CN116963072A - Fraud user early warning method and device, electronic equipment and storage medium - Google Patents

Fraud user early warning method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116963072A
CN116963072A CN202310930064.0A CN202310930064A CN116963072A CN 116963072 A CN116963072 A CN 116963072A CN 202310930064 A CN202310930064 A CN 202310930064A CN 116963072 A CN116963072 A CN 116963072A
Authority
CN
China
Prior art keywords
user
fraud
separation
person
early warning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310930064.0A
Other languages
Chinese (zh)
Inventor
王宪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Qinghai Co ltd
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Group Qinghai Co ltd
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Qinghai Co ltd, China Mobile Communications Group Co Ltd filed Critical China Mobile Group Qinghai Co ltd
Priority to CN202310930064.0A priority Critical patent/CN116963072A/en
Publication of CN116963072A publication Critical patent/CN116963072A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud
    • H04W12/128Anti-malware arrangements, e.g. protection against SMS fraud or mobile malware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application relates to the technical field of mobile communication, and provides a fraud user early warning method, a fraud user early warning device, electronic equipment and a storage medium, wherein the fraud user early warning method comprises the following steps: acquiring operation domain data and service domain data of a user to be analyzed; inputting the operation domain data and the business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not; if the prediction result is that the user to be analyzed belongs to the person and evidence separation user, acquiring the communication characteristics of the person and evidence separation user; inputting the communication characteristics of the person and certificate separation users into the fraud user identification model to obtain fraud early warning levels output by the fraud user identification model; wherein the fraud user identification model is used to predict fraud levels. The application can improve the accuracy of fraud user prediction, is convenient for intercepting malicious numbers according to the fraud early warning level, and can further improve the efficiency of fraud early warning.

Description

Fraud user early warning method and device, electronic equipment and storage medium
Technical Field
The application relates to the technical field of mobile communication, in particular to a fraud user early warning method, a fraud user early warning device, electronic equipment and a storage medium.
Background
In recent years, telecommunication fraud events are frequent, cause great social hazards, and related prevention and treatment work is highly valued. Telephone fraud is a criminal behavior with huge social harm and high incidence, fraud telephone is impersonated by other people to maliciously cheat property, social stability is affected, and the telephone fraud increasingly presents the characteristics of variability, antagonism and the like along with the application of new technology. At present, a user marks a fraud phone through a mobile phone marking function, so that a service end can establish a fraud number marking library according to the marked fraud phone service end and remind the user of an incoming call according to the fraud number marking library, however, the method has the following defects:
the data is not comprehensive enough: each party independently establishes a fraud number marking library, and because of limited user groups, the situation that the number marking is incomplete exists, and telecommunication fraud and harassment are difficult to comprehensively prevent.
Data authenticity cannot be guaranteed: the phenomenon of malicious marking and marking errors exists in the fraud number marking library of each party due to no limitation on the use user, user error and the like; there may be a possibility that owners or operators of the respective fraud number tag libraries conduct a transaction for interests with lawbreakers to tamper with the tag data for interests of themselves, and there is a problem in authenticity of the data.
Only post-reminding, not pre-defense: the number type reminding and the post reminding can only be carried out after the user accesses the telephone, and the determined malicious number cannot be intercepted in advance.
The current method of fraud number marking is inefficient due to the above-mentioned drawbacks.
Disclosure of Invention
The embodiment of the application provides a fraud user early warning method, a fraud user early warning device, electronic equipment and a storage medium, which are used for solving the problem of low efficiency of the current fraud early warning.
In a first aspect, an embodiment of the present application provides a method for early warning a fraud user, including:
acquiring operation domain data and service domain data of a user to be analyzed;
inputting the operation domain data and the business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not;
if the prediction result is that the user to be analyzed belongs to a person and evidence separation user, acquiring communication characteristics of the person and evidence separation user;
inputting the communication characteristics of the person and certificate separation user to a fraud user identification model to obtain fraud early warning levels output by the fraud user identification model; wherein the fraud user identification model is used for predicting fraud levels.
In one embodiment, after inputting the communication features of the person-license separation user to the fraud user identification model to obtain the fraud early warning level output by the fraud user identification model, the method further comprises:
and processing the telephone number of the person and certificate separation user according to the fraud early warning level.
In one embodiment, the fraud pre-warning level comprises at least a first pre-warning level, a second pre-warning level and a third pre-warning level; the telephone number of the person and certificate separation user is processed according to the fraud early warning level, and the telephone number comprises any one of the following steps:
if the fraud early warning level is the first early warning level, carrying out communication function shutdown processing on the telephone number of the person and certificate separation user;
if the fraud early warning level is the second early warning level, outputting the telephone number of the person and certificate separation user and receiving an auditing result based on the telephone number, and if the auditing result is not passed, performing communication function shutdown processing on the telephone number of the person and certificate separation user;
and if the fraud early warning level is a third early warning level, outputting the telephone number of the person and certificate separation user, receiving an auditing result based on the telephone number, and performing communication function shutdown processing on the telephone number of the person and certificate separation user under the condition that the auditing result is not passed and secondary real-name authentication is not completed within a preset time.
In one embodiment, after the communication function disabling process is performed on the phone number of the person identification separation subscriber, the method further includes:
and if the person and certificate separation user completes the secondary real-name authentication, recovering the communication function of the telephone number of the person and certificate separation user.
In one embodiment, after inputting the communication features of the person-license separation user to the fraud user identification model to obtain the fraud early warning level output by the fraud user identification model, the method further comprises:
and based on the phone number of the person certificate separation user and the fraud early warning level, updating data of a preset fraud user list.
In one embodiment, the fraud user identification model is constructed based on the following steps:
collecting communication characteristics of a fraud user as a positive sample set, and collecting communication characteristics of a non-fraud user as a negative sample set;
generating a plurality of decision trees through the positive sample set and the negative sample set by adopting a random forest algorithm;
and constructing a fraud user identification model based on the plurality of decision trees.
In one embodiment, after constructing the fraud user identification model based on the plurality of decision trees, further comprising:
acquiring the telephone number of the user which fails the secondary real-name authentication;
Constructing a fraud number library based on the telephone numbers of the users which do not pass the secondary real-name authentication;
based on the fraud number library, training the fraud user identification model by adopting a radial basis function.
In a second aspect, an embodiment of the present application provides a fraud user early warning device, including:
the first acquisition module is used for acquiring operation domain data and business domain data of a user to be analyzed;
the first input module is used for inputting the operation domain data and the business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not;
the second obtaining module is used for obtaining the communication characteristics of the person and evidence separation user if the prediction result is that the user to be analyzed belongs to the person and evidence separation user;
the second input module is used for inputting the communication characteristics of the person and certificate separation user into a fraud user identification model to obtain fraud early warning levels output by the fraud user identification model; wherein the fraud user identification model is used for predicting fraud levels.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory storing a computer program, where the processor implements the fraud user early warning method of the first aspect when executing the program.
In a fourth aspect, an embodiment of the present application provides a storage medium, where the storage medium is a computer readable storage medium, including a computer program, where the computer program, when executed by a processor, implements the fraud user early warning method of the first aspect.
According to the fraud user early warning method, the fraud user early warning device, the electronic equipment and the storage medium, whether the user to be analyzed is the person-evidence-separated user is determined through the person-evidence-separated user prediction model and the operation domain data and the service domain data, after the fact that the user to be analyzed belongs to the person-evidence-separated user is determined, the fraud early warning level of the person-evidence-separated user is determined through the fraud user identification model and the communication characteristics of the person-evidence-separated user, so that the accuracy of fraud user prediction can be improved, malicious number interception according to the fraud early warning level is facilitated, and further the efficiency in fraud early warning can be improved.
Drawings
In order to more clearly illustrate the application or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a fraud user early warning method provided by an embodiment of the present application;
FIG. 2 is a radial basis function neural network structure diagram of a fraud user early warning method provided by an embodiment of the present application;
FIG. 3 is a block diagram of a fraud user alert system to which the fraud user alert method provided by the embodiment of the present application is applicable;
FIG. 4 is a schematic diagram of functional modules of an embodiment of a fraud user alert apparatus of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The fraud user early warning method, the fraud user early warning device, the fraud user early warning electronic equipment and the fraud user early warning storage medium are described in detail below with reference to the embodiments.
Fig. 1 is a flow chart of a fraud user early warning method according to an embodiment of the present application. Referring to fig. 1, an embodiment of the present application provides a fraud user early warning method, which may include:
Step 100, acquiring operation domain data and business domain data of a user to be analyzed;
it should be noted that, the execution subject of the fraud user early warning method provided by the embodiment of the present application may be a computer device, such as a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a wearable device, an Ultra-mobile Personal Computer (UMPC), a netbook, a personal digital assistant (Personal Digital Assistant, PDA), or the like.
The service domain data in the present application may include network data such as signaling, map data, alarms, faults, network resources, etc. The operation domain data may include user data and business data such as consumer habits of the users, terminal information, groupings of average revenue per user (Average Revenue Per User, ARPU), business content, business audience population, etc.
Step 200, inputting operation domain data and business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model;
in the application, a human-certificate-separating user prediction model is used for determining whether user information is consistent with certificate information.
The human evidence separation user prediction model can be a model constructed based on a random tree algorithm.
The random tree algorithm is a typical classification method, which processes data first, generates readable rules and decision trees by using a generalization algorithm, then analyzes new data by using decisions, and includes an iterative binary tree algorithm (Iterative Dichotomiser, id 3), an information gain ratio-based classification algorithm (C4.5), and a classification and regression tree algorithm (Classification and Regression Trees, CART), which may be a C4.5 algorithm in one embodiment of the present application.
The prediction result may be that the user to be analyzed belongs to a person-evidence-separated user or that the user to be analyzed does not belong to a person-evidence-separated user.
Step 300, if the predicted result is that the user to be analyzed belongs to the person-evidence-separated user, acquiring the communication characteristics of the person-evidence-separated user;
the communication features in the present application may include, but are not limited to, one or more of the features of user home location, payment amount, contract customer, average call duration, high risk caller frequency, real name user, customer balance, campus customer, number of caller strangers, high risk caller strangers frequency, account opening channel, network duration, number of one-card-for-multiple-number user, roaming call frequency, number of communication circles, user star, customer credit line, number of voice calls, roaming caller frequency, number of communication circles outside of communication circles, user brand, group key person, caller frequency, roaming to high risk location, called number dispersion, user package, 4G customer, number of voice caller long call, personal identification information, number under the same identity card, network access dispersion, number of resident use calls and high risk location.
Each communication characteristic of a single user corresponds to a fixed attribute value, for example, the attribute of the communication characteristic is Beijing, and the payment amount is 100 yuan.
Step 400, inputting the communication characteristics of the person and certificate separation user into the fraud user identification model to obtain the fraud early warning level output by the fraud user identification model.
In the present application, a fraud user identification model is used to predict fraud levels.
According to the application, one or more of the communication features of the person and evidence separation user can be randomly combined and then input into the fraud user identification model.
The fraud early warning level in the application can be an identification model constructed based on a plurality of decision trees by adopting a random forest algorithm and generating a plurality of decision trees through a positive sample set and a negative sample set.
The fraud pre-warning level in the present application may include, but is not limited to, a first pre-warning level, a second pre-warning level, a third pre-warning level, etc.
The first early warning level can be a high-risk early warning level in the application, the second early warning level can be a medium-risk early warning level in the application, and the third early warning level can be a low-risk early warning level in the application. The distinguishing standard of the high level, the medium level and the low level can be set according to actual requirements.
According to the fraud user early warning method provided by the embodiment of the application, whether the user to be analyzed is the person-evidence-separated user is determined by combining the person-evidence-separated user prediction model with the operation domain data and the business domain data, and after the user to be analyzed is determined to belong to the person-evidence-separated user, the fraud early warning level of the person-evidence-separated user is determined by combining the fraud user identification model with the communication characteristics of the person-evidence-separated user, so that the accuracy of fraud user prediction can be improved, malicious number interception can be conveniently carried out according to the fraud early warning level, and further the efficiency in fraud early warning can be improved.
In one embodiment, after inputting the communication features of the person-separation user to the fraud user identification model, obtaining the fraud early warning level output by the fraud user identification model, further comprising:
and 500, processing the telephone number of the person and certificate separation user according to the fraud early warning level.
After the fraud early warning level of the person and certificate separation user is obtained, whether the telephone number of the user needs to be subjected to communication function shutdown processing or not can be determined according to the fraud early warning level of the person and certificate separation user, so that social harm brought by the fraud user is reduced.
Specifically, according to the fraud early warning level, the telephone number of the person and evidence separation user is processed, and the method comprises any one of the following steps:
step 501, if the fraud early warning level is the first early warning level, performing a communication function shutdown process on the phone number of the person and evidence separation user;
if the fraud early warning level is determined to be a high risk level, the telephone number of the personnel separation user is subjected to communication function shutdown processing.
It should be noted that the telephone number after the shutdown cannot be communicated until the communication function is restored.
Thus, social hazards brought by fraudulent users can be reduced.
Step 502, if the fraud early warning level is the second early warning level, outputting the telephone number of the person and certificate separation user and receiving an auditing result based on the telephone number, and if the auditing result is not passed, performing communication function shutdown processing on the telephone number of the person and certificate separation user;
if the fraud early warning level is determined to be the medium risk level, outputting the telephone number of the medium risk level user, namely the personnel separation user, for manual verification, and if the received verification result is passed, not performing communication function shutdown processing on the telephone number of the user; if the received auditing result is not passed, the telephone number of the person and certificate separation user is subjected to communication function shutdown processing.
And step 503, if the fraud early warning level is the third early warning level, outputting the telephone number of the person and certificate separation user and receiving an auditing result based on the telephone number, and performing communication function shutdown processing on the telephone number of the person and certificate separation user under the condition that the auditing result is not passed and the secondary real-name authentication is not completed within a preset time length.
If the fraud early warning level is determined to be a low-risk level, outputting the telephone number of the low-risk level user for manual verification, and if the received verification result is passed, not performing communication function shutdown processing on the number of the user; if the received auditing result is not passed, determining whether the user completes the secondary real name personnel in a preset time period, and if the user completes the secondary real name authentication in the preset time period, not performing communication function shutdown processing on the number of the user; and under the condition that the user does not complete the secondary real-name authentication within the preset time, carrying out communication function shutdown processing on the telephone number of the person and certificate separation user.
Preferably, the preset time period in the present application may be 3 hours. The user can complete the secondary real-name authentication within 3 hours, so that the communication function of the number is prevented from being stopped.
According to the embodiment, different processing modes are adopted for high-risk, medium-risk and low-risk users respectively, so that fraud behaviors are effectively controlled, the preset time length is set for the low-risk level users, the influence on the clients is reduced, and the user experience is improved.
Further, after the communication function disabling process is performed on the phone number of the person and license separation user, the method further includes:
if the person and certificate separating user completes the secondary real name authentication, the communication function of the telephone number of the person and certificate separating user is restored.
If the person and certificate separating user who has conducted the communication function shut-down processing to the telephone number is determined to finish the secondary real-name authentication, the communication function of the telephone number of the person and certificate separating user is recovered.
The users who have undergone the communication function shutdown processing include high-risk users, medium-risk users and low-risk users who have undergone the communication function shutdown processing. The user who has undergone the communication function shutdown process can perform secondary real-name authentication at the client.
Specifically, the client inputs the mobile phone number at the user end, logs in through the short message verification code, and performs supplementary registration through the password service or the recent call record after successful login. After successful registration, the user uploads the identity card picture, and the system compares the identity card picture uploaded by the user with a pre-stored identity information base through an optical character recognition technology, so that the authenticity of the identity information of the user is ensured. After verifying the identity information of the user, the system starts a camera and a recording function, generates prompt information to prompt the user to read preset digital information, and the background recognizes whether the read number of the user is consistent with the number of the system through voice, whether the read number is read by the user, and simultaneously, assisted by a portrait comparison technology and a silent living body technology, judges whether the shot video is taken by the user and whether the shot video is taken by a living body of the real person.
If the reading is correct and the person takes the live body, the secondary real-name authentication is passed, the identity information of the user is transmitted to the background, the background interface is called according to the identity information uploaded by the user, the information complement is completed for the client, and meanwhile, the communication function of the telephone number of the user is restored.
The embodiment provides a secondary real-name authentication method, which provides a quick recovery function for misjudged normal clients, reduces unfriendly perception to the clients and reduces influence on the normal clients.
Further, after inputting the communication features of the person and certificate separation user to the fraud user identification model to obtain the fraud early warning level output by the fraud user identification model, the method further comprises:
step 600, based on the personal identification, the telephone number and the fraud pre-warning level of the user are separated, and data updating is carried out on the preset fraud user list.
After the fraud early warning level is obtained, the telephone number of the person and certificate separated user and the fraud early warning level thereof can be used as data to be added into the preset fraud user list, so that the updated preset fraud user list is obtained.
The preset fraud user list of the application can be a pre-constructed list for storing information such as telephone numbers of fraud users and fraud early warning levels thereof.
In one embodiment, the data updating of the list of preset fraud users further comprises the following steps:
and acquiring high-risk user communication behaviors accessing a preset high-risk land at intervals of preset time according to the acquired operation domain position information data.
Specifically, the operation domain position information data such as the signaling of the port a, the map data and the like are collected across the system, and the communication behavior of the high-risk user accessing the preset high-risk area is tracked and roamed in real time through the operation domain position information data, wherein the preset high-risk area can be set according to the actual situation, and the specific limitation is not carried out here.
According to the high-risk communication behaviors, the suspected fraud users and the fraud early warning level of the suspected fraud users are determined through a flow calculation mode.
Specifically, according to the known attribute information of the high-risk communication behavior and related users, such as whether the high-risk communication behavior is a star-class user, the online time length, the discrete degree of the calling number and the like, the suspected fraud user and the fraud early warning level of the suspected fraud user are determined every preset time period in a flow calculation mode.
Compared with the off-line calculation, the streaming calculation mode has higher real-time performance, and has a certain time delay different from the real-time calculation.
Preferably, the fraud pre-warning level of the fraud user may be updated every 1 hour in the present embodiment.
Based on the suspected fraud user and the fraud pre-warning level of the suspected fraud user, updating the preset fraud user list.
Specifically, the preset fraud user list is updated according to the fraud users and their corresponding fraud early warning levels determined based on the streaming calculation.
According to the method and the device for the high-risk communication of the user, the high-risk communication behavior of the user is calculated based on the streaming calculation, and the preset fraud user list is updated, so that the accuracy and the instantaneity of fraud user prediction are improved.
In one embodiment, the fraud user identification model is constructed based on the following steps:
collecting communication characteristics of a fraud user as a positive sample set, and collecting communication characteristics of a non-fraud user as a negative sample set;
specifically, the fraud users comprise fraud users captured in real time and users in a fraud library prestored in the system, and the communication characteristics of the fraud users are collected as positive sample sets, such as information of fraud number language information, short message information, traffic information, position information, network age information, terminal roaming index information and the like in the communication characteristics or other communication characteristics. The non-fraud users comprise users who have completed secondary real-name authentication and non-fraud users prestored in the system, and the communication characteristics of the non-fraud users are collected as a negative sample set.
Generating a plurality of decision trees through a positive sample set and a negative sample set by adopting a random forest algorithm;
wherein, random forests are a relatively new machine learning model. Classical machine learning models are neural networks, which have been a history of over half a century. Neural networks are accurate in predictions, but are computationally intensive. The algorithm of the classification tree appears in the eighties of the last century, and the calculated amount is greatly reduced by classifying or regressing through repeated binary data. In 2001, it has been proposed to combine classification trees into a random forest, that is, randomize the use of variables (columns) and the use of data (rows), generate a plurality of classification trees, and aggregate the results of the classification trees. The prediction precision of the random forest is improved on the premise that the operand is not obviously improved. The random forest is insensitive to the polynary public linearity, the result is more robust to missing data and unbalanced data, the effect of thousands of explanatory variables can be well predicted, and the random forest algorithm is known as one of the best algorithms at present.
The random forest algorithm has the following main advantages of processing the submarine mining service: the method has the advantages that the method is good in data set performance, and due to the introduction of two randomness, the random forest is not easy to fall into overfitting, and meanwhile, the random forest has good noise resistance, so that the method is beneficial to processing oversized customer information data sets of the diving mining business; the method can process data with very high dimensionality, is not used as feature selection, and has strong adaptability to a data set: the method can process discrete data and continuous data, and the data set does not need to be normalized, so that preprocessing of input latent guest user data is not needed; the variable importance ranking (two kinds of increase amount based on Out-of-Bag (OOB) error rate and base-Ni index (decrease amount) based on splitting) can be obtained quickly, the mutual influence between characteristic values can be detected in the training process, the algorithm concurrency capability is strong, and the advantages of a Hadoop parallel big data platform can be fully exerted;
The random forest algorithm constructs a plurality of decision trees according to the data of the positive sample set and the negative sample set, each decision tree is used for marking a high-frequency fraud number class, each decision tree randomly repeatedly extracts k samples from the training sample set N in a put-back mode to generate a new training sample set, g characteristic values of each sample are extracted, the classification result of the new data is determined according to the number of votes of the decision trees to form scores, and then the set with the best characteristic value is screened according to the quality of the data classification. Random forest substance is an improvement to decision tree algorithm, combining multiple decision trees together, each decision tree is built by means of an independently extracted sample, each decision tree in random forest has the same distribution, and classification error is dependent on classification capability of each decision tree and correlation between them. Feature selection employs a random approach to splitting each node and then comparing the errors generated under different conditions. The intrinsic estimation errors, classification capabilities and correlations that can be detected determine which valuable feature values to select. The classification capability of a single decision tree may be small, but after a large number of decision trees are randomly generated, a test sample may choose the most likely classification and the most valuable feature value through statistics of the classification result of each decision tree.
The key of the decision tree construction is that the selection of the division points is carried out by taking the size of the purity difference of the current division points into consideration as an element through a greedy algorithm. And using an id3 algorithm for quantification of purity, selecting by using information gain measurement attributes, and selecting the attribute with the maximum information gain after splitting for splitting. The formula for calculating the information entropy of the set is as follows:
wherein info (D) is the information entropy of the set D, P i Is the probability that the i-th category appears in set D.
The expected information entropy calculation formula after the collection is divided according to the characteristic attribute is as follows:
wherein, info A (D) And (3) representing the expected information entropy divided by A and D, wherein D is a training set, and A is a characteristic attribute.
The information entropy gain calculation formula after the collection is divided according to the characteristic attribute is as follows:
gain(A)=info(D)-info A (D);
wherein gain (A) is information gain obtained after being divided according to the characteristic attribute A, info (D) is information entropy of the set D, info A (D) Representing the expected information entropy divided by a versus D.
All the eigenvalues are recursively ordered according to the information gain, so that the whole decision tree is constructed, branch reduction is not needed in the decision tree constructed by the random forest system, the training data can be very accurate, and the overfitting of a single decision tree can be avoided by jointly deciding a plurality of decision trees for integrated learning although overfitting can occur to other data with less accuracy.
Specifically, a plurality of training set samples are selected, a decision tree is established according to the training set and the characteristic values thereof, and the steps are repeated to establish a plurality of decision trees. The training set is composed of a positive sample set and a negative sample set.
A fraud user identification model is built based on the plurality of decision trees.
Specifically, for the data of the training set, each decision tree is subjected to decision making, the fraud early warning level of the user is determined, each decision tree is evaluated on the classification result, part of the feature type set is screened out, and a suspected fraud user identification model is constructed.
Further, after constructing the fraud user identification model based on the plurality of decision trees, it further comprises:
acquiring the telephone number of the user which fails the secondary real-name authentication;
constructing a fraud number library based on telephone numbers of users which do not pass the secondary real-name authentication;
based on the fraud number library, a radial basis function is adopted to train a fraud user identification model.
Specifically, in order to better improve the identification accuracy of a fraud user identification model and realize intelligent optimization of characteristic variables, parameter quantity and parameter weight coefficients, the embodiment combines the identified high-risk level, medium-risk level and low-risk level numbers after secondary real-name authentication, the authentication conclusion, the basic attribute of the fraud number, the roaming attribute, the behavior attribute, the position track and other historical data, and transmits the telephone number of the user which does not pass the real-name authentication to a fraud number library to construct the fraud number library.
Fig. 2 is a radial basis function neural network structure diagram of a fraud user early warning method according to an embodiment of the present application, where, as shown in fig. 2, the radial basis function neural network includes an input layer, a hidden layer, and an output layer. The transformation from the input layer to the hidden layer is a nonlinear transformation, and the output layer is a linear weighted combination of the output neurons.
When a gaussian function is selected as the radial basis function, the output formula of the radial basis function neural network is as follows:
wherein,,for the j-th output, the number of j outputs, x (t) is the input vector, w ij Is the synaptic weight between the ith hidden neuron and the jth output neuron, G i Is the Gaussian function of the ith hidden neuron, μ i Sum sigma i Is the center and width of the corresponding gaussian function.
The main task of the algorithm is to estimate three parameters W in the radial basis function neural network ij 、μ i Sum sigma i . The parameter estimation method is to give a group of input and output data pairs for training by adjusting the parameter w ij 、μ i Sum sigma i Minimizing the value of J to obtain the parameter w ij 、μ i Sum sigma i . The calculation formula of J is as follows:
wherein y is (k) For the output data in the input-output data pair,the output data is calculated according to the input data in the input-output data pair and the output formula of the radial basis function neural network.
After successfully training the radial basis function neural network, the unknown parameters w in the radial basis function neural network model are obtained ij 、μ i Sum sigma i Using this formula, a predicted output result can be derived from the input vector x (t)
According to the embodiment, by using the algorithm, the identified historical data such as the basic attribute, the roaming attribute, the behavior attribute and the position track of the suspected fraud client can be used for accurately predicting the communication behavior of the client, so that the communication behavior is used as the calculation data of the identification model of the suspected fraud client, and the identification accuracy is improved.
Further, the application also provides a fraud user early warning system.
Fig. 3 is a schematic diagram of a fraud user early warning system to which the fraud user early warning method provided by the embodiment of the present application is applicable, and as shown in fig. 3, the system includes a person and certificate separation identifier, a suspected fraud identification module, a hierarchical module, a fraud number library, a suspected fraud number identification optimizer, a real-name authentication processor and a fraud number communication recovery handler. In one embodiment, the method comprises the steps of:
step 700: the mobile client number to be analyzed, which is input from the client information input interface, is acquired.
Step 701: and analyzing and extracting the characteristics of the client, and then respectively inputting the client information into a personnel identification and separation identifier and a suspected fraud identification module.
Step 702: and (3) carrying out real-name real-person separation number identification through a person and certificate separation identifier, transmitting the identified high-risk suspected separation number as an important factor to a suspected fraud identification module, and directly jumping the medium-low-risk suspected separation number to real-name authentication treatment.
Step 703: the suspected fraud recognition module comprises a suspected fraud number recognizer and a quasi-real-time fraud number recognizer, the suspected fraud number is recognized from the input clients by the suspected fraud recognition module, and the suspected fraud number is output to the layered classification treatment module in a layered classification mode;
step 704: the high-risk fraud number shutdown processor automatically shuts down calling, sending short messages and surfing the internet for the identified high-risk fraud numbers; the intermediate risk fraud number shut-down handler provides an intermediate risk number auditing function for service personnel, and if the auditing is that the intermediate risk fraud number is shut down immediately, the handler automatically shuts down the calling, short message sending and internet surfing functions of the number; the low-risk fraud number shutdown handler provides a medium-risk number auditing function for business personnel, the auditing is that the fraud number is involved, the system provides 3-hour conversation time, and the handler which does not perform secondary authentication or does not pass authentication automatically shuts down calling, sending a short message and surfing the internet of the number within the 3-hour conversation time.
Step 705: after the hierarchical processing module processes the fraud-related numbers, the numbers are submitted to a real-name authentication processor, the real-name authentication information is submitted by the user, and then the system judges whether the numbers pass the real-name authentication.
Step 706: the numbers passing the real-name authentication are transmitted to a fraud-related number communication resume processor, and the calling, the short message sending and the internet surfing functions are resumed for the numbers with the functions of calling, short message sending and internet surfing suspended; the numbers which do not pass the real-name authentication are transmitted to a fraud number library, the numbers in the fraud number library are transmitted to a suspected fraud number identification optimizer for carrying out number attribute and behavior analysis, and a suspected fraud number identification model is optimized.
Step 707: the results in the real-name authentication handler are presented to the user.
The embodiment of the invention uses the method and the device for predicting suspected telecom fraud users based on real-name authentication to realize the closed loop flow of signaling collection, signaling transmission, fraud number detection, fraud number shut-off, portable secondary real-name authentication, quick automatic opening and automatic processing of a whole system. A complete fraud monitoring analysis and early warning treatment process is constructed, the prevention gateway is moved forward, the problems of the slimness and the tendency are timely solved, the fraud number identification precision is high, the shutdown processing efficiency is high, the rising tendency of reporting and reporting of the fraud is effectively restrained, and the customer complaint of misjudgment numbers is reduced.
Further, the application also provides a device for early warning the fraud user.
Referring to fig. 4, fig. 4 is a schematic diagram of functional modules of an embodiment of a fraud user early warning device according to the present application.
The fraud user early warning device includes:
a first obtaining module 410, configured to obtain operation domain data and service domain data of a user to be analyzed;
the first input module 420 is configured to input the operation domain data and the service domain data to a person-identification separation user prediction model, and obtain a prediction result output by the person-identification separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not;
a second obtaining module 430, configured to obtain a communication characteristic of the person-to-person separation user if the prediction result indicates that the user to be analyzed belongs to the person-to-person separation user;
the second input module 440 is configured to input the communication features of the person-identification separation user to a fraud user identification model, so as to obtain a fraud early warning level output by the fraud user identification model; wherein the fraud user identification model is used for predicting fraud levels.
According to the fraud user early warning device provided by the embodiment of the application, whether the user to be analyzed is the person-evidence-separated user is determined by combining the person-evidence-separated user prediction model with the operation domain data and the business domain data, and after the user to be analyzed is determined to belong to the person-evidence-separated user, the fraud early warning level of the person-evidence-separated user is determined by combining the fraud user identification model with the communication characteristics of the person-evidence-separated user, so that the accuracy of fraud user prediction can be improved, malicious number interception is facilitated according to the fraud early warning level, and further the efficiency in fraud early warning can be improved.
In one embodiment, the second input module 440 is further configured to:
and processing the telephone number of the person and certificate separation user according to the fraud early warning level.
In one embodiment, the second input module 440 includes a processing unit for:
if the fraud early warning level is the first early warning level, carrying out communication function shutdown processing on the telephone number of the person and certificate separation user;
if the fraud early warning level is the second early warning level, outputting the telephone number of the person and certificate separation user and receiving an auditing result based on the telephone number, and if the auditing result is not passed, performing communication function shutdown processing on the telephone number of the person and certificate separation user;
and if the fraud early warning level is a third early warning level, outputting the telephone number of the person and certificate separation user, receiving an auditing result based on the telephone number, and performing communication function shutdown processing on the telephone number of the person and certificate separation user under the condition that the auditing result is not passed and secondary real-name authentication is not completed within a preset time.
In an embodiment, the processing unit further comprises a recovery unit for:
And if the person and certificate separation user completes the secondary real-name authentication, recovering the communication function of the telephone number of the person and certificate separation user.
In one embodiment, the second input module 440 is further configured to:
and based on the phone number of the person certificate separation user and the fraud early warning level, updating data of a preset fraud user list.
Fig. 5 illustrates a physical schematic diagram of an electronic device, as shown in fig. 5, which may include: processor 510, communication interface (Communication Interface) 520, memory 530, and communication bus 540, wherein processor 510, communication interface 520, memory 530 complete communication with each other through communication bus 540. Processor 510 may invoke a computer program in memory 530 to perform the steps of the fraud user pre-warning method, including, for example:
acquiring operation domain data and service domain data of a user to be analyzed;
inputting the operation domain data and the business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not;
If the prediction result is that the user to be analyzed belongs to a person and evidence separation user, acquiring communication characteristics of the person and evidence separation user;
inputting the communication characteristics of the person and certificate separation user to a fraud user identification model to obtain fraud early warning levels output by the fraud user identification model; wherein the fraud user identification model is used for predicting fraud levels.
Further, the logic instructions in the memory 530 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, an embodiment of the present application further provides a storage medium, where the storage medium is a computer readable storage medium, where the computer readable storage medium stores a computer program, where the computer program is configured to cause a processor to execute the steps of the method provided in the foregoing embodiments, where the method includes:
acquiring operation domain data and service domain data of a user to be analyzed;
inputting the operation domain data and the business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not;
if the prediction result is that the user to be analyzed belongs to a person and evidence separation user, acquiring communication characteristics of the person and evidence separation user;
inputting the communication characteristics of the person and certificate separation user to a fraud user identification model to obtain fraud early warning levels output by the fraud user identification model; wherein the fraud user identification model is used for predicting fraud levels.
The computer readable storage medium may be any available medium or data storage device that can be accessed by a processor including, but not limited to, magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CD, DVD, BD, HVD, etc.), and semiconductor memory (e.g., ROM, EPROM, EEPROM, nonvolatile memory (NAND FLASH), solid State Disk (SSD)), etc.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A fraud user early warning method, comprising:
acquiring operation domain data and service domain data of a user to be analyzed;
inputting the operation domain data and the business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not;
if the prediction result is that the user to be analyzed belongs to a person and evidence separation user, acquiring communication characteristics of the person and evidence separation user;
inputting the communication characteristics of the person and certificate separation user to a fraud user identification model to obtain fraud early warning levels output by the fraud user identification model; wherein the fraud user identification model is used for predicting fraud levels.
2. The fraud user alert method of claim 1, further comprising, after inputting the communication features of the person-separation user to a fraud user identification model to obtain a fraud alert level output by the fraud user identification model:
and processing the telephone number of the person and certificate separation user according to the fraud early warning level.
3. The fraud user alert method of claim 2, wherein the fraud alert level includes at least a first alert level, a second alert level, and a third alert level; the telephone number of the person and certificate separation user is processed according to the fraud early warning level, and the telephone number comprises any one of the following steps:
if the fraud early warning level is the first early warning level, carrying out communication function shutdown processing on the telephone number of the person and certificate separation user;
if the fraud early warning level is the second early warning level, outputting the telephone number of the person and certificate separation user and receiving an auditing result based on the telephone number, and if the auditing result is not passed, performing communication function shutdown processing on the telephone number of the person and certificate separation user;
And if the fraud early warning level is a third early warning level, outputting the telephone number of the person and certificate separation user, receiving an auditing result based on the telephone number, and performing communication function shutdown processing on the telephone number of the person and certificate separation user under the condition that the auditing result is not passed and secondary real-name authentication is not completed within a preset time.
4. The fraud user alert method of claim 3, further comprising, after the communication function disabling process is performed on the phone number of the person and certificate separation user:
and if the person and certificate separation user completes the secondary real-name authentication, recovering the communication function of the telephone number of the person and certificate separation user.
5. The fraud user alert method of claim 1, further comprising, after inputting the communication features of the person-separation user to a fraud user identification model to obtain a fraud alert level output by the fraud user identification model:
and based on the phone number of the person certificate separation user and the fraud early warning level, updating data of a preset fraud user list.
6. The fraud user alert method of claim 1, wherein the fraud user identification model is constructed based on the steps of:
Collecting communication characteristics of a fraud user as a positive sample set, and collecting communication characteristics of a non-fraud user as a negative sample set;
generating a plurality of decision trees through the positive sample set and the negative sample set by adopting a random forest algorithm;
and constructing a fraud user identification model based on the plurality of decision trees.
7. The fraud user alert method of claim 6, further comprising, after constructing a fraud user identification model based on the plurality of decision trees:
acquiring the telephone number of the user which fails the secondary real-name authentication;
constructing a fraud number library based on the telephone numbers of the users which do not pass the secondary real-name authentication;
based on the fraud number library, training the fraud user identification model by adopting a radial basis function.
8. A fraud user alert device, comprising:
the first acquisition module is used for acquiring operation domain data and business domain data of a user to be analyzed;
the first input module is used for inputting the operation domain data and the business domain data into a human-evidence separation user prediction model to obtain a prediction result output by the human-evidence separation user prediction model; the human license separation user prediction model is used for determining whether user information is consistent with certificate information or not;
The second obtaining module is used for obtaining the communication characteristics of the person and evidence separation user if the prediction result is that the user to be analyzed belongs to the person and evidence separation user;
the second input module is used for inputting the communication characteristics of the person and certificate separation user into a fraud user identification model to obtain fraud early warning levels output by the fraud user identification model; wherein the fraud user identification model is used for predicting fraud levels.
9. An electronic device comprising a processor and a memory storing a computer program, characterized in that the processor implements the fraud user pre-warning method of any of claims 1 to 7 when executing the computer program.
10. A storage medium, which is a computer readable storage medium comprising a computer program, characterized in that the computer program, when executed by a processor, implements the fraud user early warning method of any of claims 1 to 7.
CN202310930064.0A 2023-07-27 2023-07-27 Fraud user early warning method and device, electronic equipment and storage medium Pending CN116963072A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310930064.0A CN116963072A (en) 2023-07-27 2023-07-27 Fraud user early warning method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310930064.0A CN116963072A (en) 2023-07-27 2023-07-27 Fraud user early warning method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116963072A true CN116963072A (en) 2023-10-27

Family

ID=88450917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310930064.0A Pending CN116963072A (en) 2023-07-27 2023-07-27 Fraud user early warning method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116963072A (en)

Similar Documents

Publication Publication Date Title
US20230316076A1 (en) Unsupervised Machine Learning System to Automate Functions On a Graph Structure
CN111444952B (en) Sample recognition model generation method, device, computer equipment and storage medium
US20190378050A1 (en) Machine learning system to identify and optimize features based on historical data, known patterns, or emerging patterns
US20190378049A1 (en) Ensemble of machine learning engines coupled to a graph structure that spreads heat
US20190377819A1 (en) Machine learning system to detect, label, and spread heat in a graph structure
US20190378051A1 (en) Machine learning system coupled to a graph structure detecting outlier patterns using graph scanning
CN109889538B (en) User abnormal behavior detection method and system
CN112581259B (en) Account risk identification method and device, storage medium and electronic equipment
CN110147925B (en) Risk decision method, device, equipment and system
CN108471429A (en) A kind of network attack alarm method and system
CN110493476B (en) Detection method, device, server and storage medium
CN111915468B (en) Network anti-fraud active inspection and early warning system
CN110162958B (en) Method, apparatus and recording medium for calculating comprehensive credit score of device
US20230208875A1 (en) Method of fraud detection in telecommunication using big data mining techniques
CN117993919A (en) Bank anti-electricity fraud data model construction method based on multi-feature fusion
CN116996325B (en) Network security detection method and system based on cloud computing
US11971873B2 (en) Real-time anomaly determination using integrated probabilistic system
US11735188B2 (en) System and method for detecting fraud rings
CN111611519A (en) Method and device for detecting personal abnormal behaviors
CN110956503A (en) User identification method and device with loan demand based on user network behavior
CN114841705B (en) Anti-fraud monitoring method based on scene recognition
CN116823428A (en) Anti-fraud detection method, device, equipment and storage medium
KR102332997B1 (en) Server, method and program that determines the risk of financial fraud
CN116963072A (en) Fraud user early warning method and device, electronic equipment and storage medium
Dissanayake et al. “Trust Pass”-Blockchain-Based Trusted Digital Identity Platform Towards Digital Transformation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination