CN108898418B - User account detection method, device, computer equipment and storage medium - Google Patents
User account detection method, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN108898418B CN108898418B CN201810547162.5A CN201810547162A CN108898418B CN 108898418 B CN108898418 B CN 108898418B CN 201810547162 A CN201810547162 A CN 201810547162A CN 108898418 B CN108898418 B CN 108898418B
- Authority
- CN
- China
- Prior art keywords
- user account
- historical
- obtaining
- user
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0248—Avoiding fraud
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to a user account detection method, a device, computer equipment and a storage medium. The method comprises the following steps: acquiring user account data, and acquiring user characteristic attributes according to the user account data; inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic; and obtaining a user account detection result according to the output characteristics. By adopting the method, the abnormal user account can be effectively detected.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and apparatus for detecting a user account, a computer device, and a storage medium.
Background
In most internet platforms, in order to increase the liveness of platform users, operation policies such as registering to send red packets, sending coupons, and consuming cashbacks, returning coupons, and discount prices for campaigns are set. However, the benefits of these activities may be handled in batches by the abnormal user account or may be taken by the vulnerability, and may not reach the normal user account directly, which brings about a huge economic loss to the platform. In order to prevent the user account from getting benefits in batches or utilizing platform loopholes, various protection measures such as verification codes, short message verification codes and the like are adopted at present, but the methods have defects, and the user of the abnormal user account can easily bypass the protection measures to obtain benefits, so that a great deal of loss is caused to an Internet platform.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a user account detection method, apparatus, computer device, and storage medium capable of effectively detecting an abnormal user account.
A method for detecting a user account, the method comprising:
acquiring user account data, and acquiring user characteristic attributes according to the user account data;
inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic;
and obtaining a user account detection result according to the output characteristics.
In one embodiment, the step of generating the preset user account classifier includes:
acquiring historical user account data and a corresponding detection result, wherein the detection result comprises a historical normal user account and a historical abnormal user account;
counting the historical user account number, the historical normal user account number and the historical abnormal user account number according to the historical user account data and the corresponding detection results, and calculating the historical normal user account frequency and the historical abnormal user account frequency;
obtaining corresponding historical user characteristics according to the historical user account data, and dividing the historical user characteristics according to preset conditions to obtain items to be classified;
And counting the number of historical user accounts, the number of historical normal user accounts and the number of historical abnormal user accounts corresponding to the items to be classified, and calculating the conditional probability that the historical user account corresponding to the items to be classified is the historical normal user account and the conditional probability that the historical user account is the historical abnormal user account, so as to obtain a preset user account classifier.
In one embodiment, the inputting the user feature attribute into a preset user account classifier to obtain an output feature includes:
acquiring a target item to be classified corresponding to a user characteristic attribute;
acquiring a conditional probability corresponding to a target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using a Bayesian theorem;
and comparing the probability of the normal user account with the probability of the abnormal user account, and obtaining output characteristics according to the comparison result.
In one embodiment, the method further comprises:
obtaining user account receiving address information according to the user account data;
the method comprises the steps of performing word segmentation on user account receiving address information to obtain word segmentation results, inputting the word segmentation results into a clustering model to obtain classification results, and obtaining receiving address similarity according to the classification results;
Obtaining a suspected abnormal user account according to the similarity of the receiving addresses;
the user account data is obtained, including:
and obtaining user account data of the suspected abnormal user account.
In one embodiment, the inputting the word segmentation result into the clustering model to obtain the classification result includes:
grouping the user account receiving address information according to preset conditions, calculating the total number of groups, and calculating the clustering number according to the total number of groups;
obtaining target words of the clustering number from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center;
acquiring other words except the target word in the word segmentation result, and calculating the distance from the other words except the target word to the center of the current cluster;
distributing other words except the target word into the cluster corresponding to the center of the current cluster according to the distance to obtain a target cluster with the clustering number;
and calculating a target cluster center of the target cluster, taking the target cluster center as a current cluster center, and returning to the step of calculating the distance from the other words except the target word to the current cluster center for repeated clustering until a convergence condition is met, so as to obtain a classification result.
In one embodiment, the method further comprises:
Obtaining an input feature vector according to user account data, and inputting the input feature vector into a preset user account detection model to obtain an output feature vector;
and obtaining a user account detection result according to the output feature vector.
In one embodiment, the generating step of the preset user account detection model includes:
acquiring historical user account data and corresponding detection results, acquiring a historical input feature vector according to the historical user account data, and acquiring a historical output feature vector according to the detection results;
taking the historical input feature vector as the input of the logistic regression model, and taking the historical output feature vector as the label of the logistic regression model for training;
and when the cost function of the logistic regression model reaches a preset threshold, obtaining a preset user account detection model.
A user account detection apparatus, the apparatus comprising:
the characteristic attribute obtaining module is used for obtaining user account data and obtaining user characteristic attributes according to the user account data;
the output characteristic obtaining module is used for inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic;
and the detection result obtaining module is used for obtaining a user account detection result according to the output characteristics.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of:
acquiring user account data, and acquiring user characteristic attributes according to the user account data;
inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic;
and obtaining a user account detection result according to the output characteristics.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring user account data, and acquiring user characteristic attributes according to the user account data;
inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic;
and obtaining a user account detection result according to the output characteristics.
According to the user account detection method, the device, the computer equipment and the storage medium, the user account data are obtained, and the user characteristic attribute is obtained according to the user account data; inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic; the user account detection result is obtained according to the output characteristics, the user account data can be detected by using a preset user account classifier to obtain the user account detection result, and abnormal user accounts can be effectively detected.
Drawings
FIG. 1 is an application scenario diagram of a user account detection method in one embodiment;
FIG. 2 is a flow chart of a user account detection method according to an embodiment;
FIG. 3 is a flowchart of a method for obtaining a preset user account classifier in one embodiment;
FIG. 4 is a flow diagram of obtaining output features in one embodiment;
FIG. 5 is a schematic flow chart of obtaining a suspected abnormal user account in one embodiment;
FIG. 6 is a flow chart of clustering shipping addresses according to user account numbers in one embodiment;
fig. 7 is a flowchart of a user account detection method according to another embodiment;
FIG. 8 is a flowchart of a method for obtaining a preset user account detection model in an embodiment;
FIG. 9 is a block diagram of a user account detection device according to an embodiment;
fig. 10 is an internal structural view of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
The user account detection method provided by the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The server 104 obtains user account data sent by the terminal 102, and obtains user characteristic attributes according to the user account data; inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic; and obtaining a user account detection result according to the output characteristics. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers.
In one embodiment, as shown in fig. 2, a method for detecting a user account is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:
s202, obtaining user account data, and obtaining user characteristic attributes according to the user account data.
The user account data comprises account basic attributes, equipment information, user behavior data and business data. The account basic attribute is used for reflecting personal information of a user and can comprise an account name, a mobile phone number, a bank card number, a name, an age, a gender, an identity card, an address and the like. The device information is used for describing device parameter information used by a user and can comprise parameters of various devices such as a mobile phone, a tablet computer, a notebook computer, a PC (personal computer) and the like, and can also be device fingerprints of devices frequently used by the user. The user behavior data refers to various data generated when a user performs various operations on a webpage or a client, and can include a user webpage stay time length, a user access sequence, an operation frequency, key information and the like. Business data refers to data generated during business activities, for example, when there is a business activity of killing a commodity in seconds, the business data is the dimension attribute of the commodity.
The user characteristic attribute refers to a value set corresponding to the user characteristic, the value set is obtained by processing user account data, the user characteristic comprises basic attribute characteristics, equipment information characteristics, user behavior characteristics, business characteristics and the like, and the user characteristic is obtained by extracting the characteristic according to historical user account data. The processing may be calculating according to the user account data to obtain the user feature attribute, or presetting the corresponding relationship between the user account data and the user feature attribute to obtain the user feature attributeSuch as: the basic attribute feature a1 isIf the number of friends in the user account data is 100 friends and the registration day is 400 days, the user feature attribute corresponding to the user feature a1 can be calculated to be 0.25. For example, if the device information feature b1 is a device chip type, and if the device chip type in the user account data is an X86 chip, the user feature attribute corresponding to the user feature b1 is obtained according to the preset corresponding relationship between the device chip type and the user feature attribute.
Specifically, under the condition of obtaining the user authorization, different data acquisition methods can be adopted to acquire the user account data, the user account data can be acquired by carrying out service burial on the user terminal, the user account data can be acquired by presetting a data acquisition script on the user terminal and starting the script when the platform page is loaded, and the user account data can be acquired from the log information by acquiring the log information in the server. And then, performing data cleaning on the acquired data, wherein the data cleaning refers to the last procedure for finding and correcting the identifiable errors in the data file, and comprises the steps of checking the consistency of the data, processing invalid values, missing values and the like. And extracting the cleaned data characteristics to obtain user characteristics, and obtaining user characteristic attributes corresponding to the user characteristics according to the user account data to obtain the user characteristic attributes.
S204, inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic.
The preset user account classifier is obtained by training historical user account data in advance according to a naive Bayes algorithm. The output features are used for judging the feature value of the detection result, different output features corresponding to different detection results are preset according to the detection result of the historical user account, namely, different feature values are corresponding, for example, the detection result is preset to be that the output feature corresponding to the normal user account, namely, the feature value is 1, and the detection result is preset to be that the output feature corresponding to the abnormal user account, namely, the feature value is 0. The resulting output characteristic may be either 1 or 0.
Specifically, user characteristic attributes obtained according to user account data are input into a user account classifier trained in advance according to a naive Bayesian algorithm, and output characteristics corresponding to detection results are obtained. The naive Bayes algorithm is a classification method based on the independent assumption of Bayes theorem and characteristic conditions.
S206, obtaining a user account detection result according to the output characteristics.
Specifically, according to the corresponding relation between the preset output characteristics and the historical user account detection results, the user account detection results corresponding to the output characteristics at the moment are obtained. For example, if the preset user account detection result is a normal user account, the corresponding output characteristic is 1, and when the user account detection result is an abnormal user account, the corresponding output characteristic is 0. And when the output characteristic is 1, the user account at the moment is a normal user account, and when the output characteristic is 0, the user account at the moment is an abnormal user account. At this time, when the user account is an abnormal user account, various operation requests of the abnormal user account can be intercepted in real time.
In the user account detection method, the user characteristic attribute is obtained according to the user account data by obtaining the user account data; inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic; and obtaining a user account detection result according to the output characteristics. The user account detection result can be obtained by detecting the user account data by using the preset user account classifier, and abnormal user accounts can be effectively detected. When the abnormal user account is detected, various operation requests of the abnormal user account can be intercepted, and economic benefit loss caused by the abnormal user account to the platform can be effectively reduced.
In one embodiment, as shown in fig. 3, the step of generating the preset user account classifier includes:
s302, acquiring historical user account data and corresponding detection results, wherein the detection results comprise a historical normal user account and a historical abnormal user account.
The historical normal user account is a user account which is detected by the historical user account data and has no abnormal phenomenon, and is the historical normal user account. The historical abnormal user account is a user account which is detected through the historical user account data and found to be abnormal, and is the historical abnormal user account. The abnormal phenomenon refers to the phenomenon that the operation request and the operation behavior of the user do not accord with the preset behavior rule in the business activity or the phenomenon that the user account logs in by using abnormal equipment and logs in a different place.
Specifically, the historical user account data and the corresponding detection result are obtained, the historical user account data and the corresponding detection result can be obtained by detecting the historical user account data through manual or expert rules, then the detection result data is stored, and the obtained historical user account data and the corresponding detection result are used as sample data.
S304, counting the historical user account numbers, the historical normal user account numbers and the historical abnormal user account numbers according to the historical user account data and the corresponding detection results, and calculating the historical normal user account frequency and the historical abnormal user account frequency.
Specifically, the historical user account number, the historical normal user account number and the historical abnormal user account number in the sample data are counted. And calculating the historical normal user account frequency according to the historical user account number and the historical normal user account number, and calculating the historical abnormal user account frequency according to the historical user account number and the historical abnormal user account number.
S306, obtaining corresponding historical user characteristics according to the historical user account data, and dividing the historical user characteristics according to preset conditions to obtain items to be classified.
The term to be classified refers to that different division results are obtained after user features are divided according to preset conditions, and each user feature can correspond to a plurality of divisions, namely, a plurality of terms to be classified can be corresponding to each user feature. The preset condition may be a division condition of the history user characteristics set according to human experience. For example, the extracted historical user characteristics may be The item to be classified of the user feature may be a1<=0.05、0.05<a1<0.2 and a1>=0.2. . If the ratio of log data and registration days in one historical user account data is 0.15, and the user head portrait is true head portrait, the user characteristic attribute is 0.15, and the item to be classified which the user characteristic attribute belongs to is 0.05<a1<0.2. The historical user characteristic may be a2=user head portrait, and the item to be classified of the user characteristic may be that the user head portrait is a real head portrait or other head portraits.
Specifically, feature extraction is performed according to historical user account data to obtain corresponding historical user features, and each extracted historical user feature is divided according to preset conditions to obtain to-be-classified items corresponding to each user feature.
S308, counting the number of historical user accounts, the number of historical normal user accounts and the number of historical abnormal user accounts corresponding to the items to be classified, and calculating the conditional probability that the historical user account corresponding to the items to be classified is the historical normal user account and the conditional probability that the historical user account is the historical abnormal user account, so as to obtain a preset user account classifier.
Specifically, the historical user account number, the historical normal user account number and the historical abnormal user account number of each item to be classified are counted. And calculating the conditional probability that the historical user account corresponding to each item to be classified is a historical normal user account and the conditional probability that the historical user account is a historical abnormal user account, namely calculating the probability that each item to be classified occurs under the condition that the historical user account is a historical normal user account or a historical abnormal user occurs, thereby obtaining the preset user account classifier.
In one embodiment, the sample data may be divided into training sample data and test sample data, training is performed using the training sample data to obtain an initial user account classifier, after a preset user account classifier is obtained, the initial user account classifier is tested using the test sample data, and when a test result reaches a preset accuracy, the test is completed to obtain the preset user account classifier. When the test result does not reach the preset accuracy, training of the initial user account classifier is conducted again, and more training sample data can be obtained for training until the test result reaches the preset accuracy.
In the above embodiment, by acquiring the historical user account data and the corresponding detection result, the detection result includes a historical normal user account and a historical abnormal user account; counting the historical user account numbers, the historical normal user account numbers and the historical abnormal user account numbers according to the historical user account data and the corresponding detection results, and calculating the historical normal user account frequency and the historical abnormal user account frequency; obtaining corresponding historical user characteristics according to the historical user account data, and dividing the historical user characteristics according to preset conditions to obtain items to be classified; and counting the number of historical user accounts, the number of historical normal user accounts and the number of historical abnormal user accounts corresponding to the items to be classified, and calculating the conditional probability that the historical user account corresponding to the items to be classified is the historical normal user account and the conditional probability that the historical user account is the historical abnormal user account to obtain a preset user account classifier. Through presetting the user account classifier, the user account classifier can be directly used when user account detection is carried out, and the user account detection efficiency is improved.
In one embodiment, as shown in fig. 4, step S204, that is, inputting the user feature attribute into a preset user account classifier to obtain an output feature, includes the steps of:
s402, obtaining target items to be classified corresponding to the user characteristic attributes.
Specifically, target items to be classified corresponding to user feature attributes are obtained according to the pre-divided items to be classified, and each user feature has a corresponding target item to be classified.
S404, obtaining the conditional probability corresponding to the target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using the Bayes theorem.
Specifically, a conditional probability corresponding to the item to be classified is obtained, and the conditional probability is trained in advanceAnd the good user account classifier obtains the probability that the user account is a normal user account by using a Bayesian theorem according to the conditional probability and the historical normal user account frequency, and the probability that the user account is an abnormal user account by using the Bayesian theorem according to the conditional probability and the historical abnormal user account frequency. Wherein, the calculation formula of the Bayes theorem is thatWherein, P (B|A) refers to the probability that the user account is a normal user account or an abnormal user account under the condition that the item A to be classified occurs, P (A|B) refers to the conditional probability corresponding to the item A to be classified, and P (B) refers to the historical normal user account frequency or the historical abnormal user account frequency. And because the denominator is a constant, when the items to be classified are multiple items, the calculation formula obtained according to the Bayesian theorem is P (B|A) =P (A 1 |B)P(A 2 |B)...P(A m B) P (B), wherein A m Representing the conditional probability of the mth item to be classified.
S406, comparing the probability of the normal user account with the probability of the abnormal user account, and obtaining output characteristics according to the comparison result.
Specifically, comparing the probability of the normal user account with the probability of the abnormal user account, and taking the probability as an output characteristic. For example, when the output characteristic corresponding to the normal user account is 1, the output characteristic corresponding to the abnormal user account is 0, if the comparison result is that the probability of the normal user account is high, the obtained output characteristic is 1, and if the comparison result is that the probability of the abnormal user account is high, the obtained output characteristic is 0.
In the above embodiment, the target items to be classified corresponding to the user characteristic attribute are obtained; acquiring a conditional probability corresponding to a target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using a Bayesian theorem; the probability of the normal user account number is compared with the probability of the abnormal user account number, and the output characteristics are obtained according to the comparison result, so that the output characteristics can be obtained more conveniently and accurately.
In one embodiment, as shown in fig. 5, the method further includes the following steps, where the following steps are used to obtain a suspected abnormal user account:
S502, obtaining user account receiving address information according to the user account data.
The user account receiving address information refers to receiving address information of the commodity when the user account carries out transaction.
Specifically, according to the detailed address information of the received goods obtained from the account basic attribute in the obtained user account data, or the detailed address information of the received goods of the business activity goods obtained from the business data. Different user accounts have different shipping addresses. Such as: the user account receiving address may be a XX number in a XX county, XX city, XX major road, XX, garden.
S504, word segmentation is carried out on the receiving address information of the user account to obtain word segmentation results, the word segmentation results are input into a clustering model to obtain classification results, and the receiving address similarity is obtained according to the classification results.
The clustering model is a classification model established by using a clustering algorithm, and the receiving address similarity user describes the similarity degree of the receiving address.
Specifically, the user account receiving address information is segmented to obtain a segmentation result, the segmentation result is used as a word set to be input into a clustering model to obtain a classification result, and the receiving address similarity is obtained according to the classification result. For example: the word segmentation results obtained after the word segmentation of the receiving address of the user account are XX province, XX city, XX county, XX lane, XX garden and XX number.
S506, obtaining the suspected abnormal user account according to the similarity of the receiving addresses.
Specifically, when the receiving address similarity is higher, it is indicated that the user accounts with higher receiving address similarity are higher in possibility of abnormality, and the user accounts with higher receiving address similarity are taken as suspected abnormal user accounts.
Step S202, namely, the step of obtaining the user account data includes the steps of:
and obtaining user account data of the suspected abnormal user account.
Specifically, the user account data of the suspected abnormal user account is obtained, and the suspected abnormal user account can be detected by using a preset user account classifier to determine whether the suspected abnormal user account is an abnormal user account.
In the above embodiment, the user account receiving address information is obtained according to the user account data; the method comprises the steps of performing word segmentation on user account receiving address information to obtain word segmentation results, inputting the word segmentation results into a clustering model to obtain classification results, and obtaining receiving address similarity according to the classification results; and obtaining the suspected abnormal user account according to the similarity of the receiving addresses, and then detecting the user account data of the suspected abnormal user account when using a preset user account classifier, so that the detection quantity of the user account can be reduced, and the detection efficiency of the user account can be improved.
In one embodiment, as shown in fig. 6, step S504, that is, inputting the word segmentation result into a cluster model to obtain a classification result, includes the steps of:
s602, grouping the user account receiving address information according to preset conditions, calculating the total number of the groups, and calculating the clustering number according to the total number of the groups.
The number of clusters refers to the number of categories used for classification in the cluster model.
Specifically, the user account receiving address information is grouped according to provinces, cities, counties, regions and the like, namely, different provinces, different cities, different counties and regions are respectively different groups, and the grouped group numbers are counted to obtain the group total number. The number of clusters is calculated according to the calculation formula n=m×1.1. Where M is the total number of packets and N is the number of clusters.
S604, obtaining target words of the clustering number from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center.
Specifically, after determining that the number of clusters is N, randomly initializing N cluster centers, that is, acquiring N target words from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center.
S606, obtaining other words except the target word in the word segmentation result, and calculating the distance from the other words except the target word to the center of the current cluster.
Specifically, other words except the target word in the word segmentation result are obtained from the word segmentation result, and the Euclidean distance is used for calculating the distance from the other words except the target word to the center of the current cluster.
And S608, distributing other words except the target word into the clusters corresponding to the centers of the current clusters according to the distances to obtain the target clusters with the clustering number.
Specifically, according to the distances from the other words except the target word to the centers of all the current clusters, the other words except the target word are distributed into the clusters with the smallest distances, and the target clusters with the clustering number are obtained. And obtaining the distances from one other word to the centers of all the current clusters, and distributing the word to the current cluster by judging that the distance from the center of the current cluster is shortest.
S610, calculating a target cluster center of the target cluster, taking the target cluster center as a current cluster center, and returning to the step of calculating the distance from the other words except the target word to the current cluster center for repeated clustering until a convergence condition is met, so as to obtain a classification result.
Specifically, the target cluster center of the target cluster is recalculated, the target cluster center is used as the current cluster center, the step of calculating the distance from the other words except the target word to the current cluster center is returned to perform repeated clustering until the convergence condition is met, namely, when the current cluster center is consistent with the last cluster center, namely, the convergence condition is met, namely, the target cluster is used as a classification result. The sum of squares of distances between each sample point and the centroid to which it belongs may be used as a cost function, and when the cost function reaches a minimum value, the current cluster center is consistent with the last cluster center.
In one embodiment, the target words of the clustering number can be reselected as the initial cluster center, the clustering calculation is performed, the classification result is obtained, the cost function value is compared, and the classification result with the minimum cost function value is used as the clustering model.
In the above embodiment, the user account receiving address information is grouped according to a preset condition, the total number of the groups is calculated, and the clustering number is calculated according to the total number of the groups; obtaining target words of the clustering number from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center; acquiring other words except the target word in the word segmentation result, and calculating the distance from the other words except the target word to the center of the current cluster; distributing other words except the target word into the cluster corresponding to the center of the current cluster according to the distance to obtain a target cluster with the clustering number; calculating a target cluster center of the target cluster, taking the target cluster center as a current cluster center, and returning to the step of calculating the distance from the other words except the target word to the current cluster center to perform repeated clustering until a convergence condition is met, so that a classification result is obtained, and the classification result can be obtained more conveniently and accurately.
In one embodiment, as shown in fig. 7, the method for detecting a user account further includes the steps of:
S702, obtaining an input feature vector according to the user account data, and inputting the input feature vector into a preset user account detection model to obtain an output feature vector.
The preset user account detection model is obtained by training through a logistic regression algorithm in advance by using sample data and is used for detecting the user account. The input feature vector comprises an account basic attribute vector, a device information vector, a user behavior vector and a service information vector, and the output feature vector comprises a detection result vector.
Specifically, feature extraction is performed according to historical user account data to obtain input features, wherein the input features comprise account basic attribute features, equipment information features, user behavior features and service information features, and the account basic attribute features are used for describing basic information of the user, such as account name features, gender features, age features, address features, mobile phone number features and the like. The device information features are used for describing device parameter information of a login user account, such as device operating system version number features, device fingerprint features, nearby chip features, hardware features, and features of the device in a jail breaking or cracking mode. The user behavior feature is used to describe data that a user causes a user account to be generated when the web page or client performs various operations. Such as a user page stay time feature, a user access sequence feature, an operation frequency feature, a key information feature, and the like. The business information features are used for describing information features generated when the platform performs business activities at first. For example, when there is a coupon activity, then the business information features may be coupon information features, coupon rule features, etc. And obtaining user account data corresponding to the input features according to the user account data, obtaining input feature vectors according to the user account data corresponding to the input features, and inputting the obtained input feature vectors into a preset user account detection model to obtain output feature vectors.
And S704, obtaining a user account detection result according to the output feature vector.
Specifically, output characteristics corresponding to various detection results are obtained in advance according to the detection results, then corresponding output characteristic vectors are obtained according to the data characteristics, when detection is performed, output characteristics corresponding to the output characteristic vectors are obtained according to the detected output characteristic vectors, and corresponding detection results are obtained according to the output characteristics. For example, the output characteristic corresponding to the case that the user account detection result is the normal user account is 1, the output characteristic corresponding to the case that the user account detection result is the abnormal user account is 0, the obtained output characteristic vector corresponding to the normal user account is [1], and the abnormal user account corresponding to the abnormal user account is [0].
In the above embodiment, the input feature vector is obtained according to the user account data, and the input feature vector is input into the preset user account detection model to obtain the output feature vector; and obtaining a user account detection result according to the output feature vector, and effectively detecting an abnormal user account by using a preset user account detection model.
In one embodiment, as shown in fig. 8, the step of generating the preset user account detection model includes the steps of:
S802, acquiring historical user account data and corresponding detection results, obtaining a historical input feature vector according to the historical user account data, and obtaining a historical output feature vector according to the detection results.
Specifically, input feature data corresponding to the pre-extracted input features is obtained according to the historical user account data, input feature vectors are obtained according to the input feature data, corresponding output features are obtained according to the detection result, and historical output feature vectors are obtained according to the output features.
S804, taking the historical input feature vector as the input of the logistic regression model, and taking the historical output feature vector as the label of the logistic regression model for training.
Specifically, the historical input feature vector is used as the input of a logistic regression model which is built by using a Sigmoid function, wherein the Sigmoid function is thatAnd training by taking the historical output characteristic vector as a label of the logistic regression model.
S806, when the cost function of the logistic regression model reaches a preset threshold, a preset user account detection model is obtained.
Wherein the cost function is a function for measuring the difference between the value predicted by the model and the true value.
Specifically, the cost function of the logistic regression model may use cross entropy as the cost function, where the cross entropy function is Where C is the difference value, y is the desired output, and a is the actual output. And when the C reaches a preset threshold value, the training is finished, and a preset user account detection model is obtained.
In the above embodiment, by acquiring the historical user account data and the corresponding detection result, obtaining the historical input feature vector according to the historical user account data, obtaining the historical output feature vector according to the detection result, taking the historical input feature vector as the input of the logistic regression model, training the historical output feature vector as the label of the logistic regression model, and obtaining the preset user account detection model when the cost function of the logistic regression model reaches the preset threshold, the user account detection model can be trained in advance, and the user account detection model can be directly used in detection, so that the user account detection efficiency is improved.
In a specific embodiment, when the user registers the user account, the user account registered by the user may be detected, so as to prevent the user from registering the user account in batches. Specifically, user account data including user account basic information and device information are obtained, user characteristic attributes are obtained according to the user account data, specifically, the user characteristic attributes include user mobile phone numbers, device chip information, device fingerprint information, hardware information, device mode information, device operating system models and the like, the user characteristic attributes are input into a preset user account classifier to obtain output characteristics, if a detection result corresponding to the output characteristics is an abnormal user account, the registered user account is a user account which is registered in batches by a user, and at the moment, information of registration failure can be sent to the user to prevent the user from registering the user account in batches by a client.
It should be understood that, although the steps in the flowcharts of fig. 2-8 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2-8 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the sub-steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
In one embodiment, as shown in fig. 9, there is provided a user account detection apparatus 900, including: a feature attribute obtaining module 902, an output feature obtaining module 904, and a detection result obtaining module 906, wherein:
the feature attribute obtaining module 902 is configured to obtain user account data, and obtain a feature attribute of a user according to the user account data;
an output feature obtaining module 904, configured to input the user feature attribute into a preset user account classifier, to obtain an output feature;
And the detection result obtaining module 906 is configured to obtain a detection result of the user account according to the output feature.
In the above embodiment, the user characteristic attribute is obtained according to the user account data in the characteristic attribute obtaining module 902, then the user characteristic attribute is input into the preset user account classifier in the output characteristic obtaining module 904 to obtain the output characteristic, and finally the user account detection result is obtained according to the output characteristic in the detection result obtaining module 906, so that the abnormal user account can be effectively detected.
In one embodiment, the user account detection apparatus 900 further includes:
the historical data acquisition module is used for acquiring historical user account data and corresponding detection results, wherein the detection results comprise a historical normal user account and a historical abnormal user account;
the frequency calculation module is used for counting the historical user account number, the historical normal user account number and the historical abnormal user account number according to the historical user account number data and the corresponding detection results, and calculating the historical normal user account number frequency and the historical abnormal user account number frequency;
the dividing module is used for obtaining corresponding historical user characteristics according to the historical user account data, and dividing the historical user characteristics according to preset conditions to obtain items to be classified;
The conditional probability calculation module is used for counting the number of historical user accounts, the number of historical normal user accounts and the number of historical abnormal user accounts corresponding to the items to be classified, calculating the conditional probability that the historical user account corresponding to the items to be classified is the historical normal user account and the conditional probability that the historical user account is the historical abnormal user account, and obtaining the preset user account classifier.
In one embodiment, the output feature derivation module 904 includes:
the target acquisition module is used for acquiring target items to be classified corresponding to the user characteristic attributes;
the Bayesian calculation module is used for acquiring the conditional probability corresponding to the target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using the Bayesian theorem;
and the comparison module is used for comparing the probability of the normal user account with the probability of the abnormal user account and obtaining output characteristics according to the comparison result.
In one embodiment, the user account detection apparatus 900 further includes:
the address obtaining module is used for obtaining user account receiving address information according to the user account data;
the classification module is used for segmenting the user account receiving address information to obtain a segmentation result, inputting the segmentation result into the clustering model to obtain a classification result, and obtaining receiving address similarity according to the classification result;
The suspected account obtaining module is used for obtaining a suspected abnormal user account according to the similarity of the receiving addresses;
the feature attribute obtaining module 902 includes:
the suspected abnormal data acquisition module is used for acquiring user account data of suspected abnormal user accounts.
In one embodiment, the classification module comprises:
the clustering number calculating module is used for grouping the user account receiving address information according to preset conditions, calculating the total number of the groups and calculating the clustering number according to the total number of the groups;
the cluster center determining module is used for acquiring target words of the clustering number from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center;
the distance calculation module is used for obtaining other words except the target word in the word segmentation result and calculating the distance from the other words except the target word to the center of the current cluster;
the target cluster obtaining module is used for distributing other words except the target words into clusters corresponding to the centers of the current clusters according to the distances to obtain target clusters with the clustering number;
the classification result obtaining module is used for calculating a target cluster center of the target cluster, taking the target cluster center as a current cluster center, and returning to the step of calculating the distance from the other words except the target word to the current cluster center for repeated clustering until the convergence condition is met, so that a classification result is obtained.
In one embodiment, the user account detection apparatus 900 further includes:
the user account detection module is used for obtaining an input feature vector according to the user account data, inputting the input feature vector into a preset user account detection model and obtaining an output feature vector;
and the detection result obtaining module is used for obtaining a user account detection result according to the output feature vector.
In one embodiment, the user account detection apparatus 900 further includes:
the historical vector obtaining module is used for obtaining historical user account data and corresponding detection results, obtaining a historical input characteristic vector according to the historical user account data and obtaining a historical output characteristic vector according to the detection results;
the training module is used for taking the historical input feature vector as the input of the logistic regression model and taking the historical output feature vector as the label of the logistic regression model for training;
the detection model obtaining module is used for obtaining a preset user account detection model when the cost function of the logistic regression model reaches a preset threshold.
For specific limitation of the user account detection device, reference may be made to the limitation of the user account detection method hereinabove, and the description thereof will not be repeated here. All or part of the modules in the user account detection device can be realized by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing user account data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a user account detection method.
It will be appreciated by those skilled in the art that the structure shown in fig. 10 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program: acquiring user account data, and acquiring user characteristic attributes according to the user account data; inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic; and obtaining a user account detection result according to the output characteristics.
In one embodiment, the processor when executing the computer program further performs the steps of: acquiring historical user account data and a corresponding detection result, wherein the detection result comprises a historical normal user account and a historical abnormal user account; counting the historical user account number, the historical normal user account number and the historical abnormal user account number according to the historical user account data and the corresponding detection results, and calculating the historical normal user account frequency and the historical abnormal user account frequency; obtaining corresponding historical user characteristics according to the historical user account data, and dividing the historical user characteristics according to preset conditions to obtain items to be classified; and counting the number of historical user accounts, the number of historical normal user accounts and the number of historical abnormal user accounts corresponding to the items to be classified, and calculating the conditional probability that the historical user account corresponding to the items to be classified is the historical normal user account and the conditional probability that the historical user account is the historical abnormal user account, so as to obtain a preset user account classifier.
In one embodiment, the processor when executing the computer program further performs the steps of: acquiring a target item to be classified corresponding to a user characteristic attribute; acquiring a conditional probability corresponding to a target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using a Bayesian theorem; and comparing the probability of the normal user account with the probability of the abnormal user account, and obtaining output characteristics according to the comparison result.
In one embodiment, the processor when executing the computer program further performs the steps of: obtaining user account receiving address information according to the user account data; the method comprises the steps of performing word segmentation on user account receiving address information to obtain word segmentation results, inputting the word segmentation results into a clustering model to obtain classification results, and obtaining receiving address similarity according to the classification results; obtaining a suspected abnormal user account according to the similarity of the receiving addresses; the processor, when executing the computer program, further performs the steps of: and obtaining user account data of the suspected abnormal user account.
In one embodiment, the processor when executing the computer program further performs the steps of: grouping the user account receiving address information according to preset conditions, calculating the total number of groups, and calculating the clustering number according to the total number of groups; obtaining target words of the clustering number from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center; acquiring other words except the target word in the word segmentation result, and calculating the distance from the other words except the target word to the center of the current cluster; distributing other words except the target word into the cluster corresponding to the center of the current cluster according to the distance to obtain a target cluster with the clustering number; and calculating a target cluster center of the target cluster, taking the target cluster center as a current cluster center, and returning to the step of calculating the distance from the other words except the target word to the current cluster center for repeated clustering until a convergence condition is met, so as to obtain a classification result.
In one embodiment, the processor when executing the computer program further performs the steps of: obtaining an input feature vector according to user account data, and inputting the input feature vector into a preset user account detection model to obtain an output feature vector; and obtaining a user account detection result according to the output feature vector.
In one embodiment, the processor when executing the computer program further performs the steps of: acquiring historical user account data and corresponding detection results, acquiring a historical input feature vector according to the historical user account data, and acquiring a historical output feature vector according to the detection results; taking the historical input feature vector as the input of the logistic regression model, and taking the historical output feature vector as the label of the logistic regression model for training; and when the cost function of the logistic regression model reaches a preset threshold, obtaining a preset user account detection model.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring user account data, and acquiring user characteristic attributes according to the user account data; inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic; and obtaining a user account detection result according to the output characteristics.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring historical user account data and a corresponding detection result, wherein the detection result comprises a historical normal user account and a historical abnormal user account; counting the historical user account number, the historical normal user account number and the historical abnormal user account number according to the historical user account data and the corresponding detection results, and calculating the historical normal user account frequency and the historical abnormal user account frequency; obtaining corresponding historical user characteristics according to the historical user account data, and dividing the historical user characteristics according to preset conditions to obtain items to be classified; and counting the number of historical user accounts, the number of historical normal user accounts and the number of historical abnormal user accounts corresponding to the items to be classified, and calculating the conditional probability that the historical user account corresponding to the items to be classified is the historical normal user account and the conditional probability that the historical user account is the historical abnormal user account, so as to obtain a preset user account classifier.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring a target item to be classified corresponding to a user characteristic attribute; acquiring a conditional probability corresponding to a target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using a Bayesian theorem; and comparing the probability of the normal user account with the probability of the abnormal user account, and obtaining output characteristics according to the comparison result.
In one embodiment, the computer program when executed by the processor further performs the steps of: obtaining user account receiving address information according to the user account data; the method comprises the steps of performing word segmentation on user account receiving address information to obtain word segmentation results, inputting the word segmentation results into a clustering model to obtain classification results, and obtaining receiving address similarity according to the classification results; obtaining a suspected abnormal user account according to the similarity of the receiving addresses; the computer program when executed by the processor further realizes the steps of: and obtaining user account data of the suspected abnormal user account.
In one embodiment, the computer program when executed by the processor further performs the steps of: grouping the user account receiving address information according to preset conditions, calculating the total number of groups, and calculating the clustering number according to the total number of groups; obtaining target words of the clustering number from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center; acquiring other words except the target word in the word segmentation result, and calculating the distance from the other words except the target word to the center of the current cluster; distributing other words except the target word into the cluster corresponding to the center of the current cluster according to the distance to obtain a target cluster with the clustering number; and calculating a target cluster center of the target cluster, taking the target cluster center as a current cluster center, and returning to the step of calculating the distance from the other words except the target word to the current cluster center for repeated clustering until a convergence condition is met, so as to obtain a classification result.
In one embodiment, the computer program when executed by the processor further performs the steps of: obtaining an input feature vector according to user account data, and inputting the input feature vector into a preset user account detection model to obtain an output feature vector; and obtaining a user account detection result according to the output feature vector.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring historical user account data and corresponding detection results, acquiring a historical input feature vector according to the historical user account data, and acquiring a historical output feature vector according to the detection results; taking the historical input feature vector as the input of the logistic regression model, and taking the historical output feature vector as the label of the logistic regression model for training; and when the cost function of the logistic regression model reaches a preset threshold, obtaining a preset user account detection model.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.
Claims (10)
1. A method for detecting a user account, the method comprising:
acquiring user account data, and acquiring user characteristic attributes according to the user account data;
inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic;
obtaining a user account detection result according to the output characteristics;
the method further comprises the steps of:
Obtaining user account receiving address information according to the user account data;
the user account receiving address information is segmented to obtain a segmentation result, the segmentation result is input into a clustering model to obtain a classification result, and the receiving address similarity is obtained according to the classification result;
obtaining a suspected abnormal user account according to the similarity of the receiving addresses;
the obtaining the user account data includes:
and acquiring user account data of the suspected abnormal user account.
2. The method of claim 1, wherein the step of generating the preset user account classifier comprises:
acquiring historical user account data and corresponding detection results, wherein the detection results comprise a historical normal user account and a historical abnormal user account;
counting the historical user account numbers, the historical normal user account numbers and the historical abnormal user account numbers according to the historical user account data and the corresponding detection results, and calculating the historical normal user account frequency and the historical abnormal user account frequency;
obtaining corresponding historical user characteristics according to the historical user account data, and dividing the historical user characteristics according to preset conditions to obtain items to be classified;
And counting the number of historical user accounts, the number of historical normal user accounts and the number of historical abnormal user accounts corresponding to the items to be classified, and calculating the conditional probability that the historical user account corresponding to the items to be classified is the historical normal user account and the conditional probability that the historical user account is the historical abnormal user account to obtain a preset user account classifier.
3. The method according to claim 1, wherein the inputting the user feature attribute into a preset user account classifier to obtain an output feature includes:
acquiring a target item to be classified corresponding to the user characteristic attribute;
acquiring a conditional probability corresponding to the target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using a Bayesian theorem;
and comparing the probability of the normal user account with the probability of the abnormal user account, and obtaining output characteristics according to a comparison result.
4. The method of claim 1, wherein inputting the word segmentation result into a cluster model to obtain a classification result comprises:
grouping the user account receiving address information according to preset conditions, calculating the total number of groups, and calculating the clustering number according to the total number of groups;
Obtaining target words of the clustering number from the word segmentation result as an initial cluster center, and taking the initial cluster center as a current cluster center;
acquiring other words except the target word in the word segmentation result, and calculating the distance from the other words except the target word to the center of the current cluster;
distributing the words except the target words into the clusters corresponding to the centers of the current clusters according to the distances to obtain target clusters of the clustering number;
and calculating a target cluster center of the target cluster, taking the target cluster center as a current cluster center, and returning to the step of calculating the distances from the other words except the target word to the current cluster center for repeated clustering until a convergence condition is met, so as to obtain a classification result.
5. The method according to claim 1, characterized in that the method further comprises:
obtaining an input feature vector according to the user account data, and inputting the input feature vector into a preset user account detection model to obtain an output feature vector;
and obtaining a user account detection result according to the output feature vector.
6. The method according to claim 5, wherein the step of generating the preset user account detection model includes:
Acquiring historical user account data and a corresponding detection result, acquiring a historical input feature vector according to the historical user account data, and acquiring a historical output feature vector according to the detection result;
training the historical input feature vector serving as the input of a logistic regression model and the historical output feature vector serving as the label of the logistic regression model;
and when the cost function of the logistic regression model reaches a preset threshold, obtaining the preset user account detection model.
7. A user account detection apparatus, the apparatus comprising:
the characteristic attribute obtaining module is used for obtaining user account data and obtaining user characteristic attributes according to the user account data;
the output characteristic obtaining module is used for inputting the user characteristic attribute into a preset user account classifier to obtain an output characteristic;
the detection result obtaining module is used for obtaining a user account detection result according to the output characteristics;
the address obtaining module is used for obtaining user account receiving address information according to the user account data;
the classification module is used for segmenting the user account receiving address information to obtain a segmentation result, inputting the segmentation result into the clustering model to obtain a classification result, and obtaining receiving address similarity according to the classification result;
The suspected account obtaining module is used for obtaining a suspected abnormal user account according to the similarity of the receiving addresses;
the feature attribute obtaining module comprises:
the suspected abnormal data acquisition module is used for acquiring user account data of suspected abnormal user accounts.
8. The apparatus of claim 7, wherein the output feature derivation module comprises:
the target acquisition module is used for acquiring target items to be classified corresponding to the user characteristic attributes;
the Bayesian calculation module is used for acquiring the conditional probability corresponding to the target item to be classified, and respectively calculating the probability that the user account is a normal user account and the probability that the user account is an abnormal user account according to the conditional probability by using the Bayesian theorem;
and the comparison module is used for comparing the probability of the normal user account with the probability of the abnormal user account and obtaining output characteristics according to the comparison result.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 6 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810547162.5A CN108898418B (en) | 2018-05-31 | 2018-05-31 | User account detection method, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810547162.5A CN108898418B (en) | 2018-05-31 | 2018-05-31 | User account detection method, device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108898418A CN108898418A (en) | 2018-11-27 |
CN108898418B true CN108898418B (en) | 2023-06-23 |
Family
ID=64343899
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810547162.5A Active CN108898418B (en) | 2018-05-31 | 2018-05-31 | User account detection method, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108898418B (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111242398B (en) * | 2018-11-29 | 2024-06-07 | 北京搜狗科技发展有限公司 | Data processing method and device for data processing |
CN109784811A (en) * | 2019-01-15 | 2019-05-21 | 长春市震撼科技有限公司 | A kind of commodity sale system for e-commerce |
CN118170663A (en) * | 2019-02-14 | 2024-06-11 | 创新先进技术有限公司 | Method and device for detecting resource loss check script |
CN110020858A (en) * | 2019-03-13 | 2019-07-16 | 北京三快在线科技有限公司 | Pay method for detecting abnormality, device, storage medium and electronic equipment |
CN110225036B (en) * | 2019-06-12 | 2022-03-22 | 北京奇艺世纪科技有限公司 | Account detection method, device, server and storage medium |
CN110378781A (en) * | 2019-06-17 | 2019-10-25 | 深圳壹账通智能科技有限公司 | Data monitoring method, device, computer equipment and storage medium |
CN110474871B (en) * | 2019-07-05 | 2023-10-13 | 中国平安财产保险股份有限公司 | Abnormal account detection method and device, computer equipment and storage medium |
CN110427999B (en) * | 2019-07-26 | 2022-02-22 | 武汉斗鱼网络科技有限公司 | Account correlation evaluation method, device, equipment and medium |
CN111079029B (en) * | 2019-12-20 | 2023-11-21 | 珠海格力电器股份有限公司 | Sensitive account detection method, storage medium and computer equipment |
CN113971038B (en) * | 2020-07-22 | 2024-07-02 | 北京达佳互联信息技术有限公司 | Abnormality identification method and device for application account, server and storage medium |
CN111831825B (en) * | 2020-07-23 | 2024-03-15 | 咪咕文化科技有限公司 | Account detection method, device, network equipment and storage medium |
CN111985703B (en) * | 2020-08-12 | 2022-07-29 | 支付宝(杭州)信息技术有限公司 | User identity state prediction method, device and equipment |
CN112329811B (en) * | 2020-09-18 | 2024-07-26 | 广州三七网络科技有限公司 | Abnormal account identification method, device, computer equipment and storage medium |
CN112153426B (en) * | 2020-09-21 | 2023-08-29 | 腾讯科技(深圳)有限公司 | Content account management method and device, computer equipment and storage medium |
CN113129018B (en) * | 2021-05-17 | 2023-12-08 | 无锡航吴科技有限公司 | Financing platform account classification method and system |
CN113521750B (en) * | 2021-07-15 | 2023-10-24 | 珠海金山数字网络科技有限公司 | Abnormal account detection model training method and abnormal account detection method |
CN114861177B (en) * | 2022-04-19 | 2024-08-23 | 中国科学院信息工程研究所 | Method and device for detecting suspicious account on social network |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110009372B (en) * | 2012-08-03 | 2023-08-18 | 创新先进技术有限公司 | User risk identification method and device |
CN104008332A (en) * | 2014-04-30 | 2014-08-27 | 浪潮电子信息产业股份有限公司 | Intrusion detection system based on Android platform |
CN106886518B (en) * | 2015-12-15 | 2020-10-09 | 国家计算机网络与信息安全管理中心 | Microblog account number classification method |
CN105787025B (en) * | 2016-02-24 | 2021-07-09 | 腾讯科技(深圳)有限公司 | Network platform public account classification method and device |
CN107872436B (en) * | 2016-09-27 | 2020-11-24 | 阿里巴巴集团控股有限公司 | Account identification method, device and system |
-
2018
- 2018-05-31 CN CN201810547162.5A patent/CN108898418B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN108898418A (en) | 2018-11-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108898418B (en) | User account detection method, device, computer equipment and storage medium | |
CN108769026B (en) | User account detection system and method | |
CN110598845B (en) | Data processing method, data processing device, computer equipment and storage medium | |
CN107872436B (en) | Account identification method, device and system | |
WO2019237526A1 (en) | Risk information determining method and apparatus, computer device, and storage medium | |
CN111291264B (en) | Access object prediction method and device based on machine learning and computer equipment | |
WO2020253357A1 (en) | Data product recommendation method and apparatus, computer device and storage medium | |
CN109508903B (en) | Risk assessment method, risk assessment device, computer equipment and storage medium | |
CN110737818B (en) | Network release data processing method, device, computer equipment and storage medium | |
CN110782277A (en) | Resource processing method, resource processing device, computer equipment and storage medium | |
CN108304935B (en) | Machine learning model training method and device and computer equipment | |
CN110245714B (en) | Image recognition method and device and electronic equipment | |
CN111259952B (en) | Abnormal user identification method, device, computer equipment and storage medium | |
CN112070506A (en) | Risk user identification method, device, server and storage medium | |
CN114693192A (en) | Wind control decision method and device, computer equipment and storage medium | |
CN115795000A (en) | Joint similarity algorithm comparison-based enclosure identification method and device | |
CN112258238A (en) | User life value cycle detection method and device and computer equipment | |
CN116307671A (en) | Risk early warning method, risk early warning device, computer equipment and storage medium | |
CN112818868B (en) | Method and device for identifying illegal user based on behavior sequence characteristic data | |
CN111476668B (en) | Identification method and device of credible relationship, storage medium and computer equipment | |
CN117422578A (en) | Entity relationship identification method, device, equipment and medium based on block chain | |
CN111709422A (en) | Image identification method and device based on neural network and computer equipment | |
CN117291535A (en) | Service processing method, device and computer equipment | |
CN108985755A (en) | A kind of account state identification method, device and server | |
WO2020150376A1 (en) | Real time user matching using purchasing behavior |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |