WO2020257991A1 - Procédé d'identification d'utilisateur et produit associé - Google Patents

Procédé d'identification d'utilisateur et produit associé Download PDF

Info

Publication number
WO2020257991A1
WO2020257991A1 PCT/CN2019/092592 CN2019092592W WO2020257991A1 WO 2020257991 A1 WO2020257991 A1 WO 2020257991A1 CN 2019092592 W CN2019092592 W CN 2019092592W WO 2020257991 A1 WO2020257991 A1 WO 2020257991A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
target user
identified
input
groups
Prior art date
Application number
PCT/CN2019/092592
Other languages
English (en)
Chinese (zh)
Inventor
石露
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to CN201980091203.7A priority Critical patent/CN113383362B/zh
Priority to PCT/CN2019/092592 priority patent/WO2020257991A1/fr
Publication of WO2020257991A1 publication Critical patent/WO2020257991A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions

Definitions

  • This application relates to the field of communication technology, and specifically to a user identification method and related products.
  • the more useful resources are pushed to users in important positions, the greater the user's perception of value and the better the effect of resources.
  • the indicator of the resource display in a better position is getting better and better as its user clicks or user behavior.
  • some content producers will use the method of brushing to obtain them.
  • they can obtain better location resources for themselves, and on the other hand, they can get more exposure from real users. But from the perspective of the resource display platform, it will cause the platform to be unfair to resources, and users will also have mistrust of the platform. Therefore, how to identify the users who brushed the amount has become an urgent problem to be solved.
  • the current user identification of swiping users is mainly to identify users one by one. For example, one account is logged in on multiple mobile phones, and there are multiple accounts on one mobile phone for registration and login, and one mobile phone has continuous and uninterrupted access to the same URL or the number of visits exceeds normal. Identification of users and other means. Currently, the accuracy of user identification for swiping is low.
  • the embodiments of the present application provide a user identification method and related products, which can improve the identification accuracy of a user who has swiped.
  • an embodiment of the present application provides a user identification method, including:
  • N brush groups are classified according to group user rules, any one of the N brush groups
  • the ID of the users who brush the amount included in the group is greater than the preset number threshold, and N is a positive integer;
  • the input characteristics of the target user ID including user location characteristics, user APP usage characteristics, user equipment usage characteristics, and user click-through rate CTR characteristics;
  • the target user ID is a brushing user ID.
  • an embodiment of the present application provides a user identification device, the user identification device including a first determining unit, an acquiring unit, an identifying unit, and a second determining unit, wherein:
  • the first determining unit is configured to determine whether there are identified N brushing groups when the target user ID needs to be identified.
  • the N brushing groups are classified according to group user rules.
  • the user ID contained in any one of the two groups is greater than the preset number threshold, and N is a positive integer;
  • the acquiring unit is configured to acquire the input characteristics of the target user ID when the first determining unit determines that there are N swiping groups that have been identified, and the input characteristics include user location characteristics and user APP usage Features, user equipment usage features and user click-through rate CTR features;
  • the identification unit is configured to identify the similarity between the target user ID and each of the N brush amount groups based on the input characteristics of the target user ID;
  • the second determining unit is configured to determine the brush group that has a similarity with the target user ID greater than a preset similarity threshold among the N brush groups
  • the target user ID is the ID of the user who swipes.
  • an embodiment of the present application provides a server, including a processor and a memory, the memory is used to store one or more programs, and the one or more programs are configured to be executed by the processor.
  • the program includes instructions for executing the steps in the first aspect of the embodiments of the present application.
  • an embodiment of the present application provides a computer-readable storage medium, wherein the foregoing computer-readable storage medium stores a computer program for electronic data exchange, wherein the foregoing computer program enables a computer to execute Some or all of the steps described in one aspect.
  • embodiments of the present application provide a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute Example part or all of the steps described in the first aspect.
  • the computer program product may be a software installation package.
  • the user identification method described in the embodiment of the present application specifically includes the following steps: when the target user ID needs to be identified, it is determined whether there are identified N brushing groups, and the N brushing groups are According to the group user rule classification, any one of the N brushing groups contains a brushing user ID greater than a preset number threshold, and N is a positive integer; if it exists, the input characteristics of the target user ID are obtained, and the input characteristics include users Location characteristics, user APP usage characteristics, user equipment usage characteristics, and user click-through rate CTR characteristics; based on the input characteristics of the target user ID to identify the similarity between the target user ID and each of the N brush groups; if N There is a scraping group whose similarity with the target user ID is greater than a preset similarity threshold in the scraping group, and the target user ID is determined as the scraping user ID.
  • the target user when the target user ID is used for user identification, the target user can be recognized for similarity with the identified swiping group. If the similarity is greater than the preset similarity threshold, the target user ID can be directly identified In order to swipe the user ID, since the swiping users often have the characteristics of the swiping group, the similarity recognition with the swiping group can quickly and accurately determine whether the target user is the swiping user ID, thereby improving the recognition of the swiping user Accuracy.
  • FIG. 1 is a schematic flowchart of a user identification method disclosed in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another user identification method disclosed in an embodiment of the present application.
  • FIG. 3 is a schematic flow chart of an algorithm for identifying users who swipe credits disclosed in an embodiment of the present application
  • FIG. 4 is a schematic flowchart of another user identification method disclosed in an embodiment of the present application.
  • Fig. 5 is a schematic structural diagram of a user identification device disclosed in an embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of a server disclosed in an embodiment of the present application.
  • the mobile terminals involved in the embodiments of this application may include various handheld devices with wireless communication functions, vehicle-mounted devices, wearable devices, computing devices or other processing devices connected to wireless modems, as well as various forms of user equipment (User Equipment, UE), mobile station (Mobile Station, MS), terminal device (terminal device), etc.
  • UE User Equipment
  • MS Mobile Station
  • terminal device terminal device
  • FIG. 1 is a schematic flowchart of a user identification method disclosed in an embodiment of the present application. As shown in FIG. 1, the user identification method includes the following steps.
  • the server determines whether there are identified N scraping groups.
  • the N scraping groups are classified according to the group user rules, and any one of the N scraping groups is obtained.
  • the number of brush user IDs included in the group is greater than the preset number threshold, and N is a positive integer.
  • the server serves the client, and the content of the service includes providing resources to the client and storing client data.
  • the server is a targeted service program, and the device running the server can be called a server.
  • the server can establish connections with multiple clients at the same time, and can provide services to multiple clients at the same time.
  • the server can be used to identify the user ID of the swiping amount.
  • the client, content provider, and server can form a content distribution system.
  • the client is a content distribution client.
  • the client can provide a display interface for displaying various content resources. Different content resources occupy different positions of the display interface.
  • the content distribution system will count the clicks or clicks of each content resource of each client.
  • the amount of downloads is determined to be displayed in different positions of the display interface of the client according to the amount of clicks or downloads of each content resource.
  • the server displays the content of the content provider on the display interface of the client.
  • the content resources can be application resources, audio and video resources, etc. The following is an example of APP resources.
  • the content distribution system usually counts the number of clicks or downloads of various apps, and displays the apps in different locations of the content distribution platform (ie, the client) based on the statistical data, recommends apps with high downloads to users, and gives Dedicated resources form a list to operate.
  • the APP producer ie, the content provider
  • APP publishers may use a swiping application to swipe the clicks or downloads of the APP.
  • the APP publisher sends a swipe task request through the swipe application, and the terminal installed with the swipe application obtains the swipe task request. Then, the terminal uses the installed swipe application to generate users who do not really exist, that is, the swipe
  • the terminal uses the installed swipe application to generate users who do not really exist, that is, the swipe
  • the apps recommended based on the unreal APP clicks or download traffic data may not be high-quality apps. Thereby affecting the user's trust in the content distribution platform.
  • the content distribution platform needs to identify which of the users who click or watch a certain APP are users who are scam users.
  • the server In order to identify whether the target user ID is the user ID of the scalping user, the server first determines whether there are N scalping groups that have been identified.
  • the N scalping groups are classified according to the group user rules.
  • Group user rules can be based on the location of the device corresponding to the user ID, the time series used by the application corresponding to the user ID, the cumulative use time of the application corresponding to the user ID, the frequency of use of the application corresponding to the user ID, and the application corresponding to the user ID. The usage time ratio of all applications with the client is determined.
  • the device location can be the same, the time series are similar, the cumulative usage time is greater than a certain duration threshold (for example, 2 hours), the usage frequency is greater than a certain frequency threshold (for example, 100 times), the application corresponding to the user ID and all of the client
  • a certain ratio threshold for example, 80%
  • the brush amount user ID included in the brush amount group is greater than the preset number threshold, and the preset number threshold can be set in advance and stored in the memory (for example, non-volatile memory) of the server.
  • the preset number threshold may be an integer greater than or equal to 2, for example, the preset number threshold may be set to 5.
  • the group user rule is determined based on the location of the device corresponding to the user ID and the time sequence used by the application corresponding to the user ID.
  • the server determines that the distance between the corresponding device positions in the plurality of user IDs that have been identified is less than the preset distance threshold, and the time sequence used by the application in the plurality of user IDs is in the first preset time period.
  • the user ID of the amount of brushing within is classified into the first type of brushing group.
  • the server before the server determines whether there are identified N swiping groups, the server can use group user rules to classify the multiple swiping user IDs that have been identified, and the multiple swiping groups that have been identified can be classified.
  • the distance between the corresponding device positions in the user ID is less than the preset distance threshold, and the time series used by the application in the plurality of user IDs in the first preset time period are classified into the same user ID Class brush amount group.
  • the time series used by the application is the data used by the APP that substitutes the time label, that is, a time series label is recorded for each operation of the APP, which is used to record the operation time of the APP. Because group users will concentrate on a certain period of time, the APP time training of users of the same group has a high degree of similarity.
  • the embodiment of the present application can classify the user ID of the user according to the distance between the corresponding device position in the user ID of the user and the similarity of the time series used by the application in the user ID of the user, thereby improving the classification of the user ID of the user. Accuracy.
  • the server obtains the input characteristics of the target user ID.
  • the input characteristics include user location characteristics, user APP usage characteristics, user equipment usage characteristics, and user click-through rate CTR characteristics.
  • the server acquiring the input feature of the target user ID may specifically be: the server extracting the input feature of the target user ID from the historical behavior data of the target user ID.
  • the historical behavior data of the target user ID may include the location information of the device logged in by the target user ID within a preset time period, the APP usage information of the target user ID, the usage information of the device logged in by the target user ID, and the target user ID CTR characteristics.
  • the embodiment of the present application adds the location characteristics of the user as one of the considerations.
  • the simulation will be similar to the terminal operation, but because of the swipe task, the open rate and use time of the swipe task content will be longer, but the use of other apps will be shorter.
  • this embodiment of the application will examine the frequency and duration of use of commonly used APPs by the terminal and the time distribution of the entire terminal using APPs, so the user APP use characteristics are added as one of the considerations.
  • the operation behavior of the terminal will also be different, such as whether there is a call history, whether a card is inserted, whether there is a terminal use behavior such as short message reception, so the user terminal use characteristics are added as One of the considerations. Since the indicator of the final success of the brush volume is the exposure click rate or download rate or the success rate of a certain behavior, the click volume of tasks related to CTR will be higher and more significant than other users, so the user CTR feature is also used as One of the considerations.
  • the user location feature includes the location feature of the device where the target user ID is logged in (including the location of the device when the user ID is logged in, the range of change in the location of the device, etc.).
  • the location feature of the device where the target user ID is logged in including the location of the device when the user ID is logged in, the range of change in the location of the device, etc.
  • the user APP usage characteristics include the usage time of the target APP logged in by the user ID, the usage frequency of the target APP, and the usage time distribution of the target APP.
  • the longer the use time of the target APP logged in by the user ID the higher the use frequency of the target APP, and the more concentrated the use time distribution of the target APP, the greater the probability that the user ID is a swiping user.
  • the user equipment usage characteristics include the usage characteristics of the device logged in by the target user ID (for example, whether the device has a call record, whether a card is inserted, whether there is a short message reception, etc. during the login process of the target user ID). Generally speaking, if the device has no call history, no card inserted, and no short message reception during the login process of the target user ID, the greater the possibility that the user ID is a swiping user.
  • CTR refers to searching after entering keywords in a search engine, and then sorting out relevant web pages in order according to factors such as bidding, and then users will choose the websites they are interested in and click into them; the total number of searches for a website is taken as the total The number of times, the ratio of the number of times a user clicks and enters the website to the total number of times is called click-through rate.
  • click-through rate the ratio of the number of times a user clicks and enters the website to the total number of times.
  • the server recognizes the similarity between the target user ID and each of the N brush groups based on the input characteristics of the target user ID.
  • each of the N brushing groups will have group-common characteristics.
  • the common characteristics of the group include similar group positions and similar time series used by group applications.
  • the server can calculate the similarity between the user location feature of the target user ID and the location feature of each of the N brush groups, and calculate the time series used by the application of the target user ID and the N brush groups The time similarity of the time series used by the group application of each brush group in the group; according to the location feature similarity of the group location characteristics of each brush group in the N brush groups and each brush group in the N brush groups The time similarity of the time series used by the group application of the quantity group determines the similarity between the target user ID and each of the N quantity groups.
  • the server determines the target user ID as the scraping user ID.
  • the target user ID is classified into the target swiping group, and the target user ID is determined as the swiping user ID.
  • the target user when the target user ID is identified, the target user can be identified with the identified swiping group. If the similarity is greater than the preset similarity threshold, the target user ID can be directly identified In order to swipe the user ID, since the swiping users often have the characteristics of the swiping group, the similarity recognition with the swiping group can quickly and accurately determine whether the target user is the swiping user ID, thereby improving the recognition of the swiping user Accuracy.
  • FIG. 2 is a schematic flowchart of another user identification method disclosed in an embodiment of the present application.
  • Fig. 2 is obtained by further optimization on the basis of Fig. 1.
  • the user identification method includes the following steps.
  • the server determines whether there are identified N scraping groups.
  • the N scraping groups are classified according to group user rules, and any one of the N scraping groups is scraped.
  • the number of brush user IDs included in the group is greater than the preset number threshold, and N is a positive integer.
  • the server obtains the input characteristics of the target user ID.
  • the input characteristics include user location characteristics, user APP usage characteristics, user equipment usage characteristics, and user click-through rate CTR characteristics.
  • the server identifies the similarity between the target user ID and each of the N brush groups based on the input characteristics of the target user ID.
  • the server determines that the target user ID is the scraping user ID.
  • step 201 to step 204 in the embodiment of the present application, reference may be made to the description of step 101 to step 104 shown in FIG. 1, which will not be repeated here.
  • the server If there is no brush group whose similarity with the target user ID is greater than the preset similarity threshold among the N brush groups, the server inputs the input characteristics of the target user ID into the trained binary classification model to obtain the target user ID Enter the preliminary classification result of the feature.
  • the server inputs the preliminary classification results into the trained classifier for calculation to obtain intermediate calculation results, and inputs the intermediate calculation results into the trained neural network model for training, to obtain the identification result of the target user ID.
  • the two-classification model can adopt a multi-algorithm fusion method.
  • the two-classification model can specifically include k-Nearest Neighbor (KNN) classification algorithm, logistic regression (LR) algorithm, and support vector machine (Support Vector (Machine, SVM) algorithm of one or more combinations of two classification models.
  • KNN k-Nearest Neighbor
  • LR logistic regression
  • SVM Support Vector machine
  • the classifier may include an extreme gradient boosting (eXtreme Gradient Boosting, XGboost) classifier or a random forest classifier.
  • extreme gradient boosting eXtreme Gradient Boosting, XGboost
  • random forest classifier eXtreme Gradient Boosting, XGboost
  • FIG. 3 is a schematic flow chart of an algorithm for recognizing a swiping user disclosed in an embodiment of the present application.
  • the input features of the target user are first input into the two classifiers.
  • the KNN classification algorithm, LR algorithm, and SVM algorithm in the two classifiers are single algorithms, which are used to classify the input features of the target user;
  • the intermediate results of the classifier classification are input to the classifier.
  • the XGboost and random forest in the classifier are fusion algorithms used for preliminary calculation of the intermediate results output by the two classifiers; then the intermediate results of the classifier classification are input to the neural network model After training, the recognition result of the target user is finally obtained.
  • There are only two types of recognition results for target users that is, whether it is a scalping user or not a scalping user.
  • the identification process of the target user ID in the embodiments of this application successively adopts a single algorithm, a fusion algorithm, and a neural network.
  • a single algorithm can preliminarily classify input features and reduce the computational complexity of subsequent fusion algorithms.
  • the fusion algorithm takes into account the number of users who brush This possibility can ensure the accuracy of the calculation results of the fusion algorithm.
  • the neural network model is used for training to reduce the possibility of misjudgment, thereby improving the accuracy of the recognition result of the target user ID.
  • step 205 the following steps may be performed:
  • the server extracts the input feature of the first user ID, the first user ID is any one of M user IDs to be identified, and M is a positive integer;
  • the server uses the single-user rule to identify the ID of the user who is credited and the ID of the user that is not of the M user IDs to be identified, and P is a positive integer less than or equal to M;
  • the server inputs the input features of the M to-be-identified user IDs into the initial binary classification model for training, and obtains M training results;
  • the server determines that the initial two-classification model after training is a trained two-classification model.
  • the M user IDs to be identified can be identified by the single user rule.
  • the M user IDs to be identified can all be used to identify whether they are credit users through a single user rule.
  • Single user rules can include the following rules: (1) The same user ID can log in on multiple terminals (for example, mobile phones) in a short time; (2) There are multiple user IDs on one terminal for registration and login at the same time; (3) One terminal Continuous access to the same URL or the number of visits far exceeds that of ordinary users.
  • Each of the M user IDs to be identified either satisfies the above three single user rules at the same time, or does not satisfy the above single user rules.
  • the user ID that meets the above three single-user rules at the same time is the swiping user ID
  • the user ID in the M to-be-identified user IDs that does not meet any of the three single-user rules is the non-swiping user ID . That is, the user IDs among the M user IDs to be identified can all be identified by the single user rule as to whether they are credit users.
  • the user IDs of the users who want to be identified as the black samples of the two-classification model training, and the non-user IDs of the M user IDs to be identified are the white samples for the training of the two-class model to ensure the initial training of the two-class model
  • the accuracy of the data improves the training effect of the two-class model.
  • the value of M can be as large as possible.
  • This embodiment of the application provides a method for training a two-class model.
  • a single-user rule is used to identify users who scribbled, and some more accurate users with scribbling are identified as black samples, and other normal users are used as white samples.
  • the two-classification model make predictions and count the accuracy of the prediction results.
  • the training results are wrong, the two-classification model will be adjusted accordingly so that the two-classification model will not have the same error next time.
  • the accuracy of the two-classification model reaches the first preset accuracy threshold, the training is stopped, and the initial two-classification model after training is determined to be the trained two-classification model.
  • step 206 the following steps may be performed:
  • the server inputs the M training results into the initial classifier for calculation, and obtains M intermediate calculation results;
  • the server determines that the trained initial classifier is a trained classifier.
  • the embodiment of the present application provides a method for training a classifier. According to the previously identified users who are more accurate and use as black samples, and other normal users are trained as white samples, a classifier with higher accuracy can be obtained.
  • step 206 the following steps may be performed:
  • the server inputs the M intermediate calculation results into the initial neural network model for training, and obtains M recognition results;
  • the server determines that the trained initial neural network model is a trained neural network model.
  • the embodiment of the application provides a method for training a neural network model. According to the previously identified users with more accurate brushing as black samples and other normal users as white samples for training, a neural network with higher accuracy can be obtained. model.
  • FIG. 4 is a schematic flowchart of another user identification method disclosed in an embodiment of the present application.
  • Figure 4 is further optimized on the basis of Figure 2.
  • the user identification method includes the following steps.
  • the server determines whether there are identified N scraping groups.
  • the N scraping groups are classified according to group user rules, and any one of the N scraping groups is scraped.
  • the number of brush user IDs included in the group is greater than the preset number threshold, and N is a positive integer.
  • the server obtains the input characteristics of the target user ID.
  • the input characteristics include user location characteristics, user APP usage characteristics, user equipment usage characteristics, and user click-through rate CTR characteristics.
  • the server recognizes the similarity between the target user ID and each of the N brush groups based on the input characteristics of the target user ID.
  • the server determines that the target user ID is the scraping user ID.
  • the server inputs the input characteristics of the target user ID into the trained binary classification model to obtain the target user ID Enter the preliminary classification result of the feature.
  • the server inputs the preliminary classification results into the trained classifier for calculation to obtain intermediate calculation results, and inputs the intermediate calculation results into the trained neural network model for training, to obtain the identification result of the target user ID.
  • step 401 to step 406 can refer to step 201 to step 206 shown in FIG. 2, which will not be repeated here.
  • the server determines whether there are multiple identified swipe user IDs.
  • the server identifies the similarity between the target user ID and the identified multiple credit user IDs.
  • step 409 If there is a swipe user ID whose similarity to the target user ID is greater than the preset similarity threshold among the multiple swipe user IDs, the server adds the swipe user-related feature to the input features of the target user ID; and step 405 is executed
  • the middle server inputs the input features of the target user ID into the trained two-classification model to obtain the preliminary classification results of the input features of the target user ID.
  • the target user ID and a single identified swiping user ID can be calculated for similarity.
  • the similarity analysis algorithm can be used to judge the user, and the similarity between the target user ID and the credit user ID can be calculated to increase the input characteristics of the target user ID, thereby improving The accuracy of the identification of the target user ID further determines whether the target user is a real credit user.
  • the embodiment of the present application may also use an unsupervised algorithm to complete group swipe identification, and use a clustering algorithm or a lonely forest algorithm to identify abnormal users in the group.
  • the server includes hardware structures and/or software modules corresponding to each function.
  • the present invention can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.
  • the embodiment of the present application may divide the server side into functional units according to the foregoing method examples.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit. It should be noted that the division of units in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 5 is a schematic structural diagram of a user identification device disclosed in an embodiment of the present application.
  • the user identification device 500 includes a first determination unit 501, an acquisition unit 502, an identification unit 503, and a second determination unit 504, wherein:
  • the first determining unit 501 is configured to determine whether there are identified N swipe groups when the target user ID needs to be identified.
  • the N swipe groups are classified according to group user rules. Any one of the N brush groups contains brush user IDs greater than the preset number threshold, and N is a positive integer;
  • the acquiring unit 502 is configured to acquire the input characteristics of the target user ID when the first determining unit 501 determines that there are N swipe groups that have been identified, and the input characteristics include user location characteristics, user APP usage characteristics, user equipment usage characteristics, and user click-through rate CTR characteristics;
  • the identification unit 503 is configured to identify the similarity between the target user ID and each of the N brush amount groups based on the input characteristics of the target user ID;
  • the second determining unit 504 is configured to determine when the identifying unit 503 recognizes that there is a brushing group whose similarity with the target user ID is greater than a preset similarity threshold among the N brushing groups
  • the target user ID is the ID of the user who swipes.
  • the user identification device 500 may further include a processing unit 505505.
  • the processing unit 505 is configured to: if the identification unit 503 recognizes that there is no brush group whose similarity with the target user ID is greater than a preset similarity threshold among the N brush groups, The input feature of the target user ID is input into a trained binary classification model to obtain a preliminary classification result of the input feature of the target user ID;
  • the processing unit 505 is further configured to input the preliminary classification result into the trained classifier for calculation to obtain an intermediate calculation result, and input the intermediate calculation result into the trained neural network model for training to obtain the target user ID recognition result.
  • the processing unit 505 is further configured to input the input feature of the target user ID into the trained binary classification model, and extract the first user ID before obtaining the preliminary classification result of the input feature of the target user ID
  • the first user ID is any one of the M user IDs to be identified, and M is a positive integer
  • the single-user rule is used to identify the swipe user ID and the non-swipe amount among the M user IDs to be identified User ID
  • the initial two-classification model after training is a trained two-classification model.
  • the processing unit 505 is further configured to input the preliminary classification result into the trained classifier for calculation, and before obtaining the intermediate calculation result, input the M training results into the initial classifier for calculation to obtain M Intermediate calculation results; when the accuracy of the M intermediate calculation results reaches a second preset accuracy threshold, it is determined that the initial classifier after training is a trained classifier.
  • processing unit 505 is further configured to input the intermediate calculation results into the trained neural network model for training, and before the identification result of the target user ID is obtained, input the M intermediate calculation results into the initial The neural network model is trained to obtain M recognition results;
  • the initial neural network model after training is determined to be the trained neural network model.
  • the group user rule is determined based on the location of the device corresponding to the user ID and the time series used by the application corresponding to the user ID, and the processing unit 505 is further configured to determine whether there is an existing user ID in the first determining unit 501.
  • the distance between the corresponding device positions in the multiple swiping user IDs that have been identified is smaller than the preset distance threshold, and the time series of application usage in the multiple swiping user IDs
  • the user IDs of the users who swiped during the first preset time period are classified into the first type of swipe groups.
  • the processing unit 505 is further configured to determine whether there are multiple identified user IDs when the first determining unit 501 determines that there are no identified N crediting groups; if There is the plurality of scoring user IDs that have been identified, and the similarity between the target user ID and the plurality of scoring user IDs that have been identified; The target user ID similarity is greater than the preset similarity threshold for the swipe user ID, add swipe user-related features in the input features of the target user ID; input the target user ID input features into the trained two categories The model obtains the preliminary classification result of the input feature of the target user ID.
  • the first determining unit 501, acquiring unit 502, identifying unit 503, second determining unit 504, and processing unit 505 in FIG. 5 may be processors.
  • the target user when the target user ID is user identification, the target user can be identified with the identified brush group for similarity. If the similarity is greater than the preset similarity threshold, it can be directly identified
  • the target user ID is the user ID of the scouring user. Since the scouring user often has the characteristics of the scouring group, the identification of the similarity with the scouring group can quickly and accurately determine whether the target user is the scouring user ID, thereby improving the scoring Measure the user’s recognition accuracy.
  • FIG. 6 is a schematic structural diagram of a server disclosed in an embodiment of the present application.
  • the server 600 includes a processor 601 and a memory 602.
  • the server 600 may also include a bus 603.
  • the processor 601 and the memory 602 may be connected to each other through the bus 603.
  • the bus 603 may be a peripheral component. Connect the standard (Peripheral Component Interconnect, referred to as PCI) bus or extended industry standard architecture (Extended Industry Standard Architecture, referred to as EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus 603 can be divided into an address bus, a data bus, a control bus, and so on. For ease of representation, only one thick line is used in FIG.
  • the server 600 may further include an input communication interface 604, and the communication interface 604 may obtain data from an external device (for example, other servers or databases).
  • the memory 602 is used to store one or more programs containing instructions; the processor 601 is used to call the instructions stored in the memory 602 to execute some or all of the method steps in FIGS. 1 to 4.
  • the target user when the target user ID is identified, the target user can be identified with the identified brush group for similarity. If the similarity is greater than the preset similarity threshold, it can be directly identified
  • the target user ID is the user ID of the scalping user. Since the scalping user often has the characteristics of the scalping group, the similarity recognition with the scalping group can quickly and accurately determine whether the target user is the scalping user ID, thereby increasing the scalping volume The accuracy of user recognition.
  • An embodiment of the present application also provides a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program causes the computer to execute any part of the user identification method described in the above method embodiment Or all steps.
  • the embodiments of the present application also provide a computer program product.
  • the computer program product includes a non-transitory computer-readable storage medium storing a computer program.
  • the computer program is operable to cause a computer to execute any of the methods described in the foregoing method embodiments. Part or all of the steps of a user identification method.
  • the disclosed device may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable memory.
  • the technical solution of the present invention essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory, A number of instructions are included to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present invention.
  • the aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other various media that can store program codes.
  • the program can be stored in a computer-readable memory, and the memory can include: flash disk , Read-only memory (English: Read-Only Memory, abbreviation: ROM), random access device (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disc, etc.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Selon certains modes de réalisation, la présente invention concerne un procédé d'identification d'utilisateur et un produit associé. Le procédé consiste : lorsqu'une identification d'utilisateur sur un ID d'utilisateur cible est requise, à déterminer s'il existe N populations de création de trafic factices qui ont été identifiées, les N populations de création de trafic factices étant classées selon des règles d'utilisateur de population, et des ID d'utilisateur de création de trafic factices compris dans l'une quelconque des N populations de création de trafic factices étant supérieures à un seuil de nombre prédéfini ; si tel est le cas, à obtenir des caractéristiques d'entrée de l'ID d'utilisateur cible, les caractéristiques d'entrée comprenant une caractéristique d'emplacement d'utilisateur, une caractéristique d'utilisation d'application d'utilisateur, une caractéristique d'utilisation d'équipement utilisateur, et une caractéristique de taux de clics (TDC) d'utilisateur ; à identifier une similarité entre l'ID d'utilisateur cible et chacune des N populations de création de trafic factices sur la base des caractéristiques d'entrée de l'ID d'utilisateur cible ; et s'il existe une population de création de trafic factice dans les N populations de création de trafic factices dont la similarité avec l'ID d'utilisateur cible est supérieure à un seuil de similarité prédéfini, à déterminer que l'ID d'utilisateur cible est un ID d'utilisateur de création de trafic factice. Les modes de réalisation de la présente invention permettent d'améliorer la précision d'identification d'utilisateurs créateurs de trafic factices.
PCT/CN2019/092592 2019-06-24 2019-06-24 Procédé d'identification d'utilisateur et produit associé WO2020257991A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980091203.7A CN113383362B (zh) 2019-06-24 2019-06-24 用户识别方法及相关产品
PCT/CN2019/092592 WO2020257991A1 (fr) 2019-06-24 2019-06-24 Procédé d'identification d'utilisateur et produit associé

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/092592 WO2020257991A1 (fr) 2019-06-24 2019-06-24 Procédé d'identification d'utilisateur et produit associé

Publications (1)

Publication Number Publication Date
WO2020257991A1 true WO2020257991A1 (fr) 2020-12-30

Family

ID=74061199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/092592 WO2020257991A1 (fr) 2019-06-24 2019-06-24 Procédé d'identification d'utilisateur et produit associé

Country Status (2)

Country Link
CN (1) CN113383362B (fr)
WO (1) WO2020257991A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930995A (zh) * 2020-08-18 2020-11-13 湖南快乐阳光互动娱乐传媒有限公司 数据处理方法及装置
CN112819527A (zh) * 2021-01-29 2021-05-18 百果园技术(新加坡)有限公司 一种用户分群处理方法及装置
CN113947139A (zh) * 2021-10-13 2022-01-18 咪咕视讯科技有限公司 一种用户的识别方法、装置及设备
CN114466214A (zh) * 2022-02-09 2022-05-10 上海哔哩哔哩科技有限公司 直播间人数统计方法及装置
CN114679600A (zh) * 2022-03-24 2022-06-28 上海哔哩哔哩科技有限公司 数据处理方法及装置
CN114926221A (zh) * 2022-05-31 2022-08-19 北京奇艺世纪科技有限公司 作弊用户识别方法、装置、电子设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113704566B (zh) * 2021-10-29 2022-01-18 贝壳技术有限公司 识别号主体识别方法、存储介质和电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100100486A1 (en) * 2008-10-17 2010-04-22 At&T Mobility Ii Llc User terminal and wireless item-based credit card authorization servers, systems, methods and computer program products
CN106651475A (zh) * 2017-02-22 2017-05-10 广州万唯邑众信息科技有限公司 一种移动视频广告假量识别方法和系统
CN107169769A (zh) * 2016-03-08 2017-09-15 广州市动景计算机科技有限公司 应用程序的刷量识别方法、装置
CN109241343A (zh) * 2018-07-27 2019-01-18 北京奇艺世纪科技有限公司 一种刷量用户识别系统、方法及装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294508B (zh) * 2015-06-10 2020-02-11 深圳市腾讯计算机系统有限公司 一种刷量工具检测方法及装置
CN104932966B (zh) * 2015-06-19 2017-09-15 广东欧珀移动通信有限公司 检测应用软件下载刷量的方法及装置
CN106612202A (zh) * 2015-10-27 2017-05-03 网易(杭州)网络有限公司 一种网游渠道刷量的预估判别方法及系统
CN106022834B (zh) * 2016-05-24 2020-04-07 腾讯科技(深圳)有限公司 广告反作弊方法及装置
CN107634952B (zh) * 2017-09-22 2020-12-08 Oppo广东移动通信有限公司 刷量资源确定方法、装置、服务设备、移动终端及存储介质
CN108921581B (zh) * 2018-07-18 2021-07-02 北京三快在线科技有限公司 一种刷单操作识别方法、装置及计算机可读存储介质
CN109525595B (zh) * 2018-12-25 2021-04-16 广州方硅信息技术有限公司 一种基于时间流特征的黑产账号识别方法及设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100100486A1 (en) * 2008-10-17 2010-04-22 At&T Mobility Ii Llc User terminal and wireless item-based credit card authorization servers, systems, methods and computer program products
CN107169769A (zh) * 2016-03-08 2017-09-15 广州市动景计算机科技有限公司 应用程序的刷量识别方法、装置
CN106651475A (zh) * 2017-02-22 2017-05-10 广州万唯邑众信息科技有限公司 一种移动视频广告假量识别方法和系统
CN109241343A (zh) * 2018-07-27 2019-01-18 北京奇艺世纪科技有限公司 一种刷量用户识别系统、方法及装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930995A (zh) * 2020-08-18 2020-11-13 湖南快乐阳光互动娱乐传媒有限公司 数据处理方法及装置
CN112819527A (zh) * 2021-01-29 2021-05-18 百果园技术(新加坡)有限公司 一种用户分群处理方法及装置
CN112819527B (zh) * 2021-01-29 2024-05-24 百果园技术(新加坡)有限公司 一种用户分群处理方法及装置
CN113947139A (zh) * 2021-10-13 2022-01-18 咪咕视讯科技有限公司 一种用户的识别方法、装置及设备
CN114466214A (zh) * 2022-02-09 2022-05-10 上海哔哩哔哩科技有限公司 直播间人数统计方法及装置
CN114679600A (zh) * 2022-03-24 2022-06-28 上海哔哩哔哩科技有限公司 数据处理方法及装置
CN114926221A (zh) * 2022-05-31 2022-08-19 北京奇艺世纪科技有限公司 作弊用户识别方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113383362A (zh) 2021-09-10
CN113383362B (zh) 2022-05-13

Similar Documents

Publication Publication Date Title
WO2020257991A1 (fr) Procédé d'identification d'utilisateur et produit associé
US11138381B2 (en) Method, computer device and readable medium for user's intent mining
CN104091276B (zh) 在线分析点击流数据的方法和相关装置及系统
CN106919661B (zh) 一种情感类型识别方法及相关装置
CN106339507B (zh) 流媒体消息推送方法和装置
CN110475155B (zh) 直播视频热度状态识别方法、装置、设备及可读介质
CN110442712B (zh) 风险的确定方法、装置、服务器和文本审理系统
WO2020093289A1 (fr) Procédé et appareil de recommandation de ressource, dispositif électronique et support d'informations
CN109509010B (zh) 一种多媒体信息处理方法、终端及存储介质
CN104281622A (zh) 一种社交媒体中的信息推荐方法和装置
WO2015120798A1 (fr) Procédé de traitement d'informations multimédias de réseau et système associé
CN108112038B (zh) 一种控制访问流量的方法及装置
CN104750760B (zh) 一种推荐应用软件的实现方法及装置
CN103336766A (zh) 短文本垃圾识别以及建模方法和装置
CN113111264B (zh) 界面内容显示方法、装置、电子设备及存储介质
WO2023000491A1 (fr) Procédé, appareil et dispositif de recommandation d'application et support de stockage lisible par ordinateur
CN111523035B (zh) App浏览内容的推荐方法、装置、服务器和介质
CN113505272B (zh) 基于行为习惯的控制方法和装置、电子设备和存储介质
US20200394448A1 (en) Methods for more effectively moderating one or more images and devices thereof
CN113127746A (zh) 基于用户聊天内容分析的信息推送方法及其相关设备
CN112884529A (zh) 一种广告竞价方法、装置、设备及介质
WO2018171288A1 (fr) Procédé et appareil d'étiquetage de flux d'informations, dispositif de terminal et support d'informations
CN111126071A (zh) 提问文本数据的确定方法、装置和客服群的数据处理方法
CN113010785A (zh) 用户推荐方法及设备
CN110460593B (zh) 一种移动流量网关的网络地址识别方法、装置及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19934765

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 15/02/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19934765

Country of ref document: EP

Kind code of ref document: A1