CN114676740A - User identification method, device, equipment and storage medium - Google Patents

User identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN114676740A
CN114676740A CN202110583938.0A CN202110583938A CN114676740A CN 114676740 A CN114676740 A CN 114676740A CN 202110583938 A CN202110583938 A CN 202110583938A CN 114676740 A CN114676740 A CN 114676740A
Authority
CN
China
Prior art keywords
target
application program
association
user
installation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110583938.0A
Other languages
Chinese (zh)
Inventor
周远远
吴春成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Cloud Computing Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Beijing Co Ltd filed Critical Tencent Cloud Computing Beijing Co Ltd
Priority to CN202110583938.0A priority Critical patent/CN114676740A/en
Publication of CN114676740A publication Critical patent/CN114676740A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The embodiment of the application discloses a method, a device, equipment and a storage medium for user identification, wherein the method comprises the following steps: acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, user attributes of the sample user and a reference application program; counting a first installation proportion corresponding to the reference application program installed by a sample user in the sample user set, and performing association classification on the association relation between the reference application program and the target service according to the first installation proportion to obtain a reference association class; if the target application program is matched with the reference application program, the reference association category is used as a target association category corresponding to the target application program; and adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain the target user identification model. By the method and the device, the discrimination of the application program can be improved, and the identification precision of the user identification model can be improved.

Description

User identification method, device, equipment and storage medium
Technical Field
The present application relates to the technical field related to artificial intelligence-machine learning, and in particular, to a user identification method, apparatus, device, and storage medium.
Background
With the rapid development of internet technology, more and more users participate in network activities, so that the behavior data of the users on the internet is increased in a large amount, and the behavior data becomes more and more valuable. For example, behavior data of an application installed by a user is used to train a user recognition model, and content of interest, such as network courses, games, audiovisual members, coupons and the like, is recommended to the user based on recognition results obtained by the user recognition model.
Practice shows that in the process of training the user recognition model, all business scenes are used for training the user recognition model by using the universal features, so that the recognition accuracy of the user recognition model is not high. For example, for the "overdue credit" scenario, since the overdue rate of users who install 741 type applications in "finance and financing" is high, the overdue rate of users who install applications of finance and financing such as eastern wealth, continental institute, etc. is low on the contrary. At this time, if the user recognition model is trained by using the general classification features of financial management, the discrimination of the application program is relatively low, thereby reducing the recognition accuracy of the user recognition model.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present application is to provide a user identification method, apparatus, device and storage medium, which can improve the discrimination of an application program and improve the identification precision of a user identification model.
An embodiment of the present application provides a user identification method, including:
acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, user attributes of the sample user and a reference application program having an incidence relation with the target service;
counting a first installation proportion corresponding to the reference application program installed by the sample users in the sample user set, and performing association classification according to the association relationship between the reference application program and the target service by the first installation proportion to obtain a reference association class;
if the target application program is matched with the reference application program, determining that the target application program has an association relation with the target service, and taking the reference association category as a target association category corresponding to the association relation between the target application program and the target service;
and adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
An embodiment of the present application provides a user identification apparatus, including:
the system comprises an acquisition module, a service processing module and a service processing module, wherein the acquisition module is used for acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, a user attribute of the sample user and a reference application program which has an incidence relation with the target service;
the dividing module is used for counting a first installation proportion corresponding to the reference application program installed by the sample users in the sample user set, and performing association type division on the association relation between the reference application program and the target service according to the first installation proportion to obtain a reference association type;
a determining module, configured to determine that an association relationship exists between the target application program and the target service if the target application program is matched with the reference application program, and use the reference association category as a target association category corresponding to the association relationship between the target application program and the target service;
and the adjusting module is used for adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
One aspect of the present application provides a computer device, including: a processor and a memory;
wherein, the memory is used for storing computer programs, and the processor is used for calling the computer programs to execute the following steps:
acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, user attributes of the sample user and a reference application program having an incidence relation with the target service;
counting a first installation proportion corresponding to the reference application program installed by the sample users in the sample user set, and performing association classification according to the association relationship between the reference application program and the target service by the first installation proportion to obtain a reference association class;
if the target application program is matched with the reference application program, determining that the target application program has an association relation with the target service, and taking the reference association category as a target association category corresponding to the association relation between the target application program and the target service;
and adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
An aspect of the embodiments of the present application provides a computer-readable storage medium, in which a computer program is stored, where the computer program includes program instructions, and when the program instructions are executed by a processor, the computer program instructions perform the following steps:
acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, user attributes of the sample user and a reference application program having an incidence relation with the target service;
counting a first installation proportion corresponding to the reference application program installed by the sample users in the sample user set, and performing association type division on the association relationship between the reference application program and the target service according to the first installation proportion to obtain a reference association type;
if the target application program is matched with the reference application program, determining that the target application program has an association relation with the target service, and taking the reference association category as a target association category corresponding to the association relation between the target application program and the target service;
and adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
In the application, first, the electronic device may perform association classification on association relations between applications (i.e., reference applications) on the market and target services according to a first installation duty ratio corresponding to the first installation duty ratio by counting the reference applications installed by the sample user, so as to obtain reference association classes. And then, when the target application program is matched with the reference application program, determining that the target application program has an incidence relation with the target business, and taking the reference incidence category as a target incidence category corresponding to the incidence relation between the target application program and the target business. The target association type corresponding to the target application program is obtained by matching the target application program with the reference application program, and the target association type corresponding to the target application program is obtained without calculating the corresponding first installation proportion for the target application program installed by each sample user, so that the calculation amount can be reduced, and the efficiency of obtaining the target association type corresponding to the target application program is improved. Further, the target association category and the user attributes of the sample users can be used as characteristics, and the user identification model is trained to obtain a target user identification model; in different application scenes, the reference association categories corresponding to the reference application programs are different, so that the target association categories corresponding to the target application programs are different. That is to say, the method and the device can construct the characteristics of the user identification model in a dynamic self-adaptive manner according to the service scene, can improve the distinguishing degree between application programs, and further improve the identification precision of the user identification model.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic architecture diagram of a subscriber identity system provided in the present application;
FIG. 2a is a schematic diagram of an application category tree provided herein;
FIG. 2b is a schematic diagram of an application category tree provided herein;
FIG. 3a is a schematic diagram of a data interaction scenario provided in the present application;
FIG. 3b is a schematic diagram of a data interaction scenario provided by the present application;
FIG. 4 is a schematic flow chart of a user identification method provided in the present application;
FIG. 5 is a schematic diagram of a scenario for training a user recognition model according to the present application;
FIG. 6 is a schematic diagram of a scenario for determining a target association category through an application category tree according to the present application;
FIG. 7 is a schematic diagram of a scenario for determining a target association category through an application category tree according to the present application;
FIG. 8 is a schematic diagram of a scenario for training a user recognition model according to the present application;
fig. 9 is a schematic structural diagram of a user identification device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the implementation method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject, and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, large user identification technologies, operating/interactive systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
The user identification method provided by the embodiment of the application mainly relates to an artificial intelligence-machine learning technology, namely, a target association type corresponding to an association relation between a target application program and a target service is obtained by analyzing the target application program installed by a sample user, and a target user identification model for identifying the target user associated with the target service is obtained by training a user identification model by taking the target association type and the user attribute of the sample user as characteristics. Therefore, the target association category is constructed in a dynamic self-adaptive manner according to the service scene, so that the discrimination of the application program can be improved, and the identification precision of the user identification model can be further improved.
For a clearer explanation of the present application, a user identification system utilized by the present application to implement the user identification method is first introduced, as shown in fig. 1, the user identification system includes a server 10 and a plurality of terminals, for example, four terminals in fig. 1, respectively connected with a terminal 11, a terminal 12, a terminal 13 and a terminal 14, and each terminal is respectively connected with the server 10 through a network, so that each terminal can communicate with the server 10 through a network connection.
The server 10 may be a backend device for providing an application program for a user, and in this application, the application program provided by the server may be referred to as a candidate application program, that is, the candidate application program may be an application program that is sold in the market, and the candidate application program may be downloaded and installed by the user; candidate applications may include game applications, payment applications, shopping applications, multimedia applications (such as audio-video applications), educational applications, and the like. The server 10 may also be used to record target applications downloaded and installed by the sample user from the server 10, that is, the target application may refer to any one of the applications installed by the sample user; further, the server 10 may be configured to analyze a target application installed by the sample user, train a user identification model by using the analysis result and the user attribute of the sample user as features, and obtain a target user identification model for identifying a target user associated with the target service. The server 10 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.
It is understood that the candidate application in the server 10 having an association relationship with the target service may be referred to as a reference application, i.e., the reference application may refer to an application having a significant characteristic with respect to the target service, i.e., the reference application refers to an application having a positive or negative impact on the target service. The target service may refer to a service scenario applied in the present application, that is, the target service may include a game service scenario, an education service scenario, a credit overdue service scenario, and the like.
For example, in a credit overdue business scenario, i.e., a credit overdue business scenario refers to a business for identifying users with credit overdue characteristics, the user with 741 applications installed therein generally has a relatively high overdue probability, i.e., the 741 applications have a positive impact on the credit overdue business scenario. The user overdue probability of installing the legal application is low, that is, the legal application has negative influence on the credit overdue scene, that is, the 741 class application and the legal application have significant characteristics in the credit overdue scene. Meanwhile, the installation proportion of a certain social application program in a user with a high overdue probability and a user with a low overdue probability is not very different, namely the social application program has no significant characteristics in a credit overdue scene.
It can be understood that, for the convenience of distinguishing, the association category corresponding to the association relationship between the reference application program and the target service is referred to as a reference association category, the reference association category includes positive correlation and negative correlation, the positive correlation means that the reference application program has a positive influence on the target service, that is, the probability that the user installed with the reference application program having positive correlation with the target service becomes the target user of the target service is relatively low; negative correlation means that the reference application has a negative influence on the target service, that is, the probability that a user installed with the reference application having negative correlation with the target service becomes a target user of the target service is lower.
For example, in a credit overdue service scenario, the association between the 741-type application and the credit overdue service is referred to as positive association, and the association between the legality-type application and the credit overdue service is referred to as negative association. Furthermore, the positive correlation can be further divided into a first-level positive correlation, a second-level positive correlation and the like according to the correlation degree between the reference application program and the target service; the association degree between the reference application program belonging to the first-level positive correlation and the target service is greater than the association degree between the reference application program belonging to the second-level positive correlation and the target service. Similarly, negative correlation can be subdivided into a first level of negative correlation and a second level of negative correlation; and the relevance between the reference application program belonging to the first-level negative correlation and the target service is smaller than the relevance between the reference application program belonging to the second-level negative correlation and the target service.
Optionally, the server 10 may determine a service relationship between the reference application program and the target service according to the service keyword of the target service, where the service relationship includes a homogeneous relationship and a heterogeneous relationship, the homogeneous relationship refers to that the reference application program and the target service belong to the same service scenario, and the heterogeneous relationship refers to that the reference application program and the target service do not belong to the same service scenario. For example, in an educational business scenario, the business keywords may include training, learning, teaching, and the like, and if the reference application includes the business keywords of learning or training, the reference application is said to belong to the educational business scenario, and if the reference application does not include the business keywords of learning or training, the reference application is said to not belong to the educational business scenario. The business relationship between the reference application and the target business can be used for explaining the recognition result output by the user recognition model.
The application category tree in the present application is a network for describing an association category corresponding to an association relationship between a reference application and a target service, and a service relationship between the reference application and the target service. The application category tree comprises a root node and a plurality of leaf nodes, wherein the root node is used for storing reference applications having association relations with target services, namely the root node is used for storing the applications having significance, and the leaf nodes are used for the reference applications having different association categories.
For example, as shown in fig. 2a, the application category tree 1 includes three layers, a first layer includes a root node 15 for storing all reference applications, and a second layer includes a first leaf node 16 and a second leaf node 17, the first leaf node 16 is used for storing reference applications whose reference association categories corresponding to the association relationship between the target services are positive associations, and the second leaf node 17 is used for storing reference applications whose reference association categories corresponding to the association relationship between the target services are negative associations. The third layer includes a third leaf node 18 and a fourth leaf node 19, the third leaf node 18 is configured to store a reference association category corresponding to the association relationship between the target services in the second leaf node as a reference application program of the first-level positive correlation, and the fourth leaf node 19 is configured to store a reference association category corresponding to the association relationship between the target services in the second leaf node as a reference application program of the second-level positive correlation.
Optionally, as shown in fig. 2b, the application category tree 2 includes four layers, a first layer includes a root node 20 for storing all reference applications, a second layer includes a first leaf node 21 and a second leaf node 22, the first leaf node 21 is configured to store reference applications whose reference association categories corresponding to the association relationship between the target services are positive associations, and the second leaf node 22 is configured to store reference applications whose reference association categories corresponding to the association relationship between the target services are negative associations. The third layer comprises a third leaf node 23, a fourth leaf node 24, a fifth leaf node 25 and a sixth leaf node 26, wherein the third leaf node 23 is used for storing the reference application programs of the first leaf node 21, which have the same class relationship with the target service, and the fourth leaf node 24 is used for storing the reference application programs of the first leaf node 21, which have the same class relationship with the target service. The fifth leaf node 25 is configured to store the reference applications in the second leaf node 22, which have the same class relationship with the target service, and the sixth leaf node 26 is configured to store the reference applications in the second leaf node 22, which have the same class relationship with the target service. The fourth layer includes a seventh leaf node 27 and an eighth leaf node 28, where the seventh leaf node 27 is configured to store, as a first-level positive correlation reference application program, a reference association category corresponding to an association relationship between target services in the third leaf node, and the eighth leaf node 28 is configured to store, as a second-level positive correlation reference application program, a reference association category corresponding to an association relationship between target services in the third leaf node. Of course, the reference applications in the fourth leaf node, the fifth leaf node and the sixth leaf child node may be divided with reference to the third leaf child node to obtain more leaf nodes, which is not described herein again.
The terminal may refer to a device facing a user, and the terminal may refer to a device used by a sample user to download and install a target application from a server, that is, the target application may refer to any one of the above candidate applications. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a smart car coupler, a smart television, and the like. Each terminal and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited thereto.
The user attributes in the present application may include age, gender, household registration, and the like, the sample user may refer to a user to which data for training a user recognition model belongs, the sample user may include a positive sample user and a negative sample user, the positive sample user may refer to a target group, the negative sample user may refer to a non-target group, and the target group may refer to a user set concerned in a business scenario. In the present application, a positive sample user may refer to a user who is interested in a certain service, and a non-sample user refers to a user who is not interested in a certain service. For example, in a game business scenario, a positive sample user may be a user interested in the game, such as a user installing a game application; negative sample users refer to users who are not interested in the game, such as users who do not have a game application installed.
For easy understanding, please refer to fig. 3a and fig. 3b, which are schematic diagrams illustrating a data interaction scenario provided in an embodiment of the present application. The application server shown in fig. 3a and 3b may be the server 10, and the terminal shown in fig. 3a and 3b may be any one of the terminals 11 to 14 in the embodiment corresponding to fig. 1, for example, the terminal may be the terminal 11.
As shown in fig. 3a and fig. 3b, an educational service scenario is taken as an example for explanation, for example, when a certain programming network course needs to be pushed to a user, in order to implement accurate pushing and reduce the pushing cost, a target user interested in the programming network course needs to be acquired, and only the programming network program is recommended to the target user, so that the pushing effect can be improved and the pushing cost can be reduced.
Specifically, s1, the server obtains a sample user set for the programming network course, where the sample user in the sample user set may refer to a user who downloads and installs the application program in the server, and the sample user set includes a positive sample user and a negative sample user. The positive sample users may refer to users who have purchased programming network courses, users who have installed programming applications, and correspondingly, the negative sample users may refer to users in the sample user set other than the positive sample users.
s2, the server builds an application class tree.
a. The server may obtain the installation shares of the candidate application in the positive and negative sample users. The installation share of the candidate application in the positive sample user refers to a proportion of the positive sample user in which the candidate application is installed, and the installation share of the candidate application in the positive sample user can be calculated according to the following formula (1).
Figure BDA0003087377120000091
In formula (1), P1 represents the installation share of the positive sample user for installing the candidate application, Z represents the number of users of the positive sample user for installing the candidate application, that is, the installation amount of the candidate application in the positive sample user, and R1 represents the number of positive sample users. Similarly, the installation share of the candidate application in the negative sample user refers to the proportion of the negative sample user in which the candidate application is installed, and the installation share of the candidate application in the negative sample user can be calculated according to the following formula (2).
Figure BDA0003087377120000092
In the formula (2), P2 represents the installation share of the negative sample user for installing the candidate application, H represents the number of users of the negative sample user for installing the candidate application, that is, the installation amount of the candidate application in the negative sample user, and R2 represents the number of negative sample users.
b. The server may determine, according to the installation shares of the candidate application among the positive sample users and the negative sample users, an installation proportion corresponding to the installation proportion of the candidate application by the sample users, where the installation proportion may be referred to as a saliency index, and the saliency index may be calculated according to the following formula (3).
Figure BDA0003087377120000101
In formula (2), Q represents an installation proportion of the candidate application installed by the sample user.
c. And selecting the candidate application program with the saliency characteristic as a reference application program, and adding the reference application program into the root node of the application program category tree. The larger the value of Q is, the more users for installing the candidate application program are in the positive sample users; the smaller the value of Q, the fewer users of the positive sample users who install the candidate application. When the significance index is larger than the first installation proportion threshold value, the candidate application program is indicated to have positive influence on the target service; when the significance index is less than a second installation duty threshold, indicating that the candidate application has a negative impact on the target service; when the significance index is less than or equal to the first installation duty threshold and greater than or equal to the second installation duty threshold, it indicates that the candidate application has little or no effect on the target service, and is negligible. Therefore, candidate applications with significance indexes larger than the first installation proportion threshold value and candidate applications with significance indexes smaller than the second installation proportion threshold value can be screened out and used as reference applications with significance characteristics, and the reference applications are added to the root nodes of the application category tree. Wherein the first installation duty threshold is greater than the second installation duty threshold, optionally, the first installation duty threshold may be greater than 100, for example, 120, and the second installation duty threshold may be a ratio between 100 and the first installation duty threshold.
d. A second level of the application category tree is configured, the second level including a first leaf node and a second leaf node. Adding the reference application program with the significance index larger than the first installation proportion threshold value into the first leaf node, and determining that the association category corresponding to the association relationship between the reference application program in the first leaf node and the target service is positive association; and adding the reference application program with the significance index smaller than the second installation proportion threshold value into the second leaf node, and determining that the association category corresponding to the association relation between the reference application program in the second leaf node and the target service is negative correlation.
e. And on the basis of the second layer of the application program category tree, splitting to obtain a reference application program with the same category relation with the target service and a reference application program with a non-communication category relation with the target service according to whether the reference application program contains a service keyword with the target service, and configuring the third layer of the application program category tree on the basis of the reference application program. The service keywords can be customized, and in an educational service scene, the service keywords can comprise keywords such as learning keywords and training keywords. The configuration of the second layer can be seen in fig. 2b, and is not described in detail here.
d. And configuring a fourth layer of the application program category tree, and sequencing the reference application programs in the third leaf child node according to the significance indexes corresponding to the reference application programs, for example, sequencing the reference application programs in the third leaf child node according to the sequence from large to small of the significance indexes corresponding to the reference application programs. And according to the arrangement sequence, sequentially accumulating the coverage rate of each reference application program to obtain a coverage rate sum, dividing the reference application programs in the third leaf node into a plurality of groups according to the coverage rate sum, wherein the sum of the coverage rates corresponding to the reference application programs in each group is greater than a coverage rate threshold value. The reference applications in the third leaf node are divided into two groups according to the coverage and according to fig. 2b, the groups comprise a first group and a second group, the reference applications belonging to the first group are added to the seventh leaf node, and the reference applications belonging to the second group are added to the eighth leaf node. And the coverage sum corresponding to the reference application program in each group is larger than the first coverage threshold and smaller than the second coverage threshold. The reference association category corresponding to the association between the reference application program in the seventh leaf node and the target service is a first-level positive association, and the reference association category corresponding to the association between the reference application program in the eighth leaf node and the target service is a second-level positive association. The coverage of the reference application may refer to a ratio of an installation amount corresponding to the reference application installed by the sample user to the total number of the sample users.
e. After the application category tree is obtained, the number of applications of each leaf node belonging to the application category tree in the target application downloaded by the sample user can be queried.
And s3, training the user recognition model according to the number of the application programs and the user attributes of the sample users to obtain the target user recognition model.
s4, the server diffuses the target users. The server can inquire the number of the application programs in leaf nodes belonging to the application program category tree in the historical application programs installed by the candidate user history and the association categories corresponding to the association relationship between the historical application programs and the target services. Identifying the number of historical application programs and the association categories corresponding to the association relationship between the historical application programs and the target service by adopting a target user identification model to obtain the probability of each candidate user, wherein the probability is used for reflecting the association degree between the candidate user and the target service; i.e., the greater the probability, the greater the degree of association; the smaller the probability, the smaller the degree of association. Selecting a target user corresponding to the target service from the candidate users according to the requirement and the probability; the candidate users may refer to all users in the application list, the sample user may refer to a part of users in the application list, and the application list includes users who have downloaded the application in the server.
For example, as in FIG. 3b, a user may configure a target service on the user interface 29 of the server, such as the number of users that need to be spread to configure the target service (i.e., select header magnitude), the type of output user ID (phone number, login account, etc.), the name of the service (i.e., task name), the ID of the user identification model, and so forth. After the target service is created, the server can select a target user belonging to the target service according to the corresponding association category of the historical application program of the target user and the user attribute.
Optionally, after obtaining the diffused target users, the server may output table 1 on the user interface, where table 1 includes the service names, the identification states, the IDs of the user identification models, the delivery levels, and the operation options related to the target services of the target users of the target services. The operation options related to the target service comprise options of downloading, deleting, copying, checking details, re-submitting and the like, and the downloading options are used for downloading a list of target users belonging to the target service. The delete option is used to delete the related information of the target service (such as the list of the target user); the copying option is used for copying the related information of the target service; the resubmission option is used for triggering the target user identification model to reacquire the target user corresponding to the target service; the delivery magnitude refers to the number of users who push the content related to the target service to the target user each time. The table 1 may further include information such as a client name, a client type, a service creator, a header extraction level, and creation time to which the target service belongs; the header fetching magnitude refers to the total number of users corresponding to the target user associated with the target service. As can be seen from table 1, the model ID used in the service scenario 1 is 241, and the model ID used in the service scenario 2 is 242, that is, the user identification models used in different service scenarios are different, which can improve the accuracy of obtaining the target user corresponding to the target service.
Table 1:
Figure BDA0003087377120000121
optionally, after the server obtains the target user, a user list including the target user may be sent to the terminal, and the terminal pushes content related to the target service to the target user; alternatively, the server may push content related to the target service directly to the target user. As in fig. 3a and 3b, the server may push a programming network course to the target user. The steps s1 to s4 may be implemented by a server or a terminal; the method can also be implemented by cooperation of the terminal and the server, for example, the server can implement the steps s 1-s 3, and the terminal can implement the step s 4.
Further, please refer to fig. 4, which is a flowchart illustrating a user identification method according to an embodiment of the present application. As shown in fig. 4, the method may be performed by an electronic device, that is, the electronic device may be the server in fig. 1, that is, the method may be performed by the server; alternatively, the electronic device may be the terminal in fig. 1, i.e. the method may be performed by the terminal; alternatively, the electronic device may be the terminal and the server in fig. 1, that is, the method may be executed by both the terminal and the server. Wherein the method may at least comprise the following S101-S104:
S101, acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, user attributes of the sample user and a reference application program having an association relation with the target service.
Specifically, the electronic device may obtain an application list from an application providing device (such as the server described above), where the application list includes information about users who downloaded the application in the application providing device, target applications downloaded and installed by each user, and the like. A portion of the users may be selected from the application list as sample users, where the sample users include positive sample users and negative sample users, the positive sample users being labeled as sample users associated with the target service, and the negative sample users being labeled as sample users not associated with the target service. And screening out a reference application program having an association relation with the target service from the application programs provided by the application program supplying equipment, wherein the reference application program is an application program with a significant characteristic.
S102, counting a first installation proportion corresponding to the reference application program installed by the sample user in the sample user set.
Specifically, the electronic device may count an installation proportion of the reference application installed by a positive sample user in the sample user set and an installation proportion installed by a negative sample user in the sample user set according to the application installed by the sample user, and determine the first installation proportion according to the installation proportion corresponding to the positive sample user and the installation proportion corresponding to the negative sample user in the sample set.
S103, performing association classification on the association relation between the reference application program and the target service according to the first installation occupation ratio to obtain a reference association class.
Specifically, if the first installation proportion is larger, it indicates that the installation proportion for installing the reference application in the positive sample user is higher, that is, the positive sample user prefers to install the reference application, that is, the reference application has a forward influence on the target service; if the first installation proportion is smaller, the installation proportion of the negative sample user for installing the reference application is higher, namely the negative sample user prefers to install the reference application, namely the reference application has negative influence on the target service. Therefore, the association relation between the reference application program and the target service can be subjected to association division according to the first installation occupation ratio to obtain a reference association category; namely, the association relation between the reference application program and the target service is segmented according to the first installation occupation ratio to obtain a reference association category. That is to say, for different service scenarios, the reference association categories corresponding to the reference application program are different, that is, the reference association categories corresponding to the reference application program are determined according to the service scenarios, so that the accuracy of obtaining the reference association categories corresponding to the reference application program is improved.
S104, if the target application program is matched with the reference application program, determining that the target application program has an association relation with the target business, and taking the reference association type as a target association type corresponding to the association relation between the target application program and the target business.
Specifically, the name of the target application program is compared with the name of the reference application program, and if the name of the target application program is the same as the name of the reference application program, it is determined that the target application program is matched with the reference application program; the name of the reference application and the name of the target application may both be names of installation packages referring to the applications. Because the same application program is issued to different platforms, and the names corresponding to the application program are inconsistent, when the name of the target application program and the name of the reference application program both indicate the same application program, it is determined that the target application program is matched with the reference application program. The target application program is matched with the reference application program, which indicates that the target application program and the reference application program belong to the same application program, and the reference application program and the target service have an association relationship, so that the association relationship between the target application program and the target service can be determined, and the reference association category is used as a target association category corresponding to the association relationship between the target application program and the target service. The target association type corresponding to the target application program is obtained in a mode of matching the target application program with the reference application program, the corresponding first installation proportion does not need to be calculated for the target application program installed by each sample user, the target association type corresponding to the target application program is obtained, the calculation amount can be reduced, and the efficiency of obtaining the target association type corresponding to the target application program is improved.
For example, the reference application is application a, and the sample users 1 to 3 all install application a, and only need to calculate a first installation proportion for application a once, and determine the corresponding reference association category according to the first installation proportion corresponding to application a. The target association types corresponding to the application programs a installed by the sample users 1 to 3 can be obtained in a manner of matching the application programs a installed by the sample users 1 to 3 with the reference application programs, and the corresponding first installation ratios do not need to be calculated for the application programs a installed by the sample users 1 to 3, so that the calculation amount can be reduced, and the efficiency of obtaining the target association types corresponding to the target application programs can be improved.
And S105, adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
Specifically, after the target association category is obtained, the user identification model can be trained by taking the target association category and the user attribute of the sample user as features; the target association type and the user attribute of the sample user are adopted to adjust the parameters of the user identification model, and the adjusted user identification model is used as the target user identification model. The target user identification model is used for identifying a target user associated with a target service, and the target user can be a user interested in the target service; or, the target user refers to a user matched with the service characteristics of the target service; for example, in a credit overdue business scenario, the target user refers to a user with a higher probability of having a credit overdue feature. The method comprises the steps that a target association category corresponding to an association relation between a target application program and a target service is determined according to a service scene, and a user identification model is trained by taking the target association category and user attributes as features; that is to say, the features of the user identification model are dynamically and adaptively constructed according to the service scene, so that the feature accuracy is improved, and further, the identification accuracy of the user identification model is improved.
For example, as shown in fig. 5, in an educational business scenario, a user identification model needs to be trained using user attributes and a target association category corresponding to a target application associated with the educational business scenario; in a game service scenario, a user identification model needs to be trained by using user attributes and a target association category corresponding to a target application program associated with the game service scenario. Due to the fact that corresponding target application programs are inconsistent under different service scenes, sample users are different, and the corresponding association types of the target application programs are also inconsistent; therefore, the target association category corresponding to the target application program can be called as a personalized feature, and the personalized feature is dynamically and adaptively constructed according to the service scene. The user recognition model is trained by adopting the personalized features and the general features (namely the user attributes) in the service scene, so that the recognition accuracy of the user recognition model is improved.
In the application, first, the electronic device may perform association classification on association relations between applications (i.e., reference applications) on the market and target services according to a first installation duty ratio corresponding to the first installation duty ratio by counting the reference applications installed by the sample user, so as to obtain reference association classes. And then, when the target application program is matched with the reference application program, determining that the target application program has an incidence relation with the target business, and taking the reference incidence category as a target incidence category corresponding to the incidence relation between the target application program and the target business. The target association type corresponding to the target application program is obtained in a mode of matching the target application program with the reference application program, the corresponding first installation proportion does not need to be calculated for the target application program installed by each sample user, the target association type corresponding to the target application program is obtained, the calculation amount can be reduced, and the efficiency of obtaining the target association type corresponding to the target application program is improved. Further, the target association category and the user attributes of the sample users can be used as characteristics, and the user identification model is trained to obtain a target user identification model; in different application scenes, the reference association categories corresponding to the reference application programs are different, so that the target association categories corresponding to the target application programs are different. That is to say, the method and the device can construct the characteristics of the user identification model in a dynamic self-adaptive manner according to the service scene, can improve the distinguishing degree between application programs, and further improve the identification precision of the user identification model.
Optionally, the specific manner of acquiring the reference application program having an association relationship with the target service in step S101 includes the following steps S11 to S14:
and s11, acquiring an application installation list, wherein the application installation list comprises candidate applications.
And s12, counting second installation ratios corresponding to the candidate application program installed by the sample users in the sample user set.
And s13, screening out candidate applications with the significance characteristics from the application list according to the second installation proportion.
And s14, using the candidate application program with the saliency characteristic as a reference application program having an association relation with the target service.
In steps s 11-s 14, the electronic device may screen candidate applications with salient features from the commercially available candidate applications as reference applications. Specifically, the electronic device may obtain an application installation list, where the application installation list includes candidate applications for installation, and count a second installation proportion corresponding to the candidate applications installed by the sample users in the sample user set; that is, the second installation duty is determined based on the installation duty corresponding to the candidate application being installed by the positive sample user and the installation duty corresponding to the candidate application being installed by the negative sample user. Further, screening out candidate application programs with the significant characteristics from the application program list according to the second installation proportion; i.e., a candidate application with a salient feature, refers to an application that has a positive or negative impact on the target traffic. Then, the candidate application program with the significance characteristics can be used as a reference application program with an association relation with the target business; and subsequently, only the reference application program with the significant characteristics needs to be analyzed, and all application programs in the market do not need to be analyzed, so that the efficiency of analyzing the application programs can be improved, and the cost can be saved.
Optionally, the step s12 may include the following step s21 or step s 22:
s21, taking the candidate application with the second installation proportion larger than the first installation proportion threshold value in the application installation list as the candidate application with the significance characteristics; alternatively, the first and second electrodes may be,
s23, taking the candidate application with the second installation proportion smaller than the second installation proportion threshold value in the application installation list as the candidate application with the significance characteristics; the first installation duty threshold is greater than the second installation duty threshold.
In steps s21 and s22, when the second installation proportion corresponding to the candidate application is greater than the first installation proportion, it indicates that the positive sample user prefers to install the candidate application, i.e. the candidate application has a positive impact on the target service; and when the second installation proportion corresponding to the candidate application is smaller than the second installation proportion, indicating that the negative sample user prefers to install the candidate application, namely the candidate application has negative influence on the target service. And when the second installation proportion corresponding to the candidate application is larger than the second installation proportion and smaller than the first installation proportion, indicating that both the positive sample user and the negative sample user prefer to install the candidate application, namely the candidate application has little or no influence on the target service. Therefore, the electronic equipment can be used as a candidate application with a significance characteristic from candidate applications in the application installation list, wherein the second installation proportion is larger than the first installation proportion threshold; alternatively, the candidate application having the second installation proportion smaller than the second installation proportion threshold in the application installation list may be used as the candidate application having the significant feature. The corresponding second installation proportion is larger than or equal to the second installation proportion and smaller than or equal to the first installation proportion, and the candidate application programs without significance are not needed to be analyzed subsequently, so that the efficiency of analyzing the application programs can be improved, and the cost is saved.
Optionally, the sample users in the sample user set include positive sample users and negative sample users, the positive sample users are the sample users in the sample user set labeled as being associated with the target service, and the negative sample users are the sample users in the sample user set labeled as being not associated with the target service;
optionally, the step S102 may include the following steps S31 to S32:
s31, counting a first installation share of a positive sample user of the sample set of users installing the reference application and a second installation share of a negative sample user of the sample set of users installing the reference application.
s32, determining the first installation share based on the first installation share and the second installation share.
In steps s 31-s 32, the electronic device may compare the installation of the reference application in the positive sample user as a first installation share of the positive sample user for installing the reference application; and comparing the installation proportion of the negative sample user to the reference application as a second installation share of the negative sample user for installing the reference application. Then, a first installation proportion can be determined based on the first installation share and the second installation share; the specific way of calculating the first installation proportion can be referred to the above formula (3), and is not described herein again.
Optionally, the step s31 may include the following steps s41 to s 43:
s41, counting the number of positive sample users, the number of negative sample users, the second installation amount of the reference application program installed in the positive sample users, and the third installation amount of the reference application program installed in the negative sample users in the sample user set.
s42, taking the ratio of the second installation amount to the positive sample user number as the first installation share.
s43, taking the ratio of the third installation amount to the negative sample user number as the second installation share.
In steps s 41-s 43, the larger the first installation share is, the larger the number of users of the positive sample user who install the reference application is; i.e. positive sample users prefer to install the reference application; the smaller the first installation share is, the fewer the number of users of the positive sample users who install the reference application is; i.e. a positive sample user does not like to install the reference application too much. Similarly, the larger the second installation share is, the larger the number of users installing the reference application among the negative sample users is; that is, negative sample users prefer to install the reference application; the smaller the second installation share is, the fewer the number of users installing the reference application in the negative sample users is; i.e. negative sample users do not like to install the reference application too much.
Optionally, the step S103 may include the following steps S51 to S54:
s51, obtaining an application program category tree; the application category tree includes a root node, and a first leaf node and a second leaf node connected to the root node.
s52, adding the reference application to the root node of the application category tree.
s53, if the first installation duty is greater than the first installation duty threshold, adding the reference application to the first leaf node, and determining that the reference association category is a positive association.
s54, if the first installation duty is less than a second installation duty threshold, adding the reference application to the second leaf node, determining that the reference association category is negative correlation; the first installation duty threshold is greater than the second installation duty threshold.
In steps s 51-s 54, the electronic device may add the reference application to the application category tree; the application program category tree comprises a root node, a first leaf node and a second leaf node which are connected with the root node. The root node is used for storing reference application programs which have an association relation with the target service, the first leaf node is used for storing the reference application programs which are positively correlated with the corresponding target association category in the root node, and the second leaf node is used for storing the reference application programs which are positively correlated with the corresponding target association category in the root node. Therefore, a reference application may be added to the root node, and if the first installation duty is greater than the first installation duty threshold, indicating that the user of the positive sample prefers to install the reference application, that is, the reference application has a positive impact on the target service, the reference application is added to the first leaf node, and the reference association category is determined to be positive association. If the first installation duty is less than the second installation duty threshold, indicating that the negative sample user prefers to install the reference application, that is, the reference application has a negative impact on the target service, adding the reference application to the second leaf node, and determining that the reference association type is negative correlation. By establishing the application program category tree, the target association category corresponding to the target application program can be inquired quickly, and the efficiency of determining the target association category of the target application program is improved.
In this embodiment, the step S104 may include the following steps S61-S64:
s61, traversing the root node of the application program type tree, if the target application program is matched with the reference application program in the root node, determining that the target application program and the target service have an association relationship, and traversing the first leaf node according to the node path of the application program type tree.
s62, if the target application matches the reference application in the first leaf node, determining that the target association category is a positive association.
s63, if the target application does not match the reference application in the first leaf node, traversing the second leaf node according to the node path of the application class tree.
s64, determining the target association category as negatively correlated if the target application matches a reference application in the second leaf node.
In steps s 61-s 64, as shown in fig. 6, the electronic device may query the association relationship between the target application and the target service through the application category tree to correspond to the target association category. Specifically, the electronic device may traverse the root node of the application program category tree, and if the target application program is not matched with the reference application program in the root node, it indicates that the target application program does not have an association relationship with the target service, that is, the target application program does not have significance, and the process may be ended. And if the target application program is matched with the reference application program in the root node, indicating that the target application program has the significant characteristics, determining that the target application program has an association relation with the target service, and further traversing the first leaf node according to the node path of the application program type tree. If the target application program is matched with a certain reference application program in the first leaf node, determining that the target association category is positive association; if the target application does not match the reference application in the first leaf node, indicating that the target application does not belong to the first leaf node, the second leaf node may be traversed according to the node path of the application class tree. If the target application matches a reference application in the second leaf node, determining that the target association category is negative correlation. The node paths of the application program category number may refer to from top to bottom and from left to right, or may be in other manners, and are not limited herein. By inquiring the application program category tree, the target association category corresponding to the target application program is determined, the efficiency of determining the target association category corresponding to the target application program is improved, the calculation amount is reduced, and the cost is reduced.
Optionally, the positive correlation includes a first-level positive correlation and a second-level positive correlation, and the application program category tree further includes a third leaf node and a fourth leaf node connected to the first leaf node;
the step s53 may include the following steps s71 to s 76:
s71, if the first installation duty is greater than the first installation duty threshold, adding the reference application to the first leaf node, and obtaining a first installation amount of the reference application installed by the sample user.
s72, determining the coverage rate of the reference application program according to the first installation amount, and sorting the reference application program according to the first installation account.
s73, sequentially accumulating the coverage of the reference applications according to the ranking order to obtain a sum of coverage.
s74, grouping the reference applications according to the coverage ratio and the reference application to obtain a first group and a second group.
s75, if the reference application belongs to the first group, adding the reference application to the third leaf node, and determining the reference association type as a first-level positive association.
s76, if the reference application program belongs to the second group, adding the reference application program to the fourth leaf node, and determining that the reference association category is a second-level positive association, wherein the first installation ratio corresponding to the reference application program in the first group is greater than the first installation ratio corresponding to the reference application program in the second group.
In steps s 71-s 76, as shown in fig. 2a, the positive correlation includes a first-level positive correlation and a second-level positive correlation, and the application class tree further includes a third leaf node and a fourth leaf node connected to the first leaf node; the third leaf node is used for storing the reference application program of which the incidence relation between the first leaf node and the target service belongs to the first-level positive correlation category, and the fourth leaf node is used for storing the reference application program of which the incidence relation between the first leaf node and the target service belongs to the second-level positive correlation category. And the association degree between the reference application program belonging to the first-level positive correlation and the target service is greater than the association degree between the reference application program belonging to the second-level positive correlation and the target service. Therefore, if the first installation duty is greater than the first installation duty threshold, indicating that the reference application has a positive impact on the target service, the reference application is added to the first leaf node, and a first installation amount of the reference application installed by the sample user is obtained. And taking the ratio of the first installation amount to the total number of the positive sample users as the coverage rate of the reference application program, and sequencing the reference application programs according to the sequence of the first installation ratio from large to small or according to the sequence of the first installation ratio from small to large. Further, the coverage rates of the reference applications are sequentially accumulated according to the ranking order to obtain a coverage rate sum. The reference application is divided into at least two groups according to the coverage and the reference application, and the group includes a first group and a second group. The first installation ratios corresponding to the reference applications in the first group are all larger than the first installation ratios corresponding to the reference applications in the second group; and the sum of the coverage rates corresponding to the applications in the first group and the second group is greater than the first coverage rate threshold and less than the second coverage rate threshold. If the reference application program belongs to the first group, adding the reference application program into the third leaf node, and determining that the reference association type is a first-level positive association; if the reference application belongs to the second group, adding the reference application to the fourth leaf node, and determining that the reference association category is a second-level positive association. The positive correlation corresponding to the reference application program is further subdivided into a first-level positive correlation and a second-level positive correlation according to the coverage rate, so that more precise information quantity can be provided for training the user identification model, and the identification precision of the user identification model is improved; the discrimination of the application program can be ensured, and the coverage of the reference application program can be ensured.
For example, in a credit overdue business scenario, taking the example that the reference applications associated with the business scenario include APP1-APP8, the reference association categories of the first installation duty corresponding to the respective reference applications are shown in table 2. Assuming that the first installation occupancy threshold is 120 and the second installation occupancy threshold is 45; as can be seen from table 2, the first installation ratios of APP1, APP5, and APP8 are all greater than the first installation ratio threshold, and the corresponding reference association categories are all first-level positive correlations. The first installation ratios of the APP2, the APP3, the APP4, the APP6 and the APP7 are all larger than a second installation ratio threshold value, the reference association categories corresponding to the APP3, the APP4, the APP6 and the APP7 are all negative correlations, and the reference association category corresponding to the APP2 is a first-level negative correlation.
Table 2:
application name First installation ratio Reference association categories
APP1 300 First order positive correlation
APP2
20 First order negative correlation
APP3 40 Negative correlation
APP4 40 Negative correlation
APP5 430 First order positive correlation
APP6 30 Negative correlation
APP7 40 Negative correlation
APP8 3420 First order positive correlation
…… …… ……
Optionally, the step s62 may include the following steps s81 to s 84:
s81, if the target application matches the reference application in the first leaf node, traversing the third leaf node according to the node path of the application class tree.
s82, if the target application program matches the reference application program in the third leaf node, determining that the target association category is the first level positive association.
s83, if the target application does not match the reference application in the third leaf node, traversing the fourth leaf node according to the node path of the application type tree.
s84, if the target application matches the reference application in the fourth leaf node, determining that the target association category is a second-level positive association.
In the above steps s81 to s84, as in fig. 7, when the target association class belongs to the positive correlation, the electronic device may further determine whether the target association class belongs to the first-level positive correlation or the second-level positive correlation. Specifically, if the target application program is matched with the reference application program in the first leaf node, indicating that the target association category between the target application program and the target service belongs to positive correlation, traversing the third leaf node according to the node path of the application program category tree. And if the target application program is matched with the reference application program in the third leaf node, determining that the target association type is the first-level positive association. And traversing the fourth leaf node according to the node path of the application program type tree if the target application program is not matched with the reference application program in the third leaf node. If the target application program is matched with the reference application program in the fourth leaf node, determining that the target association type is the second-level positive association. Through the application program category tree, the grade of the positive correlation corresponding to the target application program can be inquired, the target application program can be subdivided into the target application program belonging to the first-level positive correlation and the target application program belonging to the second-level positive correlation, more detailed characteristic information can be provided for training the user identification model, and the identification precision of the user identification model is improved.
Optionally, the step S105 may include the following steps S91 to S94:
and s91, counting the number of the applications with the target association category in the target applications installed by the sample user.
s92, obtaining the label associated label of the sample user, where the label associated label is used to reflect whether the sample user has an association relationship with the target service.
And s93, performing relevance identification on the number of the application programs and the user attributes of the sample user by adopting the user identification model to obtain a prediction relevance label.
s94, adjusting the user identification model according to the labeling correlation label and the prediction correlation label to obtain the target user identification model.
In steps s91 to s94, as shown in fig. 8, the electronic device may count the number of applications having the target association category in the target applications installed by the sample user, and obtain the annotation association tag of the sample user, where the annotation association tag of the sample user is manually annotated to reflect whether the sample user has an association relationship with the target service. Further, the user identification model can be adopted to perform relevance identification on the number of the application programs and the user attributes of the sample users to obtain a prediction relevance label; if the labeled associated label is closer to the predicted associated label, the identification error of the user identification model is lower; on the contrary, if the difference between the labeling correlation label and the prediction correlation label is larger, the identification error of the user identification model is higher. Therefore, the identification error of the user identification model can be calculated according to the labeling associated label and the prediction associated label, and when the identification error is smaller than the error threshold, the accuracy of the user identification model is higher, and the user identification model is used as the target user identification model. And when the identification error is greater than or equal to the error threshold, indicating that the accuracy of the user identification model is low, adjusting the user identification model according to the identification error, and when the error of the adjusted user identification model is greater than the error threshold, taking the adjusted user identification model as the target user identification model. The user identification model is trained according to the number of the target application programs which belong to the target association category and the user attributes installed by the sample users, so that the identification precision of the user identification model can be improved.
Optionally, the method may further include the following steps s111 to s 113:
s111, receiving an explanation request aiming at the prediction correlation label; and extracting the service key words associated with the target service according to the interpretation request.
And s112, comparing the service keyword with the target application program, and determining the service relationship between the target application program and the target service.
And s113, generating interpretation information by using the business relationship, wherein the interpretation information is used for interpreting influence factors of the business relationship on the prediction associated label.
In steps s111 to s113, the electronic device may determine a business relationship between the target application and the target business according to the business keyword. Specifically, the electronic device may receive an interpretation request for the predicted association tag, and extract a service keyword associated with the target service according to the interpretation request; the service key of the target service may refer to a field included in the name of the target service or a field in the name of the service corresponding to the target service. Further, the service keyword is compared with the target application program, and a service relationship between the target application program and the target service is determined, wherein the service relationship between the target application program and the target service comprises a homogeneous relationship and a heterogeneous relationship, the homogeneous relationship is used for indicating that the target application program and the target service belong to the same service scene, and the heterogeneous relationship is used for indicating that the target application program and the target service do not belong to the same service scene. For example, if the target application is a training application, it indicates that both the target application and the push programming network course belong to an educational scenario. Generally, target applications belonging to the same category also have a positive impact on the target traffic, while target applications of different categories have a negative impact on the target traffic. Therefore, the business relationship is adopted to generate interpretation information, and the interpretation information is used for interpreting the influence factor of the business relationship on the prediction correlation label. By explaining the reason that the information user can trace the user identification model to output the prediction associated label, the reliability of the user identification model is improved.
Optionally, the step s112 may include the following steps s121 to s 122:
and s121, if the target application program comprises the business keyword, determining that the business relation between the target application program and the target business is a homogeneous relation.
And s122, if the target application program does not comprise the business keyword, determining that the business relation between the target application program and the target business is a heterogeneous relation.
In steps s121 to s122, the electronic device may compare the name of the target application program with the service keyword, and if the target application program (i.e., the name of the target application program) includes the service keyword, determine that the service relationship between the target application program and the target service is a same category relationship; and if the target application program does not comprise the business key words, determining that the business relation between the target application program and the target business is a heterogeneous relation.
Optionally, the method may comprise the following steps s131 to s 133:
s131, receiving an identification request for the target user, wherein the identification request comprises the user attribute of the target user and the history application installed in the history of the target user.
And s132, if the historical application program is matched with the reference application program, taking the reference association type as the historical association type between the historical application program and the target service.
s133, performing relevance category identification on the historical relevance category and the user attribute of the target user by using the target user identification model, and obtaining a target relevance tag used for indicating whether the target user is relevant to the target service.
In steps s131 to s133, after the target user identification model is obtained through training, the target user identification model may be adopted as the target service extension user. Specifically, the electronic device may receive an identification request for a target user, where the identification request includes user attributes of the target user and a history application installed in the history of the target user. And comparing the historical application program with the reference application program, if the historical application program is not matched with the reference application program, determining that the historical application program has no significant characteristics, and filtering the historical application program, namely not processing the historical application program. And if the historical application program is matched with the reference application program, determining that the historical application program has a significant characteristic, and using the reference association category as the historical association category between the historical application program and the target service. Further, the target user identification model is adopted to perform relevance type identification on the historical relevance type and the user attribute of the target user, and a target relevance label used for indicating whether the target user is relevant to the target service or not is obtained.
Please refer to fig. 9, which is a schematic structural diagram of a user identification device according to an embodiment of the present application. The user identification means may be a computer program (including program code) running on a computer device, for example, the user identification means is an application software; the apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application. As shown in fig. 9, the user recognition apparatus may include: an acquisition module 901, a partitioning module 902, a determination module 903, an adjustment module 904, an interpretation module 905, and a user identification module 906.
An obtaining module 901, configured to obtain a sample user set for a target service, a target application installed by a sample user in the sample user set, a user attribute of the sample user, and a reference application having an association relationship with the target service;
a dividing module 902, configured to count a first installation duty ratio corresponding to the reference application program installed by the sample user in the sample user set, and perform association category division according to an association relationship between the reference application program and the target service in the first installation duty ratio, to obtain a reference association category;
A determining module 903, configured to determine that the target application program and the target service have an association relationship if the target application program is matched with the reference application program, and use the reference association category as a target association category corresponding to the association relationship between the target application program and the target service;
an adjusting module 904, configured to adjust the user identification model by using the target association category and the user attribute of the sample user, so as to obtain a target user identification model for identifying a target user associated with the target service.
Optionally, the classification model performs association classification on the association relationship between the reference application program and the target service according to the first installation occupation ratio to obtain a reference association classification; the method comprises the following steps:
acquiring an application program category tree; the application program category tree comprises a root node, a first leaf node and a second leaf node which are connected with the root node;
adding the reference application to a root node of the application category tree;
if the first installation duty is greater than a first installation duty threshold, adding the reference application to the first leaf node, and determining that the reference association category is positive association;
If the first installation duty is less than a second installation duty threshold, adding the reference application to the second leaf node, and determining that the reference association category is negative correlation; the first installation duty threshold is greater than the second installation duty threshold.
Optionally, if the target application program is matched with the reference application program, the determining module determines that the target application program has an association relationship with the target service, and uses the reference association category as a target association category corresponding to the association relationship between the target application program and the target service, including:
traversing a root node of the application program type tree, if the target application program is matched with a reference application program in the root node, determining that the target application program and the target service have an incidence relation, and traversing the first leaf node according to a node path of the application program type tree;
if the target application program is matched with the reference application program in the first leaf node, determining that the target association type is positive association;
traversing the second leaf node according to the node path of the application program type tree if the target application program is not matched with the reference application program in the first leaf node;
If the target application matches a reference application in the second leaf node, determining that the target association category is negatively correlated.
Optionally, the positive correlation includes a first-level positive correlation and a second-level positive correlation, and the application program category tree further includes a third leaf node and a fourth leaf node connected to the first leaf node;
if the first installation duty is greater than a first installation duty threshold, the partitioning module adds the reference application to the first leaf node, and determines that the reference association category is a positive association; the method comprises the following steps:
if the first installation proportion is larger than a first installation proportion threshold value, adding the reference application program into the first leaf node, and acquiring a first installation amount of the reference application program installed by the sample user;
determining the coverage rate of the reference application program according to the first installation amount, and sequencing the reference application program according to the first installation account;
sequentially accumulating the coverage rates of the reference application programs according to the arrangement sequence to obtain a sum of the coverage rates;
grouping the reference application program according to the coverage rate to obtain a first group and a second group;
if the reference application program belongs to the first group, adding the reference application program into the third leaf node, and determining that the reference association type is a first-level positive association;
If the reference application program belongs to the second group, the reference application program is added into the fourth leaf node, the reference association type is determined to be second-level positive association, and the first installation ratio corresponding to the reference application program in the first group is larger than the first installation ratio corresponding to the reference application program in the second group.
Optionally, if the target application program is matched with the reference application program in the first leaf node, the determining module determines that the target association category is positive association; the method comprises the following steps:
traversing the third leaf node according to the node path of the application program type tree if the target application program is matched with the reference application program in the first leaf node;
if the target application program is matched with the reference application program in the third leaf node, determining that the target association type is a first-level positive association;
if the target application program is not matched with the reference application program in the third leaf node, traversing the fourth leaf node according to the node path of the application program type tree;
if the target application program is matched with the reference application program in the fourth leaf node, determining that the target association type is the second-level positive association.
Optionally, the adjusting module adjusts the user identification model by using the target association category and the user attribute of the sample user, to obtain a target user identification model for identifying a target user associated with the target service, including:
counting the number of the application programs with the target association category in the target application programs installed by the sample user;
acquiring a label associated label of the sample user, wherein the label associated label is used for reflecting whether the sample user has an association relation with the target service;
performing relevance identification on the number of the application programs and the user attributes of the sample users by adopting the user identification model to obtain a prediction relevance label;
and adjusting the user identification model according to the labeling associated label and the prediction associated label to obtain the target user identification model.
Optionally, the user identification device further comprises an interpretation module, configured to receive an interpretation request for the predicted associated tag; extracting service keywords associated with the target service according to the interpretation request;
comparing the service keyword with the target application program to determine the service relationship between the target application program and the target service;
And generating interpretation information by adopting the business relationship, wherein the interpretation information is used for interpreting the influence factor of the business relationship on the prediction correlation label.
Optionally, the comparing, by the interpretation module, the service keyword with the target application program to determine a service relationship between the target application program and the target service includes:
if the target application program comprises the business key words, determining that the business relation between the target application program and the target business is a homogeneous relation;
and if the target application program does not comprise the business key words, determining that the business relation between the target application program and the target business is a heterogeneous relation.
Optionally, the obtaining module obtains a reference application program having an association relationship with the target service, including:
acquiring an application program installation list, wherein the application program installation list comprises candidate application programs;
counting a second installation proportion corresponding to the candidate application program installed by the sample user in the sample user set;
screening out candidate application programs with the significant characteristics from the application program list according to the second installation proportion;
and taking the candidate application program with the saliency characteristic as a reference application program having an association relation with the target service.
Optionally, the obtaining module screens out the candidate application programs with the significant features from the application program list according to the second installation proportion, and includes:
taking the candidate application program with the second installation proportion larger than the first installation proportion threshold value in the application program installation list as the candidate application program with the significant characteristic; alternatively, the first and second liquid crystal display panels may be,
taking the candidate application program with the second installation proportion smaller than a second installation proportion threshold value in the application program installation list as the candidate application program with the significant characteristic; the first installation occupancy threshold is greater than the second installation occupancy threshold.
Optionally, the sample users in the sample user set include a positive sample user and a negative sample user, the positive sample user is a sample user in the sample user set labeled as being associated with the target service, and the negative sample user is a sample user in the sample user set labeled as being not associated with the target service;
the dividing module counts a first installation proportion corresponding to the reference application program installed by the sample user in the sample user set, and the first installation proportion comprises the following steps:
counting a first installation share of a positive sample user in the sample user set for installing the reference application program and a second installation share of a negative sample user in the sample user set for installing the reference application program;
The first installation share is determined based on the first installation share and the second installation share.
Optionally, the dividing module counts a first installation share of the positive sample users in the sample user set for installing the reference application and a second installation share of the negative sample users in the sample user set for installing the reference application, where the first installation share includes:
counting the number of positive sample users, the number of negative sample users, a second installation amount of the reference application program installed in the positive sample users, and a third installation amount of the reference application program installed in the negative sample users in the sample user set;
taking the ratio of the second installation amount to the positive sample user number as the first installation share;
and taking the ratio of the third installation amount to the negative sample user number as the second installation share.
Optionally, the user identification device further includes a user identification module, and the user identification module is configured to: receiving an identification request aiming at a target user, wherein the identification request comprises user attributes of the target user and a history application program installed in the history of the target user;
if the historical application program is matched with the reference application program, taking the reference association category as the historical association category between the historical application program and the target service;
And performing relevance class identification on the historical relevance class and the user attribute of the target user by adopting the target user identification model to obtain a target relevance label for indicating whether the target user is relevant to the target service or not.
According to an embodiment of the present application, the steps involved in the user identification method shown in fig. 4 may be performed by various modules in the user identification apparatus shown in fig. 9. For example, step S101 shown in fig. 4 may be performed by the obtaining module 901 in fig. 9, and steps S102 and S103 shown in fig. 4 may be performed by the dividing module 902 in fig. 9; step S104 shown in fig. 4 may be performed by the determination module 905 in fig. 9, and step S105 shown in fig. 4 may be performed by the adjustment module 905 in fig. 9.
According to an embodiment of the present application, each module in the user identification apparatus shown in fig. 9 may be respectively or entirely combined into one or several units to form the user identification apparatus, or some unit(s) may be further split into multiple sub-units with smaller functions, which may implement the same operation without affecting implementation of technical effects of embodiments of the present application. The modules are divided based on logic functions, and in practical application, the functions of one module can be realized by a plurality of units, or the functions of a plurality of modules can be realized by one unit. In other embodiments of the present application, the user identification device may also include other units, and in practical applications, these functions may also be implemented by the assistance of other units, and may be implemented by cooperation of a plurality of units.
According to an embodiment of the present application, the user identification apparatus as shown in fig. 9 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the corresponding method as shown in fig. 4 on a general-purpose computer device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and the user identification method of the embodiment of the present application may be implemented. The computer program may be recorded on a computer-readable recording medium, for example, and loaded into and executed by the computing apparatus via the computer-readable recording medium.
In the application, first, the electronic device may perform association classification on association relations between applications (i.e., reference applications) on the market and target services according to a first installation duty ratio corresponding to the first installation duty ratio by counting the reference applications installed by the sample user, so as to obtain reference association classes. And then, when the target application program is matched with the reference application program, determining that the target application program has an incidence relation with the target business, and taking the reference incidence category as a target incidence category corresponding to the incidence relation between the target application program and the target business. The target association type corresponding to the target application program is obtained in a mode of matching the target application program with the reference application program, the corresponding first installation proportion does not need to be calculated for the target application program installed by each sample user, the target association type corresponding to the target application program is obtained, the calculation amount can be reduced, and the efficiency of obtaining the target association type corresponding to the target application program is improved. Further, the target association category and the user attributes of the sample users can be used as characteristics, and the user identification model is trained to obtain a target user identification model; in different application scenes, the reference association categories corresponding to the reference application programs are different, so that the target association categories corresponding to the target application programs are different. That is to say, the method and the device can construct the characteristics of the user identification model in a dynamic self-adaptive manner according to the service scene, can improve the distinguishing degree between application programs, and further improve the identification precision of the user identification model.
Fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 10, the computer apparatus 1000 may include: the processor 1001, the network interface 1004, and the memory 1005, and the computer apparatus 1000 may further include: a user interface 1003, and at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a standard wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 10, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the computer device 1000 shown in fig. 10, the network interface 1004 may provide a network communication function; the user interface 1003 is an interface for providing a user with input; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, user attributes of the sample user and a reference application program having an incidence relation with the target service;
counting a first installation proportion corresponding to the reference application program installed by the sample users in the sample user set, and performing association classification according to the association relationship between the reference application program and the target service by the first installation proportion to obtain a reference association class;
if the target application program is matched with the reference application program, determining that the target application program has an association relation with the target service, and taking the reference association category as a target association category corresponding to the association relation between the target application program and the target service;
and adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
Optionally, the processor 1001 may be configured to invoke the device control application program stored in the memory 1005, so as to implement association category division on an association relationship between the reference application program and the target service according to the first installation occupation ratio, so as to obtain a reference association category; the method comprises the following steps:
acquiring an application program category tree; the application program category tree comprises a root node, a first leaf node and a second leaf node which are connected with the root node;
adding the reference application to a root node of the application category tree;
if the first installation duty is greater than a first installation duty threshold, adding the reference application to the first leaf node, and determining that the reference association category is positive association;
if the first installation duty is less than a second installation duty threshold, adding the reference application to the second leaf node, and determining that the reference association category is negative correlation; the first installation duty threshold is greater than the second installation duty threshold.
Optionally, the processor 1001 may be configured to call the device control application stored in the memory 1005, so as to determine that the target application and the target service have an association relationship if the target application and the reference application are matched, and use the reference association category as a target association category corresponding to the association relationship between the target application and the target service, where the target association category includes:
Traversing a root node of the application program type tree, if the target application program is matched with a reference application program in the root node, determining that the target application program and the target service have an incidence relation, and traversing the first leaf node according to a node path of the application program type tree;
if the target application program is matched with the reference application program in the first leaf node, determining that the target association type is positive association;
if the target application program is not matched with the reference application program in the first leaf node, traversing the second leaf node according to the node path of the application program type tree;
if the target application matches a reference application in the second leaf node, determining that the target association category is negative correlation.
Optionally, the positive correlation includes a first-level positive correlation and a second-level positive correlation, and the application program category tree further includes a third leaf node and a fourth leaf node connected to the first leaf node;
the processor 1001 may be configured to invoke the device control application stored in the memory 1005 to add the reference application to the first leaf node if the first installation duty is greater than a first installation duty threshold, and determine that the reference association category is a positive association; the method comprises the following steps:
If the first installation proportion is larger than a first installation proportion threshold value, adding the reference application program into the first leaf node, and acquiring a first installation amount of the reference application program installed by the sample user;
determining the coverage rate of the reference application program according to the first installation amount, and sequencing the reference application program according to the first installation account;
sequentially accumulating the coverage rates of the reference application programs according to the arrangement sequence to obtain a sum of the coverage rates;
grouping the reference application program according to the coverage rate to obtain a first group and a second group;
if the reference application program belongs to the first group, adding the reference application program into the third leaf node, and determining that the reference association type is a first-level positive association;
if the reference application program belongs to the second group, the reference application program is added into the fourth leaf node, the reference association type is determined to be second-level positive association, and the first installation ratio corresponding to the reference application program in the first group is larger than the first installation ratio corresponding to the reference application program in the second group.
The processor 1001 may be configured to invoke the device control application stored in the memory 1005 to determine that the target association category is a positive association if the target application matches the reference application in the first leaf node; the method comprises the following steps:
Traversing the third leaf node according to the node path of the application program type tree if the target application program is matched with the reference application program in the first leaf node;
if the target application program is matched with the reference application program in the third leaf node, determining that the target association type is a first-level positive association;
if the target application program is not matched with the reference application program in the third leaf node, traversing the fourth leaf node according to the node path of the application program type tree;
if the target application program is matched with the reference application program in the fourth leaf node, determining that the target association type is the second-level positive association.
Optionally, the processor 1001 may be configured to invoke the device control application stored in the memory 1005 to implement the adjusting of the user identification model by using the target association category and the user attribute of the sample user, so as to obtain a target user identification model for identifying a target user associated with the target service, including:
counting the number of the application programs with the target association category in the target application programs installed by the sample user;
acquiring a label associated label of the sample user, wherein the label associated label is used for reflecting whether the sample user has an association relation with the target service;
Performing relevance identification on the number of the application programs and the user attributes of the sample users by adopting the user identification model to obtain a prediction relevance label;
and adjusting the user identification model according to the labeling association label and the prediction association label to obtain the target user identification model.
Optionally, the processor 1001 may be configured to call a device control application stored in the memory 1005 to implement: receiving an interpretation request for the predictive relevance tag; extracting service keywords associated with the target service according to the interpretation request;
comparing the service keyword with the target application program to determine the service relationship between the target application program and the target service;
and generating interpretation information by adopting the business relationship, wherein the interpretation information is used for interpreting the influence factor of the business relationship on the prediction associated label.
Optionally, the processor 1001 may be configured to call the device control application stored in the memory 1005, so as to compare the service keyword with the target application, and determine a service relationship between the target application and the target service, where the method includes:
if the target application program comprises the business key words, determining that the business relation between the target application program and the target business is a homogeneous relation;
And if the target application program does not comprise the business key words, determining that the business relation between the target application program and the target business is a heterogeneous relation.
Optionally, the processor 1001 may be configured to call the device control application stored in the memory 1005, so as to obtain a reference application having an association relationship with the target service, including:
acquiring an application program installation list, wherein the application program installation list comprises candidate application programs;
counting a second installation proportion corresponding to the candidate application program installed by the sample user in the sample user set;
screening out candidate application programs with the significant characteristics from the application program list according to the second installation proportion;
and taking the candidate application program with the saliency characteristic as a reference application program having an association relation with the target service.
Optionally, the processor 1001 may be configured to call the device control application stored in the memory 1005 to realize the screening of the candidate applications with the significant features from the application list according to the second installation proportion, including:
taking the candidate application with the second installation proportion larger than the first installation proportion threshold value in the application installation list as a candidate application with a significance characteristic; alternatively, the first and second electrodes may be,
Taking the candidate application program with the second installation proportion smaller than a second installation proportion threshold value in the application program installation list as the candidate application program with the significant characteristic; the first installation occupancy threshold is greater than the second installation occupancy threshold.
Optionally, the sample users in the sample user set include positive sample users and negative sample users, the positive sample users are the sample users in the sample user set labeled as being associated with the target service, and the negative sample users are the sample users in the sample user set labeled as being not associated with the target service;
the processor 1001 may be configured to invoke the device control application stored in the memory 1005 to implement statistics on a first installation duty of the reference application installed by a sample user of the sample user set, including:
counting a first installation share of a positive sample user in the sample user set for installing the reference application program and a second installation share of a negative sample user in the sample user set for installing the reference application program;
the first installation share is determined based on the first installation share and the second installation share.
Optionally, the processor 1001 may be configured to call the device control application stored in the memory 1005 to perform statistics on a first installation share of the reference application installed by a positive sample user in the sample user set and a second installation share of the reference application installed by a negative sample user in the sample user set, including:
Counting the number of positive sample users, the number of negative sample users, a second installation amount of the reference application program installed in the positive sample users, and a third installation amount of the reference application program installed in the negative sample users in the sample user set;
taking the ratio of the second installation amount to the positive sample user number as the first installation share;
and taking the ratio of the third installation amount to the negative sample user number as the second installation share.
Optionally, the processor 1001 may be configured to invoke a device control application stored in the memory 1005 to enable receiving an identification request for a target user, the identification request including user attributes of the target user and a history application installed in the history of the target user;
if the historical application program is matched with the reference application program, taking the reference association category as the historical association category between the historical application program and the target service;
and performing relevance class identification on the historical relevance class and the user attribute of the target user by adopting the target user identification model to obtain a target relevance label for indicating whether the target user is relevant to the target service or not.
It should be understood that the computer device 1000 described in this embodiment of the present application may perform the description of the user identification method in the embodiment corresponding to fig. 3 and fig. 7, and may also perform the description of the user identification apparatus in the embodiment corresponding to fig. 8, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
In the application, first, the electronic device may perform association classification on association relations between applications (i.e., reference applications) on the market and target services according to a first installation duty ratio corresponding to the first installation duty ratio by counting the reference applications installed by the sample user, so as to obtain reference association classes. And then, when the target application program is matched with the reference application program, determining that the target application program has an incidence relation with the target business, and taking the reference incidence category as a target incidence category corresponding to the incidence relation between the target application program and the target business. The target association type corresponding to the target application program is obtained in a mode of matching the target application program with the reference application program, the corresponding first installation proportion does not need to be calculated for the target application program installed by each sample user, the target association type corresponding to the target application program is obtained, the calculation amount can be reduced, and the efficiency of obtaining the target association type corresponding to the target application program is improved. Further, the target association category and the user attributes of the sample users can be used as characteristics, and the user identification model is trained to obtain a target user identification model; in different application scenes, the reference association categories corresponding to the reference application programs are different, so that the target association categories corresponding to the target application programs are different. That is to say, the method and the device can construct the characteristics of the user identification model in a dynamic self-adaptive manner according to the service scene, can improve the distinguishing degree between application programs, and further improve the identification precision of the user identification model.
Furthermore, it is to be noted here that: an embodiment of the present invention further provides a computer-readable storage medium, where a computer program executed by the aforementioned user identification apparatus is stored in the computer-readable storage medium, and the computer program includes program instructions, and when the processor executes the program instructions, the description of the user identification method in the embodiment corresponding to fig. 4 can be performed, so that details are not repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application.
By way of example, the program instructions described above may be executed on one computer device, or on multiple computer devices located at one site, or distributed across multiple sites and interconnected by a communication network, which may comprise a blockchain network.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (15)

1. A method for identifying a user, comprising:
acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, user attributes of the sample user and a reference application program having an incidence relation with the target service;
counting a first installation proportion corresponding to the reference application program installed by the sample users in the sample user set, and performing association classification according to the association relationship between the reference application program and the target service by the first installation proportion to obtain a reference association class;
if the target application program is matched with the reference application program, determining that the target application program has an incidence relation with the target business, and taking the reference incidence category as a target incidence category corresponding to the incidence relation between the target application program and the target business;
And adjusting a user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
2. The method according to claim 1, wherein the association relationship between the reference application and the target service is classified into an association category according to the first installation occupation ratio to obtain a reference association category; the method comprises the following steps:
acquiring an application program category tree; the application program category tree comprises a root node, a first leaf node and a second leaf node which are connected with the root node;
adding the reference application to a root node of the application category tree;
if the first installation duty ratio is greater than a first installation duty ratio threshold, adding the reference application program to the first leaf node, and determining that the reference association category is positive association;
if the first installation duty ratio is smaller than a second installation duty ratio threshold value, adding the reference application program into the second leaf node, and determining that the reference association category is negative correlation; the first installation duty threshold is greater than the second installation duty threshold.
3. The method of claim 2, wherein the determining that the target application and the target service have an association relationship if the target application and the reference application match, and using the reference association category as a target association category corresponding to the association relationship between the target application and the target service comprises:
traversing a root node of the application program type tree, if the target application program is matched with a reference application program in the root node, determining that the target application program and the target service have an incidence relation, and traversing the first leaf node according to a node path of the application program type tree;
if the target application program is matched with the reference application program in the first leaf node, determining that the target association type is positive association;
traversing the second leaf node according to the node path of the application program type tree if the target application program is not matched with the reference application program in the first leaf node;
and if the target application program is matched with the reference application program in the second leaf node, determining that the target association category is negative correlation.
4. The method of claim 3, wherein the positive correlation comprises a first level positive correlation and a second level positive correlation, the application category tree further comprising a third leaf node and a fourth leaf node connected to the first leaf node;
if the first installation duty ratio is greater than a first installation duty ratio threshold, adding the reference application program to the first leaf node, and determining that the reference association category is positive association; the method comprises the following steps:
if the first installation proportion is larger than a first installation proportion threshold value, adding the reference application program into the first leaf node, and acquiring a first installation amount of the reference application program installed by the sample user;
determining the coverage rate of the reference application programs according to the first installation amount, and sequencing the reference application programs according to the first installation occupation ratio;
sequentially accumulating the coverage rates of the reference application programs according to the arrangement sequence to obtain a sum of the coverage rates;
grouping the reference application programs according to the coverage rate to obtain a first group and a second group;
if the reference application program belongs to the first group, adding the reference application program to the third leaf node, and determining that the reference association type is a first-level positive association;
If the reference application program belongs to the second group, the reference application program is added into the fourth leaf node, the reference association type is determined to be second-level positive association, and a first installation ratio corresponding to the reference application program in the first group is larger than a first installation ratio corresponding to the reference application program in the second group.
5. The method of claim 4, wherein the target association category is determined to be positively correlated if the target application matches a reference application in the first leaf node; the method comprises the following steps:
traversing the third leaf node according to the node path of the application program type tree if the target application program is matched with the reference application program in the first leaf node;
if the target application program is matched with a reference application program in the third leaf node, determining that the target association category is a first-level positive association;
if the target application program is not matched with the reference application program in the third leaf node, traversing the fourth leaf node according to the node path of the application program type tree;
And if the target application program is matched with the reference application program in the fourth leaf node, determining that the target association category is a second-level positive association.
6. The method according to any of claims 1-5, wherein said adapting the user identification model using the target association category and the user attributes of the sample users to obtain a target user identification model for identifying a target user associated with the target service comprises:
counting the number of the applications with the target association category in the target applications installed by the sample user;
acquiring a label associated label of the sample user, wherein the label associated label is used for reflecting whether the sample user has an association relation with the target service or not;
performing relevance identification on the number of the application programs and the user attributes of the sample users by adopting the user identification model to obtain a prediction relevance label;
and adjusting the user identification model according to the labeling association label and the prediction association label to obtain the target user identification model.
7. The method of claim 6, wherein the method further comprises:
Receiving an interpretation request for the predictive relevance tag; extracting service keywords associated with the target service according to the interpretation request;
comparing the service keywords with the target application program to determine a service relationship between the target application program and the target service;
and generating interpretation information by adopting the business relationship, wherein the interpretation information is used for interpreting the influence factor of the business relationship on the prediction correlation label.
8. The method of claim 7, wherein the comparing the service keyword with the target application to determine the service relationship between the target application and the target service comprises:
if the target application program comprises the business key words, determining that the business relation between the target application program and the target business is a homogeneous relation;
and if the target application program does not comprise the business key words, determining that the business relation between the target application program and the target business is a heterogeneous relation.
9. The method of claim 1, wherein obtaining a reference application having an association with the target service comprises:
Acquiring an application program installation list, wherein the application program installation list comprises candidate application programs;
counting second installation proportions corresponding to the candidate application program installed by the sample users in the sample user set;
screening out candidate application programs with significant characteristics from the application program list according to the second installation proportion;
and taking the candidate application program with the saliency characteristic as a reference application program having an association relation with the target service.
10. The method of claim 9, wherein the screening the list of applications for candidate applications having a salient feature based on the second installation proportion comprises:
taking the candidate application with the second installation proportion larger than the first installation proportion threshold value in the application installation list as a candidate application with a significance characteristic; alternatively, the first and second electrodes may be,
taking the candidate application programs with the second installation proportion smaller than a second installation proportion threshold value in the application program installation list as candidate application programs with the significance characteristics; the first installation duty threshold is greater than the second installation duty threshold.
11. The method of claim 1, wherein the sample users in the sample user set comprise positive sample users and negative sample users, the positive sample users being sample users in the sample user set labeled as associated with the target service, the negative sample users being sample users in the sample user set labeled as not associated with the target service;
the counting of the corresponding first installation proportions of the reference application program installed by the sample users in the sample user set comprises:
counting a first installation share of a positive sample user in the sample user set for installing the reference application and a second installation share of a negative sample user in the sample user set for installing the reference application;
determining the first installation share based on the first installation share and the second installation share.
12. The method of claim 1, wherein the method further comprises:
receiving an identification request for a target user, wherein the identification request comprises user attributes of the target user and a history application installed in the history of the target user;
If the historical application program is matched with the reference application program, taking the reference association category as a historical association category between the historical application program and the target service;
and performing relevance type identification on the historical relevance type and the user attribute of the target user by adopting the target user identification model to obtain a target relevance label for indicating whether the target user is relevant to the target service or not.
13. A user identification device, comprising:
the system comprises an acquisition module, a service processing module and a service processing module, wherein the acquisition module is used for acquiring a sample user set aiming at a target service, a target application program installed by a sample user in the sample user set, a user attribute of the sample user and a reference application program having an incidence relation with the target service;
the dividing module is used for counting a first installation proportion corresponding to the reference application program installed by the sample users in the sample user set, and performing association type division on the association relationship between the reference application program and the target service according to the first installation proportion to obtain a reference association type;
a determining module, configured to determine that an association relationship exists between the target application program and the target service if the target application program is matched with the reference application program, and use the reference association category as a target association category corresponding to the association relationship between the target application program and the target service;
And the adjusting module is used for adjusting the user identification model by adopting the target association category and the user attribute of the sample user to obtain a target user identification model for identifying the target user associated with the target service.
14. A computer device, comprising:
a processor and a memory;
the processor is coupled to the memory, wherein the memory is configured to store program code and the processor is configured to invoke the program code to perform the method of any of claims 1-12.
15. A computer-readable storage medium, in which a computer program is stored which is adapted to be loaded by a processor and to carry out the method according to any one of claims 1 to 12.
CN202110583938.0A 2021-05-27 2021-05-27 User identification method, device, equipment and storage medium Pending CN114676740A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110583938.0A CN114676740A (en) 2021-05-27 2021-05-27 User identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110583938.0A CN114676740A (en) 2021-05-27 2021-05-27 User identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114676740A true CN114676740A (en) 2022-06-28

Family

ID=82069982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110583938.0A Pending CN114676740A (en) 2021-05-27 2021-05-27 User identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114676740A (en)

Similar Documents

Publication Publication Date Title
CN107818344B (en) Method and system for classifying and predicting user behaviors
US20230289828A1 (en) Data processing method, computer device, and readable storage medium
CN113011889B (en) Account anomaly identification method, system, device, equipment and medium
CN111625715B (en) Information extraction method and device, electronic equipment and storage medium
CN111258995A (en) Data processing method, device, storage medium and equipment
CN110598070A (en) Application type identification method and device, server and storage medium
CN111159563A (en) Method, device and equipment for determining user interest point information and storage medium
US20200301908A1 (en) Dynamic Document Reliability Formulation
CN111797320A (en) Data processing method, device, equipment and storage medium
CN111522724A (en) Abnormal account determination method and device, server and storage medium
WO2023024408A1 (en) Method for determining feature vector of user, and related device and medium
CN106294406A (en) A kind of method and apparatus accessing data for processing application
CN111192170A (en) Topic pushing method, device, equipment and computer readable storage medium
CN113256335B (en) Data screening method, multimedia data delivery effect prediction method and device
CN112069269B (en) Big data and multidimensional feature-based data tracing method and big data cloud server
CN111597361B (en) Multimedia data processing method, device, storage medium and equipment
CN115131052A (en) Data processing method, computer equipment and storage medium
CN116956183A (en) Multimedia resource recommendation method, model training method, device and storage medium
CN114490673B (en) Data information processing method and device, electronic equipment and storage medium
CN115618024A (en) Multimedia recommendation method and device and electronic equipment
CN114676740A (en) User identification method, device, equipment and storage medium
CN114268625B (en) Feature selection method, device, equipment and storage medium
CN114741540A (en) Multimedia sequence recommendation method, operation prediction model training method, device, equipment and storage medium
US11288322B2 (en) Conversational agents over domain structured knowledge
CN111126503B (en) Training sample generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination