CN110020162B - User identification method and device - Google Patents

User identification method and device Download PDF

Info

Publication number
CN110020162B
CN110020162B CN201711337552.1A CN201711337552A CN110020162B CN 110020162 B CN110020162 B CN 110020162B CN 201711337552 A CN201711337552 A CN 201711337552A CN 110020162 B CN110020162 B CN 110020162B
Authority
CN
China
Prior art keywords
user
similarity
models
model
user model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711337552.1A
Other languages
Chinese (zh)
Other versions
CN110020162A (en
Inventor
柴静伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201711337552.1A priority Critical patent/CN110020162B/en
Publication of CN110020162A publication Critical patent/CN110020162A/en
Application granted granted Critical
Publication of CN110020162B publication Critical patent/CN110020162B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a user identification method and device, and relates to the technical field of computers. One embodiment of the method comprises: classifying the historical data of the virtual account to obtain historical data of a plurality of categories; constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account; and identifying the user according to the similarity among the plurality of user models. According to the embodiment, a plurality of user models can be built under the condition that a plurality of users use the same virtual account, the users are identified according to the similarity among the user models, the interference of the superposition behaviors of the users on the building of the user models and the identification of the users is reduced, and the accuracy of the user models and the accuracy of the identification of the users are improved.

Description

User identification method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a user identification method and apparatus.
Background
The user model is a tagged user image abstracted according to information such as social attributes, living habits, consumption behaviors and the like of the user. The user model provides enough information basis for companies and enterprises, and can help the companies and the enterprises to quickly find accurate user groups and more extensive feedback information such as user demands.
Currently, in the internet industry, a method for constructing a user model generally comprises the following steps: user data are obtained in various modes, the data are sorted, filtered, cleaned and overlapped, a user model is built, and a user portrait is depicted.
However, in the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:
as shown in fig. 1, the same virtual account may be used by multiple users, the obtained user data may be user behavior data superimposed by the multiple users, a user model (or user representation) obtained by analyzing the superimposed user behavior is not an accurate real user, and a user model is used to identify a user (or a user is located) to perform, for example, a personalized recommendation session and advertisement delivery may generate a large deviation.
Disclosure of Invention
In view of this, embodiments of the present invention provide a user identification method and apparatus, which can construct multiple user models when multiple users use the same virtual account, and identify users according to similarities between the multiple user models, so as to reduce interference of overlapping behavior pairs of the multiple users on constructing and identifying the user models, and improve accuracy of the user models and accuracy of identifying the users.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a user identification method including: classifying the historical data of the virtual account to obtain historical data of a plurality of categories; constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account; and identifying the user according to the similarity among the plurality of user models.
Optionally, the classifying the historical data of the virtual account includes: classifying the historical data of the virtual account according to a time period, wherein the historical data in the same time period is a category; or classifying the historical data of the virtual account according to the time period and the unique identifier, wherein the historical data of the same unique identifier in the same time period is of one category.
Optionally, the performing, according to the similarity between the multiple user models, user identification includes: for each user model, calculating the similarity between the user model and other user models; taking other user models with the similarity smaller than a first similarity threshold value as similar user models of the user models; calculating the average user similarity of the user models and the average user similarity of similar user models of the user models; and identifying the user according to the average user similarity of the user model and the average user similarity of the similar user model.
Optionally, for each user model, the similarity between the user model and other user models is calculated according to the following manner:
Figure BDA0001507654970000021
wherein, K[N,M]Representing the similarity between the user model userN and the user model userM, SuserNTarget weight, S, representing user model userNuserMRepresenting the target weight of the user model userM.
Optionally, a difference between the target weight of each user model and a preset target weight of a reference user model is less than or equal to a weight threshold.
To achieve the above object, according to another aspect of the embodiments of the present invention, there is provided a user identification apparatus including: the data classification module is used for classifying the historical data of the virtual account to obtain multiple categories of historical data; the model building module is used for building a user model according to the historical data of each category so as to obtain a plurality of user models corresponding to the virtual account; and the user identification module is used for identifying the user according to the similarity among the plurality of user models.
Optionally, the data classification module is further configured to: classifying the historical data of the virtual account according to a time period, wherein the historical data in the same time period is a category; or classifying the historical data of the virtual account according to the time period and the unique identifier, wherein the historical data of the same unique identifier in the same time period is of one category.
Optionally, the subscriber identity module is further configured to: for each user model, calculating the similarity between the user model and other user models; taking other user models with the similarity smaller than a first similarity threshold value as similar user models of the user models; calculating the average user similarity of the user models and the average user similarity of similar user models of the user models; and identifying the user according to the average user similarity of the user model and the average user similarity of the similar user model.
Optionally, for each user model, the similarity between the user model and other user models is calculated according to the following manner:
Figure BDA0001507654970000031
wherein, K[N,M]Representing the similarity between the user model userN and the user model userM, SuserNTarget weight, S, representing user model userNuserMRepresenting the target weight of the user model userM.
Optionally, a difference between the target weight of each user model and a preset target weight of a reference user model is less than or equal to a weight threshold.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided an electronic apparatus including: one or more processors; a storage device, configured to store one or more programs, which when executed by the one or more processors, cause the one or more processors to implement the user identification method provided in any of the above embodiments.
To achieve the above object, according to a further aspect of the embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, the program, when executed by a processor, implementing the user identification method provided by any of the above embodiments.
One embodiment of the above invention has the following advantages or benefits: because the historical data of the virtual account is classified, the historical data of a plurality of categories is obtained; constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account; according to the similarity among the user models, the technical means of user identification is carried out, so that the technical problem that the user models in the prior art are inaccurate is solved, the interference of the superposition behaviors of the users on the user models is reduced, and the technical effects of improving the accuracy of the user models and the accuracy of the user identification are achieved.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of a prior art method of building a user model;
fig. 2 is a schematic diagram of a main flow of a user identification method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of building a user model in a user identification method according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a main flow of a user identification method according to another embodiment of the present invention;
FIG. 5 is a schematic diagram of the main modules of a user model building apparatus according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 2 is a schematic diagram of a main flow of a user identification method according to an embodiment of the present invention, as shown in fig. 2, the method includes:
step S201: classifying the historical data of the virtual account to obtain historical data of a plurality of categories;
step S202: constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account;
step S203: and identifying the user according to the similarity among the plurality of user models.
Referring to fig. 4 for step S201, the history data may be internet log data of the user, and may be obtained by a web crawler. Channels for obtaining internet log data include, but are not limited to: portal sites, video sites, e-commerce sites, travel sites, forums, social media (e.g., microblogs, WeChats), and the like. Taking e-commerce websites as an example, the acquired historical data may include detailed information of browsed commodities (such as commodity types, prices, uses, and the like), duration of browsing the same commodity, historical purchase records, records of making a round, preferential frequency of use, search records, records of communication with customer service staff (such as communication duration and communication contents), and commodity comment records. Taking a video website as an example, the historical data may include: search for recordings, browse for recordings, view recordings, video content details (e.g., drama, director, actors, etc.).
Specifically, the historical data of the virtual account may be classified according to a preset rule. For example, the acquired historical data may be classified according to a time period, and the historical data in the same time period is a category. The time period can be flexibly set according to the requirement, and the invention is not limited herein.
As an example, 30-day internet log data of a certain virtual account is obtained as historical data, each of the 30 days is divided into N (N is greater than or equal to 2) time periods, and the historical data in the same time period is classified into one type. Further, each of the above 30 days was divided into 4 time periods: t is1、T2、T3And T4Wherein, T1Is 8:00-1400, T2Is 14:00-20:00, T3Is 20:00-2:00, T4Is 02:00-8: 00. The acquired history data is divided into 4 types, the history in the same time periodThe data is one type.
In other alternative embodiments, the acquired historical data may be classified according to a time period and a unique identifier, and the historical data with the same unique identifier in the same time period is a category. Wherein, the unique identification can be a device number or an IP address.
The device number may be an IMEI (International Mobile Equipment identity 6).
The device number may be a UUID (universally 6Unique Identifier) or a MAC (Media Access Control) address.
In an alternative embodiment, referring to fig. 4, prior to classifying the historical data, the method further comprises: and cleaning historical data.
Specifically, the historical data can be processed by a incomplete data detection method, an error value detection method, a repeated value detection method, a consistency detection method and the like, so that the data quality is improved, and the data is more suitable for mining and analysis.
Furthermore, text analysis can be performed on the historical data by using a text mining technology, a natural language processing method, a machine learning algorithm, a prediction algorithm and the like.
For step S202, the step of building a user model from the historical data for each category may include:
according to a preset multi-dimensional label library, matching historical data of each category to obtain multi-dimensional labels;
and constructing a plurality of user models corresponding to the historical data of the plurality of categories according to the multi-dimensional labels.
The multi-dimensional label library can be preset according to application requirements, so that a user model can be constructed according to specific application, products and project requirements, and the accuracy is higher. A multi-dimensional label refers to a label that reflects a user's characteristics from multiple dimensions. Taking e-commerce websites as an example, the multidimensional tags in the embodiment of the present invention include, but are not limited to, basic attribute tags, behavior feature tags, purchasing power tags, hobby tags, psychological feature tags, and social attribute tags. Each tag includes a plurality of sub-tags, such as basic attribute tags including, but not limited to, the following sub-tags: gender, age group, height, weight, marital status, household income, occupation; behavioral characteristic tags include, but are not limited to, the following sub-tag online time, login time, purchase frequency, etc.; purchase capability labels include, but are not limited to, the number of times the following sub-labels purchase mid-to-low end products, high end products, light luxury goods, luxury goods; hobby labels include, but are not limited to, the following sub-labels outdoor sports, fitness, painting, music, reading, etc.; psychographic labels include, but are not limited to, the following sub-labels action break, soft break, inward, outward, etc. Social attribute tags include, but are not limited to, the following sub-tags: family gatherings, friend gatherings, telephone communications, common social applications.
In this embodiment, the user model may also be referred to as a user portrait, that is, the tagging of user information is to abstract a user's entire view by an enterprise after collecting and analyzing data of main information such as social attributes, living habits, and consumption behaviors of consumers. The user model is a virtual representation of a real user, built on a series of real data. The user model is the basic way to apply big data technology. The user model can provide enough information basis for enterprises, and can help the enterprises to quickly find more extensive feedback information such as accurate user groups and user requirements.
And matching the classified historical data according to a preset multi-dimensional label library to obtain a multi-dimensional label of each type of historical data, and constructing a plurality of user models corresponding to the type of historical data according to the multi-dimensional labels, so as to obtain a plurality of user models corresponding to the virtual account.
For step S203, in an actual application scenario, for the same virtual account, it may be used by one real user or multiple real users. In this embodiment, a plurality of user models are constructed for the same virtual account, and the similarity between the plurality of user models is calculated to identify which users of the user models are the same real user.
Specifically, the user identification according to the similarity between the plurality of user models may include the steps of:
for each user model, calculating the similarity between the user model and other user models;
taking other user models with the similarity smaller than a first similarity threshold value as similar user models of the user models;
calculating the average user similarity of the user models and the average user similarity of similar user models of the user models;
and identifying the user according to the average user similarity of the user model and the average user similarity of the similar user model.
Further, for each user model, the similarity between the user model and other user models is calculated according to the following manner:
Figure BDA0001507654970000081
wherein, K[N,M]Representing the similarity between the user model userN and the user model userM; suserNTarget weight, S, representing user model userNuserMRepresenting the target weight of the user model userM.
Further, the difference between the target weight of each user model and the target weight of the preset reference user model is less than or equal to the weight threshold.
The multi-dimensional label library comprises label weights, and the sum of the label weights is 100. And according to the matched labels of the historical data of each category, summing the weights of the matched labels, thereby determining the initial weight (namely the sum of the weights) of the user model corresponding to the historical data of the category. And judging whether the difference between the initial weight of each user model and the target weight of the preset reference user model is smaller than or equal to a weight threshold, and if the difference between the initial weight of all the user models and the target weight of the preset reference user model is smaller than or equal to the weight threshold, taking the initial weight as the target weight of the user model. Otherwise, adjusting the label weight until the difference value between the target weight of all the user models and the target weight of the preset reference user model is less than or equal to the weight threshold. As the behavior of the user can change along with time and the attribute characteristics can also change along with the change, the algorithm can be optimized by adjusting the label weight when the user model target weight is calculated, and the accuracy is improved.
The user average similarity of the user model is determined by the similarity between the user model and the similar user model support of the user model, and may be, for example, an average value of the similarities, or a variance of the similarities. In this embodiment, the user average similarity of the user model is an average of similarities of the user model and the similar user model stent of the user model. The method of calculating the average similarity of the similar user models of the user model is as described above.
The step of identifying the user according to the user average similarity of the user model and the user average similarity of the similar user model may include:
calculating the difference between the average user similarity of the user model and the average user similarity of the similar user model;
and if the difference is smaller than the average similarity threshold of the users, the users of the user model and the users of the similar user model are the same real user.
As an example, the user model useraIncluding userb、usercUser model usercIncluding userd、userf. Then the user model useraAverage similarity of users
Figure BDA0001507654970000091
User model useraSimilar user model usercAverage similarity of users
Figure BDA0001507654970000101
Calculate S (user)a) And S (user)c) If the difference is less than or equal to the user average similarity threshold, then the user model useraUser model usercIs the same real user.
According to the user identification method provided by the embodiment of the invention, the historical data of the virtual account is classified to obtain multiple categories of historical data; constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account; according to the similarity among the user models, the technical means of user identification is carried out, so that the technical problem that the user models in the prior art are inaccurate is solved, the interference of the superposition behaviors of the users on the user models is reduced, and the technical effects of improving the accuracy of the user models and the accuracy of the user identification are achieved.
Referring to fig. 3 and 4, for a virtual account, the method according to the embodiment of the present invention first constructs a plurality of user models, and then determines whether users of the plurality of user models are the same real user according to the similarity between the plurality of user models, so as to solve the technical problems of deviation and low accuracy caused by constructing a user portrait by a multi-user superposition behavior in the prior art.
The following describes a user identification method provided by the embodiment of the present invention with reference to a specific example.
Aiming at the same virtual account, constructing a plurality of user models corresponding to the virtual account according to the steps: user1, user2, user3, user4, and user 5. The target weights are respectively: suser1=80,Suser2=82,Suser3=96,Suser4=86,Suser577. The first similarity threshold is 5% and the second similarity threshold is 1%. In the present invention, the first similarity threshold and the second similarity threshold may be set according to requirements of projects or experience, and the present invention is not limited herein.
The similarity between the user models according to the above formula (1) is shown in table 1 below:
table 1:
Figure BDA0001507654970000111
then, from the above table, the user model user1 and the user model user2 are similar user models, the user model user1 and the user model user5 are similar user models, and the user model user2 and the user model user4 are similar user models.
Then, the user average similarity between the similar user models is calculated.
The user model user1 has an average user similarity of
Figure BDA0001507654970000112
The user model user2 has an average user similarity of
Figure BDA0001507654970000113
The user average similarity of the user model user4 is 4.6%; the user model user5 has an average similarity of 3.8%.
The difference between the average user similarity of the user model user1 and the average user similarity of the user model user2 is 0.533%, so that the user of the user model user1 and the user of the user model user2 are the same real user; the difference between the average user similarity of the user model user1 and the average user similarity of the user model user4 is 1.475%, and the user of the user model user1 is not the same real user as the user of the user model user 4; the difference between the average degree of similarity of the users of the user model user1 and the average degree of similarity of the users of the user model user5 is 0.675%, and then the users of the user model user1 and the users of the user model user5 are the same real users. In summary, the users of user model user1, user model user2 and user model user5 are the same real user, the user of user model user3 is another real user, and the user of user model user4 is another real user.
According to the user identification method provided by the embodiment of the invention, the historical data of the virtual account is classified to obtain multiple categories of historical data; constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account; according to the similarity among the user models, the technical means of user identification is carried out, so that the technical problem that the user models in the prior art are inaccurate is solved, and the technical effects of reducing the interference of the superposition behaviors of the users on the user models and improving the accuracy of the user models and the accuracy of the user identification are achieved. Furthermore, a powerful data basis can be provided for rapid and accurate advertisement putting, personalized recommendation and number stealing risk monitoring of operators/enterprises.
Fig. 5 is a schematic diagram of main blocks of a user model building apparatus according to an embodiment of the present invention. As shown in fig. 5, the apparatus includes: the data classification module 501 is configured to classify historical data of the virtual account to obtain multiple categories of historical data; a model building module 502, which builds a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account; the user identification module 503 identifies users according to the similarity between the plurality of user models.
Optionally, the data classification module 501 is further configured to: classifying the historical data of the virtual account according to a time period, wherein the historical data in the same time period is a category; or classifying the historical data of the virtual account according to the time period and the unique identifier, wherein the historical data of the same unique identifier in the same time period is of one category.
Optionally, the subscriber identity module is further configured to: for each user model, calculating the similarity between the user model and other user models; taking other user models with the similarity smaller than a first similarity threshold value as similar user models of the user models; calculating the average user similarity of the user models and the average user similarity of similar user models of the user models; and identifying the user according to the average user similarity of the user model and the average user similarity of the similar user model.
Optionally, for each user model, the similarity between the user model and other user models is calculated according to the following manner:
Figure BDA0001507654970000131
wherein, K[N,M]Representing the similarity between the user model userN and the user model userM, SuserNTarget weight, S, representing user model userNuserMRepresenting the target weight of the user model userM.
Optionally, a difference between the target weight of each user model and a preset target weight of a reference user model is less than or equal to a weight threshold.
The device can execute the method provided by the embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.
Fig. 6 shows an exemplary system architecture 600 of a user identification method or user identification apparatus to which embodiments of the invention may be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 604 serves to provide a medium for communication links between the terminal devices 601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. Various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, and the like, may be installed on the terminal devices 601, 602, and 603.
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 605 may be a server that provides various services, such as a background management server that supports shopping websites browsed by users using the terminal devices 601, 602, and 603. The background management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (e.g., target push information and product information) to the terminal device.
It should be noted that the user identification method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the user identification apparatus is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a sending module, an obtaining module, a determining module, and a first processing module. The names of these modules do not form a limitation on the modules themselves in some cases, and for example, the sending module may also be described as a "module sending a picture acquisition request to a connected server".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
classifying the historical data of the virtual account to obtain historical data of a plurality of categories;
constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account;
and identifying the user according to the similarity among the plurality of user models.
According to the technical scheme of the embodiment of the invention, the historical data of the virtual account are classified to obtain multiple categories of historical data; constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account; according to the similarity among the user models, the technical means of user identification is carried out, so that the technical problem that the user models in the prior art are inaccurate is solved, the interference of the superposition behaviors of the users on the user models is reduced, and the technical effects of improving the accuracy of the user models and the accuracy of the user identification are achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. A method for identifying a user, comprising:
classifying the historical data of the virtual account to obtain historical data of a plurality of categories;
constructing a user model according to the historical data of each category to obtain a plurality of user models corresponding to the virtual account;
according to the similarity among the user models, carrying out user identification;
wherein, according to the similarity among the plurality of user models, the user identification comprises:
for each user model, determining a similar user model of the user models;
calculating the average user similarity of the user models and the average user similarity of similar user models of the user models;
according to the user average similarity of the user model and the user average similarity of the similar user model, carrying out user identification;
calculating the user average similarity of the user model and the user average similarity of the similar user models of the user model comprises: determining the user average similarity of the user model according to the similarity between the user model and the corresponding similar user model; and determining the average user similarity of the similar user models of the user models according to the similarity between the similar user models and the corresponding similar user models.
2. The method of claim 1, wherein classifying the historical data of the virtual account number comprises:
classifying the historical data of the virtual account according to a time period, wherein the historical data in the same time period is a category; or
And classifying the historical data of the virtual account according to the time period and the unique identifier, wherein the historical data with the same unique identifier in the same time period is of one category.
3. The method according to claim 1 or 2, wherein for each user model, determining similar user models of the user model comprises:
for each user model, calculating the similarity between the user model and other user models;
and taking other user models with the similarity smaller than the similarity threshold value as the similar user models of the user models.
4. A method according to claim 3, characterized in that for each user model, the similarity between the user model and the other user models is calculated according to:
Figure FDA0003078392670000021
wherein, K[N,M]Representing the similarity between the user model userN and the user model userM, SuserNTarget weight, S, representing user model userNuserMRepresenting the target weight of the user model userM.
5. The method of claim 4, wherein a difference between the target weight of each user model and a preset target weight of a reference user model is less than or equal to a weight threshold.
6. A user identification device, comprising:
the data classification module is used for classifying the historical data of the virtual account to obtain multiple categories of historical data;
the model building module is used for building a user model according to the historical data of each category so as to obtain a plurality of user models corresponding to the virtual account;
the user identification module is used for carrying out user identification according to the similarity among the plurality of user models;
wherein the subscriber identity module is further configured to: for each user model, determining a similar user model of the user models; calculating the average user similarity of the user models and the average user similarity of similar user models of the user models; according to the user average similarity of the user model and the user average similarity of the similar user model, carrying out user identification;
the subscriber identity module is further configured to: determining the user average similarity of the user model according to the similarity between the user model and the corresponding similar user model; and determining the average user similarity of the similar user models of the user models according to the similarity between the similar user models and the corresponding similar user models.
7. The apparatus of claim 6, wherein the data classification module is further configured to:
classifying the historical data of the virtual account according to a time period, wherein the historical data in the same time period is a category; or
And classifying the historical data of the virtual account according to the time period and the unique identifier, wherein the historical data with the same unique identifier in the same time period is of one category.
8. The apparatus of claim 6 or 7, wherein the subscriber identity module is further configured to:
for each user model, calculating the similarity between the user model and other user models;
and taking other user models with the similarity smaller than the first similarity threshold value as the similar user models of the user models.
9. The apparatus of claim 8, wherein for each user model, the similarity between the user model and other user models is calculated according to:
Figure FDA0003078392670000031
wherein, K[N,M]Representing the similarity between the user model userN and the user model userM, SuserNTarget weight, S, representing user model userNuserMRepresenting the target weight of the user model userM.
10. The apparatus of claim 9, wherein a difference between the target weight of each user model and a preset target weight of a reference user model is less than or equal to a weight threshold.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201711337552.1A 2017-12-14 2017-12-14 User identification method and device Active CN110020162B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711337552.1A CN110020162B (en) 2017-12-14 2017-12-14 User identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711337552.1A CN110020162B (en) 2017-12-14 2017-12-14 User identification method and device

Publications (2)

Publication Number Publication Date
CN110020162A CN110020162A (en) 2019-07-16
CN110020162B true CN110020162B (en) 2021-09-03

Family

ID=67187027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711337552.1A Active CN110020162B (en) 2017-12-14 2017-12-14 User identification method and device

Country Status (1)

Country Link
CN (1) CN110020162B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765973B (en) * 2019-10-31 2023-07-04 上海掌门科技有限公司 Account type identification method and device
CN112989179B (en) * 2019-12-13 2023-07-28 北京达佳互联信息技术有限公司 Model training and multimedia content recommendation method and device
CN111310028A (en) * 2020-01-19 2020-06-19 浙江连信科技有限公司 Recommendation method and device based on psychological characteristics
CN111310035A (en) * 2020-02-03 2020-06-19 浙江连信科技有限公司 Recommendation method and device based on psychological and behavioral characteristics
CN112464106B (en) * 2020-11-26 2022-12-13 上海哔哩哔哩科技有限公司 Object recommendation method and device
CN112804567B (en) * 2021-01-04 2023-04-21 青岛聚看云科技有限公司 Display equipment, server and video recommendation method
CN112950295B (en) * 2021-04-21 2024-03-19 北京大米科技有限公司 Method and device for mining user data, readable storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646132A (en) * 2012-03-26 2012-08-22 中国联合网络通信集团有限公司 Method and device for recognizing attributes of broadband users
CN102970587A (en) * 2012-12-02 2013-03-13 北京中科大洋科技发展股份有限公司 Multi-user account realizing method suitable for OTT (Over The Top) internet television
CN105373614A (en) * 2015-11-24 2016-03-02 中国科学院深圳先进技术研究院 Sub-user identification method and system based on user account
CN105430504A (en) * 2015-11-27 2016-03-23 中国科学院深圳先进技术研究院 Family member mix identification method and system based on television watching log mining

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040190688A1 (en) * 2003-03-31 2004-09-30 Timmins Timothy A. Communications methods and systems using voiceprints
US20060077431A1 (en) * 2004-10-08 2006-04-13 Sharp Laboratories Of America, Inc. Methods and systems for imaging device concurrent account use

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646132A (en) * 2012-03-26 2012-08-22 中国联合网络通信集团有限公司 Method and device for recognizing attributes of broadband users
CN102970587A (en) * 2012-12-02 2013-03-13 北京中科大洋科技发展股份有限公司 Multi-user account realizing method suitable for OTT (Over The Top) internet television
CN105373614A (en) * 2015-11-24 2016-03-02 中国科学院深圳先进技术研究院 Sub-user identification method and system based on user account
CN105430504A (en) * 2015-11-27 2016-03-23 中国科学院深圳先进技术研究院 Family member mix identification method and system based on television watching log mining

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种面向共享账号的个性化推荐算法;李伟等;《计算机应用研究》;20171010;正文第2912-2919页 *

Also Published As

Publication number Publication date
CN110020162A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN110020162B (en) User identification method and device
US9183270B2 (en) Social genome
CN107426328B (en) Information pushing method and device
US11741094B2 (en) Method and system for identifying core product terms
CN110866040B (en) User portrait generation method, device and system
CN112925973B (en) Data processing method and device
CN110929136A (en) Personalized recommendation method and device
CN110827101B (en) Shop recommending method and device
CN107977678A (en) Method and apparatus for output information
CN108197298A (en) A kind of smart shopper exchange method and system based on natural language processing
CN111782850B (en) Object searching method and device based on hand drawing
CN112749323B (en) Method and device for constructing user portrait
CN108932640B (en) Method and device for processing orders
CN113495991A (en) Recommendation method and device
CN112330382A (en) Item recommendation method and device, computing equipment and medium
CN112529646A (en) Commodity classification method and device
CN113450172B (en) Commodity recommendation method and device
US20180129664A1 (en) System and method to recommend a bundle of items based on item/user tagging and co-install graph
CN113313542B (en) Method and device for pushing channel pages
CN110544140A (en) method and device for processing browsing data
CN113762994B (en) User operation management method and device
CN113111132B (en) Method and device for identifying target user
CN111125501B (en) Method and device for processing information
CN113763112A (en) Information pushing method and device
CN111782776A (en) Method and device for realizing intention identification through slot filling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant