CN112256973A - User portrait correction method, device, medium, and electronic apparatus - Google Patents

User portrait correction method, device, medium, and electronic apparatus Download PDF

Info

Publication number
CN112256973A
CN112256973A CN202011215640.6A CN202011215640A CN112256973A CN 112256973 A CN112256973 A CN 112256973A CN 202011215640 A CN202011215640 A CN 202011215640A CN 112256973 A CN112256973 A CN 112256973A
Authority
CN
China
Prior art keywords
user
users
group
feature vector
portrait
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011215640.6A
Other languages
Chinese (zh)
Other versions
CN112256973B (en
Inventor
陈迪
郭凯
李嘉晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seashell Housing Beijing Technology Co Ltd
Original Assignee
Beike Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beike Technology Co Ltd filed Critical Beike Technology Co Ltd
Priority to CN202011215640.6A priority Critical patent/CN112256973B/en
Publication of CN112256973A publication Critical patent/CN112256973A/en
Application granted granted Critical
Publication of CN112256973B publication Critical patent/CN112256973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/16Real estate

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A user portrait correction method, apparatus, medium, and electronic device are disclosed. The method mainly comprises the following steps: acquiring user description information of a plurality of users; respectively mapping user description information of a plurality of users into feature vectors; selecting users with the same feature vector segment from a plurality of users aiming at any feature vector segment contained in the feature vector, and dividing the users with the same feature vector segment into a group; determining a similar user group of each user according to the similarity between the feature vectors of different users in the group; for any user, determining the confidence of the user portrait of the user according to the consistency of the feature vector of the user and the feature vectors of the users in the similar user group of the user; and correcting the user portrait of the user with the reliability not meeting the preset reliability requirement according to the user portrait of the user in the similar user group. The method and the device are beneficial to better meeting the user requirements and finally improving the occurrence probability of the target behaviors.

Description

User portrait correction method, device, medium, and electronic apparatus
Technical Field
The present disclosure relates to computer technologies, and in particular, to a user portrait correction method, a user portrait correction device, a storage medium, and an electronic apparatus.
Background
The user portrait can depict the target user and describe the user appeal, so that the user portrait can be used in applications needing to provide personalized services for the user, and the services provided for the user can better meet the user requirements. How to improve the accuracy of user portrayal is a technical problem of great concern.
Disclosure of Invention
The present disclosure is proposed to solve the above technical problems. Embodiments of the present disclosure provide a user portrait correction method, a user portrait correction apparatus, a storage medium, and an electronic device.
According to an aspect of an embodiment of the present disclosure, there is provided a user portrait correction method, including: acquiring user description information of a plurality of users; wherein the user description information includes: user tags, user behavioral characteristics, and user portraits; mapping the user description information of the plurality of users into feature vectors respectively; wherein the feature vector comprises a plurality of feature vector segments; selecting users with the same feature vector segment from the plurality of users aiming at any feature vector segment contained in the feature vector, and dividing the selected users with the same feature vector segment into a group; determining a similar user group of each user according to the similarity between the feature vectors of different users in the group; for any user, determining the confidence of the user portrait of the user according to the consistency of the feature vector of the user and the feature vectors of the users in the similar user group of the user; and correcting the user portrait of the user with the reliability not meeting the preset reliability requirement according to the user portrait of the user in the similar user group.
In an embodiment of the present disclosure, the user tag includes: the user static attribute tag, the user value tag and the user state tag; the acquiring user description information of a plurality of users includes: acquiring a user static attribute label of each user and user behavior characteristics of each user according to the user operation data; and performing data mining processing on the user static attribute labels of the users and the user behavior characteristics of the users to obtain user value labels and user state labels of the users.
In another embodiment of the present disclosure, the obtaining the user static attribute tag of each user and the user behavior feature of each user according to the user operation data includes: according to the user operation data, carrying out behavior quantity statistics on various behaviors which are executed by each user according to each preference enumeration value respectively, and obtaining the behavior times of each user for executing various behaviors according to each preference enumeration value respectively; and aiming at any user, taking the behavior times of the user executing various behaviors aiming at each preference enumeration value as the user behavior characteristics of the user.
In another embodiment of the present disclosure, the determining a similar user group of each user according to a similarity between feature vectors of different users in a group includes: selecting a part of dimensional feature vectors from the feature vectors of any user in the group according to a minimum hash method; calculating the similarity between the feature vector of the partial dimension of the user and the feature vectors of the partial dimensions of other users in the group of the user; and taking the user with the similarity meeting the preset similarity requirement as the user in the similar user group of the user.
In another embodiment of the present disclosure, the determining, for any user, a confidence level of the user representation of the user according to a consistency of the feature vector of the user and feature vectors of users in a similar user group of the user includes: aiming at any user, calculating the single-behavior cross entropy mean value of the feature vector of the user and the feature vector of each user in the similar user group of the user respectively to obtain a plurality of single-behavior cross entropy mean values; and calculating the mean value of the cross entropies of the plurality of single behaviors to obtain the confidence coefficient of the user portrait of the user.
In another embodiment of the present disclosure, the calculating the mean of the single behaviors cross entropy to obtain the confidence of the user portrait of the user includes: and taking the reciprocal of the sum of all single-behavior cross entropy averages of the user as the confidence of the user portrait of the user.
In yet another embodiment of the present disclosure, the modifying, based on user profiles of users in a group of similar users, a user profile of a user whose confidence level does not meet a preset confidence level requirement includes: determining a confidence threshold corresponding to a preset scene; and for any user with the confidence coefficient not reaching the confidence coefficient threshold value, acquiring user portrait mean values of all users in the similar user group of the user, and determining the user portrait of the user according to the user portrait adjustment parameter of the user, the user portrait of the user and the user portrait mean values.
According to another aspect of the disclosed embodiments, there is provided a user portrait correction apparatus, including: the acquisition description information module is used for acquiring user description information of a plurality of users; wherein the user description information includes: user tags, user behavioral characteristics, and user portraits; the mapping module is used for mapping the user description information of the users into feature vectors respectively; wherein the feature vector comprises a plurality of feature vector segments; a group division module, configured to select, for any feature vector segment included in the feature vector, users having the same feature vector segment from the multiple users, and divide the selected users having the same feature vector segment into a group; a user group determining module for determining a similar user group of each user according to the similarity between the feature vectors of different users in the group; the confidence determining module is used for determining the confidence of the user portrait of any user according to the consistency of the feature vector of the user and the feature vectors of the users in the similar user group of the user; and the user portrait correcting module is used for correcting the user portrait of the user with the reliability not meeting the preset reliability requirement according to the user portrait of the user in the similar user group.
In an embodiment of the present disclosure, the user tag includes: the user static attribute tag, the user value tag and the user state tag; the description information acquisition module comprises: the first sub-module is used for acquiring the user static attribute labels of all users and the user behavior characteristics of all users according to the user operation data; and the second sub-module is used for carrying out data mining processing on the user static attribute labels of the users and the user behavior characteristics of the users to obtain the user value labels and the user state labels of the users.
In yet another embodiment of the present disclosure, the first sub-module is further configured to: according to the user operation data, carrying out behavior quantity statistics on various behaviors which are executed by each user according to each preference enumeration value respectively, and obtaining the behavior times of each user for executing various behaviors according to each preference enumeration value respectively; and aiming at any user, taking the behavior times of the user executing various behaviors aiming at each preference enumeration value as the user behavior characteristics of the user.
In still another embodiment of the present disclosure, the determining a user group module includes: the third sub-module is used for selecting a part of dimensional feature vectors from the feature vectors of any user in the group according to a minimum hash method; the fourth submodule is used for calculating the similarity between the feature vector of the partial dimension of the user and the feature vectors of the partial dimensions of other users in the group where the user is located; and the fifth sub-module is used for taking the user with the similarity meeting the preset similarity requirement as the user in the similar user group of the user.
In yet another embodiment of the present disclosure, the confidence determining module includes: the sixth sub-module is used for calculating the single-behavior cross entropy mean value of the feature vector of the user and the feature vector of each user in the similar user group aiming at any user to obtain a plurality of single-behavior cross entropy mean values; and the seventh sub-module is used for calculating the mean value of the cross entropies of the plurality of single behaviors to obtain the confidence coefficient of the user portrait of the user.
In yet another embodiment of the present disclosure, the seventh sub-module is further configured to: and taking the reciprocal of the sum of all single-behavior cross entropy averages of the user as the confidence of the user portrait of the user.
In yet another embodiment of the present disclosure, the revise user profile module is further configured to: determining a confidence threshold corresponding to a preset scene; and for any user with the confidence coefficient not reaching the confidence coefficient threshold value, acquiring user portrait mean values of all users in the similar user group of the user, and determining the user portrait of the user according to the user portrait adjustment parameter of the user, the user portrait of the user and the user portrait mean values.
According to still another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the user portrait correction method.
According to still another aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; the processor is used for reading the executable instruction from the memory and executing the instruction so as to realize the user portrait correction method.
Based on the method and the device for correcting the user portrait provided by the embodiment of the disclosure, the similar user group of the user is determined by using the feature vector mapped by the user description information, and because the user label, the user behavior feature, the user portrait and other contents included in the user description information can describe the user from multiple angles, the method and the device are beneficial to ensuring the accuracy of the similar user group; when the feature vector of the user is a high-dimensional sparse feature vector, a similar user group of the user is obtained from all users based on the feature vector of the user, and the calculation amount is very large. According to the method, the characteristic vectors are divided into the plurality of segments, and the users with the same characteristic vector segment are taken as one group, so that not only can the group division be conveniently realized, but also the users in the group can be ensured to have certain similarity. By determining the similar user group of the user from all the users in a group, the calculation amount of obtaining the similar user group of the user can be reduced to a greater extent, so that the calculation resources can be saved, and the feasibility of the user portrait correction scheme can be improved; the confidence coefficient of the user portrait of the user is measured by utilizing the consistency of the feature vectors of the users in the similar user group, which is favorable for objectively evaluating the confidence coefficient of the user portrait and avoiding the influence of the user portrait with larger difference on the confidence coefficient evaluation of the user portrait; by using the user portrait of the users in the similar user group, the user portrait of the user with lower reliability is corrected, which is beneficial to enabling the user portrait to reflect the characteristics of the user more truly. Therefore, the technical scheme provided by the disclosure is beneficial to better meeting the user requirements and finally improving the occurrence probability of the target behaviors.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
FIG. 1 is a flow diagram of one embodiment of a user representation correction method of the present disclosure;
FIG. 2 is a flow diagram of one embodiment of the present disclosure for determining a similar user group of users;
FIG. 3 is a flow diagram of one embodiment of the present disclosure for determining a confidence level for a user representation of each user;
FIG. 4 is a flow diagram of an embodiment of the present disclosure for revising a user representation of a user whose confidence level does not meet a preset confidence level requirement;
FIG. 5 is a schematic diagram illustrating an embodiment of a user portrayal confidence and CTR relationship according to the present disclosure;
FIG. 6 is a schematic diagram illustrating an embodiment of a user profile correction apparatus according to the present disclosure;
fig. 7 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more than two and "at least one" may refer to one, two or more than two.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships, such as a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Embodiments of the present disclosure may be implemented in electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with an electronic device, such as a terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Summary of the disclosure
In implementing the present disclosure, the inventors discovered that the network access behavior of the user generally plays a very important role in the process of building a user representation. For example, a user representation is typically set for the user based on network access behavior performed by the user over the last N days (e.g., 180 days). However, in application scenarios where the frequency of the network access behavior of the user is low, the time span of the two network access behaviors is large, the number of the core access behaviors of the user is small, and the quality of the core access behavior of the user is low, the reliability of the user portrait set for the user is often poor. The core access behavior may refer to a specific network access behavior, such as a browsing behavior of a user on a detailed page of a target object in a website, a business behavior generated by the user, or a conversation behavior between the user and a conversation party, which is important. The reliability of the portrait of the user is poor, and the personalized service of the user is prone to deviation, so that the waste of advertisement putting resources, the waste of business opportunities, the waste of target exhibition places and other resource waste phenomena can be caused.
Brief description of the drawings
An example of an application scenario of the user portrait correction technique provided by the present disclosure is as follows:
in the real estate field, user portraits can be set for users with core access behaviors in the last N days (such as the last 180 days) at a fixed time (such as 1 point in the morning) every day, and the confidence of the user portraits of each user can be determined.
For the recommended scene of the floor information, a predetermined confidence requirement (such as a confidence threshold corresponding to the scene) corresponding to the scene can be obtained in advance. And directly storing the user portraits of the part of users for the users (such as the users with the confidence coefficient of the user portraits reaching the confidence coefficient threshold value corresponding to the scene) with the confidence coefficient of the user portraits in the detection result meeting the preset confidence coefficient requirement corresponding to the scene. For users whose confidence levels of the user images in the detection result do not meet the predetermined confidence level requirement corresponding to the scene (for example, users whose confidence levels of the user images do not meet the confidence level threshold corresponding to the scene), the user images of the part of users should be corrected first, and then the corrected user images of the part of users are stored.
Assume that there are currently a plurality of users accessing a website provided by a real estate company using respective terminal devices (e.g., a computer or a smart mobile phone, etc.). The network side can judge whether each user in a plurality of users presents user portrait or not, and for the user presenting the user portrait or the user presenting the user portrait and having higher confidence coefficient of the user portrait, the network side can determine the floor information conforming to the user preference according to the user preference in the user portrait of the user and push the floor information to the terminal equipment of the corresponding user. And for the user without user portrait or the user with user portrait and low confidence coefficient of user portrait, the network side can push the current highly popular building information to the terminal equipment of the corresponding user.
Exemplary method
FIG. 1 is a flowchart illustrating a user portrait correction method according to an embodiment of the present disclosure. The method of the embodiment shown in fig. 1 comprises the steps of: s100, S101, S102, S103, S104, and S105. The following describes each step.
S100, obtaining user description information of a plurality of users.
Multiple users in the present disclosure may refer to users that have network access behavior within the last N days (e.g., the last 180 days). For example, there are users of core access behavior within the last N days (e.g., the last 180 days). The core access behavior in the present disclosure may refer to a specific network access behavior, such as a browsing behavior of a user on a detailed page of an object in a website, a business opportunity generation behavior of the user, or a conversation behavior of the user with a conversation party, which is important.
The subject matter in the present disclosure may appear as different content in different application scenarios. For example, in the field of property, the subject matter of the present disclosure may be a house. As another example, in the field of retail goods, the subject matter of the present disclosure can be retail goods, and the like.
User description information in the present disclosure may refer to information for presenting one user from a variety of different angles. The user description information of the present disclosure may include at least: user tags, user behavior characteristics, user portrayal, and the like.
User tags in the present disclosure are typically formed by abstracting an overview and classifying one or more of the features that are classifiable to all users. A particular value of a user tag typically represents a particular category.
User behavior characteristics in the present disclosure may refer to information that is used to describe characteristics of a user's network access behavior. For example, a user behavior feature in the present disclosure may refer to information that is used to describe characteristics of a user's core access behavior.
A user profile in this disclosure may refer to information that is used to describe characteristics of a user. For example, a user profile may describe the user's interests based on various preference enumeration values. Preference enumeration values in this disclosure may include: an enumerated value based on an attribute of the subject, and an enumerated value based on the business. That is, for a target object in an application field, the user profile may reflect a degree of preference or a degree of inclination of the user with respect to each enumerated value of the attribute information of the target object, and for a service in an application field, the user profile may reflect a degree of preference or a degree of inclination of the user with respect to each enumerated value of the service. The attribute information of the subject matter in the present disclosure may refer to information for describing features possessed by the subject matter itself. For the real estate domain, enumerated values for the services in this disclosure may include: a new room service enumeration value, a second-hand room service enumeration value, a lease service enumeration value, and the like. The present disclosure does not limit the number of enumerated values and the specific content of the user representation.
A more specific example of the enumerated values of the attribute information of the subject matter in the present disclosure is assumed that the subject matter in the present disclosure is a house, and the attribute information of the house may include: attribute elements such as house property, house location, house area, hall structure, house type, and house structure.
The house properties may include: new house and second house, etc., where the house locations may include: enumerated values within the second ring, between the second ring and the third ring, between the third ring and the fourth ring, between the fourth ring and the fifth ring, between the fifth ring and the sixth ring, and outside the sixth ring.
The floor space may include: within 40 square meters, 40-60 square meters, 60-80 square meters, 80-100 square meters, 100-140 square meters, and more than 140 square meters.
The hall structure may include: enumerated values such as a bay, a one room and a one hall, a two room and a one hall, a three room and a one hall, a four room and a one hall, and a five room and a one hall.
The house types may include: enumerated values for ordinary homes and villas.
The building structure may include: enumerated values for brick-concrete structures and non-brick-concrete structures.
With the above assumptions, enumerated values of attribute information of the subject matter in the present disclosure may include: new house, second-hand house, two-ring inside, two-ring to three-ring, three-ring to four-ring, four-ring to five-ring, five-ring to six-ring, six-ring outside, 40 square meter inside, 40-60 square meter, 60-80 square meter, 80-100 square meter, 100 plus 140 square meter, 140 square meter above, one room and one hall, open room, two rooms and one hall, three rooms and one hall, four rooms and one hall, five rooms and one hall, brick-concrete structure, non-brick-concrete structure, common house and villa. The enumerated value of the item attribute information in this disclosure may also be referred to as a user preference enumerated value.
In different application fields, the enumerated value of the attribute information of the target object in the present disclosure may be different according to the target object provided by the website, and the present disclosure does not limit the specific number, specific content, and the like of the enumerated value of the attribute information of the target object.
According to the method and the device, the user description information of all the users can be acquired according to the user operation data. The user operation data in the present disclosure may refer to user operation data generated by the user himself/herself and/or a network-side maintenance person or the like through operation of a device such as a computer or a smart mobile phone to describe a user behavior. The user operation data in the present disclosure may include: an operation log or an access log formed on the server side. The present disclosure is not limited thereto.
S101, mapping user description information of a plurality of users into feature vectors respectively.
The method and the device can map the user description information of a plurality of users into a specific feature space respectively, thereby forming a feature vector of each user. The dimensions of the feature vectors in the present disclosure are typically high, e.g., the dimensions of the feature vectors may reach thousands of dimensions, etc. The feature vectors in this disclosure are typically high-dimensional sparse feature vectors. The user description information can be mapped into the feature vector by adopting various existing mapping modes, and the specific implementation mode of the mapping is not limited by the disclosure.
The feature vector in the present disclosure includes a plurality of feature vector segments, that is, the present disclosure can divide the feature vector of the user into a plurality of feature vector segments, each feature vector segment includes at least one-dimensional vector, and the dimensions of the vectors included in different feature vector segments may be the same or different.
S102, aiming at any characteristic vector segment contained in the characteristic vector, selecting users with the same characteristic vector segment from a plurality of users, and dividing the selected users with the same characteristic vector segment into a group.
For any feature vector segment contained in the feature vector of the user, the present disclosure may select users with the same feature vector segment from all users, and divide all the selected users into the same group. One of the groups may be considered a bucket.
All users in the same group in this disclosure are candidates for a similar user group. The feature vectors of all users in the same group in this disclosure are at least partially identical. That is, the feature vectors of all users in the same group may not be identical, but some portion may be identical.
The feature vector of the user in the present disclosure includes b (b is an integer greater than 1) feature vector segments. In one example, all feature vector segments include the same vector dimension, e.g., all feature vector segments include r-dimension vectors, i.e., the present disclosure can divide the user's feature vector equally into b feature vector segments. In another example, the vector dimensions of different feature vector segments may be different, for example, some feature vector segments include r-dimensional vectors, some feature vector segments include r + r1(r1 is an integer greater than 1, and r is an integer greater than r 1) dimensional vectors, and some feature vector segments include r-r1 dimensional vectors, that is, the present disclosure employs an unequal manner for segmenting the feature vectors of the user.
In a more specific example, assume that there are n users, and assume that the feature vectors of the users are divided into b feature vector segments. With the above assumption, if m1 users among the n users have the same 1 st eigenvector segment, the m1 users are divided into one group; if m2 users of the n users have the same 2 nd eigenvector segment, the m2 users are divided into one group; … …, and so on, if mb users out of n users have the same b-th eigenvector segment, the mb users are divided into one group.
It should be noted that a user in the present disclosure may belong to multiple groups at the same time. For example, if the 1 st eigenvector segment of the 1 st user is the same as the 1 st eigenvector segment of the 2 nd user, the 1 st user and the 2 nd user belong to the 1 st group. If the 2 nd eigenvector segment of the 1 st user is the same as the 2 nd eigenvector segment of the 3 rd user, the 1 st user and the 3 rd user belong to the 2 nd group.
S103, determining a similar user group of each user according to the similarity between the feature vectors of different users in the group.
The similar user group of the user in the present disclosure may refer to a user set formed by users whose similarity degree with the feature vector of the user meets a certain requirement. The method and the device can determine the similar user group of the users by calculating the similarity degree of the feature vectors of different users. The similar user group of a user in the present disclosure may also be referred to as a seed user group of the user, etc.
In one example, assume that the similarity between the feature vectors of two users is t, and each feature vector segment includes r-dimensional vectors, i.e., each feature vectorThe quantity segments both comprise r rows, and the probability that the r rows (i.e. r-dimensional vectors) in any feature vector segment of the two users are identical is trThe probability that at least one line (i.e. one-dimensional vector) in any feature vector segment of the two users is different is (1-t)r) The probability that all the feature vector segments of the two users are different is (1-t)r)bThe probability that at least one of the feature vector segments of the two users is identical is 1- (1-t)r)bThe present disclosure may control the time taken to determine the similar user group of users, i.e., the efficiency of determining the similar user group of users, by controlling the values of b and r.
S104, aiming at any user, determining the confidence of the user portrait of the user according to the consistency of the feature vector of the user and the feature vectors of the users in the similar user group of the user.
Confidence in a user representation in this disclosure may refer to the accuracy of user interests (e.g., user preferences) reflected in the user representation. For example, the confidence level of a user representation in the present disclosure may be a parameter used to measure whether the user representation truly and accurately reflects the interests of the user. For example, the confidence level of the user representation may be a parameter for measuring whether the user representation truly and accurately reflects the preference of the user on each enumerated value of the attribute information of the object and the preference of the user on the service enumerated value. Confidence in a user representation in this disclosure may also be referred to as accuracy or reliability of the user representation, or the like.
For a user, the consistency of the feature vector of the user with the feature vectors of the users in the similar user group of the user may refer to an index for measuring the difference between the feature vector of the user and the feature vectors of the users in the similar user group of the user.
The present disclosure may use a feature vector calculation method such as a difference between feature vectors, cross entropy, or square loss to represent the consistency, and thus the consistency may be used as a confidence of a user portrait of each user.
S105, correcting the user image of the user with the reliability not meeting the preset reliability requirement according to the user image of the user in the similar user group.
The user portrait of the user with the confidence coefficient not meeting the preset confidence coefficient requirement can be screened out by judging the confidence coefficient of the user portrait of each user obtained currently, and the user portrait of the user with the confidence coefficient not meeting the requirement can be corrected by utilizing the user portrait of the user in the corresponding similar user group. For example, a specific value of each preference enumeration value is obtained by performing calculation such as weighted average on user images of users in a similar user group, and the value of each preference enumeration value in the user images of the corresponding users is updated by using the obtained specific value of each preference enumeration value, thereby realizing correction of the user images.
According to the method, the similar user group of the user is determined by utilizing the feature vector mapped by the user description information, and the user label, the user behavior feature, the user portrait and other contents included in the user description information can describe the user from multiple angles, so that the method is favorable for ensuring the accuracy of the similar user group; when the feature vector of the user is a high-dimensional sparse feature vector, a similar user group of the user is obtained from all users based on the feature vector of the user, and the calculation amount is very large. According to the method, the characteristic vectors are divided into the plurality of segments, and the users with the same characteristic vector segment are taken as one group, so that not only can the group division be conveniently realized, but also the users in the group can be ensured to have certain similarity. By determining the similar user group of the user from all the users in a group, the calculation amount of obtaining the similar user group of the user can be reduced to a greater extent, so that the calculation resources can be saved, and the feasibility of the user portrait correction scheme can be improved; the confidence coefficient of the user portrait of the user is measured by utilizing the consistency of the feature vectors of the users in the similar user group, which is favorable for objectively evaluating the confidence coefficient of the user portrait and avoiding the influence of the user portrait with larger difference on the confidence coefficient evaluation of the user portrait; by using the user portrait of the users in the similar user group, the user portrait of the user with lower reliability is corrected, which is beneficial to enabling the user portrait to reflect the characteristics of the user more truly. Therefore, the technical scheme provided by the disclosure is beneficial to better meeting the user requirements and finally improving the occurrence probability of the target behaviors.
In one optional example, the user tags of the present disclosure include at least: a user static attribute tag, a user value tag, and a user status tag.
The user static attribute tag may refer to: a tag for indicating a category to which the static attribute of the user belongs. The static attributes of the user may include: the attributes the user has himself. The static attributes of the user may also include: attributes that the user's belongings (e.g., terminal devices, etc.) have, etc. In one example, the user static property tags of the present disclosure may include: user social attribute tags, user device environment tags, and the like. The user social attribute tag may include: the geographic location of the user (e.g., the city in which the user is located), the age of the user, the gender of the user, and the profession affiliated with the user, among others. The user equipment environment tag may include: the user equipment comprises the installation number of applications (such as APPs) in the user equipment, user registration time (such as the registration time of a user in the APPs provided by the disclosure), the types of devices used by the user in a habit (such as a smart mobile phone, a tablet computer or a desktop computer, etc.), the types of applications of the APPs provided by the disclosure installed by the user equipment (such as APP1 and APP2 provided by the disclosure, the user type can indicate that the user equipment is provided with APP1, APP2 or APP1 and APP2), services related to competitive products APP installed by the user equipment (such as in the housing estate field, the services can include new house services, second-hand house services, rental services, etc.), and the like. The user equipment environment tag in the present disclosure may also be referred to as user behavior attribute information or the like. The present disclosure does not limit the specific content included in the user social attribute tags and the user device environment tags.
The user value label may refer to: tags describing the possibilities a user embodies in performing a target behavior. The user value labels in this disclosure may be embodied using the value of various types of actions performed by the user. As an example, the types of behaviors in the present disclosure may include: the method comprises the following steps of detail page browsing behavior, house source searching behavior, house source sharing behavior, information pushing clicking behavior, house source attention behavior, user business opportunity generating behavior, user hotline dialing behavior, user house watching behavior, user entrusting behavior and the like. The present disclosure may set a corresponding value tag for each of the above types of behaviors of the user, respectively. For example, each type of behavior may correspond to a high value tag, a medium value tag, and a low value tag, three levels of tags. The target behavior of the present disclosure may be set according to the specific requirements of the actual application scenario. For example, the target behavior may be a subject matter transaction behavior, a successful commission behavior, a contact address leaving behavior of a user, or the like, which is not limited by the present disclosure. The number of behavior categories, the specific expression of the behavior categories, the number of levels of the value labels, and the like are not limited by the present disclosure.
The user status tag may refer to a tag indicating a current status of the user. The present disclosure is generally preset with a plurality of states, each state corresponds to a tag, and a user state tag can identify one of the states. This disclosure can preset multiple user state according to the practical application scene, for example, to the real estate field, this disclosure can set up six kinds of states at least, is respectively: an online active state, an online mature state, an offline active state, an offline mature state, a cross state, a post-cross silent state, and the like. The present disclosure does not limit the number of user states set in advance and the concrete expression form of each state.
Optionally, the static attribute tags of the users can be obtained according to the user operation data. For example, the user identification is used to obtain the user operation data of the user from the user operation data, and identify the content of the predetermined field in the user operation data of the user (such as gender identification, age identification, etc.), and obtain the static attribute tag of the user according to the determination result. Meanwhile, the user behavior characteristics of each user (the user behavior characteristics can also be called as user dynamic behavior characteristics and the like) can be obtained according to the user operation data. The user behavior characteristics may refer to information that changes with operations performed by the user. For example, the behavior information of the user is obtained from the user operation data by using the user identifier and the behavior identifier, and the user behavior characteristics of the user are obtained by performing corresponding processing on the behavior information of the user. Then, the data mining processing can be carried out on the user static attribute labels of the users and the user behavior characteristics of the users, so that the user value labels of the users and the user state labels of the users can be obtained. That is, the user value tags and user status tags in this disclosure belong to mining class tags. The present disclosure does not limit the specific implementation of the data mining process.
According to the method and the device, the user value label and the user state label are mined by utilizing the user static attribute label and the user behavior characteristic, and the users can be classified from different angles such as the user static attribute, the user value and the user state, so that the users can be accurately depicted, and the similar user group of the users can be accurately determined.
In an optional example, for any user, the implementation manner of the present disclosure to obtain the user behavior feature of the user is as follows: according to the user operation data, carrying out behavior quantity statistics on various behaviors which are executed by the user respectively aiming at all preference enumeration values in the user portrait, thereby obtaining the behavior times of the user respectively aiming at all preference enumeration values to execute various behaviors; the present disclosure may use the number of times that the user performs various types of behaviors with respect to all preference enumeration values as the user behavior feature of the user.
Optionally, for any user, the disclosure may obtain all behaviors of the user in a period of time (for example, all behaviors of the last N days) from the user operation data, and then, the disclosure may perform statistics of behavior amounts on all behaviors of the user in units of preference enumeration values, for example, if all behaviors of the user relate to N1 different preference enumeration values in total and all behaviors of the user relate to N2 types of behaviors in total, the disclosure may obtain N1 × N2 behavior times by performing N1 × N2 times of statistics, and each behavior time corresponds to a behavior amount statistical result of a type of behavior on one preference enumeration value.
As a more specific example, assume that all behaviors of a user include m1+ m2 total behaviors; wherein m1 actions each relate to a preference enumeration value a (e.g., new house), and m1 actions relate to house detail page view-like actions and user-generated business-like actions, assuming that m11 actions belong to house detail page view-like actions, and m12 actions belong to user-generated business-like actions; where m2 actions all relate to a preference enumeration value b (e.g., second-hand house), and m2 actions relate to both house detail page view class actions and user-generated business opportunity class actions, assuming that m21 actions belong to the house detail page view class actions, and m22 actions belong to the user-generated business opportunity class actions. Under the above assumption, the present disclosure may perform 4 behavior amount statistics on m1+ m2 behaviors of the user to obtain 4 behavior times, where the first behavior time is: and (3) the behavior quantity statistical result of the house detail page browsing type behaviors of the user on the preference enumeration value a, wherein the second behavior frequency is as follows: the user generates a behavior quantity statistical result of the merchant behavior on the preference enumeration value a, wherein the third behavior frequency is as follows: and (3) the behavior quantity statistical result of the house detail page browsing type behaviors of the user on the preference enumeration value b, wherein the fourth behavior frequency is as follows: the user on the preference enumeration value b generates a behavior quantity statistic result of the business machine type behavior.
According to the method and the device, the users can be classified from the aspect of user behaviors by obtaining the user behavior characteristics of each user, so that the users can be depicted more accurately, and the similar user groups of the users can be determined more accurately.
In an alternative example, one example of the present disclosure determining a similar user group of users is illustrated in FIG. 2.
In fig. 2, S200, for any user in the group, a feature vector of a partial dimension is selected from feature vectors of the user.
Optionally, the present disclosure may select a feature vector of a partial dimension from the feature vectors of the user by using a minimum hash (MinHashing) method, etc. The present disclosure may also adopt other methods to select a feature vector of a partial dimension from the feature vectors of the user, which is not limited by the present disclosure.
S201, calculating the similarity between the feature vector of the partial dimension of the user and the feature vectors of the partial dimensions of other users in the group where the user is located.
Optionally, the present disclosure may utilize the Jaccard coefficient to measure the similarity between the selected feature vectors of the two users. Specifically, the similarity between the selected feature vectors of the two users can be represented by the following formula (1):
Figure BDA0002760290400000151
in the above formula (1), J (a, B) represents the Jaccard coefficients of two users, a represents the selected feature vector of one of the two users; b denotes the selected feature vector of the other of the two users.
S202, taking the user with the similarity meeting the preset similarity requirement as the user in the similar user group of the user.
Optionally, for any user, the present disclosure may rank the multiple similarities calculated for the user, and form a similar user group for the user by using the top N users with the highest similarity in the ranking.
Optionally, for any user, the present disclosure may form a similar user group of the user by using all users whose similarity is greater than a predetermined similarity.
According to the method and the device, part of feature vectors are selected from the feature vectors of the user, and the similar user group of the user is determined by using the part of feature vectors, so that the calculation amount of obtaining the similar user group of the user can be reduced to a large extent; in particular, a minimum hash method is adopted to select partial feature vectors from the feature vectors of the user, so that the reasonability of the selected partial feature vectors is guaranteed. Based on the above, the present disclosure is beneficial to saving computing resources, thereby being beneficial to improving the implementability of the user portrait correction scheme.
In an alternative example, one example of the present disclosure determining a confidence level for a user's user imagery is shown in FIG. 3.
In fig. 3, S300, for any user, a single-behavior cross entropy mean value of the feature vector of the user and the feature vector of each user in the similar user group is calculated, and a plurality of single-behavior cross entropy mean values of the user are obtained.
Alternatively, the single-action cross-entropy mean in this disclosure may be considered as the cross-entropy considering the contribution of feature similarity and action times. The single-behavior cross-entropy mean in this disclosure may be referred to as a cross-entropy based on the number of behaviors. One single-behavior cross entropy mean in this disclosure is taken as the similarity of a user with the user representation of a user in its similar user group. The present disclosure may calculate the single-behavior cross entropy mean of the feature vectors of two users using the following equation (2):
Figure BDA0002760290400000161
in the above formula (2), H (p, q) represents a one-time behavior cross entropy mean of feature vectors of a user and a user in its similar user group, that is, a similarity of user images of a user and a user in its similar user group; n represents the dimension of the feature vector of the user; w is ajRepresenting the behavior times corresponding to the one-dimensional vector; p (x)i) An ith-dimension vector representing one of the two users; q (x)i) An ith dimension vector representing the other of the two users.
S301, calculating the mean value of the cross entropies of the plurality of single behaviors of the user to obtain the confidence of the user portrait of the user.
Optionally, the disclosure may use the reciprocal of the sum of all single-behavior cross entropy means of the user as the confidence of the user representation of the user. For example, for any user, the confidence level of the user representation of the user obtained by the present disclosure may be represented by the following formula (3):
Figure BDA0002760290400000162
in the above formula (3), Score represents the confidence of a user representation of a user; m represents the number of users included in a similar user group of a user; h (p, q)y) Representing the similarity of a user to the user image of the y-th user in its similar user group.
As can be seen from the above equations (2) and (3), the more the number of behaviors of the user is, and the higher the user portrait similarity of the user with other users is, the smaller the single behavior cross entropy mean of the feature vector of the user and a user in the similar user group is, so as to make the confidence of the user portrait of the user higher.
It should be particularly noted that some of the dimension vectors in the feature vector of the user may not have actual behavior times, for example, the corresponding vectors of geographic location, age, gender, etc. do not have actual behavior times, and the disclosure may set the behavior times of the vectors without actual behavior times to 1.
The confidence coefficient of the user portrait is determined by utilizing the single-behavior cross entropy mean value, so that the similarity of the characteristic vectors and the user behavior frequency become parameters for measuring the confidence coefficient of the user portrait, the confidence coefficient of the user portrait not only considers the frequency corresponding to the vectors, but also considers the frequency corresponding to the vectors, and the accuracy of the determined confidence coefficient of the user portrait is improved.
In an alternative example, an example of the present disclosure correcting a user representation of a user whose confidence level does not meet a preset confidence level requirement is shown in FIG. 4.
In fig. 4, a confidence threshold corresponding to a predetermined scene is determined S400.
Optionally, the present disclosure may set a confidence threshold for each predetermined scenario, where the confidence thresholds corresponding to different predetermined scenarios may be the same or different. In addition, the predetermined scenarios in the present disclosure are generally related to the actual application field. For example, in the housing field, the predetermined scene is a house information recommendation scene or the like, and further, the house information recommendation scene may include: a new house information recommendation scene, a second-hand house information recommendation scene, a house leasing information recommendation scene and the like. The present disclosure does not limit the predetermined scene.
Optionally, the present disclosure may set a confidence threshold for each predetermined scenario in a dynamic manner. For example, for any predetermined scenario, the present disclosure may form a confidence threshold for that predetermined scenario based on the bottom-of-pocket policy for that predetermined scenario.
For example, for an information recommendation scenario (e.g., a house information recommendation scenario), for all users who use a bottom-of-pocket policy to push house information to the users, the present disclosure may divide all users into a plurality of user groups according to a value range to which a confidence of each user image belongs (e.g., a value range 0-1 is divided into 10 value ranges with 0.1 as a step length), and obtain a CTR (click through rate) of each user group, where the CTR of each user group and the confidence of the user image of each user group belong to a value range, which may form a straight line 500 in fig. 5. For all users who do not adopt the bottom-finding strategy to push house information to the users, the method can divide all the users into a plurality of user groups according to the value intervals (for example, the value range 0-1 is divided into 10 value intervals by taking 0.1 as a step length) to which the confidence degrees of the user images belong, and obtain the CTR of each user group, and the CTR of each user group and the value intervals to which the confidence degrees of the user images of each user group belong can form a broken line 501 in fig. 5. The present disclosure may use the intersection of the straight line 500 and the polyline 501 as a confidence threshold for the information recommendation scenario. For example, as shown in fig. 5, the present disclosure may take 0.3 as the confidence threshold for the information recommendation scenario.
S401, selecting users with the confidence coefficient of the user image lower than the confidence coefficient threshold value from all users.
S402, for any selected user, obtaining the user portrait mean value of each user in the user' S similar user group, and determining the user portrait of the user by using the user portrait adjusting parameter of the user, the user portrait of the user and the user portrait mean value.
Optionally, a user portrait adjustment parameter for a user may be obtained based on the hyper-parameter and a confidence level of the user portrait of the user, e.g., the present disclosure may take the product of the hyper-parameter and the confidence level of the user portrait of the user as the user portrait adjustment parameter for the user. The present disclosure may obtain a user representation of a user using the following equation (4):
Figure BDA0002760290400000181
in the above formula (4), α represents a hyper-parameter; s represents the confidence of the user image of the user whose image is to be corrected (hereinafter referred to simply as the user to be corrected); a1 represents a user representation of a user to be corrected, such as a specific value of each enumerated value in the user representation; a2 represents the user representation mean values of a similar user population of users to be corrected, e.g., the mean value of the specific values for each enumerated value in the user representations of all users in a pixel user population.
According to the method and the device, the confidence threshold value is set for each preset scene, so that the scheme disclosed by the invention can be better suitable for various preset scenes. In the process of adjusting the user portrait, the user portrait of the user to be corrected and the user portrait mean value of the similar user group of the user to be corrected are respectively considered, so that the adjusted user portrait is more in line with the real situation of the user.
Exemplary devices
FIG. 6 is a schematic structural diagram of an embodiment of a user portrait correction apparatus according to the present disclosure. The apparatus of this embodiment may be used to implement the method embodiments of the present disclosure described above.
As shown in fig. 6, the apparatus of the present embodiment may include: a get description information module 600, a map module 601, a group partition module 602, a determine user group module 603, a determine confidence module 604, and a revise user representation module 605.
The obtain description information module 600 is used to obtain user description information of a plurality of users. Wherein the user description information includes: user tags, user behavior characteristics, and user portrayal.
The mapping module 601 is configured to map the user description information of the multiple users into feature vectors respectively; wherein the feature vector comprises a plurality of feature vector segments.
The group division module 602 is configured to select, for any feature vector segment included in the feature vector, users having the same feature vector segment from the multiple users, and divide the selected users having the same feature vector segment into a group.
The determine user group module 603 is configured to determine a similar user group of each user according to the similarity between the feature vectors of different users in the group.
The determine confidence module 604 is configured to determine, for any user, a confidence of the user representation of the user based on the correspondence of the feature vector of the user with the feature vectors of users in the similar user group of the user.
The revise user profile module 605 is configured to revise user profiles of users with confidence levels that do not meet the preset confidence level requirement based on user profiles of users in the similar user group.
Optionally, the user tag of the present disclosure may include: the user static attribute tag, the user value tag and the user state tag. And the obtain description information module 600 may include: a first sub-module 6001 and a second sub-module 6002. The first sub-module 6001 is configured to obtain, according to the user operation data, the user static attribute tags of the users and the user behavior characteristics of the users. The second sub-module 6002 is configured to perform data mining on the user static attribute tags of the users and the user behavior characteristics of the users, and obtain user value tags and user status tags of the users.
Optionally, the first sub-module 6001 is further configured to: according to the user operation data, carrying out behavior quantity statistics on various behaviors which are executed by each user according to each preference enumeration value respectively, and obtaining the behavior times of each user for executing various behaviors according to each preference enumeration value respectively; and aiming at any user, taking the behavior times of the user executing various behaviors aiming at each preference enumeration value as the user behavior characteristics of the user.
Optionally, the module for determining a user group 603 includes: a third sub-module 6031, a fourth sub-module 6032, and a fifth sub-module 6033. The third sub-module 6031 is configured to select, for any user in the group, a feature vector of a partial dimension from feature vectors of the user according to a minimum hash method. The fourth sub-module 6033 is configured to calculate similarities between the feature vectors of the partial dimensions of the user and the feature vectors of the partial dimensions of other users in the group where the user is located. The fifth sub-module 6033 is configured to use the user whose similarity meets the predetermined similarity requirement as the user in the similar user group of the user.
Optionally, the determining confidence module 604 includes: a sixth sub-module 6041 and a seventh sub-module 6042. The sixth sub-module 6041 is configured to, for any user, calculate a single-behavior cross entropy average value of the feature vector of the user and the feature vector of each user in the similar user group of the user, and obtain a plurality of single-behavior cross entropy average values. The seventh sub-module 6042 is configured to calculate the mean of the single-behavior cross entropies, and obtain the confidence of the user representation of the user.
Optionally, the seventh sub-module 6042 is further configured to: and taking the reciprocal of the sum of all single-behavior cross entropy averages of the user as the confidence of the user portrait of the user.
Optionally, the revise user representation module 605 is further configured to: determining a confidence threshold corresponding to a preset scene; and for any user with the confidence coefficient not reaching the confidence coefficient threshold value, acquiring user portrait mean values of all users in the similar user group of the user, and determining the user portrait of the user according to the user portrait adjustment parameter of the user, the user portrait of the user and the user portrait mean values.
The operations specifically executed by the modules and the sub-modules and units included in the modules may be referred to in the description of the method embodiments with reference to fig. 1 to 5, and are not described in detail here.
Exemplary electronic device
An electronic device according to an embodiment of the present disclosure is described below with reference to fig. 7. FIG. 7 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure. As shown in fig. 7, the electronic device 71 includes one or more processors 711 and memory 712.
The processor 711 may be a Central Processing Unit (CPU) or other form of processing unit having user portrait correction capabilities and/or instruction execution capabilities, and may control other components in the electronic device 71 to perform desired functions.
Memory 712 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory, for example, may include: random Access Memory (RAM) and/or cache memory (cache), etc. The nonvolatile memory, for example, may include: read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 711 to implement the user representation correction methods of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
In one example, the electronic device 71 may further include: input devices 713 and output devices 714, among other components, interconnected by a bus system and/or other form of connection mechanism (not shown). The input device 713 may also include, for example, a keyboard, a mouse, and the like. The output device 714 can output various information to the outside. The output devices 714 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.
Of course, for simplicity, only some of the components of the electronic device 71 relevant to the present disclosure are shown in fig. 7, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 71 may include any other suitable components, depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in a user representation correction method according to various embodiments of the present disclosure as described in the "exemplary methods" section above of this specification.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a user representation correction method according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, and systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," comprising, "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

1. A user portrait correction method, comprising:
acquiring user description information of a plurality of users; wherein the user description information includes: user tags, user behavioral characteristics, and user portraits;
mapping the user description information of the plurality of users into feature vectors respectively; wherein the feature vector comprises a plurality of feature vector segments;
selecting users with the same feature vector segment from the plurality of users aiming at any feature vector segment contained in the feature vector, and dividing the selected users with the same feature vector segment into a group;
determining a similar user group of each user according to the similarity between the feature vectors of different users in the group;
for any user, determining the confidence of the user portrait of the user according to the consistency of the feature vector of the user and the feature vectors of the users in the similar user group of the user;
and correcting the user portrait of the user with the reliability not meeting the preset reliability requirement according to the user portrait of the user in the similar user group.
2. The method of claim 1, wherein the user tag comprises: the user static attribute tag, the user value tag and the user state tag;
the acquiring user description information of a plurality of users includes:
acquiring a user static attribute label of each user and user behavior characteristics of each user according to the user operation data;
and performing data mining processing on the user static attribute labels of the users and the user behavior characteristics of the users to obtain user value labels and user state labels of the users.
3. The method according to claim 2, wherein the obtaining the user static attribute tag of each user and the user behavior feature of each user according to the user operation data comprises:
according to the user operation data, carrying out behavior quantity statistics on various behaviors which are executed by each user according to each preference enumeration value respectively, and obtaining the behavior times of each user for executing various behaviors according to each preference enumeration value respectively;
and aiming at any user, taking the behavior times of the user executing various behaviors aiming at each preference enumeration value as the user behavior characteristics of the user.
4. The method according to any one of claims 1 to 3, wherein the determining a similar user group for each user according to the similarity between the feature vectors of different users in the group comprises:
selecting a part of dimensional feature vectors from the feature vectors of any user in the group according to a minimum hash method;
calculating the similarity between the feature vector of the partial dimension of the user and the feature vectors of the partial dimensions of other users in the group of the user;
and taking the user with the similarity meeting the preset similarity requirement as the user in the similar user group of the user.
5. The method of any of claims 1 to 4, wherein the determining, for any user, a confidence level for the user representation of the user based on the correspondence of the feature vector of the user with the feature vectors of users in a similar user group of the user comprises:
aiming at any user, calculating the single-behavior cross entropy mean value of the feature vector of the user and the feature vector of each user in the similar user group of the user respectively to obtain a plurality of single-behavior cross entropy mean values;
and calculating the mean value of the cross entropies of the plurality of single behaviors to obtain the confidence coefficient of the user portrait of the user.
6. The method of claim 5, wherein the calculating for the plurality of single-behavior cross-entropy means to obtain a confidence level of the user representation of the user comprises:
and taking the reciprocal of the sum of all single-behavior cross entropy averages of the user as the confidence of the user portrait of the user.
7. The method of any of claims 1 to 6, wherein said modifying, from user representations of users in a similar user group, user representations of users with confidence levels that do not meet a preset confidence requirement comprises:
determining a confidence threshold corresponding to a preset scene;
and for any user with the confidence coefficient not reaching the confidence coefficient threshold value, acquiring user portrait mean values of all users in the similar user group of the user, and determining the user portrait of the user according to the user portrait adjustment parameter of the user, the user portrait of the user and the user portrait mean values.
8. A user profile correction apparatus, wherein the apparatus comprises:
the acquisition description information module is used for acquiring user description information of a plurality of users; wherein the user description information includes: user tags, user behavioral characteristics, and user portraits;
the mapping module is used for mapping the user description information of the users into feature vectors respectively; wherein the feature vector comprises a plurality of feature vector segments;
a group division module, configured to select, for any feature vector segment included in the feature vector, users having the same feature vector segment from the multiple users, and divide the selected users having the same feature vector segment into a group;
a user group determining module for determining a similar user group of each user according to the similarity between the feature vectors of different users in the group;
the confidence determining module is used for determining the confidence of the user portrait of any user according to the consistency of the feature vector of the user and the feature vectors of the users in the similar user group of the user;
and the user portrait correcting module is used for correcting the user portrait of the user with the reliability not meeting the preset reliability requirement according to the user portrait of the user in the similar user group.
9. A computer-readable storage medium, the storage medium storing a computer program for performing the method of any of the preceding claims 1-7.
10. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1-7.
CN202011215640.6A 2020-11-04 2020-11-04 User portrait correction method, device, medium, and electronic apparatus Active CN112256973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011215640.6A CN112256973B (en) 2020-11-04 2020-11-04 User portrait correction method, device, medium, and electronic apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011215640.6A CN112256973B (en) 2020-11-04 2020-11-04 User portrait correction method, device, medium, and electronic apparatus

Publications (2)

Publication Number Publication Date
CN112256973A true CN112256973A (en) 2021-01-22
CN112256973B CN112256973B (en) 2021-09-10

Family

ID=74267693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011215640.6A Active CN112256973B (en) 2020-11-04 2020-11-04 User portrait correction method, device, medium, and electronic apparatus

Country Status (1)

Country Link
CN (1) CN112256973B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407827A (en) * 2021-06-11 2021-09-17 广州三七极创网络科技有限公司 Information recommendation method, device, equipment and medium based on user value classification
CN113609409A (en) * 2021-07-21 2021-11-05 深圳供电局有限公司 Method and system for recommending browsing information, electronic device and storage medium
CN115994267A (en) * 2023-02-15 2023-04-21 北京欧拉认知智能科技有限公司 Real-time user image depicting method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408712A (en) * 2018-09-30 2019-03-01 重庆誉存大数据科技有限公司 A kind of construction method of travel agency user multidimensional information portrait
CN109918409A (en) * 2019-03-04 2019-06-21 珠海格力电器股份有限公司 A kind of equipment portrait construction method, device, storage medium and equipment
CN110084657A (en) * 2018-01-25 2019-08-02 北京京东尚科信息技术有限公司 A kind of method and apparatus for recommending dress ornament
US20190364123A1 (en) * 2017-04-13 2019-11-28 Tencent Technology (Shenzhen) Company Limited Resource push method and apparatus
CN111026977A (en) * 2019-12-17 2020-04-17 腾讯科技(深圳)有限公司 Information recommendation method and device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190364123A1 (en) * 2017-04-13 2019-11-28 Tencent Technology (Shenzhen) Company Limited Resource push method and apparatus
CN110084657A (en) * 2018-01-25 2019-08-02 北京京东尚科信息技术有限公司 A kind of method and apparatus for recommending dress ornament
CN109408712A (en) * 2018-09-30 2019-03-01 重庆誉存大数据科技有限公司 A kind of construction method of travel agency user multidimensional information portrait
CN109918409A (en) * 2019-03-04 2019-06-21 珠海格力电器股份有限公司 A kind of equipment portrait construction method, device, storage medium and equipment
CN111026977A (en) * 2019-12-17 2020-04-17 腾讯科技(深圳)有限公司 Information recommendation method and device and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407827A (en) * 2021-06-11 2021-09-17 广州三七极创网络科技有限公司 Information recommendation method, device, equipment and medium based on user value classification
CN113609409A (en) * 2021-07-21 2021-11-05 深圳供电局有限公司 Method and system for recommending browsing information, electronic device and storage medium
CN115994267A (en) * 2023-02-15 2023-04-21 北京欧拉认知智能科技有限公司 Real-time user image depicting method, device, computer equipment and storage medium
CN115994267B (en) * 2023-02-15 2023-09-05 北京欧拉认知智能科技有限公司 Real-time user image depicting method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112256973B (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN112256973B (en) User portrait correction method, device, medium, and electronic apparatus
US8583471B1 (en) Inferring household income for users of a social networking system
US9607273B2 (en) Optimal time to post for maximum social engagement
US9727882B1 (en) Predicting and classifying network activity events
US20160132904A1 (en) Influence score of a brand
CN109903086B (en) Similar crowd expansion method and device and electronic equipment
US20140089400A1 (en) Inferring target clusters based on social connections
US10116758B2 (en) Delivering notifications based on prediction of user activity
WO2018161940A1 (en) Method and device for pushing media file, data storage medium, and electronic apparatus
US10827014B1 (en) Adjusting pacing of notifications based on interactions with previous notifications
US10891678B1 (en) Personalized network content generation and redirection according to time intervals between repeated instances of behavior based on entity size
CN112514403B (en) Distribution of embedded content items by an online system
US20130179418A1 (en) Search ranking features
KR101639656B1 (en) Method and server apparatus for advertising
JP2015166989A (en) information processing apparatus and information analysis method
CN111626898B (en) Method, device, medium and electronic equipment for realizing attribution of events
CN111523032A (en) Method, device, medium and electronic equipment for determining user preference
CN112100511A (en) Preference degree data obtaining method and device and electronic equipment
CN111753208B (en) Method, device, medium and electronic equipment for determining convergence of comparable attributes of users
CN110020129B (en) Click rate correction method, prediction method, device, computing equipment and storage medium
US11663620B2 (en) Customized merchant price ratings
JP6660168B2 (en) Information providing apparatus, information providing method, and program
KR102323424B1 (en) Rating Prediction Method for Recommendation Algorithm Based on Observed Ratings and Similarity Graphs
US20110295723A1 (en) Advertisement inventory management
US11973841B2 (en) System and method for user model based on app behavior

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210319

Address after: 100085 Floor 101 102-1, No. 35 Building, No. 2 Hospital, Xierqi West Road, Haidian District, Beijing

Applicant after: Seashell Housing (Beijing) Technology Co.,Ltd.

Address before: Unit 05, room 112, 1st floor, office building, Nangang Industrial Zone, economic and Technological Development Zone, Binhai New Area, Tianjin 300457

Applicant before: BEIKE TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant