CN111831894A - Information matching method and device - Google Patents

Information matching method and device Download PDF

Info

Publication number
CN111831894A
CN111831894A CN201910330274.XA CN201910330274A CN111831894A CN 111831894 A CN111831894 A CN 111831894A CN 201910330274 A CN201910330274 A CN 201910330274A CN 111831894 A CN111831894 A CN 111831894A
Authority
CN
China
Prior art keywords
label
classified
user
attribute
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910330274.XA
Other languages
Chinese (zh)
Inventor
兰红云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN201910330274.XA priority Critical patent/CN111831894A/en
Publication of CN111831894A publication Critical patent/CN111831894A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Abstract

The embodiment of the application provides an information matching method and device, wherein the method comprises the following steps: obtaining label values of a plurality of users to be classified under at least one preset attribute label respectively; determining a relation weight between every two users to be classified according to the label value of each user to be classified under at least one attribute label and the label weight corresponding to the label value of each user to be classified; grouping a plurality of users to be classified according to the relation weight between every two users to be classified to form at least one user group; and for each user group, matching service information for the user group. According to the method and the device, the users can be grouped with higher precision based on the weight of the label of each user to be classified, and the service information is matched for each user group, so that the user group selected during service information pushing has higher pertinence, and the waste of network resources and pushing resources is reduced.

Description

Information matching method and device
Technical Field
The present application relates to the field of data processing technologies, and in particular, to an information matching method and apparatus.
Background
The information is pushed in a targeted manner aiming at a specific user group, so that the accuracy and the efficiency of information pushing can be improved, and the users need to be grouped.
When grouping users, this is generally done based on the user profile. A user representation is a tagged user model abstracted according to information such as social attributes, living habits, consumption behaviors and the like of a user. Tags are attributes of a user, and a set of tags is used to describe a user. When grouping the users based on the user images, firstly determining attributes of the users of the target type to determine screening conditions, then screening the users according to the screening conditions and the user images, and determining the target users meeting the requirements. However, the user clustering method only simply considers the surface characteristics of the users, the clustering result accuracy is low, and the user clusters selected during information pushing have no pertinence, so that the waste of network resources and pushing resources is caused.
Disclosure of Invention
In view of this, an object of the present application is to provide an information matching method and apparatus, which can perform grouping on users with higher precision based on the weight of the tag of each user to be classified, so that the user group selected during information pushing has higher pertinence, and waste of network resources and pushing resources is reduced.
In a first aspect, an information matching method is provided, including:
obtaining label values of a plurality of users to be classified under at least one preset attribute label respectively;
determining a relation weight between every two users to be classified according to the label value of each user to be classified under at least one attribute label and the label weight corresponding to the label value of each user to be classified;
grouping a plurality of users to be classified according to the relation weight between every two users to be classified to form at least one user group;
and for each user group, matching service information for the user group.
In a second aspect, another information matching method is provided, including:
acquiring a label value of a user to be classified under at least one preset attribute label and a label value of each target user in at least one user group under at least one preset attribute label;
determining the relation weight between the user to be classified and each target user in each user group according to the label value and the corresponding label weight of the user to be classified under each attribute label and the label value and the corresponding label weight of the target user in each user group under the at least one preset attribute label;
determining the polymerization degree of the user to be classified and each user group according to the relation weight between the user to be classified and each target user in each user group;
when the number of the user groups with the polymerization degrees larger than a preset polymerization degree threshold value is larger than a preset number, determining the probability that the user to be classified belongs to each user group according to the label value of the user to be classified under each attribute label;
determining the grouping result of the user to be classified according to the probability that the user to be classified belongs to each user group;
and matching service information for the user to be classified according to the grouping result of the user to be classified.
In a third aspect, an information matching apparatus is provided, including:
the first acquisition module is used for acquiring label values of a plurality of users to be classified under at least one preset attribute label;
the first determining module is used for determining the relation weight between every two users to be classified according to the label value of each user to be classified under at least one attribute label acquired by the first acquiring module and the label weight corresponding to the label value of each user to be classified;
the first grouping module is used for grouping the users to be classified according to the relationship weight between every two users to be classified determined by the first determining module to form at least one user group;
a first matching module, configured to match service information for each user group formed by the first grouping module.
In a fourth aspect, an information matching apparatus is provided, including:
the second obtaining module is used for obtaining the label value of the user to be classified under at least one preset attribute label and the label value of each target user in at least one user group under at least one preset attribute label;
a second determining module, configured to determine, according to the tag value and the corresponding tag weight of the user to be classified under each attribute tag obtained by the second obtaining module, and the tag value and the corresponding tag weight of the target user in each user group under the at least one preset attribute tag, a relationship weight between the user to be classified and each target user in each user group;
a third determining module, configured to determine, according to the relationship weight between the user to be classified and each target user in each user group determined by the second determining module, a degree of polymerization between the user to be classified and each user group;
the second clustering module is configured to, when the number of the user groups with the aggregation degrees larger than the preset aggregation degree threshold determined by the third determining module is larger than the preset number, determine, according to the tag values of the users to be classified under the attribute tags, probabilities that the users to be classified belong to the user groups, and determine, according to the probabilities that the users to be classified belong to the user groups, clustering results of the users to be classified;
and the second matching module is used for matching service information for the user to be classified according to the grouping result of the user to be classified determined by the second grouping module.
In a fifth aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions being executable by the processor to perform the steps of the information matching method in the embodiments of the first aspect or to perform the steps of the information matching method in the embodiments of the second aspect.
In a sixth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the information matching method in the embodiment of the first aspect or the steps of the information matching method in the embodiment of the second aspect.
In the embodiment of the application, through the label values of a plurality of users to be classified under at least one preset attribute label respectively, and label weights corresponding to different label values of the labels to be classified respectively, determining the relation weight between every two users to be classified, the relation weight is used for representing the similarity degree between the two users to be classified, and further according to the relation weight between every two users to be classified, grouping a plurality of users to be classified to form at least one user group, matching corresponding server information for each user group, comparing with the prior art of grouping through user figures, the embodiment of the application is based on the weight of the label of each user to be classified, grouping with higher precision is carried out on the users, so that the user group selected during information pushing has higher pertinence, and the waste of network resources and pushing resources is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and it will be apparent to those skilled in the art that other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic diagram illustrating an architecture of a service system provided in an embodiment of the present application;
fig. 2 is a flowchart illustrating an information matching method provided in an embodiment of the present application;
fig. 3 is a flowchart illustrating a specific method for determining a relationship weight between every two users to be classified in the information matching method provided in the embodiment of the present application;
fig. 4 is a flowchart illustrating a specific method for grouping a plurality of users to be classified to form at least one user group according to a relationship weight between every two users to be classified in the information matching method provided in the embodiment of the present application;
fig. 5 is a flowchart illustrating a specific method for determining a label weight corresponding to each label value under each basic attribute label in the information matching method provided in the embodiment of the present application;
fig. 6 is a flowchart illustrating a specific method for determining a reconstructed tag in an information matching method provided in an embodiment of the present application;
fig. 7 is a flowchart illustrating a specific method for adjusting label weights corresponding to label values under each attribute label in an information matching method according to an embodiment of the present application;
FIG. 8 is a flow chart of another information matching method provided by the embodiments of the present application;
fig. 9 is a schematic structural diagram illustrating an information matching apparatus provided in an embodiment of the present application;
fig. 10 shows a schematic structural diagram of a computer device 100 provided by an embodiment of the present application;
fig. 11 is a schematic structural diagram illustrating another information matching apparatus provided in an embodiment of the present application;
fig. 12 shows a schematic structural diagram of a computer device 200 according to an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
To enable those skilled in the art to utilize the present disclosure, the following embodiments are presented in conjunction with a specific application scenario, "network appointment". It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the present application is described primarily in the context of grouping users for a network appointment, it should be understood that this is merely an exemplary embodiment and that the method may be used to group users or groups of people in any area.
It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
One aspect of the present application relates to an application system of an information matching method. The system can be used for classifying users according to the label values of the users to be classified under at least one preset attribute label, and label weights corresponding to different label values of the labels to be classified respectively, determining the relation weight between every two users to be classified, the relation weight is used for representing the similarity degree between the two users to be classified, and further according to the relation weight between every two users to be classified, grouping a plurality of users to be classified to form at least one user group, matching corresponding server information for each user group, comparing with the prior art of grouping through user figures, the embodiment of the application is based on the weight of the label of each user to be classified, grouping with higher precision is carried out on the users, so that the user group selected during information pushing has higher pertinence, and the waste of network resources and pushing resources is reduced.
Fig. 1 is a schematic architecture diagram of a service system 100 of an information matching method according to an embodiment of the present application. For example, the service system 100 may be any online transportation service platform, being a shopping platform, a video playing platform, an advertising placement platform, and the like, for transportation services such as taxis, designated driving services, express, carpooling, bus services, driver rentals, or regular service, or any combination thereof. The service system 100 may include one or more of a server 110, a network 120, and a database 130.
In some embodiments, the server 110 may include a processor. The processor may process information and/or data of the user to be classified to perform one or more of the functions described herein. For example, the processor may obtain a tag value of the user to be classified under at least one preset attribute tag, classify the plurality of users to be classified based on the tag value of the user to be classified under at least one preset attribute tag, and match the service information for each user group formed. In some embodiments, a processor may include one or more processing cores (e.g., a single-core processor (S) or a multi-core processor (S)). Merely by way of example, a Processor may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set computer), a microprocessor, or the like, or any combination thereof.
In some embodiments, database 130 may be connected to network 120 to communicate with server 110. Server 110 may access data or instructions stored in database 130 via network 120. In some embodiments, database 130 may be directly connected to server 110, or database 130 may be part of server 110.
The information matching method provided by the embodiment of the present application is described in detail below with reference to the content described in the service system 100 shown in fig. 1.
According to the information matching method provided by the embodiment of the application, the prominent attributes of the sample users which are already grouped are mined, so that users with the same or similar attributes as the sample users are found from other users.
Example one
Referring to fig. 2, an embodiment of the present application provides an information matching method, including S201 to S204. Wherein:
s201: and acquiring label values of a plurality of users to be classified under at least one preset attribute label.
S202: and determining the relation weight between every two users to be classified according to the label value of each user to be classified under at least one attribute label and the label weight corresponding to the label value of each user to be classified.
S203: and grouping a plurality of users to be classified according to the magnitude of the relation weight between every two users to be classified to form at least one user group.
S204: and for each user group, matching service information for the user group.
The present embodiment is used for grouping a plurality of unclassified users to be classified by a user, and matching service information, and before determining a user group of the user to be grouped, a user group generated in advance does not exist.
The following is a detailed description of the above S201 to S204.
I: in the above S201, the attribute tags may have different contents for different application scenarios. For example, when the application scenario is a network appointment, the attribute tags may be: the taxi booking frequency, the area where the taxi booking start is attributed, the area where the taxi booking destination is attributed, the order average amount, the brand of the vehicle, the order average distance and the like. When the application scene is a video website, the attribute tags may be: type of video watched, frequency of watching, pay-per-view, membership, etc. When the application scenario is a shopping website, the attribute tags may be: the operation types (such as clicking, adding to a shopping cart, collecting, purchasing and the like) of various commodities, the operated commodity types and the operated commodity price.
II: in the above S202, the attribute tag includes: a base attribute tag and/or a reconstructed attribute tag. Wherein the reconstructed attribute tag is a tag constructed by combining the basic attribute tags together. For example, in the field of network appointments, basic attribute tags include: "order distance", "vehicle brand", and the reconstructed properties tag includes: "order distance and vehicle brand".
Wherein: the basic attribute label can represent the characteristics of the user to be classified; the reconstructed attribute labels are formed by basic attribute labels, can represent the characteristics of the users to be classified, and can also represent the association degree of different basic attribute labels of the same user, so that the characteristics of the users to be classified are further excavated in a deeper manner.
Specifically, a specific way of determining the label weight corresponding to each label value of each to-be-classified user under each attribute label may be as shown in the following embodiment two, and details are not described here.
Referring to fig. 3, an embodiment of the present application provides a specific method for determining a relationship weight between every two users to be classified, including: for every two users to be classified, executing:
s301: determining a target attribute label from the attribute labels according to the label values of the two users to be classified under the attribute labels; the target attribute tag refers to an attribute tag with the same tag value of the two users to be classified, and/or an attribute tag with the tag value of the two users to be classified belonging to the same tag value interval.
Here, for the case where the attribute tag is a basic attribute tag:
and if the label value under the target attribute label is a discrete value, the target attribute label refers to a basic attribute label with the same label value of the two users to be classified.
And if the label values under the target attribute labels are continuous values, the target attribute labels refer to basic attribute labels of the two users to be classified, wherein the label values of the two users to be classified belong to the same label value interval.
For the case that the attribute tag is a reconstructed attribute tag, taking an example that the reconstructed attribute tag is composed of a first basic attribute tag and a second basic attribute tag:
if the label value under the first basic attribute label and the label value under the second basic attribute label which form the reconstructed attribute label are discrete values, the target attribute label means that the label values under the first basic attribute label and the label values under the second basic attribute label of the two users to be classified are the same.
If the label value under the first basic attribute label forming the reconstructed attribute label is a discrete value and the label value under the second basic attribute label is a continuous value, the target attribute label means that the label values under the first basic attribute label of the two users to be classified are the same, and the label values under the second basic attribute label belong to the same label value interval.
If the label value under the first basic attribute label and the label value under the second basic attribute label which form the reconstructed attribute label are continuous values, the target attribute label means that the label values under the first basic attribute label of the two users to be classified belong to the same label value interval under the first basic attribute label, and the label values under the second basic attribute label belong to the same label value interval under the second basic attribute label.
S302: and determining the relation weight between the two users to be classified according to the label weights corresponding to the label values of the two users to be classified under the target attribute labels.
Here, the relationship weight between the two users to be classified may be determined in a manner of summing the label weights corresponding to the label values of the users to be classified under the target attribute label.
III: in the above S203, because of the magnitude of the relationship weight between the users to be classified, the similarity between every two users to be classified is represented; the larger the relation weight is, the higher the probability that two users to be classified belong to the same user group is, so that the grouping of a plurality of users to be classified can be realized according to the relation weight between every two users to be classified.
Specifically, referring to fig. 4, an embodiment of the present application further provides a specific method for grouping a plurality of users to be classified according to a relationship weight between every two users to be classified to form at least one user group, including:
s401: randomly selecting one target user to be classified from the users to be classified which are not classified, selecting another target user to be classified with the largest relation weight with the target user to be classified, and forming an initial group by the two selected target users to be classified.
S402: determining a first polymerization degree according to the relation weight between two target users to be classified in the initial group; and determining a second aggregation degree according to the relation weight between every two other users to be classified except the target users to be classified in the users to be classified which are not classified.
S403: traversing each user to be classified except the target user to be classified in the users to be classified which are not classified, and aiming at the currently traversed user to be classified, executing:
s4031: determining a third aggregation degree according to the relation weight between the currently traversed user to be classified and each target user to be classified in the initial group and the relation weight between every two target users to be classified in the initial group; determining a fourth polymerization degree according to the relation weight between every two other users to be classified except the target user to be classified and the traversed user to be classified in the users to be classified which are not classified;
s4032: determining a first polymerization index according to the first polymerization degree and the second polymerization degree; determining a second polymerization index according to the third polymerization degree and the fourth polymerization degree;
s4033: detecting whether the second polymerization index is greater than the first polymerization index; if yes, jumping to S4034; if not, jumping to S4035;
s4034: adding the currently traversed users to be classified into an initial group as new target users to be classified to form a new initial group so as to complete traversal of the currently traversed users to be classified; jumping to S4035;
s4035: detecting whether users to be classified which are not traversed exist; if not, jumping to S404; if yes, taking the next user to be traversed to be classified as the current user to be traversed to be classified, and jumping to S4031;
s404: taking the obtained initial group as a user group, and taking the user to be classified in the initial group as the user to be classified after classification;
s405: detecting whether the user to be classified who is not classified currently exists; if yes, jumping to S401; if not, then jump to S406.
S406: the clustering process is ended.
In the grouping process, for example: the users to be classified who are not classified comprise: A. b, S1~S5. A is a target user to be classified which is randomly determined, and B is another target user to be classified which has the largest relation weight with A except A in the users to be classified which are not classified. Relation weight c between A and Bab(ii) a A is respectively and S1~S5The relationship weight of (A) is as follows: c. Ca1~ca5(ii) a B is respectively and S1~S5The relationship weight of (A) is as follows: c. Cb1~cb5(ii) a A and B are formed into an initial group.
S1~S5In, S1And S2~S5Respectively has a relationship weight of b12、b13、b14、b15
S2And S3~S5Respectively has a relationship weight of b23、b24、b25
S3And S4、S5Has a relationship weight of b34、b35
S4And S5Has a relationship weight of b45
According to the relation weight c between A and BabDetermining a first degree of polymerization Ein 1=cab
And according to S1~S5The relationship weight between every two of them, determining the second polymerization degree Eout 1=b12+b13+b14+b15+b23+b24+b25+b34+b35+b45
First determined as S1As the current traversed user to be classified, and aiming at S1
Calculated third degree of polymerization Ein 2=cab+ca1+cb1
Calculated fourth degree of polymerization:
Eout 2=b23+b24+b25+b34+b35+b45
the first polymerization index
Figure BDA0002037475060000081
Second polymerization index
Figure BDA0002037475060000082
If the consistency 2 is greater than the consistency 1, S is considered to be1After the users to be classified as the targets are added into the initial group, the aggregations of the initial group are stronger, and the aggregations of other users to be classified except for all the target users to be classified in the initial group are weaker, so that the S is used1And adding the users to be classified as targets into the initial group to form a new initial group.
Then the S is added2As currently traversed toClassify the users and execute the same process until S1~S5And all the users are traversed and finished, the finally formed initial group is used as a user group, and all the target users to be classified in the user group are used as the classified users to be classified. The same process as described above is continuously performed for other users to be classified that are not included in the user group, and a new user group is formed.
It should be noted here that after the clustering process is performed, a small number of target users to be classified may be included in the finally-formed initial cluster, for example, only 2 or 3 target users to be classified, and in another embodiment, the target users to be classified in the finally-formed initial cluster may be re-used as the users to be classified which are not classified yet, and the clustering process may be performed again together with other users to be classified which are not classified yet. When the grouping process is executed at this time, the user to be classified which is not randomly determined, which has been already taken as the user to be classified in the last classification process, is taken as the target user to be classified.
In the embodiment of the application, through the label values of a plurality of users to be classified under at least one preset attribute label respectively, and label weights corresponding to different label values of the labels to be classified respectively, determining the relation weight between every two users to be classified, the relation weight is used for representing the similarity degree between the two users to be classified, and further according to the relation weight between every two users to be classified, grouping a plurality of users to be classified to form at least one user group, matching corresponding server information for each user group, comparing with the prior art of grouping through user figures, the embodiment of the application is based on the weight of the label of each user to be classified, grouping with higher precision is carried out on the users, so that the user group selected during information pushing has higher pertinence, and the waste of network resources and pushing resources is reduced.
Example two
One is as follows: for the case where the attribute tags include a base attribute tag:
referring to fig. 5, a second embodiment of the present application provides a specific method for determining label weights corresponding to respective label values under each basic attribute label, where the method includes, for each basic attribute label, performing:
s501: and acquiring label values of a plurality of sample users under the basic attribute labels.
Here, the sample user is a user that has already completed grouping, belongs to a sample user of a different user group, and the label weights of the label values under corresponding different attribute labels are also different. The embodiment of the application describes a method for determining the label weight by all sample users belonging to the same user group.
S502: and determining the number of sample users respectively corresponding to each label value under the basic attribute label.
Here, for example, the base attribute label is "vehicle brand", and the label values of the sample users under the base attribute label are "AA card", "BB card", and "CC card", respectively; determining the number of sample users corresponding to each label value under the summary basic attribute label, namely determining the number of sample users with label values of 'AA cards', the number of sample users with label values of 'BB cards' and the number of sample users with label values of 'CC cards' respectively.
S503: and determining the concentration of the basic attribute labels according to the number of the sample users corresponding to each label value under the basic attribute labels.
S504: and determining label weights corresponding to each label value under the basic attribute label according to the concentration of the basic attribute label, the number of sample users corresponding to each label value under the basic attribute label and the total number of the sample users.
The concentration of the basic attribute label is used for representing the concentration of label values taken by different users under the basic attribute label. The larger the concentration of the basic attribute labels, the more remarkable the basic attribute labels are represented for the user group to which the sample user belongs, that is, the characteristics of the sample user belonging to the same user group can be better represented, and correspondingly, the larger the label weight corresponding to each label value under the basic attribute labels is.
The tag value of the base attribute tag may be a continuous value or a discrete value. For example, the label value of the attribute label "order distance" is a continuous value, and the label value of the "vehicle brand" is a discrete value.
A: for the case that the tag value of the basic attribute tag is a discrete value, the following method may be adopted to determine the concentration of the basic attribute tag:
and determining a preset number of target label values from the label values under the basic attribute label according to the sequence of the number of the sample users corresponding to each label value under the basic attribute label from large to small. And determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value and the total number of the sample users.
Illustratively, the concentration CRn of base attribute tags satisfies the following formula:
Figure BDA0002037475060000091
in the formula, n is a preset number, and can be set according to actual needs, for example, n is set to be 3, 4 or 5; x represents the total number of sample users; xiIndicating the number of sample users corresponding to the ith target tag value.
For example, for the case that the label value of the basic attribute label is a discrete value, the label weight corresponding to each kind of label value under the basic attribute label satisfies the following formula:
p(Xr)=CRn*Xr/X;
wherein, p (X)r) Representing the label weight corresponding to the r label value; xrRepresenting the number of sample users corresponding to the r-th label value; x represents the total number of sample users.
B: for the case that the tag value of the basic attribute tag is a continuous value, the following method may be adopted to determine the concentration of the basic technical tags:
based on the label values of the sample users under the basic attribute labels, clustering the label values under the basic attribute labels by adopting a preset clustering algorithm to form a plurality of label value intervals under the basic attribute labels; determining a preset number of target label value intervals from a plurality of label value intervals according to the sequence of the number of sample users corresponding to each label value interval under the basic attribute label from large to small; and determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value interval and the total number of the sample users.
Here, it should be noted that the process of determining the tag value interval actually clusters the sample users according to the tag values of the sample users under the basic attribute tag, and then determines a plurality of tag value intervals according to the tag values of the sample users under the basic attribute tag included in each class. In the plurality of tag value intervals, two adjacent tag value intervals may be numerically continuous or discontinuous.
For example, the tag value intervals formed are: [1, 5], [6.2, 10.4], [11.3, 20 ]. That is, in the formed tag value interval, the minimum tag value of each sample user included in the interval under the basic attribute tag is used as the minimum value of the tag value interval, and the maximum tag value of each sample user included in the interval under the basic attribute tag is used as the maximum value of the tag value interval.
In addition, for the convenience of calculation, each formed tag value interval can be expanded, and two adjacent tag value intervals which are not continuous in numerical value are adjusted to be continuous in numerical value.
For example, the formed tag value interval is expanded, and the adjacent boundary values of two adjacent tag value intervals after expansion take the average value of the primary boundary values, that is, the obtained expanded tag value intervals are respectively: [1, 5.6], (5.6, 10.85], (10.85, 20 ].
Here, the clustering algorithm is, for example, a partition-based clustering method, a hierarchy-based clustering method, a density-based clustering method, a network-based clustering method, a model-based clustering method, a fuzzy-based clustering method, or the like, and may be specifically selected according to actual needs.
In the clustering, clustering is performed based on each label value of each sample user under the basic attribute label, and a plurality of label value intervals under the basic attribute label can be formed. Each tag value interval corresponds to at least one sample user. And then determining a preset number of target label value intervals from the plurality of label value intervals according to the sequence of the number of sample users corresponding to each label value interval from large to small.
Illustratively, in this case, the concentration CRn of base attribute tags satisfies the following formula:
Figure BDA0002037475060000101
in the formula, n is a preset number, and can be set according to actual needs, for example, n is set to be 3, 4 or 5; x represents the total number of sample users; xiAnd the number of sample users corresponding to the ith target label value interval is represented.
In connection with S303, the specific method for determining the label weight corresponding to each label value under the basic attribute label provided in the embodiment of the present application further includes:
for the case that the tag value of the basic attribute tag is a discrete value, the following method may be adopted to determine the concentration of the basic attribute tag:
for example, for the case that the tag values of the basic attribute tag are continuous values, the tag weights corresponding to each tag value interval under the basic attribute tag satisfy the following formula:
p(Xs)=CRn*Xs/X;
wherein, p (X)s) Representing the label weight corresponding to the s label value interval; xsRepresenting the number of sample users corresponding to the s label value interval; x represents the total number of sample users.
The number of sample users corresponding to the s-th tag value interval and the total number of sample users belonging to the s-th tag value interval with the tag value are determined.
The label weight corresponding to each label value of the basic attribute label is the label weight corresponding to the label value interval to which each label value belongs.
The second step is as follows: for the situation that the attribute tags include reconstructed attribute tags, the reconstructed attribute tags are constructed based on the basic attribute tags, and the features of the users to be classified can be deeply mined by the reconstructed attribute tags, so that the reconstructed attribute tags are not formed by all combinations of different basic attribute tags, but are verified by the combinations of different basic attribute tags, and the reconstructed attribute tags capable of representing the deeper features of the users to be classified are determined from the different basic attribute tag combinations.
Specifically, referring to fig. 6, an embodiment of the present application further provides a specific method for determining a reconfiguration tag, including:
s601: and determining the concentration of each basic attribute label and the label weight corresponding to each label value under each basic attribute label.
Here, the determination manner of the label weight corresponding to each label value under each basic attribute label is similar to the method in the embodiment corresponding to fig. 3, and is not described herein again.
S602: and comparing the label weight corresponding to each label value under each basic attribute label with a preset label weight threshold value, and determining the basic attribute label of which the label weight corresponding to each label value is smaller than the label weight threshold value as the attribute label to be verified.
Here, for example, a certain basic attribute tag M includes 10 tag values M1 to M10; if the label weights respectively corresponding to M1-M10 are all smaller than a preset label weight threshold value k, determining the basic attribute label M as an attribute label to be verified; and if at least one corresponding label weight in m 1-m 10 is greater than or equal to the preset label weight threshold value k, the basic attribute label is a non-to-be-verified attribute label.
S603: and aiming at every two attribute tags to be verified, forming the two attribute tags to be verified into a reconstructed attribute tag to be selected, and determining the concentration of the reconstructed attribute tag to be selected according to the tag values of a plurality of sample users under the two attribute tags to be verified respectively.
S604: comparing the concentration ratios of the attribute tags to be selected and reconstructed with the concentration ratios corresponding to the two attribute tags to be verified respectively;
s605: and if the concentration of the attribute tags to be selected is greater than the concentration of any attribute tag to be verified in the two attribute tags to be verified, determining the attribute tags to be selected as the selected reconstructed attribute tags.
Specifically, for example, the two attribute tags to be verified are respectively: "automobile brand" and "whether car appointment time is in rush hour", and the label values of "under automobile brand" are: label values of "AA card", "BB card", and "CC card", whether the booking time is rush hour "are: if yes, then the corresponding reconstructed attribute labels to be selected, i.e. whether the automobile brand and the appointment time are in the rush hour, are respectively as follows: "AA cards and yes", "AA cards and no", "BB cards and yes", "BB cards and no", "CC cards and yes", "CC cards and no".
(1) Aiming at the condition that the label values under the two attribute labels to be verified are both discrete values, each label value under the formed reconstructed attribute label to be selected is also a discrete value; the concentration of the attribute tags of the candidate reconstruction can be determined in the following manner:
and determining the number of sample users respectively corresponding to the label values under the to-be-selected reconstruction attribute labels. And determining the concentration of the to-be-selected reconstructed attribute labels according to the number of the sample users corresponding to the label values under the to-be-selected reconstructed attribute labels.
For example, in the above example, the number of sample users respectively corresponding to each tag value of the attribute tag to be selected, i.e., whether the appointment time is in the rush hour or not, can be determined according to the tag values of the sample users respectively in the "automobile brand" and the "whether the appointment time is in the rush hour" or not.
In this case, the concentration is determined in a manner similar to the concentration determination manner of the basic attribute tag provided in the above embodiment, including:
and determining a preset number of target label values from the label values of the to-be-selected reconstructed attribute label according to the sequence of the number of the sample users corresponding to each label value under the to-be-selected reconstructed attribute label from large to small. And determining the concentration of the to-be-selected reconstructed attribute labels according to the number of the sample users respectively corresponding to each target label value and the total number of the sample users.
Correspondingly, the concentration ratio CRn of the attribute labels to be selected meets the following formula:
Figure BDA0002037475060000121
in the formula, n is a preset number, and can be set according to actual needs, for example, n is set to be 3, 4 or 5; x represents the total number of sample users; xiIndicating the number of sample users corresponding to the ith target tag value.
(2) For the situation that one of the tag values under two attribute tags to be verified is a discrete value and the other is a continuous value, the two attribute tags to be verified include: the attribute tag comprises a first attribute tag to be verified and a second attribute tag to be verified; the label value under the first attribute label to be verified is a discrete value; and the label value under the second attribute label to be verified is a continuous value.
The concentration of the attribute tags of the candidate reconstruction may be determined in the following manner:
clustering the label values under the second attribute label to be verified by adopting a preset clustering algorithm based on the label values under the second attribute label to be verified of each sample user to form a plurality of label value intervals under the second attribute label to be verified;
determining weights corresponding to all the label value intervals respectively according to the number of sample users under all the label value intervals and the total number of the sample users; and
for each label value interval, determining the concentration corresponding to the label value interval according to the number of sample users corresponding to the label values of the sample users under the first attribute label to be verified under the label value interval;
and determining the concentration of the attribute label to be selected according to the concentration corresponding to each label value interval and the weight corresponding to each label value interval.
Here, the concentration of the to-be-selected reconstructed attribute tag may be determined in a manner of performing weighted summation on the concentrations corresponding to each tag value interval. And the weight corresponding to each label value interval is the corresponding weighted weight in weighted summation.
(3) Aiming at the condition that the label values under two attribute labels to be verified are continuous values, the two attribute labels to be verified comprise: the attribute tag comprises a first attribute tag to be verified and a second attribute tag to be verified; and the label values of the first attribute label to be verified and the second attribute label to be verified are discrete values.
The concentration of the attribute tags of the candidate reconstruction can be determined in the following manner:
clustering the label values under the first attribute label to be verified by adopting a preset clustering algorithm based on the label values under the first attribute label to be verified of each sample user to form a plurality of first label value intervals under the first attribute label to be verified;
determining weights corresponding to the first label value intervals respectively according to the number of the sample users in the first label value intervals and the total number of the sample users;
for each first label value interval, clustering label values under a second attribute label to be verified by adopting a preset clustering algorithm according to the label values of the sample users under the label value interval under the second attribute label to be verified to form a plurality of second label value intervals under the second attribute label to be verified;
determining a preset number of target label value intervals from a plurality of second label value intervals according to the sequence of the number of sample users respectively corresponding to the second label value intervals from large to small;
and determining the concentration corresponding to the first label value interval according to the number of the sample users respectively corresponding to each target label value interval and the total number of the sample users.
And determining the concentration of the attribute label to be selected according to the concentration corresponding to each first label value interval and the weight corresponding to each first label value interval.
Here, the concentration of the to-be-selected reconstructed attribute tag may be determined in a manner of performing weighted summation on the concentrations corresponding to the first tag value intervals. And the weight corresponding to each first label value interval is the corresponding weighted weight in weighted summation.
Through the embodiment corresponding to fig. 4, the concentration of each attribute tag to be selected can be determined. And then determining the selected reconstruction attribute tags from the reconstruction attribute tags to be selected according to the concentration ratio of the reconstruction attribute tags to be selected.
After each reconstructed attribute label is determined, the label weight corresponding to each label value under each reconstructed attribute label can be determined based on the concentration of each reconstructed attribute label. Specifically, the method comprises the following steps:
(1): for the case that the label values under the two to-be-verified attribute labels constituting the reconstructed attribute label are both discrete values, since the label values under the constituted reconstructed attribute label are also discrete values, one label weight can be determined for each label value of the reconstructed attribute label.
For example, in the two attribute tags to be verified constituting the reconstructed attribute tag, the tag values under one attribute tag to be verified are A, B and C, respectively, and the tag values under the other attribute tag to be verified are D and E, respectively. Then the label values based on the two attribute labels to be verified are respectively:
a and D, A and E, B and D, B and E, C and D, C and E, respectively, wherein the attribute labels corresponding to each combination are as follows: e 1-e 6.
(2): for two attribute tags to be verified forming the reconstructed attribute tag, where a tag value under a first attribute tag to be verified is a discrete value, and a tag value under a second attribute tag to be verified is a continuous value, a tag weight can be determined for each tag value interval under each second attribute tag to be verified under each tag value under each first attribute tag to be verified.
For example, the determined tag value intervals under the second attribute to be verified tag include intervals (a1, a2), (a3, a4) and (a5, a6), and the tag values under the first attribute to be verified tag are: a and B respectively correspond to the following combinations:
(a1, a2) and A, (a1, a2) and B, (a3, a4) and A, (a3, a4) and B, (a5, a6) and A, (a5, a6) and B.
Respectively corresponding to a label weight e 1-e 6.
(3) For the two attribute tags to be verified forming the reconstructed attribute tag, in which the tag value under the first attribute tag to be verified and the tag value under the second attribute tag to be verified are both discrete values, each first tag value interval has a plurality of second tag value intervals corresponding thereto, and different first tag value intervals, the number of the corresponding second tag value intervals may be different, and the values of the start point and the end point of the corresponding interval may also be different.
Therefore, if there are α second label value intervals corresponding to any first label value interval M, each second label value interval in the first label value interval M corresponds to a label weight.
EXAMPLE III
When determining the label weights corresponding to the label values under the attribute labels, because classification is performed based on the label values under the attribute labels of the sample users who have completed grouping, and some attribute labels are determined by the behavior of the user, the behavior of the user has contingency and randomness, so that a certain deviation exists between the label weights corresponding to the label values determined according to the label values under the attribute labels of the sample users and the true values. In order to reduce this deviation, a third embodiment of the present application further provides another information matching method, and based on the foregoing embodiment, label weights corresponding to label paper under each attribute label can be adjusted according to a result of grouping the verified users and an actual grouping result of the verified users, so as to reduce an error between the label weights and actual values.
Specifically, referring to fig. 7, in the information matching method provided in the embodiment of the present application, a process of adjusting label weights respectively corresponding to label values under each attribute label includes:
s701: and acquiring a label value of a verification user under at least one preset attribute label, a label value of each user in at least one user group under at least one preset attribute label, and an actual grouping result of the verification user.
S702: and determining the relationship weight between the verification user and each user in each user group according to the label value and the corresponding label weight of the verification user under each attribute label, and the label value and the corresponding label weight of the user in each user group under at least one preset attribute label.
Here, the manner of determining the relationship weight between the verification user and each user in each user group is similar to the manner of determining the relationship weight between each two users to be classified in the first embodiment, and details are not repeated here.
S703: and determining the polymerization degree of the verification user and each user group according to the relationship weight between the verification user and each user in each user group.
Here, for each user group, the relationship weights between the verified user and the users in the user group may be weighted and summed to obtain the aggregation degree between the verified user and the user group.
S704: and when the number of the user groups with the polymerization degrees larger than the preset polymerization degree threshold value is larger than the preset number, determining the probability that the verification user belongs to each user group according to the label value of the verification user under each attribute label.
Here, the probability of verifying that the user belongs to the respective user group may be determined in the following manner, which is performed for each user group:
(1) for each attribute tag M, determining a first target user from the users in the user group according to the tag value mi of the verification user under the attribute tag; and the label value of the first target user under the attribute label is the same as the label value of the verification user under the attribute label. And according to the number of the first target users in the user group and the total number of the users in the user group, determining a first probability that the label value of the users in the user group under the attribute label is the same as the label value of the verified user under the attribute label, and expressing the first probability as p (x | A).
(2) And determining the proportion of the users in the user group occupying the number of the users in the user group according to the number of the users in the user group and the total number of the users in all the user groups, wherein the proportion is expressed by p (A).
(3) For each attribute tag M, determining a second target user from the users in all the user groups according to the tag value of the verification user under the attribute tag; and the label value of the second target user under the attribute label is the same as the label value of the verification user under the attribute label. And determining a second probability that the label value used under the attribute label in all the user groups is the same as the label value of the verified user under the attribute label according to the number of the second target users in all the user groups and the total number of the users in all the user groups, and expressing the second probability in p (x).
(4) Calculating a probability p (ax) of verifying that the user belongs to the user group according to the first probability p (ax), the ratio p (A), and the second probability p (x), wherein p (ax) satisfies:
Figure BDA0002037475060000151
s705: and adjusting the label weight corresponding to the label value under each attribute label according to the probability that the verification user belongs to each user group and the actual grouping result of the verification user.
Here, the probability of verifying that the user belongs to each user group may be compared with a preset probability threshold. And if the probability that the verified user belongs to a certain user group is greater than the probability threshold, determining that the verified user belongs to the user group. And then verifying whether the grouping result of the verification user is consistent with the actual grouping result. If the verification result is consistent with the verification result, the grouping result of the verification user is considered to be correct; if the attribute labels are inconsistent, the grouping result of the verification user is considered to be inconsistent, and the label weight corresponding to the label value under each attribute label is adjusted.
Specifically, the following method may be adopted to adjust the label weight corresponding to the label value under each attribute label:
(1) and taking the verification user as a sample user, forming a new sample user set with the original sample user, and re-determining the label weights corresponding to the label values under the attribute labels based on the sample users in the new sample user set by adopting the method provided by the second embodiment.
(2) For each attribute label, determining the label weight of the label value under the attribute label by adopting the following formula:
Figure BDA0002037475060000152
wherein:
Figure BDA0002037475060000153
the label weight of the adjusted label value A in the correct classification is represented;
Figure BDA0002037475060000154
the label weight of the label value A before adjustment in the correct classification is represented;
Figure BDA0002037475060000155
a label weight indicating the adjusted label value a in the error classification;
Figure BDA0002037475060000156
represents the sum of the number of samples that the adjusted tag value a has in the error classification;
Figure BDA0002037475060000157
represents the sum of the number of samples that the label value a before adjustment has in the error classification.
The third embodiment of the present application can further improve the grouping precision of the information matching method provided by the third embodiment of the present application.
Example four
Based on the same inventive concept, the third embodiment of the present application further provides another information matching method, where before clustering users to be clustered, at least one pre-generated user group already exists, and the user group can be obtained by using the information matching method provided in the foregoing embodiment.
Referring to fig. 8, an information matching method provided in the third embodiment of the present application includes:
s801: the method comprises the steps of obtaining a label value of a user to be classified under at least one preset attribute label, a label value of each user in at least one user group under at least one preset attribute label, and an actual grouping result of the user to be classified.
S802: and determining the relation weight between the user to be classified and each target user in each user group according to the label value and the corresponding label weight of the user to be classified under each attribute label and the label value and the corresponding label weight of the target user in each user group under the at least one preset attribute label.
S803: determining the polymerization degree of the user to be classified and each user group according to the relation weight between the user to be classified and each user in each user group;
s804: and when the number of the user groups with the polymerization degrees larger than the preset polymerization degree threshold value is larger than the preset number, determining the probability that the user to be classified belongs to each user group according to the label value of the user to be classified under each attribute label.
S805: determining the grouping result of the user to be classified according to the probability that the user to be classified belongs to each user group;
s806: and matching service information for the user to be classified according to the grouping result of the user to be classified.
For the above-mentioned specific implementation of S801 to S804, please refer to fig. 7, which is not described herein again.
According to the method and the device, the users can be grouped with higher precision based on the weight of the label of each user to be classified, so that the users to be classified can be matched with push information with higher pertinence, and the waste of network resources and push resources is reduced.
Based on the same inventive concept, an information matching device corresponding to the information matching method is also provided in the embodiments of the present application, and as the principle of solving the problem of the device in the embodiments of the present application is similar to the information matching method in the embodiments of the present application, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.
EXAMPLE five
Referring to fig. 9, which is a schematic view of an information matching apparatus provided in the fifth embodiment of the present application, the apparatus includes: a first obtaining module 91, a first determining module 92, a first grouping module 93 and a first matching module 94; wherein the content of the first and second substances,
the first obtaining module 91 is configured to obtain tag values of a plurality of users to be classified under at least one preset attribute tag;
a first determining module 92, configured to determine a relationship weight between each two users to be classified according to the label value of each user to be classified under at least one attribute label acquired by the first acquiring module and the label weight corresponding to the label value of each user to be classified;
a first grouping module 93, configured to group the multiple users to be classified according to the magnitude of the relationship weight between every two users to be classified determined by the first determining module, so as to form at least one user group;
a first matching module 94, configured to match service information for each user group formed by the first grouping module.
In the embodiment of the application, through the label values of a plurality of users to be classified under at least one preset attribute label respectively, and label weights corresponding to different label values of the labels to be classified respectively, determining the relation weight between every two users to be classified, the relation weight is used for representing the similarity degree between the two users to be classified, and further according to the relation weight between every two users to be classified, grouping a plurality of users to be classified to form at least one user group, matching corresponding server information for each user group, comparing with the prior art of grouping through user figures, the embodiment of the application is based on the weight of the label of each user to be classified, grouping with higher precision is carried out on the users, so that the user group selected during information pushing has higher pertinence, and the waste of network resources and pushing resources is reduced.
The first determining module 91 is configured to determine a relationship weight between each two users to be classified in the following manner:
for every two users to be classified, executing:
determining a target attribute label from the attribute labels according to the label values of the two users to be classified under the attribute labels; the target attribute label refers to an attribute label with the same label value of the two users to be classified, or an attribute label with the label value of the two users to be classified belonging to the same label value interval;
and determining the relation weight between the two users to be classified according to the label weights corresponding to the label values of the two users to be classified under the target attribute labels.
In an optional embodiment, the first clustering module 93 is configured to cluster a plurality of users to be classified to form at least one user group according to a relationship weight between every two users to be classified in the following manner:
randomly selecting a target user to be classified from the users to be classified which are not classified, selecting another target user to be classified with the largest relation weight with the target user to be classified, and forming an initial group by the two selected target users to be classified;
determining a first polymerization degree according to the relation weight between two target users to be classified in the initial group; determining a second aggregation degree according to the relation weight between every two other users to be classified except the target user to be classified in the users to be classified which are not classified;
traversing each user to be classified except the target user to be classified in the users to be classified which are not classified, and aiming at the currently traversed user to be classified, executing:
determining a third aggregation degree according to the relation weight between the currently traversed user to be classified and each target user to be classified in the initial group and the relation weight between every two target users to be classified in the initial group; determining a fourth polymerization degree according to the relation weight between every two other users to be classified except the target user to be classified and the currently traversed user to be classified in the users to be classified which are not classified;
determining a first polymerization index according to the first polymerization degree and the second polymerization degree; determining a second polymerization index according to the third polymerization degree and the fourth polymerization degree, and detecting whether the second polymerization index is greater than the first polymerization index;
if so, adding the currently traversed user to be classified into the initial group as a new target user to be classified to form a new initial group;
and after traversing is finished, taking the obtained initial group as a user group, taking the user to be classified in the initial group as the user to be classified after classification is finished, returning to the step of randomly selecting a target user to be classified from the users to be classified without classification, and finishing the classification of all the users to be classified without classification.
In an alternative embodiment, the attribute tag comprises: a base attribute tag and/or a reconstructed attribute tag;
wherein the reconstructed attribute tag is constructed using the base attribute tag.
In an alternative embodiment, the method further comprises: a tag weight determining module 95, configured to determine, for a case that the attribute tag includes a basic attribute tag, a tag weight corresponding to each tag value under each basic attribute tag in the following manner:
for each of the base attribute tags, performing:
obtaining label values of a plurality of sample users under the basic attribute labels;
determining the number of sample users corresponding to each label value under the basic attribute label;
determining the concentration of the basic attribute labels according to the number of sample users corresponding to each label value under the basic attribute labels; the concentration degree of the basic attribute label is used for representing the concentration degree of label values taken by different users under the basic attribute label;
and determining label weights corresponding to each label value under the basic attribute label according to the concentration of the basic attribute label, the number of sample users corresponding to each label value under the basic attribute label and the total number of the sample users.
In an optional implementation manner, for a case that the tag value of the basic attribute tag is a discrete value, the tag weight determining module 95 is configured to determine the concentration of the basic attribute tag according to the number of sample users respectively corresponding to each tag value under the basic attribute tag in the following manner:
determining a preset number of target label values from the label values under the basic attribute label according to the sequence of the number of sample users corresponding to each label value under the basic attribute label from large to small;
and determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value and the total number of the sample users.
In an optional implementation manner, for a case that the tag values of the basic attribute tags are continuous values, the tag weight determining module 95 is configured to determine the concentration of the basic attribute tags according to the number of sample users corresponding to each tag value under the basic attribute tags in the following manner:
based on the label values of the sample users under the basic attribute labels, clustering the label values under the basic attribute labels by adopting a preset clustering algorithm to form a plurality of label value intervals under the basic attribute labels;
determining a preset number of target label value intervals from a plurality of label value intervals according to the sequence of the number of sample users corresponding to each label value interval under the basic attribute label from large to small;
and determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value interval and the total number of the sample users.
In an optional embodiment, the tag weight determining module 95 is configured to determine, according to the concentration of the basic attribute tags, the number of sample users respectively corresponding to each tag value under the basic attribute tags, and the total number of the sample users, a tag weight respectively corresponding to each tag value by:
determining label weights respectively corresponding to each label value interval under each label value interval according to the concentration of the basic attribute labels, the number of sample users respectively corresponding to each label value interval under the basic attribute labels and the total number of the sample users;
and the label weight corresponding to each label value of the basic attribute label is the label weight corresponding to the label value interval to which each label value belongs.
In an optional embodiment, the tag weight determining module 95 is further configured to, for a case that the attribute tag includes a reconstructed attribute tag, determine the reconstructed attribute tag by:
determining the concentration of each basic attribute label and the label weight corresponding to each label value under each basic attribute label;
comparing the label weight corresponding to each label value under each basic attribute label with a preset label weight threshold value, and determining the basic attribute label of which the label weight corresponding to each label value is smaller than the label weight threshold value as an attribute label to be verified;
aiming at every two attribute tags to be verified, the two attribute tags to be verified form a reconstructed attribute tag to be selected, and the concentration of the reconstructed attribute tag to be selected is determined according to the tag values of a plurality of sample users under the two attribute tags to be verified respectively;
comparing the concentration ratios of the attribute tags to be selected and reconstructed with the concentration ratios corresponding to the two attribute tags to be verified respectively;
and if the concentration of the attribute tags to be selected is greater than the concentration of any attribute tag to be verified in the two attribute tags to be verified, determining the attribute tags to be selected as the selected reconstructed attribute tags.
In an alternative embodiment, the apparatus further comprises a verification module 96 for:
obtaining a label value of a verification user under at least one preset attribute label, a label value of each user in at least one user group under at least one preset attribute label, and an actual grouping result of the verification user;
determining a relationship weight between the verification user and each user in each user group according to the label value and the corresponding label weight of the verification user under each attribute label, and the label value and the corresponding label weight of the user in each user group under at least one preset attribute label;
determining the polymerization degree of the verification user and each user group according to the relationship weight between the verification user and each user in each user group;
when the number of the user groups with the polymerization degrees larger than a preset polymerization degree threshold value is larger than a preset number, determining the probability that the verification user belongs to each user group according to the label value of the verification user under each attribute label;
and adjusting the label weight corresponding to the label value under each attribute label according to the probability that the verification user belongs to each user group and the actual grouping result of the verification user.
In an alternative embodiment, the verification module 96 is configured to determine the aggregation degree between the verified user and the user group according to the relationship weight between the verified user and each user in the following manner:
and weighting and summing the relation weights between the verification users and the users to obtain the polymerization degree of the verification users and the user group.
The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.
EXAMPLE six
An embodiment of the present application further provides a computer device 100, as shown in fig. 10, which is a schematic structural diagram of the computer device 100 provided in the embodiment of the present application, and includes: a processor 11, a memory 12, and a bus 13. The memory 12 stores machine-readable instructions executable by the processor 11 (for example, execution instructions corresponding to the first obtaining module 91, the first determining module 92, the first grouping module 93, and the first matching module 94 in the apparatus in fig. 9, and the like), when the computer device 100 runs, the processor 11 communicates with the memory 12 through the bus 13, and the machine-readable instructions, when executed by the processor 11, perform the following processes:
obtaining label values of a plurality of users to be classified under at least one preset attribute label respectively;
determining a relation weight between every two users to be classified according to the label value of each user to be classified under at least one attribute label and the label weight corresponding to the label value of each user to be classified;
grouping a plurality of users to be classified according to the relation weight between every two users to be classified to form at least one user group;
and for each user group, matching service information for the user group.
In a possible implementation manner, in the instructions executed by the processor 11, the determining, according to the label value of each user to be classified under at least one attribute label and the label weight corresponding to the label value of each user to be classified, a relationship weight between each two users to be classified includes:
for every two users to be classified, executing:
determining a target attribute label from the attribute labels according to the label values of the two users to be classified under the attribute labels; the target attribute label refers to an attribute label with the same label value of the two users to be classified, or an attribute label with the label value of the two users to be classified belonging to the same label value interval;
and determining the relation weight between the two users to be classified according to the label weights corresponding to the label values of the two users to be classified under the target attribute labels.
In a possible implementation manner, the instructions executed by the processor 11 for grouping a plurality of users to be classified according to a relationship weight between every two users to be classified to form at least one user group includes:
randomly selecting a target user to be classified from the users to be classified which are not classified, selecting another target user to be classified with the largest relation weight with the target user to be classified, and forming an initial group by the two selected target users to be classified;
determining a first polymerization degree according to the relation weight between two target users to be classified in the initial group; determining a second aggregation degree according to the relation weight between every two other users to be classified except the target user to be classified in the users to be classified which are not classified;
traversing each user to be classified except the target user to be classified in the users to be classified which are not classified, and aiming at the currently traversed user to be classified, executing:
determining a third aggregation degree according to the relation weight between the currently traversed user to be classified and each target user to be classified in the initial group and the relation weight between every two target users to be classified in the initial group; determining a fourth polymerization degree according to the relation weight between every two other users to be classified except the target user to be classified and the currently traversed user to be classified in the users to be classified which are not classified;
determining a first polymerization index according to the first polymerization degree and the second polymerization degree; determining a second polymerization index according to the third polymerization degree and the fourth polymerization degree, and detecting whether the second polymerization index is greater than the first polymerization index;
if so, adding the currently traversed user to be classified into the initial group as a new target user to be classified to form a new initial group;
and after traversing is finished, taking the obtained initial group as a user group, taking the user to be classified in the initial group as the user to be classified after classification is finished, returning to the step of randomly selecting a target user to be classified from the users to be classified without classification, and finishing the classification of all the users to be classified without classification.
In a possible implementation, in the instructions executed by the processor 11, the attribute tag includes: a base attribute tag and/or a reconstructed attribute tag;
wherein the reconstructed attribute tag is constructed using the base attribute tag.
In a possible embodiment, in the instruction executed by the processor 11, for a case that the attribute tag includes a basic attribute tag, the following manner is adopted to determine a tag weight corresponding to each tag value under each basic attribute tag:
for each of the base attribute tags, performing:
obtaining label values of a plurality of sample users under the basic attribute labels;
determining the number of sample users corresponding to each label value under the basic attribute label;
determining the concentration of the basic attribute labels according to the number of sample users corresponding to each label value under the basic attribute labels; the concentration degree of the basic attribute label is used for representing the concentration degree of label values taken by different users under the basic attribute label;
and determining label weights corresponding to each label value under the basic attribute label according to the concentration of the basic attribute label, the number of sample users corresponding to each label value under the basic attribute label and the total number of the sample users.
In one possible embodiment, the instructions executed by the processor 11, for a case that the tag value of the basic attribute tag is a discrete value, the determining the concentration of the basic attribute tag according to the number of sample users respectively corresponding to each tag value under the basic attribute tag includes:
determining a preset number of target label values from the label values under the basic attribute label according to the sequence of the number of sample users corresponding to each label value under the basic attribute label from large to small;
and determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value and the total number of the sample users.
In one possible embodiment, the instructions executed by the processor 11, for a case that the tag values of the basic attribute tag are continuous values, the determining the concentration of the basic attribute tag according to the number of sample users respectively corresponding to each tag value under the basic attribute tag includes:
based on the label values of the sample users under the basic attribute labels, clustering the label values under the basic attribute labels by adopting a preset clustering algorithm to form a plurality of label value intervals under the basic attribute labels;
determining a preset number of target label value intervals from a plurality of label value intervals according to the sequence of the number of sample users corresponding to each label value interval under the basic attribute label from large to small;
and determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value interval and the total number of the sample users.
In a possible embodiment, the determining, by the processor 11, the label weight corresponding to each label value according to the concentration of the basic attribute labels, the number of sample users corresponding to each label value under the basic attribute labels, and the total number of the sample users includes:
determining label weights respectively corresponding to each label value interval under each label value interval according to the concentration of the basic attribute labels, the number of sample users respectively corresponding to each label value interval under the basic attribute labels and the total number of the sample users;
and the label weight corresponding to each label value of the basic attribute label is the label weight corresponding to the label value interval to which each label value belongs.
In one possible embodiment, the processor 11 executes instructions that determine the reconstructed attribute tag in the following manner for the case that the attribute tag includes a reconstructed attribute tag:
determining the concentration of each basic attribute label and the label weight corresponding to each label value under each basic attribute label;
comparing the label weight corresponding to each label value under each basic attribute label with a preset label weight threshold value, and determining the basic attribute label of which the label weight corresponding to each label value is smaller than the label weight threshold value as an attribute label to be verified;
aiming at every two attribute tags to be verified, the two attribute tags to be verified form a reconstructed attribute tag to be selected, and the concentration of the reconstructed attribute tag to be selected is determined according to the tag values of a plurality of sample users under the two attribute tags to be verified respectively;
comparing the concentration ratios of the attribute tags to be selected and reconstructed with the concentration ratios corresponding to the two attribute tags to be verified respectively;
and if the concentration of the attribute tags to be selected is greater than the concentration of any attribute tag to be verified in the two attribute tags to be verified, determining the attribute tags to be selected as the selected reconstructed attribute tags.
In a possible implementation, in the instructions executed by the processor 11, the method further includes:
obtaining a label value of a verification user under at least one preset attribute label, a label value of each user in at least one user group under at least one preset attribute label, and an actual grouping result of the verification user;
determining a relationship weight between the verification user and each user in each user group according to the label value and the corresponding label weight of the verification user under each attribute label, and the label value and the corresponding label weight of the user in each user group under at least one preset attribute label;
determining the polymerization degree of the verification user and each user group according to the relationship weight between the verification user and each user in each user group;
when the number of the user groups with the polymerization degrees larger than a preset polymerization degree threshold value is larger than a preset number, determining the probability that the verification user belongs to each user group according to the label value of the verification user under each attribute label;
and adjusting the label weight corresponding to the label value under each attribute label according to the probability that the verification user belongs to each user group and the actual grouping result of the verification user.
In one possible embodiment, the instructions executed by processor 11 for determining the aggregation degrees of the authenticated users and the user groups according to the relationship weights between the authenticated users and the respective users include:
and weighting and summing the relation weights between the verification users and the users to obtain the polymerization degree of the verification users and the user group.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by the processor 11, the steps of the information matching method are performed.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when a computer program on the storage medium is executed, the information matching method can be executed, so that the problem that the user group selected during information push has no pertinence and wastes network resources and push resources is solved, and the effects that the user group selected during information push has pertinence and wastes network resources and push resources are reduced are achieved.
EXAMPLE seven
Referring to fig. 11, which is a schematic view of an information matching apparatus provided in a seventh embodiment of the present application, the apparatus includes: a second obtaining module 111, a second determining module 112, a third determining module 113, a second clustering module 114 and a second matching module 115; wherein the content of the first and second substances,
a second obtaining module 111, configured to obtain a tag value of a user to be classified under at least one preset attribute tag, and a tag value of each target user in at least one user group under at least one preset attribute tag;
a second determining module 112, configured to determine, according to the tag value and the corresponding tag weight of the user to be classified under each attribute tag obtained by the second obtaining module, and the tag value and the corresponding tag weight of the target user in each user group under the at least one preset attribute tag, a relationship weight between the user to be classified and each target user in each user group;
a third determining module 113, configured to determine, according to the relationship weight between the user to be classified and each target user in each user group determined by the second determining module, a degree of polymerization between the user to be classified and each user group;
a second clustering module 114, configured to, when the number of the user groups with the aggregation degrees larger than the preset aggregation degree threshold determined by the third determining module is larger than the preset number, determine, according to the tag values of the users to be classified under the attribute tags, probabilities that the users to be classified belong to the user groups, and determine, according to the probabilities that the users to be classified belong to the user groups, clustering results of the users to be classified;
and a second matching module 115, configured to match service information for the user to be classified according to the grouping result of the user to be classified determined by the second grouping module.
According to the method and the device, the users can be grouped with higher precision based on the weight of the label of each user to be classified, so that the users to be classified can be matched with push information with higher pertinence, and the waste of network resources and push resources is reduced.
Example eight
An embodiment of the present application further provides a computer device 200, as shown in fig. 12, which is a schematic structural diagram of the computer device 200 provided in the embodiment of the present application, and includes: a processor 21, a memory 22, and a bus 23. The memory 22 stores machine-readable instructions (such as execution instructions corresponding to the second obtaining module 111, the second determining module 112, the third determining module 113, and the second clustering module 114 and the second matching module 115 in the apparatus in fig. 11, and the like) executable by the processor 21, when the computer device 200 runs, the processor 21 communicates with the memory 22 through the bus 23, and the machine-readable instructions, when executed by the processor 21, perform the following processes:
acquiring a label value of a user to be classified under at least one preset attribute label and a label value of each target user in at least one user group under at least one preset attribute label;
determining the relation weight between the user to be classified and each target user in each user group according to the label value and the corresponding label weight of the user to be classified under each attribute label and the label value and the corresponding label weight of the target user in each user group under the at least one preset attribute label;
determining the polymerization degree of the user to be classified and each user group according to the relation weight between the user to be classified and each target user in each user group;
when the number of the user groups with the polymerization degrees larger than a preset polymerization degree threshold value is larger than a preset number, determining the probability that the user to be classified belongs to each user group according to the label value of the user to be classified under each attribute label;
determining the grouping result of the user to be classified according to the probability that the user to be classified belongs to each user group;
and matching service information for the user to be classified according to the grouping result of the user to be classified.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by the processor 21, the steps of the information matching method are performed.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when a computer program on the storage medium is executed, the information matching method can be executed, so that the problem that the user group selected during information push has no pertinence and wastes network resources and push resources is solved, and the effects that the user group selected during information push has pertinence and wastes network resources and push resources are reduced are achieved.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to corresponding processes in the method embodiments, and are not described in detail in this application. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (15)

1. An information matching method, comprising:
obtaining label values of a plurality of users to be classified under at least one preset attribute label respectively;
determining a relation weight between every two users to be classified according to the label value of each user to be classified under at least one attribute label and the label weight corresponding to the label value of each user to be classified;
grouping a plurality of users to be classified according to the relation weight between every two users to be classified to form at least one user group;
and for each user group, matching service information for the user group.
2. The information matching method according to claim 1, wherein the determining a relationship weight between each two users to be classified according to the label value of each user to be classified under at least one attribute label and the label weight corresponding to the label value of each user to be classified respectively comprises:
for every two users to be classified, executing:
determining a target attribute label from the attribute labels according to the label values of the two users to be classified under the attribute labels; the target attribute label refers to an attribute label with the same label value of two users to be classified, or an attribute label with the label value of the two users to be classified belonging to the same label value interval;
and determining the relation weight between the two users to be classified according to the label weights corresponding to the label values of the two users to be classified under the target attribute labels.
3. The information matching method according to claim 1, wherein the grouping a plurality of users to be classified according to the relationship weight between every two users to be classified to form at least one user group comprises:
randomly selecting a target user to be classified from the users to be classified which are not classified, selecting another target user to be classified with the largest relation weight with the target user to be classified, and forming an initial group by the two selected target users to be classified;
determining a first polymerization degree according to the relation weight between two target users to be classified in the initial group; determining a second aggregation degree according to the relation weight between every two other users to be classified except the target user to be classified in the users to be classified which are not classified;
traversing each user to be classified except the target user to be classified in the users to be classified which are not classified, and aiming at the currently traversed user to be classified, executing:
determining a third aggregation degree according to the relation weight between the currently traversed user to be classified and each target user to be classified in the initial group and the relation weight between every two target users to be classified in the initial group; determining a fourth polymerization degree according to the relation weight between every two other users to be classified except the target user to be classified and the currently traversed user to be classified in the users to be classified which are not classified;
determining a first polymerization index according to the first polymerization degree and the second polymerization degree; determining a second polymerization index according to the third polymerization degree and the fourth polymerization degree, and detecting whether the second polymerization index is greater than the first polymerization index;
if so, adding the currently traversed user to be classified into the initial group as a new target user to be classified to form a new initial group;
and after traversing is finished, taking the obtained initial group as a user group, taking the user to be classified in the initial group as the user to be classified after classification is finished, returning to the step of randomly selecting a target user to be classified from the users to be classified without classification, and finishing the classification of all the users to be classified without classification.
4. The information matching method according to claim 1, wherein the attribute tag includes: a base attribute tag and/or a reconstructed attribute tag;
wherein the reconstructed attribute tag is constructed using the base attribute tag.
5. The information matching method according to claim 4, wherein, for a case that the attribute tag includes a basic attribute tag, the following method is adopted to determine the tag weight corresponding to each tag value under each basic attribute tag:
for each of the base attribute tags, performing:
obtaining label values of a plurality of sample users under the basic attribute labels;
determining the number of sample users corresponding to each label value under the basic attribute label;
determining the concentration of the basic attribute labels according to the number of sample users corresponding to each label value under the basic attribute labels; the concentration degree of the basic attribute label is used for representing the concentration degree of label values taken by different users under the basic attribute label;
and determining label weights corresponding to each label value under the basic attribute label according to the concentration of the basic attribute label, the number of sample users corresponding to each label value under the basic attribute label and the total number of the sample users.
6. The information matching method according to claim 5, wherein, for a case that the label value of the basic attribute label is a discrete value, the determining the concentration of the basic attribute label according to the number of sample users respectively corresponding to each label value under the basic attribute label comprises:
determining a preset number of target label values from the label values under the basic attribute label according to the sequence of the number of sample users corresponding to each label value under the basic attribute label from large to small;
and determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value and the total number of the sample users.
7. The information matching method according to claim 5, wherein, for a case that the tag values of the basic attribute tag are continuous values, the determining the concentration of the basic attribute tag according to the number of sample users respectively corresponding to each tag value under the basic attribute tag comprises:
based on the label values of the sample users under the basic attribute labels, clustering the label values under the basic attribute labels by adopting a preset clustering algorithm to form a plurality of label value intervals under the basic attribute labels;
determining a preset number of target label value intervals from a plurality of label value intervals according to the sequence of the number of sample users corresponding to each label value interval under the basic attribute label from large to small;
and determining the concentration of the basic attribute labels according to the number of the sample users respectively corresponding to each target label value interval and the total number of the sample users.
8. The information matching method according to claim 7, wherein the determining, according to the concentration of the basic attribute labels, the number of sample users respectively corresponding to each label value under the basic attribute label, and the total number of the sample users, the label weight respectively corresponding to each label value comprises:
determining label weights respectively corresponding to each label value interval under each label value interval according to the concentration of the basic attribute labels, the number of sample users respectively corresponding to each label value interval under the basic attribute labels and the total number of the sample users;
and the label weight corresponding to each label value of the basic attribute label is the label weight corresponding to the label value interval to which each label value belongs.
9. The information matching method according to claim 4, wherein, for a case where the attribute tag includes a reconstructed attribute tag, the reconstructed attribute tag is determined in the following manner:
determining the concentration of each basic attribute label and the label weight corresponding to each label value under each basic attribute label;
comparing the label weight corresponding to each label value under each basic attribute label with a preset label weight threshold value, and determining the basic attribute label of which the label weight corresponding to each label value is smaller than the label weight threshold value as an attribute label to be verified;
aiming at every two attribute tags to be verified, the two attribute tags to be verified form a reconstructed attribute tag to be selected, and the concentration of the reconstructed attribute tag to be selected is determined according to the tag values of a plurality of sample users under the two attribute tags to be verified respectively;
comparing the concentration ratios of the attribute tags to be selected and reconstructed with the concentration ratios corresponding to the two attribute tags to be verified respectively;
and if the concentration of the attribute tags to be selected is greater than the concentration of any attribute tag to be verified in the two attribute tags to be verified, determining the attribute tags to be selected as the selected reconstructed attribute tags.
10. The information matching method according to claim 1, characterized in that the method further comprises:
obtaining a label value of a verification user under at least one preset attribute label, a label value of each user in at least one user group under at least one preset attribute label, and an actual grouping result of the verification user;
determining a relationship weight between the verification user and each user in each user group according to the label value and the corresponding label weight of the verification user under each attribute label, and the label value and the corresponding label weight of the user in each user group under at least one preset attribute label;
determining the polymerization degree of the verification user and each user group according to the relationship weight between the verification user and each user in each user group;
when the number of the user groups with the polymerization degrees larger than a preset polymerization degree threshold value is larger than a preset number, determining the probability that the verification user belongs to each user group according to the label value of the verification user under each attribute label;
and adjusting the label weight corresponding to the label value under each attribute label according to the probability that the verification user belongs to each user group and the actual grouping result of the verification user.
11. An information matching method, comprising:
acquiring a label value of a user to be classified under at least one preset attribute label and a label value of each target user in at least one user group under at least one preset attribute label;
determining the relation weight between the user to be classified and each target user in each user group according to the label value and the corresponding label weight of the user to be classified under each attribute label and the label value and the corresponding label weight of the target user in each user group under the at least one preset attribute label;
determining the polymerization degree of the user to be classified and each user group according to the relation weight between the user to be classified and each target user in each user group;
when the number of the user groups with the polymerization degrees larger than a preset polymerization degree threshold value is larger than a preset number, determining the probability that the user to be classified belongs to each user group according to the label value of the user to be classified under each attribute label;
determining the grouping result of the user to be classified according to the probability that the user to be classified belongs to each user group;
and matching service information for the user to be classified according to the grouping result of the user to be classified.
12. An information matching apparatus, comprising:
the first acquisition module is used for acquiring label values of a plurality of users to be classified under at least one preset attribute label;
the first determining module is used for determining the relation weight between every two users to be classified according to the label value of each user to be classified under at least one attribute label acquired by the first acquiring module and the label weight corresponding to the label value of each user to be classified;
the first grouping module is used for grouping the users to be classified according to the relationship weight between every two users to be classified determined by the first determining module to form at least one user group;
a first matching module, configured to match service information for each user group formed by the first grouping module.
13. An information matching apparatus, comprising:
the second obtaining module is used for obtaining the label value of the user to be classified under at least one preset attribute label and the label value of each target user in at least one user group under at least one preset attribute label;
a second determining module, configured to determine, according to the tag value and the corresponding tag weight of the user to be classified under each attribute tag obtained by the second obtaining module, and the tag value and the corresponding tag weight of the target user in each user group under the at least one preset attribute tag, a relationship weight between the user to be classified and each target user in each user group;
a third determining module, configured to determine, according to the relationship weight between the user to be classified and each target user in each user group determined by the second determining module, a degree of polymerization between the user to be classified and each user group;
the second clustering module is configured to, when the number of the user groups with the aggregation degrees larger than the preset aggregation degree threshold determined by the third determining module is larger than the preset number, determine, according to the tag values of the users to be classified under the attribute tags, probabilities that the users to be classified belong to the user groups, and determine, according to the probabilities that the users to be classified belong to the user groups, clustering results of the users to be classified;
and the second matching module is used for matching service information for the user to be classified according to the grouping result of the user to be classified determined by the second grouping module.
14. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the information matching method according to any one of claims 1 to 11.
15. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the information matching method according to any one of claims 1 to 11.
CN201910330274.XA 2019-04-23 2019-04-23 Information matching method and device Pending CN111831894A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910330274.XA CN111831894A (en) 2019-04-23 2019-04-23 Information matching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910330274.XA CN111831894A (en) 2019-04-23 2019-04-23 Information matching method and device

Publications (1)

Publication Number Publication Date
CN111831894A true CN111831894A (en) 2020-10-27

Family

ID=72912478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910330274.XA Pending CN111831894A (en) 2019-04-23 2019-04-23 Information matching method and device

Country Status (1)

Country Link
CN (1) CN111831894A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112529082A (en) * 2020-12-15 2021-03-19 建信金融科技有限责任公司 System portrait construction method, device and equipment
CN112581161A (en) * 2020-12-04 2021-03-30 上海明略人工智能(集团)有限公司 Object selection method and device, storage medium and electronic equipment
CN112732755A (en) * 2020-12-30 2021-04-30 招商局金融科技有限公司 Client grouping-based label value matching joint verification method and device and computer equipment
CN115511014A (en) * 2022-11-23 2022-12-23 联仁健康医疗大数据科技股份有限公司 Information matching method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106157083A (en) * 2015-04-14 2016-11-23 阿里巴巴集团控股有限公司 The method and apparatus excavating potential customers
WO2017080398A1 (en) * 2015-11-12 2017-05-18 阿里巴巴集团控股有限公司 Method and apparatus for dividing user group
CN107247786A (en) * 2017-06-15 2017-10-13 北京小度信息科技有限公司 Method, device and server for determining similar users
CN109165975A (en) * 2018-08-09 2019-01-08 平安科技(深圳)有限公司 Label recommendation method, device, computer equipment and storage medium
CN109245996A (en) * 2018-09-18 2019-01-18 平安科技(深圳)有限公司 Mail push method, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106157083A (en) * 2015-04-14 2016-11-23 阿里巴巴集团控股有限公司 The method and apparatus excavating potential customers
WO2017080398A1 (en) * 2015-11-12 2017-05-18 阿里巴巴集团控股有限公司 Method and apparatus for dividing user group
CN107247786A (en) * 2017-06-15 2017-10-13 北京小度信息科技有限公司 Method, device and server for determining similar users
CN109165975A (en) * 2018-08-09 2019-01-08 平安科技(深圳)有限公司 Label recommendation method, device, computer equipment and storage medium
CN109245996A (en) * 2018-09-18 2019-01-18 平安科技(深圳)有限公司 Mail push method, device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
许长清; 赵华东; 宋晓辉: "基于大数据的电力用户群体识别与分析方法研究", 郑州大学学报(理学版), vol. 48, no. 3, 17 October 2016 (2016-10-17) *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112581161A (en) * 2020-12-04 2021-03-30 上海明略人工智能(集团)有限公司 Object selection method and device, storage medium and electronic equipment
CN112581161B (en) * 2020-12-04 2024-01-19 上海明略人工智能(集团)有限公司 Object selection method and device, storage medium and electronic equipment
CN112529082A (en) * 2020-12-15 2021-03-19 建信金融科技有限责任公司 System portrait construction method, device and equipment
CN112732755A (en) * 2020-12-30 2021-04-30 招商局金融科技有限公司 Client grouping-based label value matching joint verification method and device and computer equipment
CN112732755B (en) * 2020-12-30 2024-03-22 招商局金融科技有限公司 Label value matching joint verification method and device based on customer grouping and computer equipment
CN115511014A (en) * 2022-11-23 2022-12-23 联仁健康医疗大数据科技股份有限公司 Information matching method, device, equipment and storage medium
CN115511014B (en) * 2022-11-23 2023-04-07 联仁健康医疗大数据科技股份有限公司 Information matching method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111831894A (en) Information matching method and device
CN110910180B (en) Information pushing method and device, electronic equipment and storage medium
CN112380859A (en) Public opinion information recommendation method and device, electronic equipment and computer storage medium
KR101565957B1 (en) Discovering spam merchants using product feed similarity
Lehmussola et al. Evaluating the performance of microarray segmentation algorithms
CN112380449B (en) Information recommendation method, model training method and related device
CN111966886A (en) Object recommendation method, object recommendation device, electronic equipment and storage medium
CN111061979A (en) User label pushing method and device, electronic equipment and medium
CN116310656B (en) Training sample determining method and device and computer equipment
CN115293332A (en) Method, device and equipment for training graph neural network and storage medium
CN113256007A (en) Multi-mode-oriented new product sales forecasting method and device
CN111159481A (en) Edge prediction method and device of graph data and terminal equipment
CN113327132A (en) Multimedia recommendation method, device, equipment and storage medium
US8577814B1 (en) System and method for genetic creation of a rule set for duplicate detection
CN111325614B (en) Recommendation method and device of electronic object and electronic equipment
Bhavani et al. Feature selection using correlation fractal dimension: Issues and applications in binary classification problems
CN111274471B (en) Information pushing method, device, server and readable storage medium
CN107391728B (en) Data mining method and data mining device
CN111833080A (en) Information pushing method and device, electronic equipment and computer-readable storage medium
CN113591881B (en) Intention recognition method and device based on model fusion, electronic equipment and medium
CN115393903A (en) Method, device and equipment for updating image base and storage medium
CN115146729A (en) Abnormal shop identification method and device, computer equipment and storage medium
CN113052222A (en) Feature binning method, electronic device and storage medium
CN111861538A (en) Information pushing method and device, electronic equipment and storage medium
CN112766995A (en) Article recommendation method and device, terminal device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination