CN115455298A - Target object determination method and device, electronic equipment and storage medium - Google Patents

Target object determination method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115455298A
CN115455298A CN202211189057.1A CN202211189057A CN115455298A CN 115455298 A CN115455298 A CN 115455298A CN 202211189057 A CN202211189057 A CN 202211189057A CN 115455298 A CN115455298 A CN 115455298A
Authority
CN
China
Prior art keywords
data set
target
classifier
dimension
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211189057.1A
Other languages
Chinese (zh)
Inventor
胡正迪
闻连臣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202211189057.1A priority Critical patent/CN115455298A/en
Publication of CN115455298A publication Critical patent/CN115455298A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a target object determining method, a target object determining device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an original characteristic data set comprising at least three target users, and constructing a trust relationship matrix based on the original characteristic data set; aggregating the original characteristic data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions, and forming an influence factor set based on the influence factors; processing the original characteristic data set based on the trust relationship matrix to obtain an unmarked data set; the unlabeled data set is classified based on the target classifier and the set of impact factors to obtain a classification result, and at least one target item is determined based on the classification result. Based on the technical scheme, the classification of the users is completed according to the trust relationship among the users, and then the corresponding target object is determined based on the classification result, so that the target object can be recommended to the users more quickly, and the object recommendation accuracy is improved.

Description

Target object determination method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for determining a target object, an electronic device, and a storage medium.
Background
With the rapid development of computer technology, the server of the application program can recommend corresponding articles to the user according to the current hot spot information.
However, the existing item recommendation method recommends the hot item for the user according to the current hot spot, or directly recommends similar items for the user, and cannot perform personalized recommendation according to the requirements of the user, so that the accuracy of item recommendation is reduced.
Disclosure of Invention
The invention provides a target object determining method, a target object determining device, electronic equipment and a storage medium, which are used for determining a corresponding target object based on a classification result, so that the target object can be recommended to a user more quickly, and the object recommending accuracy is improved.
In a first aspect, the present invention provides a target item determination method, including: acquiring an original characteristic data set comprising at least three target users, and constructing a trust relationship matrix based on the original characteristic data set;
aggregating the original feature data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions, and forming an influence factor set based on the influence factors; wherein the set of impact factors includes a similarity dimension, an activity dimension, and an attention dimension;
processing the original characteristic data set based on the trust relationship matrix to obtain an unmarked data set;
classifying the unlabeled data set based on the target classifier to obtain a classification result, and determining at least one target item based on the classification result; wherein the target classifier is trained based on the set of influence factors, the labeled data set, the set of confidence labels and the unlabeled data set in the training sample.
In a second aspect, an embodiment of the present invention further provides a target item determination apparatus, where the apparatus includes:
the trust relationship matrix construction module is used for acquiring an original characteristic data set comprising at least three target users and constructing a trust relationship matrix based on the original characteristic data set;
the influence factor set forming module is used for aggregating the original characteristic data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions, and forming an influence factor set based on the influence factors; wherein the set of impact factors includes a similarity dimension, an activity dimension, and an attention dimension;
a data set acquisition module, configured to process the original feature data set based on the trust relationship matrix to obtain an unmarked data set;
the target article determining module is used for classifying the unmarked data set based on the target classifier to obtain a classification result and determining at least one target article based on the classification result; wherein the target classifier is trained based on the set of influence factors, the labeled data set, the set of confidence labels and the unlabeled data set in the training sample.
In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:
one or more processors; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to implement a target item determination method according to any of the embodiments of the present invention.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the target item determination method according to any one of the embodiments of the present invention.
According to the technical scheme, the original characteristic data set comprising at least three target users is obtained, the trust relationship matrix is built based on the original characteristic data set, the original characteristic data set is aggregated based on the preset aggregation dimension to obtain the influence factors corresponding to the aggregation dimensions, the influence factor set is formed based on the influence factors, the original characteristic data set is processed based on the trust relationship matrix to obtain the unmarked data set, the unmarked data set is classified based on the target classifier to obtain the classification result, and at least one target object is determined based on the classification result. Based on the technical scheme, the classification of the users is completed according to the trust relationship between the users, and then the corresponding target object is determined based on the classification result, so that the target object can be recommended to the users more quickly, and the object recommendation accuracy is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a target item determination method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a trust relationship matrix provided by an embodiment of the present invention;
FIG. 3 is a flow chart of a method for determining a target item provided by an embodiment of the present invention;
fig. 4 is a block diagram of a target item determining apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It is understood that before the technical solutions disclosed in the embodiments of the present invention are used, the type, the use range, the use scene, etc. of the personal information related to the present invention should be informed to the user and authorized by the user in a proper manner according to the relevant laws and regulations.
For example, in response to receiving an active request from a user, a prompt message is sent to the user to explicitly prompt the user that the requested operation to be performed would require the acquisition and use of personal information to the user. Therefore, the user can autonomously select whether to provide personal information to software or hardware such as electronic equipment, an application program, a server or a storage medium for executing the operation of the technical scheme of the invention according to the prompt information.
As an optional but non-limiting implementation manner, in response to receiving an active request from the user, the manner of sending the prompt information to the user may be, for example, a pop-up window, and the prompt information may be presented in a text manner in the pop-up window. In addition, a selection control for providing personal information to the electronic device by the user's selection of "agreeing" or "disagreeing" can be carried in the pop-up window.
It is understood that the above notification and user authorization process is only illustrative and not limiting, and other ways of satisfying relevant laws and regulations may be applied to the implementation of the present disclosure.
It will be appreciated that the data involved in the subject technology, including but not limited to the data itself, the acquisition or use of the data, should comply with the requirements of the corresponding laws and regulations and related regulations.
Example one
Fig. 1 is a schematic flowchart of a target item determining method according to an embodiment of the present invention, and this embodiment may be applied to determining an impact factor set and an unlabeled data set according to feature data of a user, further classifying the unlabeled data set based on a target classifier to obtain a classification result, and determining a condition of a target item based on the classification result.
As shown in fig. 1, the method includes:
s110, obtaining an original characteristic data set comprising at least three target users, and constructing a trust relationship matrix based on the original characteristic data set.
The target user may be a user who needs to obtain feature data, for example, the user who needs to count features of the user when using the a application may be the target user. The original feature data set can be understood as a data set composed of feature data of the target user. The feature data may be data generated when the user uses the application program, and may also be data generated based on interaction between users, such as historical item click data of the users, interaction data between the users, and the like. The trust relationship matrix may be understood as a matrix constructed based on trust relationships between users.
Specifically, the feature data set of the target user may be called from a preset database, for example, all data stored in the database may be directly obtained, the obtained data may be directly used as the original feature data set, feature data corresponding to a preset data type may be obtained from the database according to a preset screening manner, and may be used as the original feature data set, and the preset data type may be a preset data type related to a trust relationship. After the original characteristic data set is obtained, a trust relationship matrix between users can be constructed based on data in the original characteristic data set, trust can be divided into three states, namely trust, distrust and uncertain trust relationship, and the three states are respectively represented by 1, -1 and 0, so that the trust relationship matrix between the users can be constructed.
And S120, aggregating the original characteristic data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions, and forming an influence factor set based on the influence factors.
Wherein the set of impact factors includes a similarity dimension, an activity dimension, and an attention dimension. The preset aggregation dimension may be understood as a preset aggregation dimension, and for example, information such as similarity, activity, attention, and the like of the user may be obtained according to the historical data aggregation of the user. The influence factor may be a factor for representing a trust relationship between users, and accordingly, the influence factor set may be understood as a set of a plurality of influence factors. The similarity dimension may be a similarity between users determined by feature data in the original feature data set, correspondingly, the active dimension may be activity information of the users obtained by the original feature data set, and the attention dimension may be understood as a determination of attention between the users based on the original feature data set.
Specifically, the data in the original feature data set may be aggregated based on a preset aggregation dimension, for example, different types of feature data may be aggregated according to similarity of the feature data, or the data in the original feature data set may be aggregated based on a relationship between different users. After the aggregation of the data is completed, influence factors corresponding to the aggregation dimensions are obtained, and the influence factors corresponding to the aggregation dimensions are combined to obtain a set of influence factors. For example, the influence factors obtained based on the similar dimension, the active dimension, and the attention dimension may be combined to obtain a corresponding influence factor set, and then the trust relationship between the users may be further determined through the influence factor set.
On the basis of the above technical solution, the aggregating the original feature data set based on the preset aggregation dimension to obtain the influence factors corresponding to each aggregation dimension includes: acquiring historical article data of at least two target users; and determining the similarity between the at least two target users based on the historical item data, and taking the similarity as an influence factor corresponding to the similar dimension.
The historical item data may be item data purchased by the target user in the historical time, or item data clicked by the target user in a preset time period. The similarity may be understood as data indicating a degree of similarity between at least two target users.
Specifically, historical item data of at least two target users are obtained from the original feature data set, the similarity between the at least two target users is determined based on the historical item data of the at least two target users, and the similarity is used as an influence factor corresponding to the similarity dimension. For example, the similarity between users can be determined by calculating the Jacard similarity coefficient between users, and the similarity between users can be calculated by the following method
Figure BDA0003868620130000081
Wherein, I u And I v Represents a collection of products purchased by user u and user v, respectively, wherein I u ∩I v The intersection of products purchased for user u and user v, and I u ∪I v For the union of products purchased by user u and user v, jcd (u, v) represents the jaccard similarity coefficient between user u and user v.
On the basis of the above technical solution, the aggregating the original feature data set based on the preset aggregation dimension to obtain the influence factors corresponding to each aggregation dimension includes: acquiring historical evaluation data of a current target user; if the historical evaluation data is larger than a preset quantity threshold, determining the activity of the current target user to be a preset value; if the historical evaluation data are smaller than a preset quantity threshold value, determining the activity of the current user based on the historical evaluation data and the quantity threshold value, and taking the activity as an influence factor corresponding to the active dimension.
The historical evaluation data may be evaluation data of the target user on the commodity within a preset time, and it can be understood that the user may set the preset time according to the requirement. The number threshold may be understood as a preset active user evaluation number. Liveness may be data used to assess the liveness of a target user.
Specifically, historical evaluation data of the target user can be acquired from the original characteristic data set, the historical evaluation data is compared with a preset quantity threshold, if the historical evaluation data of the target user is larger than the quantity threshold, it is determined that the current target user meets a preset activity standard, and the activity of the current target user is set to be a preset value; and if the historical evaluation data of the target user does not meet the preset quantity threshold, determining the activity of the current target user based on the historical evaluation data of the current target user and the quantity threshold. For example, when the historical evaluation data of the target user is greater than or equal to a preset number threshold, the activity of the target user is set to 1; if the historical evaluation data of the target user is smaller than the preset quantity threshold, determining the activity of the user based on the ratio of the historical evaluation data to the quantity threshold, and calculating the similarity by the following method
Figure BDA0003868620130000091
Wherein N is historical evaluation data of a target user, and N n For the quantity threshold, l (v) is the activity of user v.
On the basis of the above technical solution, the aggregating the original feature data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions includes: acquiring the number of users with trust relationship with the current target user based on the trust relationship matrix; and taking the number of the users with trust relationship with the current target user as the attention of the current user, and taking the attention as an influence factor corresponding to the attention dimension.
Wherein, the attention degree may be used to indicate the trust degree of the current user by other users.
Specifically, the number of users having a trust relationship with the current target user is determined based on the trust relationship matrix, and the number of users having a trust relationship with the current target user is directly used as the attention of the current user to represent the attention degree of the current user. For example, if there are users A, B, C, D, a trust matrix constructed based on an original feature data set of user A, B, C, D is shown in fig. 2, and based on fig. 2, it can be seen that users B, C, D both trust user a, the number of users having a trust relationship with user a is 3, the attention of user a is 3, users a and C both trust user B, and the number of users having a trust relationship with user a is 2, the attention of user B is 2.
S130, processing the original characteristic data set based on the trust relationship matrix to obtain an unmarked data set.
Wherein an unmarked data set may be understood as a data set for which no trust relationship can be established.
Specifically, the transfer of the trust relationship may be performed based on the trust relationship matrix, and after the transfer of the trust relationship is completed, the feature data having the trust relationship is extracted from the original feature data set to be used as the marked data set, and the remaining data is data that has not been marked with the trust relationship and is used as the unmarked data set. For example, if t u,v =1 indicates that the trust relationship between user u and user v is 1 if t v,x =1, i.e. the trust relationship between user v and user x is 1, then by the transfer property of trust we consider that there is a trust relationship between user u and user x, i.e. t u,x =t u,v ∩t v,x
S140, classifying the unlabeled data set based on the target classifier and the influence factor set to obtain a classification result, and determining at least one target object based on the classification result.
The target classifier is obtained by training based on the influence factor set, the labeled data set, the confidence label set and the unlabeled data set in the training sample. A target classifier may be understood as a classification model for classifying unlabeled datasets. The classification result may be a classification result obtained after inputting unlabeled data to the target classifier. The target item may be understood as an item that needs to be pushed. The tagged data set may be a data set that has completed tagging of data tags. A trust tag set may be understood as a trust tag used to characterize individual data in a marked data set.
Specifically, the unmarked data may be further screened through the impact factor set, for example, each impact factor is compared with a preset threshold, when the preset threshold condition is not met, the feature data corresponding to the current impact factor set is marked as untrusted, the marked data is added to the marked data set and deleted from the unmarked data set, the unmarked data set is updated based on the impact factor set, the updated unmarked data set is input to the target classification model to obtain a classification result, and the corresponding target object is determined based on the classification result. For example, the risk tendency of the user may be divided by a target classification model, the classification result may be conservative, aggressive, or the like, and after the classification result is determined, at least one target object is determined for the target user corresponding to the classification result, for example, when the classification result is conservative, the corresponding target object is a conservative commodity.
On the basis of the above technical solution, before classifying the unlabeled dataset based on the target classifier to obtain a classification result, and determining at least one target article based on the classification result, the method includes: determining a trust label for partial data in the unmarked data set based on each influence factor in the set of influence factors; and updating the marked data set according to the trust label of the partial data in the unmarked data set to obtain a target data set so as to obtain the target classifier based on the target data set.
The trust label may be label information for representing a trust relationship between users, and may be represented by a vector between users. The target data set may be understood as a data set resulting from adding data after determining the trust tag to the marked data set.
Specifically, the trust label of the data with poor influence factors of the influence factor set is marked as distrust, the part of data is deleted from the unmarked data set and added into the marked data set, and the marked data set after processing is used as a target data set so as to determine a target classifier based on the target data set. For example, threshold information corresponding to each influence factor may be preset, each influence factor may be compared with the corresponding threshold information, if each influence factor does not satisfy the preset threshold, it is determined that sample data corresponding to the influence factor set cannot satisfy the requirement, the trust tag is set as untrusted, and the data is added to the marked data set.
On the basis of the above technical solution, the obtaining the target classifier based on the target data set includes: inputting an unlabeled data set into a classifier to be trained, and acquiring an output result of the classifier to be trained; screening the output result based on the trust label set to obtain a target output result; and iterating the loss function of the classifier to be trained according to the target output result to obtain a target classifier.
The classifier to be trained may be a classifier constructed based on a target data set, and it should be noted that the classifier constructed based on the target data set is not yet trained, so that the classification performance of the classifier to be trained cannot meet the requirement of a user, and the classifier to be trained needs to be further trained to obtain the target classifier. The output result can be understood as a classification result obtained by classifying the unlabeled data set by the classifier to be trained. The target output result may be an output result determined after the output result is filtered based on the trust tag set.
Specifically, the processed unmarked data set can be input into the classifier to be trained, and a corresponding output result is obtained, because the classifier to be trained is not trained yet and cannot perform accurate classification on data, the output result of the classifier to be trained is inaccurate, and then after the output result is obtained, the output result is screened according to the trust label set, so that an accurate target output result is obtained, and iterative solution is performed on the loss function of the classifier to be trained based on the target output result, so that the target classifier is obtained.
On the basis of the above technical solution, the iterating the loss function of the classifier to be trained according to the target output result to obtain the target classifier includes: detecting the specific gravity coefficients of the unlabeled data set and the labeled data set in the loss function; if the specific gravity coefficient of the unmarked data set is smaller than that of the marked data set, the specific gravity coefficient of the unmarked data set is increased, and then the loss function is iterated again; and if the proportion coefficient of the unmarked data set is equal to the proportion coefficient of the marked data set, taking the current classifier as a target classifier.
Wherein, the specific gravity coefficient can be understood as the specific gravity information occupied by different data sets in the classifier training process.
Specifically, in the process of iterating the loss function, the specific gravity coefficients of the unlabeled data set and the labeled data set in the loss function are detected in real time, if the specific gravity coefficient of the unlabeled data set is smaller than that of the labeled data set, the specific gravity coefficient of the unlabeled data set needs to be increased, then the loss function is iterated again, finally, the specific gravity coefficients of the unlabeled data set and the labeled data set in the loss function are equal through continuously increasing the specific gravity coefficient of the unlabeled data set, the specific gravity coefficients of the unlabeled data set and the labeled data set in the loss function are equal to serve as a condition for ending iteration, and a classifier with equal specific gravity coefficients of the unlabeled data set and the labeled data set in the loss function serves as a target classifier. It should be noted that the target output result changes in different iteration processes, so that the target output result in the loss function needs to be dynamically adjusted, and the proportion of the unlabeled data is continuously increased to increase the influence of the unlabeled data in the training process. When the proportion of unlabeled data is equal to that of labeled data, stopping iteration, and using a final classifier to classify the unlabeled data set.
According to the technical scheme, the original characteristic data set comprising at least three target users is obtained, the trust relationship matrix is built based on the original characteristic data set, the original characteristic data set is aggregated based on the preset aggregation dimension to obtain the influence factors corresponding to the aggregation dimensions, the influence factor set is formed based on the influence factors, the original characteristic data set is processed based on the trust relationship matrix to obtain the unmarked data set, the unmarked data set is classified based on the target classifier to obtain the classification result, and at least one target object is determined based on the classification result. Based on the technical scheme, the classification of the users is completed according to the trust relationship among the users, and then the corresponding target object is determined based on the classification result, so that the target object can be recommended to the users more quickly, and the object recommendation accuracy is improved.
Example two
Fig. 3 is a flowchart of a target item determining method according to an embodiment of the present invention. The present embodiment further optimizes the target item determination method on the basis of the above embodiments. The technical scheme of the embodiment can be referred to for the specific implementation mode. The technical terms that are the same as or corresponding to the above embodiments are not repeated herein.
As shown in fig. 3, the method of this embodiment specifically includes:
acquiring an original characteristic data set: the feature data set of the target user can be called from a preset database, for example, all data stored in the database can be directly obtained, the obtained data is directly used as an original feature data set, feature data conforming to the type of the preset data can be obtained from the database according to a preset screening mode and used as the original feature data set, and the type of the preset data can be a preset data type related to a trust relationship.
Constructing a trust relationship matrix: specifically, the trust relationship matrix is established according to the user data characteristics, and trust can be divided into three states, namely '1' represents trust, '1' represents distrust, and '0' represents that the trust relationship is ambiguous. Therefore, the description of the trust relationship not only accords with the objective state of trust in a recommendation system scene, but also is beneficial to subsequent mining and recommendation.
Determining an unlabeled dataset: in particular, the transfer of trust is performed in a trust matrix. t is t u,v =1 indicates that the trust relationship between user u and user v is 1 if t v,x =1, i.e. the trust relationship between user v and user x is 1, then by the transfer property of trust we consider that there is a trust relationship between user u and user x, i.e. t u,x =t u,v ∩t v,x . It should be noted that, at this stage, only one intermediate entity is transferred to obtain a new set D of trust relationships k ={(x k ,y k ),(x k+1 ,y k+2 ),...,(x k+l ,y k+l ) } and the set of predicted trust labels
Figure BDA0003868620130000141
Determining a characteristic factor set: specifically, in order to make trust prediction more accurate, features among users are aggregated, so that the features are more consistent with the mining of trust relationships. Aggregation of feature factors can be done in three dimensions, such as a similarity dimension, an activity dimension, an attention dimension. For example, the "Jacard similarity factor" may be used to calculate the similarity between users. Determining the activity of the users based on the historical evaluation data of the users and a preset threshold value, and determining the attention of the users according to trust data among the users.
Determining a target classifier: in particular, a training set D may be given l ={(x 1 ,y 1 ),(x 2 ,y 2 ),...,(x l ,y l ) And D u ={x l+1 ,x l+2 ,...,x l+u In which x is i Representing the ith sparse trust relationship (referred to herein as the ithThe sparse trust relationship is a vector of influence factors of the ith vector after being combined pairwise by the users); y is i E { -1, +1} represents the trust label for the second user sparse trust relationship, where l < u, l + u = k, k = m × m. For example, assume that the second user sparse trust relationship is a trust relationship pointed to user v by user u. The goal of the object classifier is to give D u Pseudo-tagging of unmarked trust data in
Figure BDA0003868620130000142
It should be noted that the loss function of the classifier is as follows:
Figure BDA0003868620130000151
s.ty i (w T x i +b)≥1-ξ i ,i=1,2,...l,
Figure BDA0003868620130000152
ξ i ≥0,i=1,2,...m
wherein xi is a relaxation variable, C l And C u The weighting factor, which is artificially set for us, is used to balance the weighting of the unlabeled dataset and the labeled dataset in the model.
First, unlabeled sample data that all influence factors in the set of influence factors perform poorly can be labeled as untrusted, i.e., y i =1, and then added to the marked data set D l In (1). During the mining process, we use unlabeled belief relationships to assist in the training of the mining model. After having obtained untrusted data, we use the marked dataset D l Training weak classifier SVM l And use of SVM l Predicting the unmarked data to obtain the output result of the unmarked data
Figure BDA0003868620130000153
Then we use the firstTrust tag set obtained by spreading trust in steps
Figure BDA0003868620130000154
Obtaining the intersection of the output result and the predicted output result to obtain the target output result
Figure BDA0003868620130000155
And outputs the target output result
Figure BDA0003868620130000156
Figure BDA0003868620130000157
And adding the loss function into an iterative solution process of the loss function. And adjusting the target output result in an iteration process, and continuously increasing the proportion of unlabeled data to increase the generated influence of the unlabeled data in the training process. The iteration is stopped when the unmarked data and the marked data have a constant weight.
Determining a target article: in particular, the final classifier is used to classify the unlabeled data set D u ={x l+1 ,x l+2 ,...,x l+u Classifying to obtain a classification result, and determining at least one target object based on the classification result.
According to the technical scheme, the original characteristic data set comprising at least three target users is obtained, the trust relationship matrix is built based on the original characteristic data set, the original characteristic data set is aggregated based on the preset aggregation dimension to obtain the influence factors corresponding to the aggregation dimensions, the influence factor set is formed based on the influence factors, the original characteristic data set is processed based on the trust relationship matrix to obtain the unmarked data set, the unmarked data set is classified based on the target classifier to obtain the classification result, and at least one target object is determined based on the classification result. Based on the technical scheme, the classification of the users is completed according to the trust relationship between the users, and then the corresponding target object is determined based on the classification result, so that the target object can be recommended to the users more quickly, and the object recommendation accuracy is improved.
EXAMPLE III
Fig. 4 is a block diagram of a target item determination apparatus according to an embodiment of the present invention. The device includes: a trust relationship matrix building module 410, an influence factor set constructing module 420, a data set acquisition module 430, and a target item determination module 440.
The trust relationship matrix construction module is used for acquiring an original characteristic data set comprising at least three target users and constructing a trust relationship matrix based on the original characteristic data set;
the influence factor set forming module is used for aggregating the original characteristic data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions, and forming an influence factor set based on the influence factors; wherein the set of impact factors includes a similarity dimension, an activity dimension, and an attention dimension;
a data set acquisition module, configured to process the original feature data set based on the trust relationship matrix to obtain an unmarked data set;
a target object determination module, configured to classify the unlabeled dataset based on the target classifier and the set of influence factors to obtain a classification result, and determine at least one target object based on the classification result; wherein the target classifier is trained based on the set of influence factors, the labeled data set, the set of confidence labels and the unlabeled data set in the training sample.
On the basis of the technical scheme, the influence factor set forming module is used for acquiring historical item data of at least two target users; and determining the similarity between the at least two target users based on the historical item data, and taking the similarity as an influence factor corresponding to the similar dimension.
On the basis of the technical scheme, the influence factor set forming module is used for acquiring historical evaluation data of the current target user; if the historical evaluation data are larger than a preset quantity threshold value, determining the activity of the current target user to be a preset value; and if the historical evaluation data is smaller than a preset quantity threshold, determining the activity of the current user based on the historical evaluation data and the quantity threshold, and taking the activity as an influence factor corresponding to the activity dimension.
On the basis of the technical scheme, the influence factor set forming module is used for acquiring the number of users with trust relationships with the current target user based on the trust relationship matrix; and taking the number of the users with trust relationship with the current target user as the attention of the current user, and taking the attention as an influence factor corresponding to the attention dimension.
On the basis of the above technical solution, the apparatus further includes:
a target classifier determination module configured to determine a trust label for partial data in the unmarked data set based on each influence factor in the set of influence factors; and updating the marked data set according to the trust label of the partial data in the unmarked data set to obtain a target data set so as to obtain the target classifier based on the target data set.
On the basis of the technical scheme, the target classifier determining module is used for inputting an unmarked data set into the classifier to be trained and acquiring an output result of the classifier to be trained; screening the output result based on the trust label set to obtain a target output result; and iterating the loss function of the classifier to be trained according to the target output result to obtain a target classifier.
On the basis of the technical scheme, the target classifier determining module is used for detecting the proportion coefficient of the unmarked data set and the marked data set in the loss function; if the specific gravity coefficient of the unmarked data set is smaller than that of the marked data set, the specific gravity coefficient of the unmarked data set is increased, and then the loss function is iterated again; and if the proportion coefficient of the unmarked data set is equal to the proportion coefficient of the marked data set, taking the current classifier as a target classifier.
According to the technical scheme, the original characteristic data set comprising at least three target users is obtained, the trust relationship matrix is built based on the original characteristic data set, the original characteristic data set is aggregated based on the preset aggregation dimension to obtain the influence factors corresponding to the aggregation dimensions, the influence factor set is formed based on the influence factors, the original characteristic data set is processed based on the trust relationship matrix to obtain the unmarked data set, the unmarked data set is classified based on the target classifier to obtain the classification result, and at least one target object is determined based on the classification result. Based on the technical scheme, the classification of the users is completed according to the trust relationship between the users, and then the corresponding target object is determined based on the classification result, so that the target object can be recommended to the users more quickly, and the object recommendation accuracy is improved.
The target object determining device provided by the embodiment of the invention can execute the target object determining method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the executing method.
It should be noted that, the units and modules included in the apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the embodiments of the present disclosure.
Example four
FIG. 5 illustrates a block diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM12, and the RAM13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 performs the various methods and processes described above, such as the target item determination method.
In some embodiments, the target item determination method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM12 and/or the communication unit 19. When the computer program is loaded into RAM13 and executed by processor 11, one or more steps of the target item determination method described above may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the target item determination method in any other suitable manner (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired result of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A target item determination method, comprising:
acquiring an original characteristic data set comprising at least three target users, and constructing a trust relationship matrix based on the original characteristic data set;
aggregating the original feature data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions, and forming an influence factor set based on the influence factors; wherein the set of impact factors includes a similarity dimension, an activity dimension, and an attention dimension;
processing the original characteristic data set based on the trust relationship matrix to obtain an unmarked data set;
classifying the unlabeled data set based on the target classifier and the set of impact factors to obtain a classification result, and determining at least one target item based on the classification result; wherein the target classifier is trained based on the set of influence factors, the labeled data set, the belief tag set, and the unlabeled data set in the training sample.
2. The method according to claim 1, wherein the aggregating the original feature data set based on a preset aggregation dimension to obtain an influence factor corresponding to each aggregation dimension comprises:
acquiring historical article data of at least two target users;
and determining the similarity between the at least two target users based on the historical item data, and taking the similarity as an influence factor corresponding to the similar dimension.
3. The method according to claim 1, wherein the aggregating the original feature data set based on a preset aggregation dimension to obtain an influence factor corresponding to each aggregation dimension comprises:
acquiring historical evaluation data of a current target user;
if the historical evaluation data is larger than a preset quantity threshold, determining the activity of the current target user to be a preset value;
and if the historical evaluation data is smaller than a preset quantity threshold, determining the activity of the current user based on the historical evaluation data and the quantity threshold, and taking the activity as an influence factor corresponding to the activity dimension.
4. The method according to claim 1, wherein the aggregating the original feature data set based on a preset aggregation dimension to obtain an influence factor corresponding to each aggregation dimension comprises:
acquiring the number of users with trust relationship with the current target user based on the trust relationship matrix;
and taking the number of the users having trust relationship with the current target user as the attention degree of the current user, and taking the attention degree as an influence factor corresponding to the attention dimension.
5. The method of claim 1, prior to classifying the unlabeled data set based on the target classifier to obtain a classification result and determining at least one target item based on the classification result, comprising:
determining a trust label for partial data in the unmarked data set based on each influence factor in the set of influence factors;
and updating the marked data set according to the trust label of the partial data in the unmarked data set to obtain a target data set so as to obtain the target classifier based on the target data set.
6. The method of claim 5, wherein the deriving the target classifier based on the target dataset comprises:
inputting an unlabeled data set into the classifier to be trained, and acquiring an output result of the classifier to be trained;
screening the output result based on the trust label set to obtain a target output result;
and iterating the loss function of the classifier to be trained according to the target output result to obtain a target classifier.
7. The method of claim 6, wherein iterating the loss function of the classifier to be trained according to the target output result to obtain a target classifier comprises:
detecting the specific gravity coefficients of the unlabeled data set and the labeled data set in the loss function;
if the specific gravity coefficient of the unmarked data set is smaller than that of the marked data set, the specific gravity coefficient of the unmarked data set is increased, and then the loss function is iterated again;
and if the weight coefficient of the unmarked data set is equal to that of the marked data set, taking the current classifier as a target classifier.
8. A target item determination apparatus, comprising:
the trust relationship matrix construction module is used for acquiring an original characteristic data set comprising at least three target users and constructing a trust relationship matrix based on the original characteristic data set;
the influence factor set forming module is used for carrying out aggregation on the original characteristic data set based on preset aggregation dimensions to obtain influence factors corresponding to the aggregation dimensions, and forming an influence factor set based on the influence factors; wherein the set of impact factors includes a similarity dimension, an activity dimension, and an attention dimension;
a data set acquisition module, configured to process the original feature data set based on the trust relationship matrix to obtain an unmarked data set;
a target object determination module, configured to classify the unlabeled data set based on the target classifier to obtain a classification result, and determine at least one target object based on the classification result; wherein the target classifier is trained based on the set of influence factors, the labeled data set, the set of confidence labels and the unlabeled data set in the training sample.
9. An electronic device, characterized in that the electronic device comprises:
one or more processors; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the target item determination method of any one of claims 1-7.
10. A computer-readable storage medium storing computer instructions for causing a processor to perform the target item determination method of any one of claims 1-7 when executed.
CN202211189057.1A 2022-09-28 2022-09-28 Target object determination method and device, electronic equipment and storage medium Pending CN115455298A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211189057.1A CN115455298A (en) 2022-09-28 2022-09-28 Target object determination method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211189057.1A CN115455298A (en) 2022-09-28 2022-09-28 Target object determination method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115455298A true CN115455298A (en) 2022-12-09

Family

ID=84306889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211189057.1A Pending CN115455298A (en) 2022-09-28 2022-09-28 Target object determination method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115455298A (en)

Similar Documents

Publication Publication Date Title
CN109558951B (en) Method and device for detecting fraud account and storage medium thereof
CN114265979B (en) Method for determining fusion parameters, information recommendation method and model training method
CN109993627B (en) Recommendation method, recommendation model training device and storage medium
CN111783039B (en) Risk determination method, risk determination device, computer system and storage medium
CN111667024B (en) Content pushing method, device, computer equipment and storage medium
CN115983900A (en) Method, apparatus, device, medium, and program product for constructing user marketing strategy
CN116883181B (en) Financial service pushing method based on user portrait, storage medium and server
CN114037059A (en) Pre-training model, model generation method, data processing method and data processing device
CN114169418B (en) Label recommendation model training method and device and label acquisition method and device
CN115827994A (en) Data processing method, device, equipment and storage medium
CN115619245A (en) Portrait construction and classification method and system based on data dimension reduction method
CN112905885B (en) Method, apparatus, device, medium and program product for recommending resources to user
CN115455298A (en) Target object determination method and device, electronic equipment and storage medium
CN111325350A (en) Suspicious tissue discovery system and method
CN112434083A (en) Event processing method and device based on big data
CN114547448B (en) Data processing method, model training method, device, equipment, storage medium and program
CN115146725B (en) Method for determining object classification mode, object classification method, device and equipment
CN114037058B (en) Pre-training model generation method and device, electronic equipment and storage medium
CN114066278B (en) Method, apparatus, medium, and program product for evaluating article recall
CN115221421A (en) Data processing method and device, electronic equipment and storage medium
CN117651167A (en) Resource recommendation method, device, equipment and storage medium
CN116150499A (en) Preference measurement method, device, equipment and medium for resource quality
CN117668209A (en) Document recommendation method, device, equipment and storage medium
CN112966210A (en) Method and device for storing user data
CN117743693A (en) Information recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination