CN112508256B - User demand active prediction method and system based on crowdsourcing - Google Patents

User demand active prediction method and system based on crowdsourcing Download PDF

Info

Publication number
CN112508256B
CN112508256B CN202011387991.5A CN202011387991A CN112508256B CN 112508256 B CN112508256 B CN 112508256B CN 202011387991 A CN202011387991 A CN 202011387991A CN 112508256 B CN112508256 B CN 112508256B
Authority
CN
China
Prior art keywords
user
demand
crowdsourcing
information
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011387991.5A
Other languages
Chinese (zh)
Other versions
CN112508256A (en
Inventor
张以文
储蓓
王庆人
沈书泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202011387991.5A priority Critical patent/CN112508256B/en
Publication of CN112508256A publication Critical patent/CN112508256A/en
Application granted granted Critical
Publication of CN112508256B publication Critical patent/CN112508256B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a user demand active prediction method and a user demand active prediction system based on crowdsourcing, which comprise the following steps: s1: determining annotators participating in the crowdsourcing task, wherein the annotators receive the crowdsourcing task and complete the task; s2: constructing a heterogeneous information network according to the user preference information; s3: generating a user required data space; s4: learning the expression vectors of the user and the demand object respectively through a graph convolution neural network; s5: and (4) demand prediction. According to the invention, the user directly participates in information production and knowledge sharing through a crowdsourcing technology, the preference information fed back by a crowdsourcing annotator can better reflect the real requirement of the user, and the accuracy of the result can be improved by combining the information to predict the requirement; the attribute characteristics of the users are enriched by the user preference information acquired in the crowdsourcing mode, and attribute completion is performed on new registered users lacking historical behavior data, so that each user can be more accurately represented, and the recommendation result is more personalized.

Description

User demand active prediction method and system based on crowdsourcing
Technical Field
The invention relates to the technical field of computers, in particular to a user demand active prediction method and system based on crowdsourcing.
Background
With the development of the internet and big data technology, the problem of information overload is increasingly serious, and the recommendation system can provide interested commodities or services for the user according to the demand information of the user to help the user to perform effective information processing. Therefore, whether the user requirements can be accurately, comprehensively and actively predicted becomes a key for improving the recommendation performance of the service provider so as to realize the maximization of the commercial profit.
(1) Heterogeneous information network recommendation
The existing recommendation technology mainly analyzes user requirements based on network behavior data of users, and the user network behavior data under the background of a big data era often has multi-source heterogeneity. Heterogeneous information networks can integrate different types of objects and complex interaction relationships between the objects, and have been widely used in the field of recommendation as an effective information fusion method. The application number CN106503028A relates to a recommendation method, which comprises the following steps: modeling the objects in the recommended data set and the relationship between the objects as a heterogeneous information network; acquiring a meta path connecting two objects in the heterogeneous information network; calculating similarity data between the objects according to a meta path connecting the two objects; constructing an objective function according to similarity data between objects, and training the recommended data set through the objective function to obtain a prediction score of the user on the object; and recommending the item to the user according to the prediction score of the user on the item. According to the method, a heterogeneous information network is utilized to model the recommendation data, the problem of data sparseness is effectively relieved, and the recommendation effect is improved.
(2) Crowdsourcing recommendations
Crowdsourcing is a distributed problem solving and production mode which adopts a certain mechanism to enable groups to participate in a certain thing together to achieve a certain target. Crowdsourcing solves the problem difficult to understand through the intelligence of the group, and spreads the problem to the worker group in a public bidding mode. Combining crowdsourcing task information has become a hot issue for research in the field of crowdsourcing recommendations.
A method for crowd-sourced task recommendation, as disclosed in application No. 202010464312.3, comprising the steps of: according to crowdsourcing worker data and historical tasks on a crowdsourcing platform, user portrait updating and user portrait grade updating are carried out on crowdsourcing workers; screening the crowdsourcing workers according to the requirements of the tasks to be processed, and obtaining a crowdsourcing worker list; determining the completion time and price of the task to be processed according to the requirements of the task to be processed and the crowdsourcing worker list; determining a recommendation probability list of the crowdsourcing workers through a task recommendation model according to the completion time and the price; recommending the tasks to be processed to crowdsourcing workers in the crowdsourcing worker list according to the tasks to be processed and the recommendation probability list. According to the method and the device, the crowdsourcing workers are subjected to user portrait, skills of the crowdsourcing workers are graded according to attributes in the user portrait, and the recommendation probability list is generated, so that tasks are automatically pushed to the crowdsourcing workers. The method focuses on task recommendation on a crowdsourcing platform. The crowdsourcing mode can feed back information which can reflect true requirements of the user to a task requester in a mode of direct participation of the user, such as a service provider, more accurate and complete data support can be provided for user requirement prediction of a service platform by combining data collected by the crowdsourcing mode with recommended data, and particularly the data missing problem of a newly registered service platform user can be relieved. Therefore, designing a heterogeneous information network modeling method fusing crowdsourcing acquisition data and recommendation data and a corresponding demand prediction method is a practical demand.
Disclosure of Invention
The invention aims to provide a high-matching-degree demand active prediction method for a new user lacking historical data.
The invention solves the technical problems through the following technical means:
a crowd-sourced user demand active prediction method comprises the following steps:
s1: determining annotators participating in the crowdsourcing task, designing the crowdsourcing task and issuing the crowdsourcing task to a crowdsourcing task platform, wherein the annotators receive the crowdsourcing task and complete the task;
s2: constructing a heterogeneous information network according to the user social relationship, the historical behavior data and the user preference information acquired in the crowdsourcing mode;
s3: uniformly representing different types of entities in a heterogeneous information network to generate a user demand data space;
s4: extracting the interactive semantics of the user and the demand object by using a meta path in a heterogeneous information network, and respectively learning the expression vectors of the user and the demand object through a graph convolution neural network;
s5: and aggregating the neighbor information of the target user according to the social relation of the users in the heterogeneous information network, obtaining the expression vector of the target user from the data space, and predicting the demand.
Further, the step S1 comprises
S11: acquiring a user set of a service provider as a target user, acquiring user social relationship data from a social network and a service provider platform to obtain a social neighbor user set of the target user, and taking all users as annotators for receiving crowdsourcing tasks;
s12: and designing a user preference survey questionnaire from the aspects of demographic information, social requirements for reflecting common interest in social relations and enjoyment requirements for reflecting individual preferences, wherein the content of the questionnaire comprises selection questions expressed in characters and displayed in a graphical mode, a annotator is allowed to submit auxiliary information independently, the questionnaire is published on a crowdsourcing platform in a crowdsourcing task mode, and tasks are published to the annotator obtained in S11.
Further, the step S2 includes:
s21: taking multi-modal data acquired based on a crowdsourcing mode in the S1 as an attribute set of each user, and taking the user and a demand object as nodes;
s22: establishing a connection edge between a user and a demand object according to the following relation:
the relationship R1: direct relationships such as friends and concern exist among users U, and L are used respectively -1 Representing relationships between users U, i.e.
Figure BDA0002809935100000031
And &>
Figure BDA0002809935100000032
Relationship R2: some users have historical behavior information, such as the user bought a certain article, used a certain service, etc., respectively using B and B -1 Representing user U and requirement object O k In relation to each other, i.e.
Figure BDA0002809935100000033
And &>
Figure BDA0002809935100000034
Wherein k represents a kth class requirement object;
s23: and establishing a multi-mode heterogeneous information network according to the attribute set, the nodes and the relationship among the nodes.
Further, the step S3 includes:
s31: user information, text attribute information and image attribute information of a demand object collected in a crowdsourcing mode are uniformly expressed:
obtaining vector representation of text type information by adopting word2vec method
Figure BDA0002809935100000035
Wherein e is u Representing a user text attribute vector representation, e o Representing the text attribute vector representation of the demand object, wherein N is the quantity of the demand object categories;
the picture type information is represented by vector obtained by adopting convolutional neural network
Figure BDA0002809935100000041
Wherein, g u Representing user Picture Attribute vector representation, g o A picture attribute vector representation representing a demand object;
s32: to the warpFusing the uniformly expressed multi-modal attribute information: the user attribute vector e obtained in the step S31 u And g u Performing outer product operation to realize feature intersection, flattening the obtained matrix according to rows, inputting the flattened matrix into a multilayer perceptron to obtain an initial vector representation Z of a user node, and expressing a vector of a demand object attribute
Figure BDA0002809935100000042
And &>
Figure BDA0002809935100000043
Repeating the operation to obtain the initial vector representation O of the demand object node k And vector representation of all users and requirement objects forms a user requirement data space.
Further, the step S4 includes:
s41: establishing a plurality of user-demand object co-occurrence matrixes T according to historical behavior information of users k : user-item co-occurrence matrix
Figure BDA0002809935100000044
Subscriber-service co-occurrence matrix ≥>
Figure BDA0002809935100000045
Wherein, | I | is the quantity of articles, | S | is the quantity of services, if the user has bought a certain article or the user has used a certain service, put 1 in the corresponding position of the corresponding co-occurrence matrix;
s42: in the k-th demand active prediction scene of the user, extracting the UO from the heterogeneous information network constructed in the step S2 k U-element path, meaning that two users use the semantic information of the kth class demand object together, co-occurrence matrix T k To which it is transferred
Figure BDA0002809935100000046
Multiply, i.e. ->
Figure BDA0002809935100000047
Deriving an inter-user relationship matrix under that semantic->
Figure BDA0002809935100000048
Extracting O from the heterogeneous information network constructed in step S2 k UO k Meta-path representing semantic information used by the same user for two kth class demand objects by &>
Figure BDA0002809935100000049
Obtain a relationship matrix ^ between the kth class of demand objects under the semantic>
Figure BDA00028099351000000410
S43: for the relationship matrix between users obtained in S42
Figure BDA00028099351000000411
And a relationship matrix between demand objects>
Figure BDA00028099351000000412
The standardization treatment is carried out according to the following formulas respectively,
Figure BDA00028099351000000413
Figure BDA00028099351000000414
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00028099351000000415
and &>
Figure BDA00028099351000000416
Are all diagonal matrices, are asserted>
Figure BDA00028099351000000417
And &>
Figure BDA00028099351000000418
Are respectively based on>
Figure BDA00028099351000000419
And &>
Figure BDA00028099351000000420
A degree matrix of (c);
s44: using a graph-convolution neural network, a user vector representation is learned according to the following formula,
Figure BDA00028099351000000421
the vector representation of the kth class demand object is learned in accordance with the following formula,
Figure BDA00028099351000000422
wherein the content of the first and second substances,
Figure BDA0002809935100000051
vector representations of the ith layer user and kth class requirement object respectively are shown, when l is 0,
Figure BDA0002809935100000052
is Z, <' > based on>
Figure BDA0002809935100000053
Is O k P and W are weight parameters, wherein, indicates element-by-element multiplication operation, sigma is an activation function, and phi indicates that a vector is converted into a diagonal matrix;
s45: and repeating the operation in the S44, and alternately updating the vector representations of the user and the kth class of demand object respectively until the final layer of convolution is finished to obtain the vector representations of all the user and the kth class of demand object.
Further, the step S5 includes:
s51: s4, obtaining a vector representation of a target user i as
Figure BDA0002809935100000054
Neighbor user j e of user iN (i), aggregating neighbor user information by using an attention mechanism to obtain final target user vector representation; the weight coefficient of the neighbor to the target user is calculated,
Figure BDA0002809935100000055
the vector representation of the target user is updated,
Figure BDA0002809935100000056
wherein, alpha and W are weight parameters, sigma is an activation function, and | l is splicing operation;
s52: for each target user, calculating its relevance prediction score to each kth class demand object
Figure BDA0002809935100000057
Figure BDA0002809935100000058
S53: the loss function is a binary cross entropy function:
Figure BDA0002809935100000059
wherein, Y and Y - Positive and negative examples in the data set, Y represents the demand object set used by the user, Y - Sampled from the demand objects in the data set that are not used by the user,
Figure BDA00028099351000000510
indicates whether the user has an interaction, presence interaction @, with the demand object>
Figure BDA00028099351000000511
Is 1, otherwise is 0; using stochastic gradient descent method to correct the loss functionPerforming optimization solution on the numbers, sequencing the kth class demand objects from high to low according to the prediction score obtained by calculation in the step S52, and selecting the first n demand objects as a kth class demand list of the user;
s54: by repeating the operations of S42-S53, a list of all the categories of demand objects for each user can be obtained, thereby realizing active prediction of user demands.
The invention also provides a user demand active prediction system based on crowdsourcing, which comprises
A crowdsourcing task issuing module: determining a annotator participating in the crowdsourcing task, designing the crowdsourcing task and issuing the crowdsourcing task to a crowdsourcing task platform, and receiving the crowdsourcing task and completing the task by the annotator;
heterogeneous information network construction module: constructing a heterogeneous information network according to the user social relationship, the historical behavior data and the user preference information acquired in the crowdsourcing mode;
the user demand data space generation module: uniformly representing different types of entities in a heterogeneous information network to generate a user demand data space;
the user and demand object representation vector learning module: extracting the interactive semantics of the user and the demand object by using a meta path in the heterogeneous information network, and respectively learning the expression vectors of the user and the demand object;
a demand forecasting module: and aggregating the neighbor information of the target user according to the social relation of the users in the heterogeneous information network to obtain the expression vector of the target user and perform demand prediction.
Further, the specific execution process of the step crowdsourcing task issuing module is as follows:
s11: acquiring a user set of a service provider as a target user, acquiring user social relationship data from a social network and a service provider platform to obtain a social neighbor user set of the target user, and taking all the users as annotators for receiving crowdsourcing tasks;
s12: and designing a user preference survey questionnaire from the aspects of demographic information, social requirements for reflecting common interest in social relations and enjoyment requirements for reflecting individual preferences, wherein the content of the questionnaire comprises selection questions expressed in characters and displayed in a graphical mode, a annotator is allowed to submit auxiliary information independently, the questionnaire is published on a crowdsourcing platform in a crowdsourcing task mode, and tasks are published to the annotator obtained in S11.
Further, the heterogeneous information network construction module performs the following steps:
s21: taking multimodal data collected based on a crowdsourcing mode as an attribute set of each user, and taking the user and a demand object as nodes;
s22: establishing a connection edge between a user and a demand object according to the following relation:
the relationship R1: direct relationships such as friends and concerns exist between users U, and L are used respectively -1 Representing relationships between users U, i.e.
Figure BDA0002809935100000061
And &>
Figure BDA0002809935100000062
The relationship R2: some users have historical behavior information, such as the user bought a certain article, used a certain service, etc., respectively using B and B -1 Representing user U and requirement object O k In relation to each other, i.e.
Figure BDA0002809935100000071
And &>
Figure BDA0002809935100000072
Wherein k represents a kth class requirement object;
s23: and establishing a multi-mode heterogeneous information network according to the attribute set, the nodes and the relationship among the nodes.
Further, the specific execution process of the user requirement data space generation module includes:
s31: user information, text attribute information and image attribute information of a demand object collected in a crowdsourcing mode are uniformly expressed:
obtaining vector representation of text type information by adopting word2vec method
Figure BDA0002809935100000073
Wherein e is u Representing a user text attribute vector representation, e o Representing the text attribute vector representation of the demand object, wherein N is the quantity of the demand object categories;
the picture type information is represented by vector obtained by adopting convolutional neural network
Figure BDA0002809935100000074
Wherein, g u Representing user picture attribute vector representation, g o A picture attribute vector representation representing a demand object;
s32: fusing the multi-mode attribute information after uniform expression: the user attribute vector e obtained in S31 is used u And g u Performing outer product operation to realize feature crossing, flattening the obtained matrix according to rows, and inputting the flattened matrix into a multilayer perceptron to obtain a user node initial vector representation Z and a demand object attribute representation vector
Figure BDA0002809935100000075
And &>
Figure BDA0002809935100000076
Repeating the operation to obtain the initial vector representation O of the demand object node k And vector representation of all users and requirement objects forms a user requirement data space.
Further, the specific implementation process of the expression vector learning module for the user and the demand object includes:
s41: establishing a plurality of user-demand object co-occurrence matrixes T according to historical behavior information of users k : user-item co-occurrence matrix
Figure BDA0002809935100000077
Subscriber-service co-occurrence matrix ≥>
Figure BDA0002809935100000078
Wherein, | I | is the quantity of articles, | S | is the quantity of services, if the user has bought a certain article or the user has used a certain service, put 1 in the corresponding position of the corresponding co-occurrence matrix;
s42: in the k-th demand active prediction scene of the user, extracting the UO from the constructed heterogeneous information network k U-element path, meaning that two users commonly use semantic information of kth class demand object, co-occurrence matrix T k To which it is transferred
Figure BDA0002809935100000079
Multiply, i.e. ->
Figure BDA00028099351000000710
Deriving an inter-user relationship matrix under that semantic->
Figure BDA00028099351000000711
Extracting O from a constructed heterogeneous information network k UO k Meta-path representing semantic information used by the same user for two kth class demand objects by &>
Figure BDA0002809935100000081
Obtain a relationship matrix ^ between the kth class of demand objects under the semantic>
Figure BDA0002809935100000082
/>
S43: for the relationship matrix between users obtained in S42
Figure BDA0002809935100000083
And a relation matrix between demand objects>
Figure BDA0002809935100000084
The standardization treatment is carried out according to the following formulas respectively,
Figure BDA0002809935100000085
Figure BDA0002809935100000086
wherein the content of the first and second substances,
Figure BDA0002809935100000087
and &>
Figure BDA0002809935100000088
Are all diagonal matrices, are asserted>
Figure BDA0002809935100000089
And &>
Figure BDA00028099351000000810
Are respectively based on>
Figure BDA00028099351000000811
And &>
Figure BDA00028099351000000812
A degree matrix of (c);
s44: using a graph convolution neural network, a user vector representation is learned in accordance with the following formula,
Figure BDA00028099351000000813
the vector representation of the kth class demand object is learned as follows,
Figure BDA00028099351000000814
wherein the content of the first and second substances,
Figure BDA00028099351000000815
vector representations of the ith layer user and kth class requirement object respectively are shown, when l is 0,
Figure BDA00028099351000000816
is Z->
Figure BDA00028099351000000817
Is O k P and W are weight parameters, wherein, indicates element-by-element multiplication operation, sigma is an activation function, and phi indicates that a vector is converted into a diagonal matrix;
s45: and repeating the operation in the S44, and alternately updating the vector representations of the user and the kth class of demand object respectively until the final layer of convolution is finished to obtain the vector representations of all the user and the kth class of demand object.
Further, the specific implementation process of the step demand prediction module includes:
s51: the expression vector learning module of the user and the demand object obtains the vector expression of the target user i as
Figure BDA00028099351000000818
Neighbor users j of the user i belong to N (i), and the neighbor user information is aggregated by using an attention mechanism to obtain final target user vector representation; the weight coefficient of the neighbor to the target user is calculated,
Figure BDA00028099351000000819
the vector representation of the target user is updated,
Figure BDA00028099351000000820
wherein, alpha and W are weight parameters, sigma is an activation function, and | l is splicing operation;
s52: for each target user, calculating the relevance prediction score of the target user and each k-th class demand object
Figure BDA00028099351000000821
Figure BDA0002809935100000091
S53: the loss function is a binary cross entropy function:
Figure BDA0002809935100000092
wherein Y and Y - Positive and negative examples in the data set, Y represents the demand object set used by the user, Y - Sampled from the demand objects in the data set that are not used by the user,
Figure BDA0002809935100000093
indicates whether a user has interacted with the demand object, there is an interaction @>
Figure BDA0002809935100000094
Is 1, otherwise is 0; optimizing and solving the loss function by using a random gradient descent method, sequencing kth demand objects from high to low according to the prediction score calculated in the step S52, and selecting the first n demand objects as a kth demand list of the user; />
S54: by repeating the operations of S42-S53, a list of all the categories of demand objects for each user can be obtained, thereby realizing the active prediction of the demand of the user.
The invention has the advantages that:
according to the invention, the user directly participates in information production and knowledge sharing through a crowdsourcing technology, the preference information fed back by a crowdsourcing annotator can better reflect the real requirement of the user, and the accuracy of the result can be improved by combining the information to predict the requirement; the attribute characteristics of the users are enriched by the user preference information acquired in the crowdsourcing mode, and attribute completion is performed on new registered users lacking historical behavior data, so that each user can be more accurately represented, and the recommendation result is more personalized.
The social relationship, the user historical behavior data and the user preference information collected in the crowdsourcing mode generally have the characteristics of multiple sources, multiple types and multiple relationships, the heterogeneous information network can effectively model multiple types of entities and complex relationships among the entities, and implicit interactive relationships among the entities of different types are extracted through meta-paths, so that the information is fully utilized, the individual characteristics of users and demand objects can be more comprehensively described, and the potential demands of the users are mined.
Drawings
Fig. 1 is a flow chart of a user demand active prediction method based on crowdsourcing in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the present embodiment provides a method for actively predicting user demand based on crowdsourcing, which includes the following steps:
s1: determining annotators participating in the crowdsourcing task, designing the crowdsourcing task and issuing the crowdsourcing task to a crowdsourcing task platform; specifically comprises
S11: acquiring a user set of a service provider as a target user, acquiring user social relationship data from Twitter, a service provider platform and the like to obtain a social neighbor user set of the target user, and taking all the users as annotators for receiving crowdsourcing tasks;
s12: a user preference survey questionnaire is designed from the aspects of demographic information, social requirements reflecting common interest in social relations, enjoyment requirements reflecting personal preferences and the like, wherein the content of the questionnaire comprises word expressions and selection questions displayed graphically (for example, the interested content is selected from the following options), and a annotator is allowed to independently submit auxiliary information such as texts, videos and pictures. Issuing the questionnaire to a crowdsourcing platform in a crowdsourcing task form, and issuing a task to the annotator acquired in S11;
s2: constructing a heterogeneous information network according to the user social relationship, the historical behavior data and the user preference information acquired in the crowdsourcing mode; specifically comprises
S21: and taking the multi-modal data collected based on the crowdsourcing mode in the S1 as a property set of each user, such as age, gender, favorite movie posters and the like, and the properties of a demand object, such as a manufacturer, a marketing date and the like. The user and demand objects (including goods, services, etc.) are taken as nodes.
S22: establishing a connection edge between a user and a demand object according to the following relation:
the relationship R1: direct relationships such as friends and concern exist among users, and L are used respectively -1 Representing relationships between users (U), i.e.
Figure BDA0002809935100000101
And &>
Figure BDA0002809935100000102
Relationship R2: some users have historical behavior information, such as the user bought a certain article, used a certain service, etc., and use B and B respectively -1 Representing user (U) and requirement object (O) k ) In relation to each other, i.e.
Figure BDA0002809935100000103
And &>
Figure BDA0002809935100000104
Wherein k represents a kth-class demand object, such as a commodity or a service;
s23: and establishing a multi-mode heterogeneous information network according to the attribute set, the nodes and the relationship among the nodes.
S3: uniformly representing different types of entities in a heterogeneous information network to generate a user demand data space;
s31: user information and attribute information of demand objects collected based on crowdsourcing mode generally have different expression forms including types of texts, pictures and the like, and the user information and the attribute information of the demand objects need to be acquired according to different modalities without adopting the modesThe same expression learning method is used for unified expression. Obtaining vector representation of text type information by adopting word2vec method
Figure BDA0002809935100000111
e u Representing a user text attribute vector representation, e o And representing the text attribute vector representation of the demand object, wherein N is the number of the demand object categories. The picture type information is expressed by a vector obtained by a convolutional neural network>
Figure BDA0002809935100000112
g u Representing user picture attribute vector representation, g o A picture attribute vector representation representing a demand object;
s32: in order to learn the embedded representation of the nodes, the multimodality attribute information after being uniformly expressed needs to be fused. The user attribute vector e obtained in S31 is used u And g u Performing outer product operation to realize feature intersection, flattening the obtained matrix according to rows, inputting the flattened matrix into a multilayer perceptron to obtain an initial vector representation Z of a user node, and expressing a vector of a demand object attribute
Figure BDA0002809935100000113
And &>
Figure BDA0002809935100000114
Repeating the operation to obtain the initial vector representation O of the demand object node k The vector representations of all users and demand objects form a user demand data space.
S4: extracting the interactive semantics of the user and the demand object, and respectively learning the expression vectors of the user and the demand object; specifically comprises
S41: establishing a plurality of user-demand object co-occurrence matrixes T according to historical behavior information of users k E.g. user-item co-occurrence matrix
Figure BDA0002809935100000115
Subscriber-service co-occurrence matrix ≥>
Figure BDA0002809935100000116
Wherein, | I | is the quantity of articles, | S | is the quantity of services, if the user has bought a certain article or the user has used a certain service, put 1 in the corresponding position of the corresponding co-occurrence matrix;
s42: in the k-th demand active prediction scene of the user, extracting UO from the heterogeneous information network constructed by S2 k U-element path, meaning that two users commonly use semantic information of kth class demand object, co-occurrence matrix T k And the transpose thereof
Figure BDA0002809935100000117
Multiply, i.e. ->
Figure BDA0002809935100000118
Get the inter-user relationship matrix->
Figure BDA0002809935100000119
Extracting O from S2-constructed heterogeneous information network k UO k Meta-path representing semantic information used by the same user for two kth class demand objects by &>
Figure BDA00028099351000001110
Deriving a relationship matrix between class k demand objects under the semantics>
Figure BDA00028099351000001111
S43: for the relationship matrix between users obtained in S42
Figure BDA00028099351000001112
And a relation matrix between demand objects>
Figure BDA00028099351000001113
The standardization treatment is carried out according to the following formulas respectively,
Figure BDA00028099351000001114
Figure BDA00028099351000001115
wherein the content of the first and second substances,
Figure BDA00028099351000001116
and &>
Figure BDA00028099351000001117
Are all diagonal matrices, in combination>
Figure BDA00028099351000001118
And &>
Figure BDA00028099351000001119
Are respectively based on>
Figure BDA00028099351000001120
And &>
Figure BDA00028099351000001121
A degree matrix of (c);
s44: using a graph convolution neural network, a user vector representation is learned in accordance with the following formula,
Figure BDA0002809935100000121
the vector representation of the kth class demand object is learned in accordance with the following formula,
Figure BDA0002809935100000122
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002809935100000123
vector representations of the ith layer user and kth class requirement object respectively are shown, when l is 0,
Figure BDA0002809935100000124
is Z->
Figure BDA0002809935100000125
Is O k P and W are weight parameters, wherein, indicates element-by-element multiplication operation, sigma is an activation function, and phi indicates that a vector is converted into a diagonal matrix;
s45: and repeating the operation in the S44, and alternately updating the vector representations of the user and the kth class of demand object respectively until the final layer of convolution is finished to obtain the vector representations of all the user and the kth class of demand object.
S5: aggregating neighbor information of a target user according to user social relations in a heterogeneous information network to obtain an expression vector of the target user and perform demand prediction, specifically comprising
S51: s4, obtaining a vector of the target user i and expressing the vector as
Figure BDA0002809935100000126
And a neighbor user j of the user i belongs to N (i), and the final target user vector representation is obtained by aggregating neighbor user information by using an attention mechanism. The weight coefficient of the neighbor to the target user is calculated,
Figure BDA0002809935100000127
the vector representation of the target user is updated,
Figure BDA0002809935100000128
wherein, α and W are weight parameters, σ is an activation function, and | is a splicing operation.
S52: for each target user, calculating its relevance prediction score to each kth class demand object
Figure BDA0002809935100000129
Figure BDA00028099351000001210
S53: the loss function is a binary cross entropy function:
Figure BDA00028099351000001211
wherein Y and Y - Positive and negative examples in the data set, Y represents the demand object set used by the user, Y - Sampled from unused demand objects in the data set by the user,
Figure BDA00028099351000001212
indicates whether the user has an interaction, presence interaction @, with the demand object>
Figure BDA00028099351000001213
Is 1, otherwise is 0. And (4) performing optimization solution on the loss function by using a random gradient descent method, sequencing the kth demand objects from high to low according to the prediction score obtained by S52 calculation, and selecting the first n demand objects as a kth demand list of the user.
S54: by repeating the operations of S42-S53, a list of all the categories of demand objects for each user can be obtained, thereby realizing active prediction of user demands.
The embodiment also provides a crowd-sourced user demand active prediction system, which comprises
A crowdsourcing task issuing module: determining annotators participating in the crowdsourcing task, designing the crowdsourcing task and issuing the crowdsourcing task to a crowdsourcing task platform, wherein the annotators receive the crowdsourcing task and complete the task; in particular to
S11: acquiring a user set of a service provider as a target user, acquiring user social relationship data from Twitter, a service provider platform and the like to obtain a social neighbor user set of the target user, and taking all users as annotators for receiving crowdsourcing tasks;
s12: a user preference survey questionnaire is designed from the aspects of demographic information, social requirements reflecting common interest in social relations, enjoyment requirements reflecting personal preferences and the like, the content of the questionnaire comprises word expressions and selection questions displayed in a graphical mode (for example, the interested content is selected from the following options), and a annotator is allowed to submit auxiliary information such as texts, videos, pictures and the like independently. And issuing the questionnaire to a crowdsourcing platform in a crowdsourcing task form, and issuing a task to the annotator acquired in S11.
Heterogeneous information network construction module: constructing a heterogeneous information network according to the user social relationship, the historical behavior data and the user preference information acquired in the crowdsourcing mode in the heterogeneous information network; in particular to
S21: taking the multi-modal data collected by S1 based on the crowdsourcing mode as a property set of each user, such as age, gender, favorite movie posters and the like, and properties of demand objects, such as manufacturers, dates of marketing and the like. The user and demand objects (including goods, services, etc.) are taken as nodes.
S22: establishing a connecting edge between the user and the demand object according to the following relation:
the relationship R1: direct relationships such as friends and concern exist among users U, and L are used respectively -1 Representing relationships between users U, i.e.
Figure BDA0002809935100000131
And &>
Figure BDA0002809935100000132
Relationship R2: some users have historical behavior information, such as the user bought a certain article, used a certain service, etc., respectively using B and B -1 Representing user U and requirement object O k In relation to each other, i.e.
Figure BDA0002809935100000133
And &>
Figure BDA0002809935100000134
Wherein k represents a kth class requirement object;
s23: and establishing a multi-mode heterogeneous information network according to the attribute set, the nodes and the relationship among the nodes.
The user requirement data space generation module: uniformly representing different types of entities in a heterogeneous information network to generate a user demand data space; in particular to
S31: user information and attribute information of a demand object collected based on a crowdsourcing mode generally have different expression forms, including types such as texts and pictures, and need to be uniformly expressed by adopting different expression learning methods according to different modalities:
obtaining vector representation of text type information by adopting word2vec method
Figure BDA0002809935100000141
Wherein e is u Representing a user text attribute vector representation, e o Representing the text attribute vector representation of the demand object, wherein N is the quantity of the demand object categories;
the picture type information is represented by vector obtained by adopting convolutional neural network
Figure BDA0002809935100000142
Wherein, g u Representing user picture attribute vector representation, g o A picture attribute vector representation representing a demand object;
s32: in order to learn the embedded representation of the nodes, the multimodality attribute information after being uniformly expressed needs to be fused. The user attribute vector e obtained in S31 is used u And g u Performing outer product operation to realize feature crossing, flattening the obtained matrix according to rows, and inputting the flattened matrix into a multilayer perceptron to obtain a user node initial vector representation Z and a demand object attribute representation vector
Figure BDA0002809935100000143
And &>
Figure BDA0002809935100000144
Repeating the operation to obtain the initial node of the demand objectVector representation O k The vector representations of all users and demand objects form a user demand data space.
The user and demand object representation vector learning module: extracting the interactive semantics of the user and the demand object by using a meta path in a heterogeneous information network, and respectively learning the expression vectors of the user and the demand object; in particular to
S41: establishing a plurality of user-demand object co-occurrence matrixes T according to historical behavior information of users k : for example, a user-item co-occurrence matrix
Figure BDA0002809935100000145
Subscriber-service co-occurrence matrix ≥>
Figure BDA0002809935100000146
Wherein, | I | is the quantity of articles, | S | is the quantity of services, if the user has bought a certain article or the user has used a certain service, put 1 in the corresponding position of the corresponding co-occurrence matrix;
s42: in the k-th demand active prediction scene of the user, extracting the UO from the heterogeneous information network constructed in the step S2 k U-element path, meaning that two users use the semantic information of the kth class demand object together, co-occurrence matrix T k To which it is transferred
Figure BDA0002809935100000151
Multiply, i.e. [ means ] in>
Figure BDA0002809935100000152
Get the inter-user relationship matrix->
Figure BDA0002809935100000153
Extracting O from the heterogeneous information network constructed in step S2 k UO k Meta-path representing semantic information used by the same user for two kth class demand objects by &>
Figure BDA0002809935100000154
Obtaining a relation matrix between kth class demand objects under the semantics/>
Figure BDA0002809935100000155
S43: for the relationship matrix between users obtained in S42
Figure BDA0002809935100000156
And a relationship matrix between demand objects>
Figure BDA0002809935100000157
The standardization treatment is carried out according to the following formulas respectively,
Figure BDA0002809935100000158
Figure BDA0002809935100000159
wherein the content of the first and second substances,
Figure BDA00028099351000001510
and &>
Figure BDA00028099351000001511
Are all diagonal matrices, are asserted>
Figure BDA00028099351000001512
And &>
Figure BDA00028099351000001513
Are respectively based on>
Figure BDA00028099351000001514
And &>
Figure BDA00028099351000001515
A degree matrix of (c);
s44: using a graph-convolution neural network, a user vector representation is learned according to the following formula,
Figure BDA00028099351000001516
the vector representation of the kth class demand object is learned in accordance with the following formula,
Figure BDA00028099351000001517
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00028099351000001518
vector representations of the l-th layer user and the k-th type demand object respectively are shown, when l is 0,
Figure BDA00028099351000001519
is Z->
Figure BDA00028099351000001520
Is O k P and W are weight parameters, wherein, indicates element-by-element multiplication operation, sigma is an activation function, and phi indicates that a vector is converted into a diagonal matrix;
s45: and repeating the operation in the S44, and alternately updating the vector representations of the user and the kth class of demand object respectively until the final layer of convolution is finished to obtain the vector representations of all the user and the kth class of demand object.
The demand prediction module is used for aggregating neighbor information of the target user according to the social relations of the users in the heterogeneous information network to obtain an expression vector of the target user and performing demand prediction, and specifically comprises
S51: the expression vector learning module of the user and the demand object obtains the vector expression of the target user i as
Figure BDA00028099351000001521
A neighbor user j of the user i belongs to N (i), and the final target user vector representation is obtained by aggregating neighbor user information by using an attention mechanism; the weight coefficient of the neighbor to the target user is calculated,
Figure BDA00028099351000001522
the vector representation of the target user is updated,
Figure BDA00028099351000001523
wherein alpha and W are weight parameters, sigma is an activation function, and | is splicing operation;
s52: for each target user, calculating the relevance prediction score of the target user and each k-th class demand object
Figure BDA0002809935100000161
Figure BDA0002809935100000162
/>
S53: the loss function is a binary cross entropy function:
Figure BDA0002809935100000163
wherein Y and Y - Positive and negative examples in the data set, Y represents the demand object set used by the user, Y - Sampled from the demand objects in the data set that are not used by the user,
Figure BDA0002809935100000164
indicates whether the user has an interaction, presence interaction @, with the demand object>
Figure BDA0002809935100000165
Is 1, otherwise is 0; optimizing and solving the loss function by using a random gradient descent method, sequencing kth demand objects from high to low according to the prediction score calculated in the step S52, and selecting the first n demand objects as a kth demand list of the user;
s54: by repeating the operations of S42-S53, a list of all the categories of demand objects for each user can be obtained, thereby realizing the active prediction of the demand of the user.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A user demand active prediction method based on crowdsourcing is characterized by comprising the following steps: the method comprises the following steps:
s1: determining a annotator participating in the crowdsourcing task, designing the crowdsourcing task and issuing the crowdsourcing task to a crowdsourcing task platform, and receiving the crowdsourcing task and completing the task by the annotator; the method specifically comprises the following steps:
s11: acquiring a user set of a service provider as a target user, acquiring user social relationship data from a social network and a service provider platform to obtain a social neighbor user set of the target user, and taking all the users as annotators for receiving crowdsourcing tasks;
s12: designing a user preference survey questionnaire from the perspectives of demographic information, social requirements for reflecting common interest and love in social relations and enjoyment requirements for reflecting personal preferences, wherein the content of the questionnaire comprises word expression and selection questions displayed graphically, allowing a annotator to submit auxiliary information independently, publishing the questionnaire to a crowdsourcing platform in a crowdsourcing task form, and publishing tasks to the annotator obtained in S11;
s2: constructing a heterogeneous information network according to the user social relationship, the historical behavior data and the user preference information acquired in the crowdsourcing mode; the method specifically comprises the following steps:
s21: taking multi-modal data acquired based on a crowdsourcing mode in the S1 as an attribute set of each user, and taking the user and a demand object as nodes;
s22: establishing a connection edge between a user and a demand object according to the following relation:
the relationship R1: direct relationships of friends and concerns exist among users U, and L are used respectively -1 Representing relationships between users U, i.e.
Figure FDA0004073731800000011
And &>
Figure FDA0004073731800000012
Relationship R2: part of the users have historical behavior information, respectively B and B -1 Representing user U and demand object O k In relation to each other, i.e.
Figure FDA0004073731800000013
And &>
Figure FDA0004073731800000014
Wherein k represents a kth class requirement object;
s23: establishing a multi-mode heterogeneous information network according to the attribute set, the nodes and the relationship among the nodes;
s3: uniformly representing different types of entities in a heterogeneous information network to generate a user demand data space;
s4: extracting the interactive semantics of the user and the demand object by using a meta path in a heterogeneous information network, and respectively learning the expression vectors of the user and the demand object;
s5: and aggregating the neighbor information of the target user according to the social relation of the users in the heterogeneous information network to obtain the expression vector of the target user and perform demand prediction.
2. The crowd-sourced user demand active prediction method of claim 1, wherein: the step S3 includes:
s31: user information, text attribute information and image attribute information of a demand object collected in a crowdsourcing mode are uniformly expressed:
obtaining vector representation of text type information by adopting word2vec method
Figure FDA0004073731800000021
Wherein e is u Representing a user text attribute vector representation, e o Representing the text attribute vector representation of the demand object, wherein N is the quantity of the demand object categories;
Figure FDA0004073731800000022
representing vectors for the attributes of the demand objects;
the picture type information is represented by vector obtained by adopting convolutional neural network
Figure FDA0004073731800000023
Wherein, g u Representing user Picture Attribute vector representation, g o A picture attribute vector representation representing a demand object;
Figure FDA0004073731800000024
representing vectors for the attributes of the demand objects; />
S32: fusing the multi-mode attribute information after uniform expression: the user attribute vector e obtained in S31 is used u And g u Performing outer product operation to realize feature crossing, flattening the obtained matrix according to rows, and inputting the flattened matrix into a multilayer perceptron to obtain a user node initial vector representation Z and a demand object attribute representation vector
Figure FDA0004073731800000025
And &>
Figure FDA0004073731800000026
Repeating the operation to obtain the initial vector representation O of the demand object node k The vector representations of all users and demand objects form a user demand data space.
3. The crowd-sourced user demand active prediction method of claim 1, wherein: the step S4 includes:
s41: establishing a co-occurrence matrix T of a plurality of user-demand objects according to historical behavior information of users k : user-item co-occurrence matrix
Figure FDA0004073731800000027
Subscriber-service co-occurrence matrix ≥>
Figure FDA0004073731800000028
Wherein, | I | is the quantity of articles, | S | is the quantity of services, if the user has bought a certain article or the user has used a certain service, put 1 in the corresponding position of the corresponding co-occurrence matrix;
s42: in the k-th demand active prediction scene of the user, extracting the UO from the heterogeneous information network constructed in the step S2 k U-element path, meaning that two users use the semantic information of the kth class demand object together, co-occurrence matrix T k With its transpose (T) k ) T Multiplication, i.e. T k ×(T k ) T Obtaining the relationship matrix between the users under the semantic meaning
Figure FDA0004073731800000029
Extracting O from the heterogeneous information network constructed in step S2 k UO k Meta-path, meaning semantic information used by the same user for two kth class requirement objects, passing through (T) k ) T ×T k Obtain a relationship matrix ^ between the kth class of demand objects under the semantic>
Figure FDA0004073731800000031
S43: for the relationship matrix between users obtained in S42
Figure FDA0004073731800000032
And a relation matrix between demand objects>
Figure FDA0004073731800000033
The standardization treatment is carried out according to the following formulas respectively,
Figure FDA0004073731800000034
Figure FDA0004073731800000035
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0004073731800000036
and &>
Figure FDA0004073731800000037
Are all diagonal matrices, in combination>
Figure FDA0004073731800000038
And &>
Figure FDA0004073731800000039
Are respectively based on>
Figure FDA00040737318000000310
And &>
Figure FDA00040737318000000311
A degree matrix of (c);
s44: using a graph-convolution neural network, a user vector representation is learned according to the following formula,
Figure FDA00040737318000000312
the vector representation of the kth class demand object is learned in accordance with the following formula,
Figure FDA00040737318000000313
wherein the content of the first and second substances,
Figure FDA00040737318000000314
vector representations of the ith layer user and kth class demand object, respectively, are based on the fact that when l is 0, then->
Figure FDA00040737318000000315
Is a group of a group Z having a structure,
Figure FDA00040737318000000316
is O k P and W are weight parameters, wherein, indicates element-by-element multiplication operation, sigma is an activation function, and phi indicates that a vector is converted into a diagonal matrix;
s45: and repeating the operation in the S44, and alternately updating the vector representations of the user and the kth type demand object respectively until the last layer of convolution is finished to obtain the vector representations of all the users and the kth type demand object.
4. The active prediction method of user demand based on crowdsourcing of claim 3, wherein: the step S5 includes:
s51: s4, obtaining a vector of the target user i and expressing the vector as
Figure FDA00040737318000000317
Neighbor users j of the user i belong to N (i), and the neighbor user information is aggregated by using an attention mechanism to obtain final target user vector representation; the weight coefficient of the neighbor to the target user is calculated,
Figure FDA00040737318000000318
the vector representation of the target user is updated,
Figure FDA00040737318000000319
wherein, alpha and W are weight parameters, sigma is an activation function, and | l is splicing operation;
s52: for each target user, calculating its relevance prediction score to each kth class demand object
Figure FDA0004073731800000041
Figure FDA0004073731800000042
S53: the loss function is a binary cross entropy function:
Figure FDA0004073731800000043
wherein, Y and Y - Positive and negative examples in the data set, Y represents the demand object set used by the user, Y - Sampled from the demand objects in the data set that are not used by the user,
Figure FDA0004073731800000044
indicates whether a user has interacted with the demand object, there is an interaction @>
Figure FDA0004073731800000045
Is 1, otherwise is 0; optimizing and solving the loss function by using a random gradient descent method, sequencing kth-class demand objects from high to low according to the prediction score obtained by calculation in the step S52, and selecting the first n demand objects as a kth-class demand list of the user;
s54: by repeating the operations of S42-S53, a list of all the categories of demand objects for each user can be obtained, thereby realizing active prediction of user demands.
5. A crowd-sourced based active user demand prediction system is characterized in that: comprises that
A crowdsourcing task issuing module: determining a annotator participating in the crowdsourcing task, designing the crowdsourcing task and issuing the crowdsourcing task to a crowdsourcing task platform, and receiving the crowdsourcing task and completing the task by the annotator;
the specific execution process of the crowdsourcing task issuing module is as follows:
s11: acquiring a user set of a service provider as a target user, acquiring user social relationship data from a social network and a service provider platform to obtain a social neighbor user set of the target user, and taking all users as annotators for receiving crowdsourcing tasks;
s12: designing a user preference survey questionnaire from the aspects of demographic information, social requirements for reflecting common interest in social relations and enjoyment requirements for reflecting individual preferences, wherein the content of the questionnaire comprises selection questions expressed in characters and displayed in a graphical mode, a annotator is allowed to submit auxiliary information independently, the questionnaire is issued to a crowdsourcing platform in a crowdsourcing task mode, and tasks are issued to the annotator acquired in S11;
the heterogeneous information network construction module: constructing a heterogeneous information network according to the user social relationship, the historical behavior data and the user preference information acquired in the crowdsourcing mode;
the heterogeneous information network construction module comprises the following execution processes:
s21: taking multimodal data collected based on a crowdsourcing mode as an attribute set of each user, and taking the user and a demand object as nodes;
s22: establishing a connection edge between a user and a demand object according to the following relation:
the relationship R1: direct relationships such as friends and concern exist among users U, and L are used respectively -1 Representing relationships between users U, i.e.
Figure FDA0004073731800000051
And &>
Figure FDA0004073731800000052
The relationship R2: some users have historical behavior information, such as the user bought a certain article, used a certain service, etc., respectively using B and B -1 Representing user U and requirement object O k In relation to each other, i.e.
Figure FDA0004073731800000053
And &>
Figure FDA0004073731800000054
Wherein k represents a kth class requirement object;
s23: establishing a multi-mode heterogeneous information network according to the attribute set, the nodes and the relationship among the nodes;
the user requirement data space generation module: uniformly representing different types of entities in a heterogeneous information network to generate a user demand data space;
the user and demand object representation vector learning module: extracting the interactive semantics of the user and the demand object by using a meta path in the heterogeneous information network, and respectively learning the expression vectors of the user and the demand object;
a demand forecasting module: and aggregating the neighbor information of the target user according to the user social relationship in the heterogeneous information network to obtain the expression vector of the target user and perform demand prediction.
6. The crowd-sourced, user-demand active prediction system of claim 5, wherein: the specific execution process of the user requirement data space generation module comprises the following steps:
s31: user information, text attribute information and image attribute information of a demand object collected in a crowdsourcing mode are uniformly expressed:
obtaining vector representation of text type information by adopting word2vec method
Figure FDA0004073731800000055
Wherein e is u Representing a userText attribute vector representation, e o Representing the text attribute vector of the demand object, wherein N is the quantity of the demand object categories;
the picture type information is represented by vector obtained by adopting convolutional neural network
Figure FDA0004073731800000056
Wherein, g u Representing user picture attribute vector representation, g o A picture attribute vector representation representing a demand object;
s32: fusing the multi-mode attribute information after uniform expression: the user attribute vector e obtained in S31 is used u And g u Performing outer product operation to realize feature intersection, flattening the obtained matrix according to rows, inputting the flattened matrix into a multilayer perceptron to obtain an initial vector representation Z of a user node, and expressing a vector of a demand object attribute
Figure FDA0004073731800000061
And &>
Figure FDA0004073731800000062
Repeating the operation to obtain the initial vector representation O of the demand object node k The vector representations of all users and demand objects form a user demand data space.
7. The crowd-sourced, user-demand active prediction system of claim 6, wherein: the specific implementation process of the expression vector learning module of the user and the demand object comprises the following steps:
s41: establishing a co-occurrence matrix T of a plurality of user-demand objects according to historical behavior information of users k : user-item co-occurrence matrix
Figure FDA0004073731800000063
Subscriber-service co-occurrence matrix ≥>
Figure FDA0004073731800000064
Wherein, | I | is the quantity of articles, | S | is the quantity of services, if the user has bought a certain article or the user has used a certain service, put 1 in the corresponding position of the corresponding co-occurrence matrix;
s42: in the k-th demand active prediction scene of the user, extracting the UO from the constructed heterogeneous information network k U-element path, meaning that two users use the semantic information of the kth class demand object together, co-occurrence matrix T k With its transpose (T) k ) T Multiplication, i.e. T k ×(T k ) T Obtaining the relationship matrix between the users under the semantic meaning
Figure FDA0004073731800000065
Extracting O from a constructed heterogeneous information network k UO k Meta-path, meaning semantic information used by the same user for two kth class requirement objects, passing through (T) k ) T ×T k Obtain a relationship matrix ^ between the kth class of demand objects under the semantic>
Figure FDA0004073731800000066
S43: for the relationship matrix between users obtained in S42
Figure FDA0004073731800000067
And a relation matrix between demand objects>
Figure FDA0004073731800000068
The standardization treatment is carried out according to the following formulas respectively,
Figure FDA0004073731800000069
Figure FDA00040737318000000610
wherein the content of the first and second substances,
Figure FDA00040737318000000611
and &>
Figure FDA00040737318000000612
Are all diagonal matrices, are asserted>
Figure FDA00040737318000000613
And &>
Figure FDA00040737318000000614
Are respectively in>
Figure FDA00040737318000000615
And &>
Figure FDA00040737318000000616
A degree matrix of (c);
s44: using a graph convolution neural network, a user vector representation is learned in accordance with the following formula,
Figure FDA00040737318000000617
the vector representation of the kth class demand object is learned as follows,
Figure FDA00040737318000000618
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0004073731800000071
vector representations of the ith layer user and kth class demand object, respectively, are based on the fact that when l is 0, then->
Figure FDA0004073731800000072
Is a group of Z and is a group of Z,
Figure FDA0004073731800000073
is O k P and W are weight parameters, wherein, indicates element-by-element multiplication operation, sigma is an activation function, and phi indicates that a vector is converted into a diagonal matrix;
s45: and repeating the operation in the S44, and alternately updating the vector representations of the user and the kth class of demand object respectively until the final layer of convolution is finished to obtain the vector representations of all the user and the kth class of demand object.
8. The crowd-sourced, user-demand active prediction system of claim 7, wherein: the specific execution process of the step demand forecasting module comprises the following steps:
s51: the expression vector learning module of the user and the demand object obtains the vector expression of the target user i as
Figure FDA0004073731800000074
A neighbor user j of the user i belongs to N (i), and the final target user vector representation is obtained by aggregating neighbor user information by using an attention mechanism; the weight coefficient of the neighbor to the target user is calculated,
Figure FDA0004073731800000075
the vector representation of the target user is updated,
Figure FDA0004073731800000076
wherein, alpha and W are weight parameters, sigma is an activation function, and | l is splicing operation;
s52: for each target user, calculating its relevance prediction score to each kth class demand object
Figure FDA0004073731800000077
Figure FDA0004073731800000078
S53: the loss function is a binary cross entropy function:
Figure FDA0004073731800000079
wherein Y and Y - Positive and negative examples in the data set, Y represents the demand object set used by the user, Y - Sampled from the demand objects in the data set that are not used by the user,
Figure FDA00040737318000000710
indicates whether the user has an interaction, presence interaction @, with the demand object>
Figure FDA00040737318000000711
Is 1, otherwise is 0; optimizing and solving the loss function by using a random gradient descent method, sequencing kth demand objects from high to low according to the prediction score calculated in the step S52, and selecting the first n demand objects as a kth demand list of the user;
s54: by repeating the operations of S42-S53, a list of all the categories of demand objects for each user can be obtained, thereby realizing active prediction of user demands.
CN202011387991.5A 2020-12-01 2020-12-01 User demand active prediction method and system based on crowdsourcing Active CN112508256B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011387991.5A CN112508256B (en) 2020-12-01 2020-12-01 User demand active prediction method and system based on crowdsourcing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011387991.5A CN112508256B (en) 2020-12-01 2020-12-01 User demand active prediction method and system based on crowdsourcing

Publications (2)

Publication Number Publication Date
CN112508256A CN112508256A (en) 2021-03-16
CN112508256B true CN112508256B (en) 2023-04-14

Family

ID=74969199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011387991.5A Active CN112508256B (en) 2020-12-01 2020-12-01 User demand active prediction method and system based on crowdsourcing

Country Status (1)

Country Link
CN (1) CN112508256B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361928B (en) * 2021-06-07 2023-08-25 南京大学 Crowd-sourced task recommendation method based on heterogram attention network
CN113378051B (en) * 2021-06-16 2024-03-22 南京大学 User-task association crowdsourcing task recommendation method based on graph neural network
CN113393056B (en) * 2021-07-08 2022-11-25 山东大学 Crowdsourcing service supply and demand gap prediction method and system based on time sequence
CN114445043B (en) * 2022-01-26 2022-12-16 安徽大学 Open ecological cloud ERP-based heterogeneous graph user demand accurate discovery method and system
CN114470790A (en) * 2022-02-09 2022-05-13 腾讯科技(深圳)有限公司 Virtual resource processing method, device, equipment, computer program and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191081A (en) * 2019-12-17 2020-05-22 安徽大学 Developer recommendation method and device based on heterogeneous information network
CN111626616A (en) * 2020-05-27 2020-09-04 深圳莫比嗨客数据智能科技有限公司 Crowdsourcing task recommendation method
CN111881342A (en) * 2020-06-23 2020-11-03 北京工业大学 Recommendation method based on graph twin network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11463472B2 (en) * 2018-10-24 2022-10-04 Nec Corporation Unknown malicious program behavior detection using a graph neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191081A (en) * 2019-12-17 2020-05-22 安徽大学 Developer recommendation method and device based on heterogeneous information network
CN111626616A (en) * 2020-05-27 2020-09-04 深圳莫比嗨客数据智能科技有限公司 Crowdsourcing task recommendation method
CN111881342A (en) * 2020-06-23 2020-11-03 北京工业大学 Recommendation method based on graph twin network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Heterogeneous Information Network Embedding with Convolutional Graph Attention Networks";Meng Cao等;《2020 International Joint Conference on Neural Networks (IJCNN)》;IEEE;20200928;第2020年卷;第1-8页 *
"基于异质信息网络表示学习的推荐算法研究与实现";胡斌斌;《中国优秀硕士学位论文全文数据库信息科技辑》;20190815;第2019年卷(第08期);第I138-1361页 *
基于融合元路径图卷积的异质网络表示学习;蒋宗礼等;《计算机科学》;20200731;第47卷(第07期);第231-235页 *

Also Published As

Publication number Publication date
CN112508256A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN112508256B (en) User demand active prediction method and system based on crowdsourcing
Bi et al. Modelling customer satisfaction from online reviews using ensemble neural network and effect-based Kano model
CN108431833B (en) End-to-end depth collaborative filtering
US11580447B1 (en) Shared per content provider prediction models
US10102503B2 (en) Scalable response prediction using personalized recommendation models
Lytvyn et al. Design of the architecture of an intelligent system for distributing commercial content in the internet space based on SEO-technologies, neural networks, and Machine Learning
EP4181026A1 (en) Recommendation model training method and apparatus, recommendation method and apparatus, and computer-readable medium
US20230259964A1 (en) Device for providing mediation service between advertiser and influencer by using artificial intelligence, and mediation method using same
CN111241394B (en) Data processing method, data processing device, computer readable storage medium and electronic equipment
Khalid et al. A literature review of implemented recommendation techniques used in massive open online courses
Kutlimuratov et al. Modeling and applying implicit dormant features for recommendation via clustering and deep factorization
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
Shen et al. A voice of the customer real-time strategy: An integrated quality function deployment approach
Wu et al. Enhanced e-commerce customer engagement: A comprehensive three-tiered recommendation system
Wang A survey of online advertising click-through rate prediction models
Wang et al. Who are the best adopters? User selection model for free trial item promotion
CN113610610B (en) Session recommendation method and system based on graph neural network and comment similarity
KR102238438B1 (en) System for providing commercial product transaction service using price standardization
US20230316106A1 (en) Method and apparatus for training content recommendation model, device, and storage medium
Vysotska et al. V. Lytvyn
Yu et al. A graph attention network under probabilistic linguistic environment based on Bi-LSTM applied to film classification
CN114090848A (en) Data recommendation and classification method, feature fusion model and electronic equipment
Lakshmi Chetana et al. CF-AMVRGO: Collaborative Filtering based Adaptive Moment Variance Reduction Gradient Optimizer for Movie Recommendations
CN111460300A (en) Network content pushing method and device and storage medium
Gao et al. DDRCN: Deep Deterministic Policy Gradient Recommendation Framework Fused with Deep Cross Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant