CN113312514A - Grouping method, device, equipment and medium combining Deepwalk and community discovery technology - Google Patents

Grouping method, device, equipment and medium combining Deepwalk and community discovery technology Download PDF

Info

Publication number
CN113312514A
CN113312514A CN202110868902.7A CN202110868902A CN113312514A CN 113312514 A CN113312514 A CN 113312514A CN 202110868902 A CN202110868902 A CN 202110868902A CN 113312514 A CN113312514 A CN 113312514A
Authority
CN
China
Prior art keywords
user
video
node
sub
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110868902.7A
Other languages
Chinese (zh)
Other versions
CN113312514B (en
Inventor
璁镐腹
许丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110868902.7A priority Critical patent/CN113312514B/en
Publication of CN113312514A publication Critical patent/CN113312514A/en
Application granted granted Critical
Publication of CN113312514B publication Critical patent/CN113312514B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • G06Q50/2057Career enhancement or continuing education service

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of artificial intelligence, and provides a grouping method, a grouping device, grouping equipment and grouping media which combine Deepwalk and community discovery technologies, wherein an improved Deepwalk algorithm can be adopted to randomly walk on a user-video relational graph, user information and video information are simultaneously generated in a user vector, the user relationship is established through videos, and a context relationship similar to sentences is formed in the process of establishing the relational graph and walking between the videos and the users, so that the context relationship not only comprises video watching habit information of the users, but also comprises the relationship information between the users established according to video watching behaviors, and further, the automatic generation of online learning groups is realized by combining the improved Deepwalk algorithm and the community discovery technologies, the grouping method, the grouping device, the grouping equipment and the grouping media are efficient and more accurate, meanwhile, in each established learning group, the members can more effectively communicate due to strong association between the members, the learning effect is improved. The invention also relates to blockchain techniques, the learning groups may be stored on blockchain nodes.

Description

Grouping method, device, equipment and medium combining Deepwalk and community discovery technology
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a grouping method, a grouping device, grouping equipment and a grouping medium which are combined with deep walk and community discovery technologies.
Background
Various links in the enterprise employee training are undergoing a gradual offline to online transition phase. The differences in learning experience due to form changes are also becoming more and more clearly revealed. For example: the on-line learning resources are rich and can not be compared in the course of the off-line course, and students can culture various interests and hobbies conditionally and find development paths which are more accordant with the characteristics of the students.
However, the interactive behavior of online learning is mainly established by means of messages and barrage between the instructor and the audience, and interaction between students is difficult. Meanwhile, the difference between students in learning interest and habits is very large, and effective learning discussion needs to be carried out under a certain similar learning background. After losing similar points such as courses and regions naturally shared in the offline learning scene, the communication among the online students becomes more difficult. Starting from online communication, weakness in communication among students in the course of offline can be found in turn, such as: the off-line class personnel keep a certain amount for a long time, are relatively solidified, have small personnel flow, and are difficult to expand a larger communication range; meanwhile, the diversity of the staff in the class is far inferior to that of the online students, and the long-term co-located teaching mode weakens the possibility of the students developing various interests and hobbies to a certain extent. Except for teaching teachers, discussion among students is a very important link in the process of learning and strengthening knowledge, and the win-win effect of improving the collective learning efficiency can be achieved by making up for deficiencies and sharing resources.
However, in the prior art, when online training is performed, students generally need to be grouped by means of questionnaires and manual marking models, which not only takes a long time, lacks timeliness, and has a low accuracy, so how to quickly and accurately group the students becomes an urgent problem in the field of online training.
Disclosure of Invention
The embodiment of the invention provides a grouping method, a grouping device, grouping equipment and a grouping medium which are combined with Deepwalk and community discovery technologies, which can be combined with an improved Deepwalk algorithm and a community discovery technology to realize automatic generation of online learning groups, are efficient and accurate in division, and meanwhile, in each established learning group, due to strong association among members, the members can communicate more effectively, and the learning effect is improved.
In a first aspect, an embodiment of the present invention provides a grouping method combining deep walk and community discovery technologies, including:
acquiring watching data of a user on a training video, and constructing a user-video relation graph according to the watching data;
adopting an improved Deepwalk algorithm to carry out random walk on the user-video relation graph to generate at least one path sequence;
vectorizing the at least one path sequence to obtain at least one path vector;
identifying a user vector from the at least one path vector;
calculating the similarity between every two users according to the user vector;
constructing a user relation graph according to the similarity between every two users;
and adopting a community discovery technology to divide the user groups based on the user relation graph to obtain at least one learning group.
According to a preferred embodiment of the present invention, the constructing the user-video relationship graph according to the viewing data includes:
when a record that a user watches a video is detected in the watching data, connecting the detected user with the corresponding video to obtain an initial relation graph comprising at least one sub-connection, wherein each sub-connection comprises a user and a video;
acquiring a pre-configured scoring algorithm;
calculating a score for each sub-connection based on the scoring algorithm by acquiring data from the viewing data;
sorting the sub-connections corresponding to each user according to the order of the scores from high to low;
and deleting the sub-connections arranged behind the preset bits from the initial relationship graph to obtain the user-video relationship graph.
According to a preferred embodiment of the present invention, the randomly walking on the user-video relationship graph by using the improved Deepwalk algorithm, and generating at least one path sequence includes:
in the user-video relation graph, normalizing the edge formed by each node and the adjacent node to obtain the wandering probability of each node relative to the adjacent node;
determining each node in the user-video relationship graph as an initial node;
and according to the wandering probability of each node relative to the adjacent nodes, starting to wander on the user-video relation graph from each initial node until a cycle termination condition is met, stopping wandering, and generating the at least one path sequence.
According to the preferred embodiment of the present invention, the normalizing the edge formed by each node and the adjacent node to obtain the wandering probability of each node relative to the adjacent node includes:
acquiring the score of each sub-connection corresponding to each user;
calculating a first sum of scores of each sub-connection corresponding to each user;
calculating the quotient of the score of each sub-connection and the first sum value to obtain the wandering probability of the node of the corresponding user wandering to the node of the corresponding video in each sub-connection;
acquiring the score of each sub-link corresponding to each video;
calculating a second sum of scores for each sub-connection corresponding to each video;
calculating the quotient of the score of each sub-connection and the second sum value to obtain the wandering probability of the node of the corresponding video wandering to the node of the corresponding user in each sub-connection;
and integrating the wandering probability of each sub-connection from the node of the corresponding user to the node of the corresponding video and the wandering probability of each sub-connection from the node of the corresponding video to the node of the corresponding user to obtain the wandering probability of each node relative to the adjacent node.
According to a preferred embodiment of the present invention, the satisfying of the cycle end condition includes:
acquiring a preset walking step number threshold value, and determining that the circulation termination condition is met when the walking step number is detected to reach the walking step number threshold value; and/or
In the wandering process, after the wandering from a first node to a second node is detected, the first node is directly returned from the second node, the in-place loop is determined to be generated, and the loop termination condition is determined to be met.
According to the preferred embodiment of the present invention, the constructing the user relationship graph according to the similarity between each two users includes:
deleting the nodes corresponding to the videos from the user-video relation graph to obtain an initial user relation graph;
and determining the similarity between every two users as the weight of each edge corresponding to the initial user relationship graph to obtain the user relationship graph.
According to a preferred embodiment of the invention, after obtaining at least one learning group, the method further comprises:
sequencing the watching times of the videos watched by the users in each learning group from high to low at intervals of a first preset time interval, acquiring the videos ranked at the front preset positions as target videos of each learning group, extracting keywords of the target videos of each learning group, and determining the keywords of the target videos of each learning group as labels of each learning group; and/or
Acquiring the access probability of a node corresponding to a user in each learning group, determining the user corresponding to the node with the highest access probability as a core user in each learning group, and establishing a first position label for the core user; and/or
Acquiring the performance level of the users in each learning group, determining the user with the highest performance level as a target user, and establishing a second position label for the target user; and/or
Updating the viewing data at intervals of a second preset time interval, sending a grouping confirmation request to a corresponding user, and updating the learning group according to the updated viewing data and the grouping confirmation information when receiving the grouping confirmation information fed back by the corresponding user; and/or
When the number of people with the learning group is detected to be less than the threshold number of people, the detected learning group is deleted.
In a second aspect, an embodiment of the present invention provides a grouping apparatus combining deep walk and community discovery technologies, including:
the building unit is used for obtaining the watching data of the user on the training video and building a user-video relation graph according to the watching data;
a walking unit, configured to perform random walking on the user-video relationship diagram by using an improved Deepwalk algorithm, and generate at least one path sequence;
the vectorization unit is used for vectorizing the at least one path sequence to obtain at least one path vector;
an identifying unit for identifying a user vector from the at least one path vector;
the calculating unit is used for calculating the similarity between every two users according to the user vector;
the construction unit is also used for constructing a user relationship graph according to the similarity between every two users;
and the dividing unit is used for dividing the user groups based on the user relationship graph by adopting a community discovery technology to obtain at least one learning group.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements the grouping method combining deep walk and community discovery techniques described in the first aspect.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the grouping method combining deepwater and community discovery technology in the first aspect.
The embodiment of the invention provides a grouping method, a grouping device, grouping equipment and a grouping medium which are combined with deep walk and community discovery technologies, which can acquire watching data of a user on a training video, construct a user-video relation graph according to the watching data, randomly walk on the user-video relation graph by adopting an improved deep walk algorithm to generate at least one path sequence, and replace the traditional uniform adoption with 'love' distribution which can represent the tightness degree between nodes (between a user and a video). While walking, it is easier to select a user who has a preference for each video for the next node of each video; for the next node of each user, a video which is preferred by each user is easily selected, vectorization processing is carried out on the at least one path sequence to obtain at least one path vector, the user vector is identified from the at least one path vector, the obtained user vector not only contains user information, but also contains video information, the relationship of the users is established through the video, a context relationship similar to sentences is formed between the video and the users in the process of establishing a relationship graph and walking, the context of each user is the video which is watched by the users and has a higher preference degree, and the context of each video is the user who watches the video and prefers the video relatively. Each user vector carries video information favored by the user, not only contains video watching habit information of the user, but also contains relationship information between the users established according to video watching behaviors, so that the users with similar interests, namely the users with close user vectors, can be found out, the similarity between every two users is calculated according to the user vectors, a user relationship graph is established according to the similarity between every two users, the established user relationship graph simultaneously contains information with multiple dimensions, so that the relationship between the users is more compact and accurate, a community discovery technology is adopted, the users are divided into groups based on the user relationship graph to obtain at least one learning group, the automatic generation of the online learning group is realized by combining an improved Deepwalk algorithm and the community discovery technology, the efficiency and the division are more accurate, and meanwhile, in each established learning group, due to the strong association among the members, the members can communicate more effectively, and the learning effect is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a grouping method combining deep walk and community discovery technologies according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of a grouping apparatus incorporating deep walk and community discovery techniques according to an embodiment of the present invention;
FIG. 3 is a schematic block diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Fig. 1 is a schematic flow chart of a grouping method combining deep walk and community discovery technologies according to an embodiment of the present invention.
And S10, acquiring the watching data of the user on the training video, and constructing a user-video relation graph according to the watching data.
In at least one embodiment of the present invention, the viewing data includes, but is not limited to, a combination of one or more of the following:
the watching times, the number of barracks, the comment content, the number of comments, the collection number, the number of praise and the number of appreciation of each training video of each user.
Wherein the training videos may include training videos of any field, such as: sales training videos, learning videos for students, and the like.
In at least one embodiment of the present invention, the constructing the user-video relationship graph according to the viewing data comprises:
when a record that a user watches a video is detected in the watching data, connecting the detected user with the corresponding video to obtain an initial relation graph comprising at least one sub-connection, wherein each sub-connection comprises a user and a video;
acquiring a pre-configured scoring algorithm;
calculating a score for each sub-connection based on the scoring algorithm by acquiring data from the viewing data;
sorting the sub-connections corresponding to each user according to the order of the scores from high to low;
and deleting the sub-connections arranged behind the preset bits from the initial relationship graph to obtain the user-video relationship graph.
The preset bits may be configured according to an actually required data amount, such as 100.
Through the configuration of the preset bits, data with low influence can be deleted from the initial relation graph, and further the complexity of the model is reduced.
Specifically, when the user-video relation graph is constructed, if the user A is identified to watch the video B, the user A and the video B are connected. Obviously, in practical applications, each user may have viewed a plurality of videos, and each video may also have been viewed by a plurality of users, so that a plurality of sub-connections including one user and one video can be established. Each sub-connection represents that the user watches the corresponding video, and further the relationship between the user and the video is established.
Further, the preconfigured scoring algorithm may include:
Figure DEST_PATH_IMAGE001
where Y represents the score of each sub-link, X1 represents the number of views, X2 represents the number of positive barrage, X3 represents the number of negative barrage, X4 represents the number of positive comments, X5 represents the number of negative comments, and X6 represents the number of awards.
Specifically, the positive barrage, the negative barrage, the positive comments and the negative comments may be classified by applying naive bayes or SVM (support vector machines) models.
Of course, in other embodiments, the emotion may be ignored, and the number of bullet screens and the number of evaluations may be directly added to the score of each sub-link.
The score for each sub-connection may indicate the user's preference for the corresponding viewed video.
Through the embodiment, the user-video relation graph can be automatically constructed, and the closeness degree of the relation between the user and the video is reflected.
And S11, performing random walk on the user-video relation graph by adopting a modified Deepwalk algorithm to generate at least one path sequence.
In at least one embodiment of the present invention, the randomly walking on the user-video relationship graph by using the improved Deepwalk algorithm, and generating at least one path sequence includes:
in the user-video relation graph, normalizing the edge formed by each node and the adjacent node to obtain the wandering probability of each node relative to the adjacent node;
determining each node in the user-video relationship graph as an initial node;
and according to the wandering probability of each node relative to the adjacent nodes, starting to wander on the user-video relation graph from each initial node until a cycle termination condition is met, stopping wandering, and generating the at least one path sequence.
It should be noted that, when the conventional Deepwalk algorithm performs random walk, the next node is uniformly sampled from the previous node, that is, the walk probability of each node is consistent.
In this embodiment, the traditional uniform adoption is replaced by a "likeness" distribution that can represent how close the nodes are (between the user and the video). While walking, it is easier to select a user who has a preference for each video for the next node of each video; it is also easier to select a video that is preferred by each user for the next node of each user.
Specifically, the normalizing the edge formed by each node and the adjacent node to obtain the wandering probability of each node relative to the adjacent node includes:
acquiring the score of each sub-connection corresponding to each user;
calculating a first sum of scores of each sub-connection corresponding to each user;
calculating the quotient of the score of each sub-connection and the first sum value to obtain the wandering probability of the node of the corresponding user wandering to the node of the corresponding video in each sub-connection;
acquiring the score of each sub-link corresponding to each video;
calculating a second sum of scores for each sub-connection corresponding to each video;
calculating the quotient of the score of each sub-connection and the second sum value to obtain the wandering probability of the node of the corresponding video wandering to the node of the corresponding user in each sub-connection;
and integrating the wandering probability of each sub-connection from the node of the corresponding user to the node of the corresponding video and the wandering probability of each sub-connection from the node of the corresponding video to the node of the corresponding user to obtain the wandering probability of each node relative to the adjacent node.
It will be appreciated that the nodes of one user are adjacent to the nodes of one video.
For example, when a node corresponding to the user a is connected to a node corresponding to the video M and a node corresponding to the video N, the score of the sub-connection between the user a and the video M is 2, and the score of the sub-connection between the user a and the video N is 5, then the wandering probability from the node of the user a to the node of the video M is 2/(2 + 5) =2/7, and the wandering probability from the node of the user a to the node of the video N is 5/(2 + 5) = 5/7; similarly, when the node corresponding to the video M is further connected to the node corresponding to the user B, the score of the sub-connection between the video M and the user a is 2, and the score of the sub-connection between the video M and the user B is 1, then the wandering probability of the node of the video M wandering to the node of the user a is 2/(2 + 1) =2/3, and the wandering probability of the node of the video M wandering to the node of the user B is 1/(2 + 1) = 1/3.
Through the embodiment, the wandering probability among the nodes can be subjected to normalized processing, and the calculation dimensionality is unified.
Specifically, the satisfying of the cycle end condition includes:
acquiring a preset walking step number threshold value, and determining that the circulation termination condition is met when the walking step number is detected to reach the walking step number threshold value; and/or
In the wandering process, after the wandering from a first node to a second node is detected, the first node is directly returned from the second node, the in-place loop is determined to be generated, and the loop termination condition is determined to be met.
The threshold value of the number of walking steps can be configured in a user-defined manner, such as 9 steps.
For example: when it is detected that 9 walks in total from the node X, having undergone X → a → Y → C → Z → B → Y → a → X → B, have formed a path with a step size of 10, it is determined that the loop termination condition is satisfied.
For the case of the in-situ circulation, see the following examples.
For example: when it is detected that a loop similar to B → Y → B occurs during the walk, it is determined that the in-place loop is generated, i.e., it is determined that the loop termination condition is satisfied.
S12, vectorizing the at least one path sequence to obtain at least one path vector.
In at least one embodiment of the present invention, a word2vec model may be adopted to perform vectorization processing on the at least one path sequence, so as to obtain the at least one path vector.
For example: each sequence of paths may be converted to 64-dimensional vectors of the same dimension using the word2vec model.
S13, identifying a user vector from the at least one path vector.
It is understood that the at least one generated path vector includes the following two vectors:
(1) and (3) user vector: taking a node corresponding to a user as an initial node to carry out wandering to obtain a vector;
(2) video vector: and taking the node corresponding to the video as an initial node to carry out wandering to obtain a vector.
Since the present embodiment is to group users, the user vector needs to be identified from the at least one path vector.
The user vector obtained by the embodiment contains not only user information but also video information, the relationship between users is established through videos, the videos and the users form a context relationship similar to sentences in the process of establishing a relationship graph and walking, the context of each user is a video which is watched by the user and has a high favorite degree, and the context of each video is a user who watches the video and likes the video relatively. Each user vector carries video information favored by the user, not only contains video watching habit information of the user, but also contains relationship information between the users established according to video watching behaviors, so that the users with similar interests, namely the users with close user vectors, can be found out.
And S14, calculating the similarity between every two users according to the user vector.
In this embodiment, the similarity between every two users may be calculated by using a cosine similarity calculation method, and the calculation method of the similarity between every two users is not limited by the present invention.
And S15, constructing a user relation graph according to the similarity between every two users.
In at least one embodiment of the present invention, the constructing the user relationship graph according to the similarity between every two users includes:
deleting the nodes corresponding to the videos from the user-video relation graph to obtain an initial user relation graph;
and determining the similarity between every two users as the weight of each edge corresponding to the initial user relationship graph to obtain the user relationship graph.
Through the embodiment, the established user relation graph simultaneously contains information of multiple dimensions, so that the relation among users is more compact and accurate.
And S16, carrying out group division on the users based on the user relation graph by adopting a community discovery technology to obtain at least one learning group.
In at least one embodiment of the present invention, the dividing the group of the users based on the user relationship graph by using the community discovery technology to obtain at least one learning group includes:
calculating the access probability of each node in the user relationship graph;
dividing all users in the user relationship graph into non-overlapping user communities according to the access probability of each node;
and adjusting the non-overlapping user community to an overlapping community by adopting a Fuzzy infomap algorithm to obtain the at least one learning group.
It should be noted that the community discovery technology belongs to a relatively mature technical means, which is not described herein.
It can be understood that the scheme of manual marking and grouping adopted generally is not only low in efficiency, but also inaccurate in grouping, and the marking personnel can not comprehensively master the dynamic change of each branch subject in the development process of each large field, and the knowledge range included by each learning group can also change, so that the realization difficulty is higher.
Through the implementation mode, the automatic generation of the online learning groups can be realized by combining the improved Deepwalk algorithm and the community discovery technology, the efficiency is high, the division is more accurate, meanwhile, in each established learning group, the members can communicate more effectively due to the strong association among the members, and the learning effect is improved.
In at least one embodiment of the invention, after obtaining at least one learning team, the method further comprises:
sequencing the watching times of the videos watched by the users in each learning group from high to low at intervals of a first preset time interval, acquiring the videos ranked at the front preset positions as target videos of each learning group, extracting keywords of the target videos of each learning group, and determining the keywords of the target videos of each learning group as labels of each learning group; and/or
Acquiring the access probability of a node corresponding to a user in each learning group, determining the user corresponding to the node with the highest access probability as a core user in each learning group, and establishing a first position label for the core user; and/or
Acquiring the performance level of the users in each learning group, determining the user with the highest performance level as a target user, and establishing a second position label for the target user; and/or
Updating the viewing data at intervals of a second preset time interval, sending a grouping confirmation request to a corresponding user, and updating the learning group according to the updated viewing data and the grouping confirmation information when receiving the grouping confirmation information fed back by the corresponding user; and/or
When the number of people with the learning group is detected to be less than the threshold number of people, the detected learning group is deleted.
The first preset time interval and the second preset time interval may be configured in a user-defined manner, which is not limited in the present invention.
The front preset bit can be configured by self-definition, such as the front 10 bits.
The first position tag and the second position tag can also be configured in a user-defined mode, if the first position tag can be a shift leader, the second position tag can be a study committee.
The number of people threshold can also be configured in a user-defined mode, such as 25 people.
In this embodiment, the tf-idf and other techniques may be adopted to extract the keywords of the target video of each learning group.
In this embodiment, the performance level of the user in each learning group may be obtained from the enterprise database corresponding to each user, or may be uploaded by each user.
In the above embodiment, by establishing the label of each learning group, the attribute of each learning group can be made more definite, and a reference is provided for the addition of other users; the user with the highest access probability is determined as the core user and the corresponding position, so that the growth of a learning group can be maintained to a certain extent; corresponding positions are distributed to the users with the best performance, and the learning enthusiasm of other members in the group can be promoted; the learning groups are periodically updated and reconstructed, the opinions of the users are fully considered during updating, the group members are encouraged to exert long items, the inherent habits are broken, and the user experience is improved; and (3) resolving the learning groups with too low population, and avoiding the redundancy of the group data formed by the unused learning groups.
It should be noted that, in order to further ensure the security of the data and avoid malicious tampering of the data, the learning group may be stored on the blockchain node.
According to the technical scheme, the method comprises the steps of obtaining watching data of a user on a training video, constructing a user-video relation graph according to the watching data, randomly walking on the user-video relation graph by adopting an improved Deepwalk algorithm, generating at least one path sequence, and replacing the traditional uniform adoption with 'love' distribution which can represent the tightness degree between nodes (between the user and the video). While walking, it is easier to select a user who has a preference for each video for the next node of each video; for the next node of each user, a video which is preferred by each user is easily selected, vectorization processing is carried out on the at least one path sequence to obtain at least one path vector, the user vector is identified from the at least one path vector, the obtained user vector not only contains user information, but also contains video information, the relationship of the users is established through the video, a context relationship similar to sentences is formed between the video and the users in the process of establishing a relationship graph and walking, the context of each user is the video which is watched by the users and has a higher preference degree, and the context of each video is the user who watches the video and prefers the video relatively. Each user vector carries video information favored by the user, not only contains video watching habit information of the user, but also contains relationship information between the users established according to video watching behaviors, so that the users with similar interests, namely the users with close user vectors, can be found out, the similarity between every two users is calculated according to the user vectors, a user relationship graph is established according to the similarity between every two users, the established user relationship graph simultaneously contains information with multiple dimensions, so that the relationship between the users is more compact and accurate, a community discovery technology is adopted, the users are divided into groups based on the user relationship graph to obtain at least one learning group, the automatic generation of the online learning group is realized by combining an improved Deepwalk algorithm and the community discovery technology, the efficiency and the division are more accurate, and meanwhile, in each established learning group, due to the strong association among the members, the members can communicate more effectively, and the learning effect is improved.
An embodiment of the present invention further provides a grouping apparatus combining the Deepwalk and the community discovery technology, where the grouping apparatus combining the Deepwalk and the community discovery technology is configured to implement any of the aforementioned embodiments of the grouping method combining the Deepwalk and the community discovery technology. Specifically, referring to fig. 2, fig. 2 is a schematic block diagram of a grouping apparatus incorporating deep walk and community discovery technologies according to an embodiment of the present invention.
As shown in fig. 2, the grouping apparatus 100 combining the deep walk and community discovery techniques includes: the device comprises a construction unit 101, a migration unit 102, a vectorization unit 103, an identification unit 104, a calculation unit 105 and a division unit 106.
The construction unit 101 acquires the viewing data of the training video from the user, and constructs a user-video relationship diagram according to the viewing data.
In at least one embodiment of the present invention, the viewing data includes, but is not limited to, a combination of one or more of the following:
the watching times, the number of barracks, the comment content, the number of comments, the collection number, the number of praise and the number of appreciation of each training video of each user.
Wherein the training videos may include training videos of any field, such as: sales training videos, learning videos for students, and the like.
In at least one embodiment of the present invention, the constructing unit 101 constructs the user-video relationship graph according to the viewing data, including:
when a record that a user watches a video is detected in the watching data, connecting the detected user with the corresponding video to obtain an initial relation graph comprising at least one sub-connection, wherein each sub-connection comprises a user and a video;
acquiring a pre-configured scoring algorithm;
calculating a score for each sub-connection based on the scoring algorithm by acquiring data from the viewing data;
sorting the sub-connections corresponding to each user according to the order of the scores from high to low;
and deleting the sub-connections arranged behind the preset bits from the initial relationship graph to obtain the user-video relationship graph.
The preset bits may be configured according to an actually required data amount, such as 100.
Through the configuration of the preset bits, data with low influence can be deleted from the initial relation graph, and further the complexity of the model is reduced.
Specifically, when the user-video relation graph is constructed, if the user A is identified to watch the video B, the user A and the video B are connected. Obviously, in practical applications, each user may have viewed a plurality of videos, and each video may also have been viewed by a plurality of users, so that a plurality of sub-connections including one user and one video can be established. Each sub-connection represents that the user watches the corresponding video, and further the relationship between the user and the video is established.
Further, the preconfigured scoring algorithm may include:
Figure 886318DEST_PATH_IMAGE001
where Y represents the score of each sub-link, X1 represents the number of views, X2 represents the number of positive barrage, X3 represents the number of negative barrage, X4 represents the number of positive comments, X5 represents the number of negative comments, and X6 represents the number of awards.
Specifically, the positive barrage, the negative barrage, the positive comments and the negative comments may be classified by applying naive bayes or SVM (support vector machines) models.
Of course, in other embodiments, the emotion may be ignored, and the number of bullet screens and the number of evaluations may be directly added to the score of each sub-link.
The score for each sub-connection may indicate the user's preference for the corresponding viewed video.
Through the embodiment, the user-video relation graph can be automatically constructed, and the closeness degree of the relation between the user and the video is reflected.
The walking unit 102 performs random walking on the user-video relationship graph by using a modified deep walk algorithm to generate at least one path sequence.
In at least one embodiment of the present invention, the walking unit 102 performs random walking on the user-video relationship graph by using a modified Deepwalk algorithm, and generating at least one path sequence includes:
in the user-video relation graph, normalizing the edge formed by each node and the adjacent node to obtain the wandering probability of each node relative to the adjacent node;
determining each node in the user-video relationship graph as an initial node;
and according to the wandering probability of each node relative to the adjacent nodes, starting to wander on the user-video relation graph from each initial node until a cycle termination condition is met, stopping wandering, and generating the at least one path sequence.
It should be noted that, when the conventional Deepwalk algorithm performs random walk, the next node is uniformly sampled from the previous node, that is, the walk probability of each node is consistent.
In this embodiment, the traditional uniform adoption is replaced by a "likeness" distribution that can represent how close the nodes are (between the user and the video). While walking, it is easier to select a user who has a preference for each video for the next node of each video; it is also easier to select a video that is preferred by each user for the next node of each user.
Specifically, the normalizing the edge formed by each node and the adjacent node to obtain the wandering probability of each node relative to the adjacent node includes:
acquiring the score of each sub-connection corresponding to each user;
calculating a first sum of scores of each sub-connection corresponding to each user;
calculating the quotient of the score of each sub-connection and the first sum value to obtain the wandering probability of the node of the corresponding user wandering to the node of the corresponding video in each sub-connection;
acquiring the score of each sub-link corresponding to each video;
calculating a second sum of scores for each sub-connection corresponding to each video;
calculating the quotient of the score of each sub-connection and the second sum value to obtain the wandering probability of the node of the corresponding video wandering to the node of the corresponding user in each sub-connection;
and integrating the wandering probability of each sub-connection from the node of the corresponding user to the node of the corresponding video and the wandering probability of each sub-connection from the node of the corresponding video to the node of the corresponding user to obtain the wandering probability of each node relative to the adjacent node.
It will be appreciated that the nodes of one user are adjacent to the nodes of one video.
For example, when a node corresponding to the user a is connected to a node corresponding to the video M and a node corresponding to the video N, the score of the sub-connection between the user a and the video M is 2, and the score of the sub-connection between the user a and the video N is 5, then the wandering probability from the node of the user a to the node of the video M is 2/(2 + 5) =2/7, and the wandering probability from the node of the user a to the node of the video N is 5/(2 + 5) = 5/7; similarly, when the node corresponding to the video M is further connected to the node corresponding to the user B, the score of the sub-connection between the video M and the user a is 2, and the score of the sub-connection between the video M and the user B is 1, then the wandering probability of the node of the video M wandering to the node of the user a is 2/(2 + 1) =2/3, and the wandering probability of the node of the video M wandering to the node of the user B is 1/(2 + 1) = 1/3.
Through the embodiment, the wandering probability among the nodes can be subjected to normalized processing, and the calculation dimensionality is unified.
Specifically, the satisfying of the cycle end condition includes:
acquiring a preset walking step number threshold value, and determining that the circulation termination condition is met when the walking step number is detected to reach the walking step number threshold value; and/or
In the wandering process, after the wandering from a first node to a second node is detected, the first node is directly returned from the second node, the in-place loop is determined to be generated, and the loop termination condition is determined to be met.
The threshold value of the number of walking steps can be configured in a user-defined manner, such as 9 steps.
For example: when it is detected that 9 walks in total from the node X, having undergone X → a → Y → C → Z → B → Y → a → X → B, have formed a path with a step size of 10, it is determined that the loop termination condition is satisfied.
For the case of the in-situ circulation, see the following examples.
For example: when it is detected that a loop similar to B → Y → B occurs during the walk, it is determined that the in-place loop is generated, i.e., it is determined that the loop termination condition is satisfied.
The vectorization unit 103 performs vectorization processing on the at least one path sequence to obtain at least one path vector.
In at least one embodiment of the present invention, a word2vec model may be adopted to perform vectorization processing on the at least one path sequence, so as to obtain the at least one path vector.
For example: each sequence of paths may be converted to 64-dimensional vectors of the same dimension using the word2vec model.
The identifying unit 104 identifies a user vector from the at least one path vector.
It is understood that the at least one generated path vector includes the following two vectors:
(1) and (3) user vector: taking a node corresponding to a user as an initial node to carry out wandering to obtain a vector;
(2) video vector: and taking the node corresponding to the video as an initial node to carry out wandering to obtain a vector.
Since the present embodiment is to group users, the user vector needs to be identified from the at least one path vector.
The user vector obtained by the embodiment contains not only user information but also video information, the relationship between users is established through videos, the videos and the users form a context relationship similar to sentences in the process of establishing a relationship graph and walking, the context of each user is a video which is watched by the user and has a high favorite degree, and the context of each video is a user who watches the video and likes the video relatively. Each user vector carries video information favored by the user, not only contains video watching habit information of the user, but also contains relationship information between the users established according to video watching behaviors, so that the users with similar interests, namely the users with close user vectors, can be found out.
The calculating unit 105 calculates the similarity between each two users according to the user vector.
In this embodiment, the similarity between every two users may be calculated by using a cosine similarity calculation method, and the calculation method of the similarity between every two users is not limited by the present invention.
The construction unit 101 constructs a user relationship graph according to the similarity between every two users.
In at least one embodiment of the present invention, the constructing unit 101, according to the similarity between each two users, constructs the user relationship graph, including:
deleting the nodes corresponding to the videos from the user-video relation graph to obtain an initial user relation graph;
and determining the similarity between every two users as the weight of each edge corresponding to the initial user relationship graph to obtain the user relationship graph.
Through the embodiment, the established user relation graph simultaneously contains information of multiple dimensions, so that the relation among users is more compact and accurate.
The dividing unit 106 performs group division on the users based on the user relationship graph by using a community discovery technology to obtain at least one learning group.
In at least one embodiment of the present invention, the dividing unit 106 performs group division on the users based on the user relationship graph by using a community discovery technology, and obtaining at least one learning group includes:
calculating the access probability of each node in the user relationship graph;
dividing all users in the user relationship graph into non-overlapping user communities according to the access probability of each node;
and adjusting the non-overlapping user community to an overlapping community by adopting a Fuzzy infomap algorithm to obtain the at least one learning group.
It should be noted that the community discovery technology belongs to a relatively mature technical means, which is not described herein.
It can be understood that the scheme of manual marking and grouping adopted generally is not only low in efficiency, but also inaccurate in grouping, and the marking personnel can not comprehensively master the dynamic change of each branch subject in the development process of each large field, and the knowledge range included by each learning group can also change, so that the realization difficulty is higher.
Through the implementation mode, the automatic generation of the online learning groups can be realized by combining the improved Deepwalk algorithm and the community discovery technology, the efficiency is high, the division is more accurate, meanwhile, in each established learning group, the members can communicate more effectively due to the strong association among the members, and the learning effect is improved.
In at least one embodiment of the invention, after at least one learning group is obtained, at intervals of a first preset time interval, the watching times of videos watched by users in each learning group are sequenced from high to low, a video ranked at a preset position in front is obtained as a target video of each learning group, keywords of the target video of each learning group are extracted, and the keywords of the target video of each learning group are determined as a label of each learning group; and/or
Acquiring the access probability of a node corresponding to a user in each learning group, determining the user corresponding to the node with the highest access probability as a core user in each learning group, and establishing a first position label for the core user; and/or
Acquiring the performance level of the users in each learning group, determining the user with the highest performance level as a target user, and establishing a second position label for the target user; and/or
Updating the viewing data at intervals of a second preset time interval, sending a grouping confirmation request to a corresponding user, and updating the learning group according to the updated viewing data and the grouping confirmation information when receiving the grouping confirmation information fed back by the corresponding user; and/or
When the number of people with the learning group is detected to be less than the threshold number of people, the detected learning group is deleted.
The first preset time interval and the second preset time interval may be configured in a user-defined manner, which is not limited in the present invention.
The front preset bit can be configured by self-definition, such as the front 10 bits.
The first position tag and the second position tag can also be configured in a user-defined mode, if the first position tag can be a shift leader, the second position tag can be a study committee.
The number of people threshold can also be configured in a user-defined mode, such as 25 people.
In this embodiment, the tf-idf and other techniques may be adopted to extract the keywords of the target video of each learning group.
In this embodiment, the performance level of the user in each learning group may be obtained from the enterprise database corresponding to each user, or may be uploaded by each user.
In the above embodiment, by establishing the label of each learning group, the attribute of each learning group can be made more definite, and a reference is provided for the addition of other users; the user with the highest access probability is determined as the core user and the corresponding position, so that the growth of a learning group can be maintained to a certain extent; corresponding positions are distributed to the users with the best performance, and the learning enthusiasm of other members in the group can be promoted; the learning groups are periodically updated and reconstructed, the opinions of the users are fully considered during updating, the group members are encouraged to exert long items, the inherent habits are broken, and the user experience is improved; and (3) resolving the learning groups with too low population, and avoiding the redundancy of the group data formed by the unused learning groups.
It should be noted that, in order to further ensure the security of the data and avoid malicious tampering of the data, the learning group may be stored on the blockchain node.
According to the technical scheme, the method comprises the steps of obtaining watching data of a user on a training video, constructing a user-video relation graph according to the watching data, randomly walking on the user-video relation graph by adopting an improved Deepwalk algorithm, generating at least one path sequence, and replacing the traditional uniform adoption with 'love' distribution which can represent the tightness degree between nodes (between the user and the video). While walking, it is easier to select a user who has a preference for each video for the next node of each video; for the next node of each user, a video which is preferred by each user is easily selected, vectorization processing is carried out on the at least one path sequence to obtain at least one path vector, the user vector is identified from the at least one path vector, the obtained user vector not only contains user information, but also contains video information, the relationship of the users is established through the video, a context relationship similar to sentences is formed between the video and the users in the process of establishing a relationship graph and walking, the context of each user is the video which is watched by the users and has a higher preference degree, and the context of each video is the user who watches the video and prefers the video relatively. Each user vector carries video information favored by the user, not only contains video watching habit information of the user, but also contains relationship information between the users established according to video watching behaviors, so that the users with similar interests, namely the users with close user vectors, can be found out, the similarity between every two users is calculated according to the user vectors, a user relationship graph is established according to the similarity between every two users, the established user relationship graph simultaneously contains information with multiple dimensions, so that the relationship between the users is more compact and accurate, a community discovery technology is adopted, the users are divided into groups based on the user relationship graph to obtain at least one learning group, the automatic generation of the online learning group is realized by combining an improved Deepwalk algorithm and the community discovery technology, the efficiency and the division are more accurate, and meanwhile, in each established learning group, due to the strong association among the members, the members can communicate more effectively, and the learning effect is improved.
The grouping means described above in connection with the Deepwalk and community discovery techniques may be implemented in the form of a computer program which may be run on a computer device as shown in fig. 3.
Referring to fig. 3, fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present invention. The computer device 500 is a server, and the server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 3, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a storage medium 503 and an internal memory 504.
The storage medium 503 may store an operating system 5031 and a computer program 5032. The computer programs 5032, when executed, cause the processor 502 to perform a grouping method that incorporates Deepwalk and community discovery techniques.
The processor 502 is used to provide computing and control capabilities that support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the operation of the computer program 5032 in the storage medium 503, and when the computer program 5032 is executed by the processor 502, the processor 502 may perform the grouping method combining deepwater and community discovery technology.
The network interface 505 is used for network communication, such as providing transmission of data information. Those skilled in the art will appreciate that the configuration shown in fig. 3 is a block diagram of only a portion of the configuration associated with aspects of the present invention and is not intended to limit the computing device 500 to which aspects of the present invention may be applied, and that a particular computing device 500 may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The processor 502 is configured to run a computer program 5032 stored in the memory to implement the grouping method combining deep walk and community discovery technology disclosed in the embodiment of the present invention.
Those skilled in the art will appreciate that the embodiment of a computer device illustrated in fig. 3 does not constitute a limitation on the specific construction of the computer device, and in other embodiments a computer device may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. For example, in some embodiments, the computer device may only include a memory and a processor, and in such embodiments, the structures and functions of the memory and the processor are consistent with those of the embodiment shown in fig. 3, and are not described herein again.
It should be understood that, in the embodiment of the present invention, the Processor 502 may be a Central Processing Unit (CPU), and the Processor 502 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In another embodiment of the invention, a computer-readable storage medium is provided. The computer-readable storage medium may be a nonvolatile computer-readable storage medium or a volatile computer-readable storage medium. The computer readable storage medium stores a computer program, wherein the computer program, when executed by a processor, implements the grouping method combining deep walk and community discovery techniques disclosed in embodiments of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only a logical division, and there may be other divisions when the actual implementation is performed, or units having the same function may be grouped into one unit, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A grouping method combining Deepwalk and community discovery technologies is characterized by comprising the following steps:
acquiring watching data of a user on a training video, and constructing a user-video relation graph according to the watching data;
adopting an improved Deepwalk algorithm to carry out random walk on the user-video relation graph to generate at least one path sequence;
vectorizing the at least one path sequence to obtain at least one path vector;
identifying a user vector from the at least one path vector;
calculating the similarity between every two users according to the user vector;
constructing a user relation graph according to the similarity between every two users;
and adopting a community discovery technology to divide the user groups based on the user relation graph to obtain at least one learning group.
2. The grouping method in conjunction with deep and community discovery techniques as claimed in claim 1, wherein said constructing a user-video relationship graph from said viewing data comprises:
when a record that a user watches a video is detected in the watching data, connecting the detected user with the corresponding video to obtain an initial relation graph comprising at least one sub-connection, wherein each sub-connection comprises a user and a video;
acquiring a pre-configured scoring algorithm;
calculating a score for each sub-connection based on the scoring algorithm by acquiring data from the viewing data;
sorting the sub-connections corresponding to each user according to the order of the scores from high to low;
and deleting the sub-connections arranged behind the preset bits from the initial relationship graph to obtain the user-video relationship graph.
3. The grouping method combining deep walk and community discovery technologies according to claim 1, wherein the randomly walking on the user-video relationship graph by using the improved deep walk algorithm, and generating at least one path sequence comprises:
in the user-video relation graph, normalizing the edge formed by each node and the adjacent node to obtain the wandering probability of each node relative to the adjacent node;
determining each node in the user-video relationship graph as an initial node;
and according to the wandering probability of each node relative to the adjacent nodes, starting to wander on the user-video relation graph from each initial node until a cycle termination condition is met, stopping wandering, and generating the at least one path sequence.
4. The grouping method combining Deepwalk and community discovery technologies as claimed in claim 3, wherein said normalizing the edges formed by each node and neighboring nodes to obtain the wandering probability of each node relative to neighboring nodes comprises:
acquiring the score of each sub-connection corresponding to each user;
calculating a first sum of scores of each sub-connection corresponding to each user;
calculating the quotient of the score of each sub-connection and the first sum value to obtain the wandering probability of the node of the corresponding user wandering to the node of the corresponding video in each sub-connection;
acquiring the score of each sub-link corresponding to each video;
calculating a second sum of scores for each sub-connection corresponding to each video;
calculating the quotient of the score of each sub-connection and the second sum value to obtain the wandering probability of the node of the corresponding video wandering to the node of the corresponding user in each sub-connection;
and integrating the wandering probability of each sub-connection from the node of the corresponding user to the node of the corresponding video and the wandering probability of each sub-connection from the node of the corresponding video to the node of the corresponding user to obtain the wandering probability of each node relative to the adjacent node.
5. The grouping method combining Deepwalk and community discovery techniques as claimed in claim 3, wherein said satisfying a loop termination condition comprises:
acquiring a preset walking step number threshold value, and determining that the circulation termination condition is met when the walking step number is detected to reach the walking step number threshold value; and/or
In the wandering process, after the wandering from a first node to a second node is detected, the first node is directly returned from the second node, the in-place loop is determined to be generated, and the loop termination condition is determined to be met.
6. The grouping method combining Deepwalk and community discovery technologies as claimed in claim 1, wherein said constructing a user relationship graph according to the similarity between each two users comprises:
deleting the nodes corresponding to the videos from the user-video relation graph to obtain an initial user relation graph;
and determining the similarity between every two users as the weight of each edge corresponding to the initial user relationship graph to obtain the user relationship graph.
7. The grouping method in conjunction with deep and community discovery techniques as claimed in claim 1, wherein after obtaining at least one learning team, said method further comprises:
sequencing the watching times of the videos watched by the users in each learning group from high to low at intervals of a first preset time interval, acquiring the videos ranked at the front preset positions as target videos of each learning group, extracting keywords of the target videos of each learning group, and determining the keywords of the target videos of each learning group as labels of each learning group; and/or
Acquiring the access probability of a node corresponding to a user in each learning group, determining the user corresponding to the node with the highest access probability as a core user in each learning group, and establishing a first position label for the core user; and/or
Acquiring the performance level of the users in each learning group, determining the user with the highest performance level as a target user, and establishing a second position label for the target user; and/or
Updating the viewing data at intervals of a second preset time interval, sending a grouping confirmation request to a corresponding user, and updating the learning group according to the updated viewing data and the grouping confirmation information when receiving the grouping confirmation information fed back by the corresponding user; and/or
When the number of people with the learning group is detected to be less than the threshold number of people, the detected learning group is deleted.
8. A grouping apparatus that combines deep and community discovery techniques, comprising:
the building unit is used for obtaining the watching data of the user on the training video and building a user-video relation graph according to the watching data;
a walking unit, configured to perform random walking on the user-video relationship diagram by using an improved Deepwalk algorithm, and generate at least one path sequence;
the vectorization unit is used for vectorizing the at least one path sequence to obtain at least one path vector;
an identifying unit for identifying a user vector from the at least one path vector;
the calculating unit is used for calculating the similarity between every two users according to the user vector;
the construction unit is also used for constructing a user relationship graph according to the similarity between every two users;
and the dividing unit is used for dividing the user groups based on the user relationship graph by adopting a community discovery technology to obtain at least one learning group.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the grouping method in conjunction with Deepwalk and community discovery techniques as claimed in any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, causes the processor to carry out the grouping method in conjunction with deepwater and community discovery techniques as claimed in any one of claims 1 to 7.
CN202110868902.7A 2021-07-30 2021-07-30 Grouping method, device, equipment and medium combining Deepwalk and community discovery technology Active CN113312514B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110868902.7A CN113312514B (en) 2021-07-30 2021-07-30 Grouping method, device, equipment and medium combining Deepwalk and community discovery technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110868902.7A CN113312514B (en) 2021-07-30 2021-07-30 Grouping method, device, equipment and medium combining Deepwalk and community discovery technology

Publications (2)

Publication Number Publication Date
CN113312514A true CN113312514A (en) 2021-08-27
CN113312514B CN113312514B (en) 2021-11-09

Family

ID=77382144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110868902.7A Active CN113312514B (en) 2021-07-30 2021-07-30 Grouping method, device, equipment and medium combining Deepwalk and community discovery technology

Country Status (1)

Country Link
CN (1) CN113312514B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075908A1 (en) * 2015-09-10 2017-03-16 Adobe Systems Incorporated Incorporating Social-Network Connections Information into Estimated User-Ratings of Videos for Video Recommendations
CN107609063A (en) * 2017-08-29 2018-01-19 重庆邮电大学 A kind of the mobile phone application commending system and its method of multi-tag classification
US20190379628A1 (en) * 2018-06-07 2019-12-12 Arizona Board Of Regents On Behalf Of Arizona State University Method and apparatus for detecting fake news in a social media network
CN110866838A (en) * 2019-11-06 2020-03-06 西安邮电大学 Network representation learning algorithm based on transition probability preprocessing
CN111246256A (en) * 2020-02-21 2020-06-05 华南理工大学 Video recommendation method based on multi-mode video content and multi-task learning
CN112231579A (en) * 2019-12-30 2021-01-15 北京邮电大学 Social video recommendation system and method based on implicit community discovery
CN112487110A (en) * 2020-12-07 2021-03-12 中国船舶重工集团公司第七一六研究所 Overlapped community evolution analysis method and system based on network structure and node content
CN112910680A (en) * 2020-12-30 2021-06-04 重庆邮电大学 Network embedding method for fusing multi-granularity community information
CN113094594A (en) * 2021-03-22 2021-07-09 北京海致星图科技有限公司 Similar financial community network mining algorithm based on graph partitioning algorithm and graph embedding algorithm

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075908A1 (en) * 2015-09-10 2017-03-16 Adobe Systems Incorporated Incorporating Social-Network Connections Information into Estimated User-Ratings of Videos for Video Recommendations
CN107609063A (en) * 2017-08-29 2018-01-19 重庆邮电大学 A kind of the mobile phone application commending system and its method of multi-tag classification
US20190379628A1 (en) * 2018-06-07 2019-12-12 Arizona Board Of Regents On Behalf Of Arizona State University Method and apparatus for detecting fake news in a social media network
CN110866838A (en) * 2019-11-06 2020-03-06 西安邮电大学 Network representation learning algorithm based on transition probability preprocessing
CN112231579A (en) * 2019-12-30 2021-01-15 北京邮电大学 Social video recommendation system and method based on implicit community discovery
CN111246256A (en) * 2020-02-21 2020-06-05 华南理工大学 Video recommendation method based on multi-mode video content and multi-task learning
CN112487110A (en) * 2020-12-07 2021-03-12 中国船舶重工集团公司第七一六研究所 Overlapped community evolution analysis method and system based on network structure and node content
CN112910680A (en) * 2020-12-30 2021-06-04 重庆邮电大学 Network embedding method for fusing multi-granularity community information
CN113094594A (en) * 2021-03-22 2021-07-09 北京海致星图科技有限公司 Similar financial community network mining algorithm based on graph partitioning algorithm and graph embedding algorithm

Also Published As

Publication number Publication date
CN113312514B (en) 2021-11-09

Similar Documents

Publication Publication Date Title
US11893514B2 (en) Contextual-based method and system for identifying and revealing selected objects from video
US20210141814A1 (en) Concept-level user intent profile extraction and applications
Macedo et al. Context-aware event recommendation in event-based social networks
Peng et al. Retweet modeling using conditional random fields
US20160180402A1 (en) Method for recommending products based on a user profile derived from metadata of multimedia content
US20120072283A1 (en) Mobile application recommendation system and method
CN105378720A (en) Media content discovery and character organization techniques
CN111885399A (en) Content distribution method, content distribution device, electronic equipment and storage medium
CN109272390A (en) The personalized recommendation method of fusion scoring and label information
Jia et al. Multi-modal learning for video recommendation based on mobile application usage
Lei et al. Social diffusion analysis with common-interest model for image annotation
Xu et al. Towards annotating media contents through social diffusion analysis
JP2022531410A (en) Digital anthropology and ethnographic system
CN113312514B (en) Grouping method, device, equipment and medium combining Deepwalk and community discovery technology
Sabet Social Media Posts Popularity Prediction During Long-Running Live Events A case study on Fashion Week
Mahalle et al. Foundations of data science for engineering problem solving
Liu et al. On the influence propagation of web videos
CN114596108A (en) Object recommendation method and device, electronic equipment and storage medium
Elhishi et al. Perspectives on the evolution of online communities
Ma Modeling users for online advertising
JAVADIAN SABET Social media posts popularity prediction during long-running live events. A case study on fashion week
Widisinghe et al. picSEEK: Collaborative filtering for context-based image recommendation
Sargar Recommender system using reinforcement learning
Ntalianis et al. Multiresolution organization of social media users’ profiles: Fast detection and efficient transmission of characteristic profiles
CN114610905B (en) Data processing method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant