CN115344787B - Multi-granularity recommendation method, system, device and storage medium - Google Patents

Multi-granularity recommendation method, system, device and storage medium Download PDF

Info

Publication number
CN115344787B
CN115344787B CN202211011337.3A CN202211011337A CN115344787B CN 115344787 B CN115344787 B CN 115344787B CN 202211011337 A CN202211011337 A CN 202211011337A CN 115344787 B CN115344787 B CN 115344787B
Authority
CN
China
Prior art keywords
recommendation
feature vector
word
sequence
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211011337.3A
Other languages
Chinese (zh)
Other versions
CN115344787A (en
Inventor
宋宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN202211011337.3A priority Critical patent/CN115344787B/en
Publication of CN115344787A publication Critical patent/CN115344787A/en
Application granted granted Critical
Publication of CN115344787B publication Critical patent/CN115344787B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a multi-granularity recommendation method, a system, a device and a storage medium.

Description

Multi-granularity recommendation method, system, device and storage medium
Technical Field
The application relates to the technical field of computers, in particular to a multi-granularity recommendation method, a system, a device and a storage medium.
Background
Compared with the traditional teaching mode, the classroom dialogue plays a great role in mobilizing the participation enthusiasm of students, enhancing the higher-order thinking ability and improving comprehensive literacy. Therefore, developing high-quality classroom dialogues has important significance in improving the level of basic education and teaching and cultivating innovative talents.
However, classroom conversations involve multiple subjects, and the involved knowledge exploration and construction process is complex, and the advanced thinking and cognitive evolution laws are implicit. In a real classroom, a large number of low-quality conversations exist, and the conversations of different subjects, different grades and different teaching units are different in content and conversation functions, so that the effectiveness of the conversations in the classroom is low. In addition, the traditional classroom dialogue recommendation method only depends on expert guidance, is mostly pointed, lacks pertinence and is difficult to provide rich recommendation contents for teachers.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems existing in the prior art to a certain extent.
Therefore, the embodiment of the invention provides a multi-granularity recommendation method, a multi-granularity recommendation system, a multi-granularity recommendation device and a storage medium, and the effectiveness, pertinence and richness of classroom dialogue recommendation are improved.
In order to achieve the technical purpose, the technical scheme adopted by the embodiment of the invention comprises the following steps:
in one aspect, the embodiment of the invention provides a multi-granularity recommendation method, which comprises the following steps:
establishing a dialogue recommendation data set according to the corpus, wherein the dialogue recommendation data set comprises a course recommendation data set, a dialogue sequence recommendation data set, a unilateral speaking recommendation data set and a word recommendation data set;
Generating a first recommendation result according to the course recommendation data set, wherein the first recommendation result is a recommendation result of a similar course;
generating a second recommendation result according to the first recommendation result and the dialogue sequence recommendation data set, wherein the second recommendation result is a recommendation result of a similar dialogue sequence;
generating a third recommendation result according to the second recommendation result and the single-party speech recommendation data set, wherein the third recommendation result is a recommendation result of a similar single-party speech;
generating a fourth recommendation result according to the third recommendation result, the word recommendation data set and a vocabulary set of a plurality of pre-trained word vector models, wherein the fourth recommendation result is a recommendation result of similar words;
and generating a multi-granularity recommendation report according to the first recommendation result, the second recommendation result, the third recommendation result and the fourth recommendation result.
According to the multi-granularity recommendation method, a dialogue recommendation data set comprising a course recommendation data set, a dialogue sequence recommendation data set, a single-side speech recommendation data set and a word recommendation data set is constructed according to a corpus, a first recommendation result is generated according to the course recommendation data set, a second recommendation result is generated according to the first recommendation result and the dialogue sequence recommendation data set, a third recommendation result is generated according to the second recommendation result and the single-side speech recommendation data set, and a fourth recommendation result is generated according to the third recommendation result, the word recommendation data set and word list sets of a plurality of pre-trained word vector models, so that the effectiveness, pertinence and richness of the classroom dialogue recommendation result are improved.
In addition, the multi-granularity recommendation method according to the above embodiment of the present invention may further have the following additional technical features:
further, in the multi-granularity recommendation method of the embodiment of the invention, the corpus includes classroom corpus metadata and classroom corpus content data, the classroom corpus metadata is used for describing basic information of a classroom, the classroom corpus content data is used for describing specific information of the classroom, and the classroom corpus content data includes course keywords and dialogue sequences;
the construction of the dialogue recommendation data set according to the corpus comprises the following steps:
constructing the course recommendation data set according to the classroom corpus metadata and the course keywords;
constructing the dialogue sequence recommendation data set and the unilateral speaking recommendation data set according to the dialogue sequence;
and carrying out text preprocessing on the speech content in the unilateral speech recommendation data set to obtain the word recommendation data set.
Further, in one embodiment of the present invention, the generating a first recommendation from the course recommendation dataset includes:
acquiring a first recommendation list from the course recommendation data set according to a first preset rule, wherein the first recommendation list is a recommendation list of candidate courses;
Generating a first feature vector according to the first recommendation list, wherein the first feature vector comprises a first course meta information feature vector and a first keyword sense feature vector;
generating a first feature sequence according to the first feature vector, wherein the first feature sequence is the feature sequence of the first recommendation list;
acquiring a second course meta information feature vector and a second keyword sense feature vector according to the first feature sequence, wherein the second course meta information feature vector is a course meta information feature vector of a source course in the first recommendation list, and the second keyword sense feature vector is a keyword sense feature vector of the source course;
calculating a first similarity according to the second course meta-information feature vector, wherein the first similarity is cosine similarity between the second course meta-information feature vector and a course meta-information feature vector of any course in a first target course set, and the first target course set is a set formed by other courses except the source course in the first recommendation list;
calculating second similarity according to the second keyword semantic feature vector, wherein the second similarity is cosine similarity between the second keyword semantic feature vector and the keyword semantic feature vector of any course in the first target course set;
Calculating a first comprehensive similarity according to the first similarity and the second similarity;
acquiring courses with the first comprehensive similarity greater than a first similarity threshold value from the first target course set as a second target course set;
and sequencing courses in the second target course set according to the first comprehensive similarity, and generating the first recommendation result.
Further, in an embodiment of the present invention, the generating a second recommendation result according to the first recommendation result and the dialog sequence recommendation data set includes:
acquiring a second recommendation list from the dialog sequence recommendation data set according to the first recommendation result and a second preset rule, wherein the second preset rule comprises a starting tag set, a middle tag set and an ending tag set, and the second recommendation list is a recommendation list of candidate dialog sequences;
extracting a second feature vector by adopting a pre-trained first language model according to the second recommendation list, wherein the second feature vector comprises a high-order semantic feature vector of a dialogue sequence;
generating a second feature sequence according to the second feature vector, wherein the second feature sequence is the feature sequence of the second recommendation list;
Acquiring a first high-order semantic feature vector according to the second feature sequence, wherein the first high-order semantic feature vector is a high-order semantic feature vector of a source dialogue sequence in the second recommendation list;
calculating third similarity according to the first high-order semantic feature vector, wherein the third similarity is cosine similarity of the first high-order semantic feature vector and a second high-order semantic feature vector, the second high-order semantic feature vector is a high-order semantic feature vector of any dialog sequence in a first target dialog sequence set, and the first target dialog sequence set is a set formed by other dialog sequences except the source dialog sequence in the second recommendation list;
obtaining a dialogue sequence with the third similarity larger than a second similarity threshold value from the first target dialogue sequence set as a second target dialogue sequence set;
and sequencing the dialogue sequences in the second target dialogue sequence set according to the third similarity, and generating the second recommendation result.
Further, in an embodiment of the present invention, the generating a third recommendation result according to the second recommendation result and the single-party speech recommendation data set includes:
Acquiring a third recommendation list from the single party utterance recommendation data set according to the second recommendation result and the third preset rule, wherein the third preset rule comprises a target tag set, and the third recommendation list is a recommendation list of candidate single party utterances;
extracting a third feature vector by adopting a pre-trained second language model according to the third recommendation list, wherein the third feature vector comprises high-order semantic feature vectors of single-party utterances;
generating a third feature sequence according to the third feature vector, wherein the third feature sequence is the feature sequence of the third recommendation list;
acquiring a third high-order semantic feature vector according to the third feature sequence, wherein the third high-order semantic feature vector is a high-order semantic feature vector of a source single-party utterance in the third recommendation list;
calculating a fourth similarity according to the third high-order semantic feature vector, wherein the fourth similarity is cosine similarity of the third high-order semantic feature vector and a fourth high-order semantic feature vector, the fourth high-order semantic feature vector is a high-order semantic feature vector of any single-party utterance in a first target single-party utterance set, and the first target single-party utterance set is a set formed by other single-party utterances except the source single-party utterance in the third recommendation list;
Acquiring the single-party utterances with the fourth similarity greater than a third similarity threshold value from the first target single-party utterances as a second target single-party utterances set;
and sequencing the single-party utterances in the second target single-party utterance set according to the fourth similarity, and generating the third recommendation result.
Further, in an embodiment of the present invention, the generating the fourth recommendation according to the third recommendation, the word recommendation data set, and the vocabulary set of the plurality of pre-trained word vector models includes:
acquiring words which are simultaneously present in the third recommendation result and the word recommendation data set according to a fourth preset rule, and generating a candidate word recommendation list;
obtaining a target part-of-speech set and a target word length interval according to the word vector model and the vocabulary set;
acquiring word lists conforming to the target part-of-speech set and the target word length section from the word list set as candidate word sets of the word vector models;
combining the candidate word sets of each word vector model and the candidate word recommendation list to generate a fourth recommendation list;
carrying out semantic vector characterization on the source words in the fourth recommendation list by adopting each word vector model to generate a fourth feature sequence;
Carrying out semantic vector characterization on any word in a first target word set by adopting each word vector model to generate a fifth characteristic sequence, wherein the first target word set is a set formed by other words except the source word in the fourth recommendation list;
calculating fifth similarity according to the fourth feature sequence and the fifth feature sequence, wherein the fifth similarity is cosine similarity of any word in the source word and the first target word set under each word vector model semantic vector space;
calculating a second comprehensive similarity according to the fifth similarity;
acquiring words with the second comprehensive similarity greater than a fourth similarity threshold value from the first target word set as a second target word set;
and sorting the words in the second target word set according to the second comprehensive similarity, and generating the fourth recommendation result.
Further, in one embodiment of the present invention, after the generating the multi-granularity recommendation report according to the first recommendation result, the second recommendation result, the third recommendation result, and the fourth recommendation result, the multi-granularity recommendation method further includes:
The multi-granularity recommendation report is displayed.
In another aspect, an embodiment of the present invention provides a multi-granularity recommendation system, including:
the first module is used for constructing a dialogue recommendation data set according to the corpus, wherein the dialogue recommendation data set comprises a course recommendation data set, a dialogue sequence recommendation data set, a unilateral speaking recommendation data set and a word recommendation data set;
the second module is used for generating a first recommendation result according to the course recommendation data set, wherein the first recommendation result is a recommendation result of a similar course;
a third module, configured to generate a second recommendation result according to the first recommendation result and the dialog sequence recommendation data set, where the second recommendation result is a recommendation result of a similar dialog sequence;
a fourth module, configured to generate a third recommendation result according to the second recommendation result and the single-party speech recommendation data set, where the third recommendation result is a recommendation result of a similar single-party speech;
a fifth module, configured to generate a fourth recommendation result according to the third recommendation result, the word recommendation data set, and a vocabulary set of a plurality of pre-trained word vector models, where the fourth recommendation result is a recommendation result of a similar word;
And a sixth module, configured to generate a multi-granularity recommendation report according to the first recommendation result, the second recommendation result, the third recommendation result, and the fourth recommendation result.
In another aspect, an embodiment of the present invention provides a multi-granularity recommendation apparatus, including:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the one multi-granular recommendation method.
In another aspect, embodiments of the present invention provide a storage medium having stored therein a processor-executable program, which when executed by a processor, is configured to implement the multi-granularity recommendation method.
The invention has the advantages and beneficial effects that:
the embodiment of the invention can be widely applied to the technical field of multi-granularity recommendation, and the effectiveness, pertinence and richness of the classroom dialogue recommendation result are improved by constructing the dialogue recommendation data set comprising the course recommendation data set, the dialogue sequence recommendation data set, the single-word recommendation data set and the word recommendation data set according to the corpus, generating the first recommendation result according to the course recommendation data set, generating the second recommendation result according to the first recommendation result and the dialogue sequence recommendation data set, generating the third recommendation result according to the second recommendation result and the single-word recommendation data set, and generating the fourth recommendation result according to the third recommendation result, the word recommendation data set and the word list set of a plurality of pre-trained word vector models.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description is made with reference to the accompanying drawings of the embodiments of the present application or the related technical solutions in the prior art, it should be understood that, in the following description, the drawings are only for convenience and clarity to describe some embodiments in the technical solutions of the present application, and other drawings may be obtained according to these drawings without any inventive effort for those skilled in the art.
FIG. 1 is a flow chart of an embodiment of a multi-granularity recommendation method according to the present invention;
FIG. 2 is a schematic diagram of a multi-granularity recommendation system according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an embodiment of a multi-granularity recommendation device according to the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.
The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The classroom dialogue involves a plurality of subjects, the included knowledge exploration and construction process is complex, and the thinking advanced and cognitive evolution rules are implicit. In a real classroom, a large number of low-quality conversations exist, and the conversations of different subjects, different grades and different teaching units are different in content and conversation functions, so that the effectiveness of the conversations in the classroom is low. In addition, the traditional classroom dialogue recommendation method only depends on expert guidance, is mostly pointed, lacks pertinence and is difficult to provide rich recommendation contents for teachers. Therefore, the invention provides a multi-granularity recommendation method, a system, a device and a storage medium, which constructs a dialogue recommendation data set comprising a course recommendation data set, a dialogue sequence recommendation data set, a single-party speech recommendation data set and a word recommendation data set according to a corpus, generates a first recommendation result according to the course recommendation data set, generates a second recommendation result according to the first recommendation result and the dialogue sequence recommendation data set, generates a third recommendation result according to the second recommendation result and the single-party speech recommendation data set, and generates a fourth recommendation result according to the third recommendation result, the word recommendation data set and a word list set of a plurality of pre-trained word vector models, thereby improving the effectiveness, pertinence and richness of the classroom dialogue recommendation result.
The following describes in detail a multi-granularity recommendation method, system, device and storage medium according to an embodiment of the present invention with reference to the accompanying drawings, and first describes a multi-granularity recommendation method according to an embodiment of the present invention with reference to the accompanying drawings.
Referring to fig. 1, a multi-granularity recommendation method is provided in the embodiment of the present invention, and the multi-granularity recommendation method in the embodiment of the present invention may be applied to a terminal, a server, or software running in the terminal or the server. The terminal may be, but is not limited to, a tablet computer, a notebook computer, a desktop computer, etc. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content Delivery Networks (CDNs), basic cloud computing services such as big data and artificial intelligence platforms, and the like. The multi-granularity recommendation method in the embodiment of the invention mainly comprises the following steps:
s101, constructing a dialogue recommendation data set according to a corpus;
Wherein the dialogue recommendation data set comprises a course recommendation data set, a dialogue sequence recommendation data set, a unilateral utterance recommendation data set and a word recommendation data set.
In the embodiment of the invention, the corpus comprises classroom corpus metadata and classroom corpus content data, wherein the classroom corpus metadata is used for describing basic information of a classroom, the classroom corpus content data is used for describing specific information of the classroom, and the classroom corpus content data comprises course keywords and dialogue sequences.
Optionally, the class corpus metadata includes course names, belonging subjects, teaching stages, teaching grades, teaching teachers, schools and evaluation grades.
Optionally, the course keywords are composed of a user preset keyword set and an automatic extraction keyword set, wherein the automatic extraction keyword set is generated by extracting keywords from the speech content.
S101 may be further divided into the following steps S1011-S1013:
step S1011, constructing a course recommendation data set according to the classroom corpus metadata and the course keywords;
specifically, in the embodiment of the present invention, the course recommendation data set is composed of course keywords in the classroom corpus metadata and the classroom corpus content data.
Step S1012, constructing a dialogue sequence recommendation data set and a unilateral speaking recommendation data set according to the dialogue sequence;
the dialogue sequence consists of a plurality of single-side sentences, and each single-side sentence consists of an utterance main body, utterance content and category labels thereof. Optionally, the speech subject is a teacher or a student; the speaking content is speaking text and is transcribed from a classroom video file or a classroom audio file; category labels are generated by encoding the speech content manually or by a machine.
Specifically, in the embodiment of the present invention, the dialog sequence recommendation data set is composed of a plurality of single-side sentences in the dialog sequence, and the single-side utterance recommendation data set is composed of an utterance subject, an utterance content, and a category label thereof in each single-side sentence.
Step S1013, performing text preprocessing on the speech content in the unilateral speech recommendation data set to obtain a word recommendation data set.
The text preprocessing comprises word segmentation, part-of-speech tagging, word removal of specific part-of-speech and word removal of stop words.
Specifically, in the embodiment of the invention, text preprocessing is performed on the speech content of each single-party speech in the single-party speech recommendation data set to obtain a word recommendation data set.
S102, generating a first recommendation result according to a course recommendation data set;
the first recommendation result is a recommendation result of a similar course.
S102 may be further divided into the following steps S1021-S1029:
s1021, acquiring a first recommendation list from the course recommendation data set according to a first preset rule;
the first recommendation list is a recommendation list of candidate courses.
The first preset rule comprises fuzzy matching conditions and accurate matching rules for screening different attributes in the class corpus metadata of the course recommendation data set.
Specifically, in one embodiment of the invention, input items such as a course name, a teaching teacher, a school where the teaching teacher is located and the like input by a user are obtained, fuzzy matching is carried out from a course recommendation data set, and a primary course recommendation list is obtained; and according to options such as the subject, teaching stage, teaching grade, evaluation grade and the like selected by the user, carrying out accurate matching from the initially selected course recommendation list to obtain a first recommendation list.
Or in another embodiment of the invention, fuzzy matching is carried out from the course recommendation data set according to options such as the subject, teaching stage, teaching grade, evaluation grade and the like selected by the user, so as to obtain a primary course recommendation list; and carrying out accurate matching from the initially selected course recommendation list according to the course name input by the user, the input items of the teaching teacher, the school where the teaching teacher is located and the like, so as to obtain a first recommendation list.
Step S1022, generating a first feature vector according to the first recommendation list;
the first feature vector comprises a first course meta information feature vector and a first keyword sense feature vector.
Specifically, different attributes in the class corpus metadata of each course entry in the first recommendation list are subjected to single-heat coding, dimension reduction and embedding into feature vectors with fixed dimensions to obtain a first course meta-information feature vector w m Carrying out vector characterization and addition processing on course keywords by adopting a pre-trained word vector model to obtain a first keyword sense feature vector w k
Step S1023, generating a first feature sequence according to the first feature vector;
the first feature sequence is a feature sequence of the first recommendation list.
Specifically, in an embodiment of the present invention, a course meta-information feature vector w is generated from each course entry in the first recommendation list mi And keyword sense feature vector w ki Obtaining a first characteristic sequence:
[(w m1 ,w k1 ),...,(w mi ,w ki ),...,(w mn ,w kn )]
wherein each course entry is a binary set comprising a course meta information feature vector and a keyword sense feature vector, the first recommendation list may be expressed as:
C={(w m1 ,w k1 ),…,(w mi ,w ki ),…,(w mn ,w kn )},w mi ∈d m ,w ki ∈d k
step S1024, obtaining a second course meta information feature vector and a second key word sense feature vector according to the first feature sequence;
Wherein, the second course meta-information feature vector w' m For the course meta information feature vector of the source course in the first recommendation list, the second keyword sense feature vector w' k The feature vector is defined for the key words of the source lesson.
Optionally, the source course is a course selected by the user in the first recommendation list.
Step S1025, calculating a first similarity according to the second course meta-information feature vector;
wherein the first similarity is a second course meta-information feature vector w' m Course meta-information feature vector w for any course in the first set of target courses m Cosine similarity among the source courses, wherein the first target course set is a set formed by other courses except the source courses in the first recommendation list.
First similarity:
s m =CosSimilarity(w′ m ,w m )
step S1026, calculating a second similarity according to the second keyword semantic feature vector;
wherein the second similarity is a second keyword semantic feature vector w' k Keyword sense feature vector w associated with any course in the first set of target courses k Cosine similarity between them.
Second degree of similarity:
s k =CosSimilarity(w′ k ,w k )
step S1027, calculating a first comprehensive similarity according to the first similarity and the second similarity;
specifically, in the embodiment of the present invention, the first comprehensive similarity is calculated according to the first similarity and the second similarity based on the weight coefficient a of the course meta information feature vector and the weight coefficient b of the keyword sense feature vector.
First comprehensive similarity:
Figure BDA0003811005090000101
step 1028, acquiring courses with the first comprehensive similarity greater than the first similarity threshold from the first target course set as a second target course set;
the first similarity threshold is a similarity threshold preset by a user.
Step S1029, sorting courses in the second target course set according to the first comprehensive similarity, and generating a first recommendation result.
Optionally, in an embodiment of the present invention, the courses in the second set of target courses are ordered from high to low according to the first comprehensive similarity, and a first recommendation is generated.
S103, generating a second recommendation result according to the first recommendation result and the dialogue sequence recommendation data set;
wherein the second recommendation is a recommendation of a similar dialog sequence.
S103 may be further divided into the following steps S1031-S1037:
step S1031, obtaining a second recommendation list from the dialogue sequence recommendation data set according to the first recommendation result and a second preset rule;
the second preset rule comprises a starting tag set, a middle tag set and an ending tag set, and the second recommendation list is a recommendation list of the candidate conversation sequence. It will be appreciated that the start tag set is the set of tags that appear in the beginning of the dialog sequence, the middle tag set is the set of tags that appear in the middle of the dialog sequence, and the end tag set is the set of tags that appear in the end of the dialog sequence.
Specifically, in the embodiment of the invention, a dialogue sequence recommendation data set is screened according to a first recommendation result to obtain a primary dialogue sequence recommendation list; and according to a second preset rule, arranging and combining the initial selection dialogue sequence recommendation list, and screening out utterances which accord with the position rules of the initial tag set, the middle tag set and the end tag set in the dialogue sequence to form a second recommendation list.
It will be appreciated that the speech content S for each dialog sequence entry of the second recommendation list S i
Figure BDA0003811005090000102
Wherein Y is s Representing the initial tag set, Y m Representing intermediate label sets, Y e Representing the end tag set.
S1032, extracting a second feature vector by adopting the pre-trained first language model according to the second recommendation list;
wherein the second feature vector comprises a high-order semantic feature vector of the dialog sequence.
Specifically, the speech content s of each dialog sequence entry in the second recommendation list is entered using the pre-trained first language model i Extracting features to obtain high-order semantic feature vector w of context-dependent dialogue sequence i Wherein w is i =PLM(s i )
Step S1033, generating a second feature sequence according to the second feature vector;
the second feature sequence is a feature sequence of the second recommendation list.
As known from step S1032, the second recommendation list may be expressed as:
Figure BDA0003811005090000111
step S1034, obtaining a first high-order semantic feature vector according to the second feature sequence;
wherein, the first high-order semantic feature vector w' i And a high-order semantic feature vector for the source dialog sequence in the second recommendation list.
Optionally, the source dialog sequence is a dialog sequence selected by the user in the second recommendation list.
Step S1035, calculating a third similarity according to the first high-order semantic feature vector;
wherein the third similarity is the first high-order semantic feature vector w' i And a second higher order semantic feature vector w i Cosine similarity of the second higher order semantic feature vector w i The first target dialog sequence set is a set formed by other dialog sequences except the source dialog sequence in the second recommendation list.
Third similarity:
SeqSim i =CosineSimilarity(w′ i ,w i )
step S1036, obtaining a dialogue sequence with a third similarity greater than a second similarity threshold value from the first target dialogue sequence set as a second target dialogue sequence set;
the second similarity threshold is a preset similarity threshold.
Step S1037, sorting the dialogue sequences in the second target dialogue sequence set according to the third similarity, and generating a second recommendation result.
Optionally, in an embodiment of the present invention, the dialog sequences in the second set of target dialog sequences are ordered from high to low according to a third similarity, and a second recommendation is generated.
S104, generating a third recommendation result according to the second recommendation result and the unilateral speech recommendation data set;
the third recommendation result is a recommendation result similar to the single-party words.
S104 may be further divided into the following steps S1041-S1047:
step S1041, obtaining a third recommendation list from the unilateral speech recommendation data set according to the second recommendation result and a third preset rule;
the third preset rule comprises a target label set, and is a condition or rule for screening class labels of single-side utterances of the single-side utterance recommendation data set; the third recommendation list is a recommendation list of candidate single-party utterances.
Specifically, in the embodiment of the invention, the second recommendation result is split by taking the unilateral words of a teacher or a student as units and is filtered and matched with the unilateral word recommendation data set to obtain an initial menu word recommendation list; and screening out the utterances meeting the range of the target tag set in the first menu utterance recommendation list according to a third preset rule to form a third recommendation list.
It can be appreciated that the utterance content U for each single utterance entry of the third recommendation list U i
Figure BDA0003811005090000121
Wherein Y is t Representing a target set of tags.
Step S1042, extracting a third feature vector by adopting a pre-trained second language model according to a third recommendation list;
wherein the third feature vector comprises a higher-order semantic feature vector of the single utterance.
Specifically, in an embodiment of the present invention, the speech content u of each single speech item in the third recommendation list is determined using the pre-trained second language model i Extracting features to obtain high-order semantic feature vector w of single-party words related by context l Wherein w is l =PLM(u l )。
Step S1043, generating a third feature sequence according to the third feature vector;
the third feature sequence is a feature sequence of a third recommendation list.
As can be seen from step S1042, the third recommendation list can be expressed as:
Figure BDA0003811005090000122
step S1044, obtaining a third high-order semantic feature vector according to the third feature sequence;
the third higher-order semantic feature vector w' is a higher-order semantic feature vector of the source single-party utterance in the third recommendation list.
Optionally, the source unilateral utterance is a unilateral utterance selected by the user in the third recommendation list.
Step S1045, calculating a fourth similarity according to the third high-order semantic feature vector;
wherein the fourth similarity is a third high-order semantic feature vector w' l Cosine similarity with fourth high-order semantic feature vector, fourth high-order semantic feature vector w l The method comprises the steps of carrying out a first treatment on the surface of the The first target single-party utterance set is a set formed by other single-party utterances except the source single-party utterance in a third recommendation list.
Fourth similarity:
Uttsim i =CosineSimilarity(w′ i ,w l )
step S1046, obtaining a single-party utterance with a fourth similarity greater than a third similarity threshold from the first target single-party utterance set as a second target single-party utterance set;
the third similarity threshold is a preset similarity threshold.
Step S1047, sorting the single-side utterances in the second target single-side utterances set according to the fourth similarity, and generating a third recommendation result.
Optionally, in an embodiment of the present invention, the single utterances in the second target single utterance set are ordered from high to low according to a fourth similarity, and a third recommendation result is generated.
S105, generating a fourth recommendation result according to the third recommendation result, the word recommendation data set and the vocabulary sets of the plurality of pre-trained word vector models;
The fourth recommendation result is a recommendation result of similar words.
S105 may further divide the following steps S1051-S10510:
step S1051, obtaining the words which are simultaneously present in the third recommendation result and the word recommendation data set according to a fourth preset rule, and generating a candidate word recommendation list;
the fourth preset rule is a screening condition for recommended target words, and comprises parts of speech, word length and word frequency.
It will be appreciated that all of the terms in the candidate term recommendation list meet the fourth preset rule and originate from terms that occur in the third recommendation, and that terms that do not meet the fourth preset rule but are present in the third recommendation or the term recommendation dataset are culled.
Step S1052, obtaining a target part-of-speech set and a target word length interval according to the word vector model and the vocabulary set;
specifically, in an embodiment of the present invention, a set of word vector models, m= { M, is given 1 ,...,m i ,...,m p The vocabulary set d= { D } 1 ,...,d i ,...,d p Obtaining a target part-of-speech set P= { P according to a fourth preset rule 1 ,...,p j ,...,p q [ a, b ] and target word length section]Wherein
Figure BDA0003811005090000131
/>
Step S1053, obtaining a vocabulary which accords with a target part-of-speech set and the target word length section from a vocabulary set as a candidate vocabulary set of each word vector model;
Wherein each vocabulary d i ={w 1 ,...,w k ,...,w r }. Candidate word sets:
Figure BDA0003811005090000132
step S1054, combining the candidate word sets of each word vector model and the candidate word recommendation list to generate a fourth recommendation list;
step S1055, semantic vector characterization is carried out on the source words in the fourth recommendation list by adopting each word vector model, and a fourth feature sequence is generated;
the source words are words selected by the user in the fourth recommendation list.
Fourth feature sequence:
Figure BDA0003811005090000133
step S1056, semantic vector characterization is carried out on any word in the first target word set by adopting each word vector model, and a fifth feature sequence is generated;
the first target word set is a set formed by words except the source word in the fourth recommendation list.
Fifth feature sequence:
Figure BDA0003811005090000134
step S1057, calculating a fifth similarity according to the fourth feature sequence and the fifth feature sequence;
the fifth similarity is cosine similarity of any word in the source word and the first target word set under the semantic vector space of each word vector model.
Fifth similarity:
s mi =CosineSimilarity(w′ i ,w′ mi )
step S1058, calculating a second comprehensive similarity according to the fifth similarity;
specifically, the cosine similarity of any word in the source word and the first target word set under the semantic vector space of each word vector model is weighted and summed to obtain a second comprehensive similarity.
Step S1059, obtaining words with second comprehensive similarity greater than a fourth similarity threshold from the first target word set as a second target word set;
the fourth similarity threshold is a preset similarity threshold.
Step S10510, sorting the words in the second target word set according to the second comprehensive similarity, and generating a fourth recommendation result.
Optionally, in an embodiment of the present invention, the words in the second target word set are ranked from high to low according to the second comprehensive similarity, and a fourth recommendation result is generated.
S106, generating a multi-granularity recommendation report according to the first recommendation result, the second recommendation result, the third recommendation result and the fourth recommendation result.
In the embodiment of the invention, the first recommendation result belongs to a recommendation result of a macroscopic level, and similar courses are recommended according to the course meta information feature vector and the keyword sense feature vector; the second recommendation result, the third recommendation result and the fourth recommendation result belong to recommendation results of a surrounding aspect, wherein the second recommendation result recommends a conversation with a similar rule and mode according to the speech coding characteristics of a plurality of conversations between teachers and students and the semantic characteristics of speech texts, and the recommendation results are conversation fragments between the teachers and the students or a plurality of single-sided utterances between the teachers and the students; the third recommendation result takes the single round of utterances in the teacher-student dialogue as recommendation targets, and similar utterances are recommended according to the encoding features of the utterances and the semantic features of the utterances; and the fourth recommendation result takes words in a single round of words of a teacher-student dialogue as recommendation targets, and similar words are recommended according to the words and semantic features in the context of the words.
In an embodiment of the present invention, after generating the multi-granularity recommendation report in step S106, the method further includes: a multi-granularity recommendation report is shown.
According to the multi-granularity recommendation method described in steps S101-S106, the invention constructs the dialogue recommendation data set comprising the course recommendation data set, the dialogue sequence recommendation data set, the single-party speech recommendation data set and the word recommendation data set according to the corpus, generates the first recommendation result according to the course recommendation data set, generates the second recommendation result according to the first recommendation result and the dialogue sequence recommendation data set, generates the third recommendation result according to the second recommendation result and the single-party speech recommendation data set, and generates the fourth recommendation result according to the third recommendation result, the word recommendation data set and the word list set of the plurality of pre-trained word vector models, thereby improving the effectiveness, pertinence and richness of the classroom dialogue recommendation result.
FIG. 2 is a schematic diagram of a multi-granularity recommendation system according to one embodiment of the present application.
The system specifically comprises:
a first module 201, configured to construct a dialogue recommendation data set according to a corpus, where the dialogue recommendation data set includes a course recommendation data set, a dialogue sequence recommendation data set, a unilateral utterance recommendation data set, and a word recommendation data set;
A second module 202, configured to generate a first recommendation result according to the course recommendation data set, where the first recommendation result is a recommendation result of a similar course;
a third module 203, configured to generate a second recommendation result according to the first recommendation result and the dialog sequence recommendation data set, where the second recommendation result is a recommendation result of a similar dialog sequence;
a fourth module 204, configured to generate a third recommendation result according to the second recommendation result and the single-party speech recommendation data set, where the third recommendation result is a recommendation result similar to the single-party speech;
a fifth module 205, configured to generate a fourth recommendation result according to the third recommendation result, the word recommendation data set, and a vocabulary set of a plurality of pre-trained word vector models, where the fourth recommendation result is a recommendation result of a similar word;
a sixth module 206, configured to generate a multi-granularity recommendation report according to the first recommendation result, the second recommendation result, the third recommendation result, and the fourth recommendation result.
It can be seen that the content in the above method embodiment is applicable to the system embodiment, and the functions specifically implemented by the system embodiment are the same as those of the method embodiment, and the beneficial effects achieved by the method embodiment are the same as those achieved by the method embodiment.
Referring to fig. 3, an embodiment of the present application provides a multi-granularity recommendation apparatus, including:
at least one processor 301;
at least one memory 302 for storing at least one program;
the at least one program, when executed by the at least one processor 301, causes the at least one processor 301 to implement a multi-granular recommendation method as described in steps S101-S106.
Similarly, the content in the above method embodiment is applicable to the embodiment of the present device, and the functions specifically implemented by the embodiment of the present device are the same as those of the embodiment of the above method, and the beneficial effects achieved by the embodiment of the above method are the same as those achieved by the embodiment of the above method.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of this application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the present application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or one or more of the functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Thus, those of ordinary skill in the art will be able to implement the present application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined by the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several programs for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable programs for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with a program execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the programs from the program execution system, apparatus, or device and execute the programs. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the program execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable program execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the foregoing description of the present specification, descriptions of the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the application, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims (6)

1. A multi-granularity recommendation method, comprising the steps of:
establishing a dialogue recommendation data set according to the corpus, wherein the dialogue recommendation data set comprises a course recommendation data set, a dialogue sequence recommendation data set, a unilateral speaking recommendation data set and a word recommendation data set;
generating a first recommendation result according to the course recommendation data set, wherein the first recommendation result is a recommendation result of a similar course;
generating a second recommendation result according to the first recommendation result and the dialogue sequence recommendation data set, wherein the second recommendation result is a recommendation result of a similar dialogue sequence;
Generating a third recommendation result according to the second recommendation result and the single-party speech recommendation data set, wherein the third recommendation result is a recommendation result of a similar single-party speech;
generating a fourth recommendation result according to the third recommendation result, the word recommendation data set and a vocabulary set of a plurality of pre-trained word vector models, wherein the fourth recommendation result is a recommendation result of similar words;
generating a multi-granularity recommendation report according to the first recommendation result, the second recommendation result, the third recommendation result and the fourth recommendation result;
the generating a first recommendation result according to the course recommendation data set comprises the following steps:
acquiring a first recommendation list from the course recommendation data set according to a first preset rule, wherein the first recommendation list is a recommendation list of candidate courses;
generating a first feature vector according to the first recommendation list, wherein the first feature vector comprises a first course meta information feature vector and a first keyword sense feature vector;
generating a first feature sequence according to the first feature vector, wherein the first feature sequence is the feature sequence of the first recommendation list;
acquiring a second course meta information feature vector and a second keyword sense feature vector according to the first feature sequence, wherein the second course meta information feature vector is a course meta information feature vector of a source course in the first recommendation list, and the second keyword sense feature vector is a keyword sense feature vector of the source course, and the source course is used for representing a course selected by a user in the first recommendation list;
Calculating a first similarity according to the second course meta-information feature vector, wherein the first similarity is cosine similarity between the second course meta-information feature vector and a course meta-information feature vector of any course in a first target course set, and the first target course set is a set formed by other courses except the source course in the first recommendation list;
calculating second similarity according to the second keyword semantic feature vector, wherein the second similarity is cosine similarity between the second keyword semantic feature vector and the keyword semantic feature vector of any course in the first target course set;
calculating a first comprehensive similarity according to the first similarity and the second similarity;
acquiring courses with the first comprehensive similarity greater than a first similarity threshold value from the first target course set as a second target course set;
sequencing courses in the second target course set according to the first comprehensive similarity, and generating the first recommendation result;
the generating a second recommendation result according to the first recommendation result and the dialog sequence recommendation data set includes:
Obtaining a second recommendation list from the dialog sequence recommendation data set according to the first recommendation result and a second preset rule, wherein the second preset rule comprises a starting tag set, an intermediate tag set and an ending tag set, the second recommendation list is a recommendation list of candidate dialog sequences, the starting tag set is used for representing a set formed by tags appearing in a head element of the dialog sequences, the intermediate tag set is used for representing a set formed by tags appearing in a middle element of the dialog sequences, and the ending tag set is used for representing a set formed by tags appearing in a tail element of the dialog sequences;
extracting a second feature vector by adopting a pre-trained first language model according to the second recommendation list, wherein the second feature vector comprises a high-order semantic feature vector of a dialogue sequence;
generating a second feature sequence according to the second feature vector, wherein the second feature sequence is the feature sequence of the second recommendation list;
acquiring a first high-order semantic feature vector according to the second feature sequence, wherein the first high-order semantic feature vector is a high-order semantic feature vector of a source dialogue sequence in the second recommendation list, and the source dialogue sequence is used for representing a dialogue sequence selected by a user in the second recommendation list;
Calculating third similarity according to the first high-order semantic feature vector, wherein the third similarity is cosine similarity of the first high-order semantic feature vector and a second high-order semantic feature vector, the second high-order semantic feature vector is a high-order semantic feature vector of any dialog sequence in a first target dialog sequence set, and the first target dialog sequence set is a set formed by other dialog sequences except the source dialog sequence in the second recommendation list;
obtaining a dialogue sequence with the third similarity larger than a second similarity threshold value from the first target dialogue sequence set as a second target dialogue sequence set;
sorting the dialogue sequences in the second target dialogue sequence set according to the third similarity, and generating the second recommendation result;
the generating a third recommendation result according to the second recommendation result and the single-party speech recommendation data set includes:
acquiring a third recommendation list from the single party utterance recommendation data set according to the second recommendation result and a third preset rule, wherein the third preset rule comprises a target tag set, and the third recommendation list is a recommendation list of candidate single party utterances;
Extracting a third feature vector by adopting a pre-trained second language model according to the third recommendation list, wherein the third feature vector comprises high-order semantic feature vectors of single-party utterances;
generating a third feature sequence according to the third feature vector, wherein the third feature sequence is the feature sequence of the third recommendation list;
acquiring a third high-order semantic feature vector according to the third feature sequence, wherein the third high-order semantic feature vector is a high-order semantic feature vector of a source unilateral utterance in the third recommendation list, and the source unilateral utterance is used for representing unilateral utterances selected by a user in the third recommendation list;
calculating a fourth similarity according to the third high-order semantic feature vector, wherein the fourth similarity is cosine similarity of the third high-order semantic feature vector and a fourth high-order semantic feature vector, the fourth high-order semantic feature vector is a high-order semantic feature vector of any single-party utterance in a first target single-party utterance set, and the first target single-party utterance set is a set formed by other single-party utterances except the source single-party utterance in the third recommendation list;
acquiring the single-party utterances with the fourth similarity greater than a third similarity threshold value from the first target single-party utterances as a second target single-party utterances set;
Sorting the single utterances in the second target single utterance set according to the fourth similarity, and generating the third recommendation result;
the generating a fourth recommendation result according to the third recommendation result, the word recommendation data set and the vocabulary set of the plurality of pre-trained word vector models includes:
acquiring words which are simultaneously present in the third recommendation result and the word recommendation data set according to a fourth preset rule, and generating a candidate word recommendation list;
obtaining a target part-of-speech set and a target word length interval according to the word vector model and the vocabulary set;
acquiring word lists conforming to the target part-of-speech set and the target word length section from the word list set as candidate word sets of the word vector models;
combining the candidate word sets of each word vector model and the candidate word recommendation list to generate a fourth recommendation list;
performing semantic vector characterization on source words in the fourth recommendation list by adopting each word vector model to generate a fourth feature sequence, wherein the source words are used for characterizing words selected by a user in the fourth recommendation list;
carrying out semantic vector characterization on any word in a first target word set by adopting each word vector model to generate a fifth characteristic sequence, wherein the first target word set is a set formed by other words except the source word in the fourth recommendation list;
Calculating fifth similarity according to the fourth feature sequence and the fifth feature sequence, wherein the fifth similarity is cosine similarity of any word in the source word and the first target word set under each word vector model semantic vector space;
calculating a second comprehensive similarity according to the fifth similarity;
acquiring words with the second comprehensive similarity greater than a fourth similarity threshold value from the first target word set as a second target word set;
and sorting the words in the second target word set according to the second comprehensive similarity, and generating the fourth recommendation result.
2. The multi-granularity recommendation method according to claim 1, wherein the corpus comprises classroom corpus metadata and classroom corpus content data, the classroom corpus metadata is used for describing basic information of a classroom, the classroom corpus content data is used for describing specific information of the classroom, and the classroom corpus content data comprises course keywords and dialogue sequences;
the construction of the dialogue recommendation data set according to the corpus comprises the following steps:
constructing the course recommendation data set according to the classroom corpus metadata and the course keywords;
Constructing the dialogue sequence recommendation data set and the unilateral speaking recommendation data set according to the dialogue sequence;
and carrying out text preprocessing on the speech content in the unilateral speech recommendation data set to obtain the word recommendation data set.
3. The multi-granular recommendation method of claim 1, wherein after said generating a multi-granular recommendation report based on said first recommendation, said second recommendation, said third recommendation, and said fourth recommendation, said multi-granular recommendation method further comprises:
the multi-granularity recommendation report is displayed.
4. A multi-granularity recommendation system, comprising:
the first module is used for constructing a dialogue recommendation data set according to the corpus, wherein the dialogue recommendation data set comprises a course recommendation data set, a dialogue sequence recommendation data set, a unilateral speaking recommendation data set and a word recommendation data set;
the second module is used for generating a first recommendation result according to the course recommendation data set, wherein the first recommendation result is a recommendation result of a similar course;
a third module, configured to generate a second recommendation result according to the first recommendation result and the dialog sequence recommendation data set, where the second recommendation result is a recommendation result of a similar dialog sequence;
A fourth module, configured to generate a third recommendation result according to the second recommendation result and the single-party speech recommendation data set, where the third recommendation result is a recommendation result of a similar single-party speech;
a fifth module, configured to generate a fourth recommendation result according to the third recommendation result, the word recommendation data set, and a vocabulary set of a plurality of pre-trained word vector models, where the fourth recommendation result is a recommendation result of a similar word;
a sixth module, configured to generate a multi-granularity recommendation report according to the first recommendation result, the second recommendation result, the third recommendation result, and the fourth recommendation result;
the generating a first recommendation result according to the course recommendation data set comprises the following steps:
acquiring a first recommendation list from the course recommendation data set according to a first preset rule, wherein the first recommendation list is a recommendation list of candidate courses;
generating a first feature vector according to the first recommendation list, wherein the first feature vector comprises a first course meta information feature vector and a first keyword sense feature vector;
generating a first feature sequence according to the first feature vector, wherein the first feature sequence is the feature sequence of the first recommendation list;
Acquiring a second course meta information feature vector and a second keyword sense feature vector according to the first feature sequence, wherein the second course meta information feature vector is a course meta information feature vector of a source course in the first recommendation list, and the second keyword sense feature vector is a keyword sense feature vector of the source course, and the source course is used for representing a course selected by a user in the first recommendation list;
calculating a first similarity according to the second course meta-information feature vector, wherein the first similarity is cosine similarity between the second course meta-information feature vector and a course meta-information feature vector of any course in a first target course set, and the first target course set is a set formed by other courses except the source course in the first recommendation list;
calculating second similarity according to the second keyword semantic feature vector, wherein the second similarity is cosine similarity between the second keyword semantic feature vector and the keyword semantic feature vector of any course in the first target course set;
calculating a first comprehensive similarity according to the first similarity and the second similarity;
Acquiring courses with the first comprehensive similarity greater than a first similarity threshold value from the first target course set as a second target course set;
sequencing courses in the second target course set according to the first comprehensive similarity, and generating the first recommendation result;
the generating a second recommendation result according to the first recommendation result and the dialog sequence recommendation data set includes:
obtaining a second recommendation list from the dialog sequence recommendation data set according to the first recommendation result and a second preset rule, wherein the second preset rule comprises a starting tag set, an intermediate tag set and an ending tag set, the second recommendation list is a recommendation list of candidate dialog sequences, the starting tag set is used for representing a set formed by tags appearing in a head element of the dialog sequences, the intermediate tag set is used for representing a set formed by tags appearing in a middle element of the dialog sequences, and the ending tag set is used for representing a set formed by tags appearing in a tail element of the dialog sequences;
extracting a second feature vector by adopting a pre-trained first language model according to the second recommendation list, wherein the second feature vector comprises a high-order semantic feature vector of a dialogue sequence;
Generating a second feature sequence according to the second feature vector, wherein the second feature sequence is the feature sequence of the second recommendation list;
acquiring a first high-order semantic feature vector according to the second feature sequence, wherein the first high-order semantic feature vector is a high-order semantic feature vector of a source dialogue sequence in the second recommendation list, and the source dialogue sequence is used for representing a dialogue sequence selected by a user in the second recommendation list;
calculating third similarity according to the first high-order semantic feature vector, wherein the third similarity is cosine similarity of the first high-order semantic feature vector and a second high-order semantic feature vector, the second high-order semantic feature vector is a high-order semantic feature vector of any dialog sequence in a first target dialog sequence set, and the first target dialog sequence set is a set formed by other dialog sequences except the source dialog sequence in the second recommendation list;
obtaining a dialogue sequence with the third similarity larger than a second similarity threshold value from the first target dialogue sequence set as a second target dialogue sequence set;
sorting the dialogue sequences in the second target dialogue sequence set according to the third similarity, and generating the second recommendation result;
The generating a third recommendation result according to the second recommendation result and the single-party speech recommendation data set includes:
acquiring a third recommendation list from the single party utterance recommendation data set according to the second recommendation result and a third preset rule, wherein the third preset rule comprises a target tag set, and the third recommendation list is a recommendation list of candidate single party utterances;
extracting a third feature vector by adopting a pre-trained second language model according to the third recommendation list, wherein the third feature vector comprises high-order semantic feature vectors of single-party utterances;
generating a third feature sequence according to the third feature vector, wherein the third feature sequence is the feature sequence of the third recommendation list;
acquiring a third high-order semantic feature vector according to the third feature sequence, wherein the third high-order semantic feature vector is a high-order semantic feature vector of a source unilateral utterance in the third recommendation list, and the source unilateral utterance is used for representing unilateral utterances selected by a user in the third recommendation list;
calculating a fourth similarity according to the third high-order semantic feature vector, wherein the fourth similarity is cosine similarity of the third high-order semantic feature vector and a fourth high-order semantic feature vector, the fourth high-order semantic feature vector is a high-order semantic feature vector of any single-party utterance in a first target single-party utterance set, and the first target single-party utterance set is a set formed by other single-party utterances except the source single-party utterance in the third recommendation list;
Acquiring the single-party utterances with the fourth similarity greater than a third similarity threshold value from the first target single-party utterances as a second target single-party utterances set;
sorting the single utterances in the second target single utterance set according to the fourth similarity, and generating the third recommendation result;
the generating a fourth recommendation result according to the third recommendation result, the word recommendation data set and the vocabulary set of the plurality of pre-trained word vector models includes:
acquiring words which are simultaneously present in the third recommendation result and the word recommendation data set according to a fourth preset rule, and generating a candidate word recommendation list;
obtaining a target part-of-speech set and a target word length interval according to the word vector model and the vocabulary set;
acquiring word lists conforming to the target part-of-speech set and the target word length section from the word list set as candidate word sets of the word vector models;
combining the candidate word sets of each word vector model and the candidate word recommendation list to generate a fourth recommendation list;
performing semantic vector characterization on source words in the fourth recommendation list by adopting each word vector model to generate a fourth feature sequence, wherein the source words are used for characterizing words selected by a user in the fourth recommendation list;
Carrying out semantic vector characterization on any word in a first target word set by adopting each word vector model to generate a fifth characteristic sequence, wherein the first target word set is a set formed by other words except the source word in the fourth recommendation list;
calculating fifth similarity according to the fourth feature sequence and the fifth feature sequence, wherein the fifth similarity is cosine similarity of any word in the source word and the first target word set under each word vector model semantic vector space;
calculating a second comprehensive similarity according to the fifth similarity;
acquiring words with the second comprehensive similarity greater than a fourth similarity threshold value from the first target word set as a second target word set;
and sorting the words in the second target word set according to the second comprehensive similarity, and generating the fourth recommendation result.
5. A multi-granularity recommendation device, comprising:
at least one processor;
at least one memory for storing at least one program;
when said at least one program is executed by said at least one processor, said at least one processor is caused to implement a multi-granular recommendation method as claimed in any one of claims 1-3.
6. A storage medium having stored therein a processor executable program which when executed by a processor is for implementing a multi-granular recommendation method as claimed in any one of claims 1 to 3.
CN202211011337.3A 2022-08-23 2022-08-23 Multi-granularity recommendation method, system, device and storage medium Active CN115344787B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211011337.3A CN115344787B (en) 2022-08-23 2022-08-23 Multi-granularity recommendation method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211011337.3A CN115344787B (en) 2022-08-23 2022-08-23 Multi-granularity recommendation method, system, device and storage medium

Publications (2)

Publication Number Publication Date
CN115344787A CN115344787A (en) 2022-11-15
CN115344787B true CN115344787B (en) 2023-07-04

Family

ID=83954033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211011337.3A Active CN115344787B (en) 2022-08-23 2022-08-23 Multi-granularity recommendation method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN115344787B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989112A (en) * 2021-04-27 2021-06-18 北京世纪好未来教育科技有限公司 Online classroom content acquisition method and device
WO2021134524A1 (en) * 2019-12-31 2021-07-08 深圳市欢太科技有限公司 Data processing method, apparatus, electronic device, and storage medium
US11410653B1 (en) * 2020-09-25 2022-08-09 Amazon Technologies, Inc. Generating content recommendation based on user-device dialogue

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829822B (en) * 2018-06-12 2023-10-27 腾讯科技(深圳)有限公司 Media content recommendation method and device, storage medium and electronic device
CN110717340B (en) * 2019-09-29 2023-11-21 百度在线网络技术(北京)有限公司 Recommendation method, recommendation device, electronic equipment and storage medium
CN111522937B (en) * 2020-05-15 2023-04-28 支付宝(杭州)信息技术有限公司 Speaking recommendation method and device and electronic equipment
CN112434151A (en) * 2020-11-26 2021-03-02 重庆知识产权大数据研究院有限公司 Patent recommendation method and device, computer equipment and storage medium
CN112632385B (en) * 2020-12-29 2023-09-22 中国平安人寿保险股份有限公司 Course recommendation method, course recommendation device, computer equipment and medium
CN112732892B (en) * 2020-12-30 2022-09-20 平安科技(深圳)有限公司 Course recommendation method, device, equipment and storage medium
CN114896498A (en) * 2022-05-16 2022-08-12 咪咕文化科技有限公司 Course recommendation method, system, terminal and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021134524A1 (en) * 2019-12-31 2021-07-08 深圳市欢太科技有限公司 Data processing method, apparatus, electronic device, and storage medium
US11410653B1 (en) * 2020-09-25 2022-08-09 Amazon Technologies, Inc. Generating content recommendation based on user-device dialogue
CN112989112A (en) * 2021-04-27 2021-06-18 北京世纪好未来教育科技有限公司 Online classroom content acquisition method and device

Also Published As

Publication number Publication date
CN115344787A (en) 2022-11-15

Similar Documents

Publication Publication Date Title
Kastrati et al. Integrating word embeddings and document topics with deep learning in a video classification framework
WO2022095380A1 (en) Ai-based virtual interaction model generation method and apparatus, computer device and storage medium
CN111259631B (en) Referee document structuring method and referee document structuring device
CN107464555A (en) Background sound is added to the voice data comprising voice
Chen et al. Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features
CN112559749B (en) Intelligent matching method, device and storage medium for online education teachers and students
CN109359201B (en) Coding and storing method and device for multimedia teaching resource
CN110765270B (en) Training method and system of text classification model for spoken language interaction
Kaushik et al. Automatic sentiment detection in naturalistic audio
CN110222225A (en) The abstraction generating method and device of GRU codec training method, audio
CN112911326B (en) Barrage information processing method and device, electronic equipment and storage medium
CN111930792A (en) Data resource labeling method and device, storage medium and electronic equipment
Scholten et al. Learning to recognise words using visually grounded speech
CN110647613A (en) Courseware construction method, courseware construction device, courseware construction server and storage medium
Miranda et al. Topic modeling and sentiment analysis of martial arts learning textual feedback on YouTube
CN114611520A (en) Text abstract generating method
Wang et al. DeHumor: Visual analytics for decomposing humor
CN115344787B (en) Multi-granularity recommendation method, system, device and storage medium
Kayis et al. Artificial Intelligence-Based Classification with Classical Turkish Music Makams: Possibilities to Turkish Music Education.
CN115510208A (en) Dynamic entity representation for sequence generation
Atef et al. Adaptive learning environments based on intelligent manipulation for video learning objects
CN114297372A (en) Personalized note generation method and system
CN111061845A (en) Method, apparatus and computer storage medium for managing chat topics of chat room
CN114118087A (en) Entity determination method, entity determination device, electronic equipment and storage medium
Mishra et al. AI based approach to trailer generation for online educational courses

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant