WO2016179938A1 - Method and device for question recommendation - Google Patents

Method and device for question recommendation Download PDF

Info

Publication number
WO2016179938A1
WO2016179938A1 PCT/CN2015/090002 CN2015090002W WO2016179938A1 WO 2016179938 A1 WO2016179938 A1 WO 2016179938A1 CN 2015090002 W CN2015090002 W CN 2015090002W WO 2016179938 A1 WO2016179938 A1 WO 2016179938A1
Authority
WO
WIPO (PCT)
Prior art keywords
topic
user
information
search
attribute information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2015/090002
Other languages
French (fr)
Chinese (zh)
Inventor
侯建彬
陈恭明
杨帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Publication of WO2016179938A1 publication Critical patent/WO2016179938A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a topic recommendation method and a topic recommendation device.
  • the relationship between the topics is established, and the user is presented with a list, and the list is sorted according to the strength of the association.
  • the present invention aims to solve at least one of the technical problems in the related art to some extent.
  • an object of the present invention is to provide a method for recommending a topic, which can improve the correlation between a recommended topic and a search topic, and improve the recommendation effect.
  • Another object of the present invention is to provide a title recommendation device.
  • the method for recommending a topic according to the first aspect of the present invention includes: receiving a search topic; acquiring topic attribute information of the search title, and obtaining a preliminary search result according to the topic attribute information; acquiring a user of the user Descriptive information, and sorting the preliminary search results according to the user description information to obtain a sorted result; and selecting a preset number of results from the sorted result to determine a recommended title.
  • the topic recommendation method proposed by the first aspect of the present invention obtains the preliminary search result by acquiring the topic attribute information and the reference attribute information, and not only the text similarity, but also the recommended title and the search title. Correlation; in addition, by obtaining the user description information and sorting the preliminary search results according to the user description information, the user information can be referred to in the recommendation, and the correlation with the user is improved. Improve the recommendation.
  • a topic recommendation device includes: a receiving module, configured to receive a search topic; and an obtaining module, configured to acquire topic attribute information of the search topic, and according to the topic attribute Obtaining a preliminary search result; the sorting module is configured to obtain user description information of the user, and sort the preliminary search result according to the user description information to obtain a sorted result; and a determining module, configured to After the result, the result of selecting the preset number is determined as the recommended title.
  • the title recommendation device obtains the preliminary attribute search information according to the title attribute information, and obtains the preliminary search result according to the title attribute information. Since the title attribute information is referenced, not only the text similarity, but also the recommended title and the search title can be improved. Correlation; in addition, by obtaining the user description information and sorting the preliminary search results according to the user description information, the user information can be referred to in the recommendation, the relevance to the user is improved, and the recommendation effect is improved.
  • An embodiment of the present invention further provides an electronic device, including: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory when the one or more When the processor is executed: the method according to any of the first aspect of the invention is performed.
  • Embodiments of the present invention also provide a non-volatile computer storage medium having one or more modules stored when the one or more modules are executed: performing the first aspect of the present invention.
  • FIG. 1 is a schematic flow chart of a method for recommending a topic according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of acquiring topic attribute information in an embodiment of the present invention
  • FIG. 3 is a schematic diagram of an online implementation process in an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of obtaining user description information in an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of an offline implementation process in an embodiment of the present invention.
  • FIG. 7 is a schematic flowchart of establishing a topic structured information base in an embodiment of the present invention.
  • FIG. 8 is a schematic flowchart of establishing a user model in an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a title recommendation apparatus according to another embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a title recommendation apparatus according to another embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of a method for recommending a topic according to an embodiment of the present invention, where the method includes:
  • a search question can be entered in the search box.
  • the embodiment can be executed by the server.
  • the search question can be sent to the server, and the server receives the message sent by the browser. Search for the title. or,
  • the embodiment may also be implemented by a web product having a search function, the web product includes a part of the front end interacting with the user and a background processing part.
  • the search term input by the user may be received by the front end of the web product, for example, by searching.
  • the box receives the search question entered by the user.
  • S12 Obtain topic attribute information of the search topic, and obtain a preliminary search result according to the topic attribute information.
  • the acquiring the topic attribute information of the search topic includes:
  • the image may be identified by Optical Character Recognition (OCR) to obtain the recognition result.
  • OCR Optical Character Recognition
  • the search result is the same as or similar to the recognition result.
  • the title identifies the identification information (id) of the same or similar topic as the identification information (id) of the current search title.
  • S22 Acquire, in a pre-established topic structured information database, topic attribute information corresponding to the identifier information, where the title structured information database corresponds to the identification information of the saved topic and the topic attribute information.
  • the title attribute information corresponding to the id of the search title may be acquired in the pre-established topic structured information base.
  • the title structured information database stores the attribute attribute information corresponding to the identification information of the title.
  • the title attribute information includes, for example, the title type, the title difficulty, the title structure, the topic knowledge point, the answer quality, and the normalized topic description.
  • the online system may include a topic feature acquisition module 31.
  • the topic attribute information may be obtained from the topic structured information library.
  • the online system may further include: a text retrieval module 32, and the topic feature acquisition module 31 acquires the topic.
  • the text retrieval module 32 can obtain the preliminary search result according to the topic attribute information.
  • the obtaining the preliminary search result according to the attribute information of the topic includes:
  • a general word segmentation technique can be used to segment a search term, and then a keyword can be obtained from the obtained segmentation according to a preset rule.
  • the preset rules are, for example, based on the position of the participle in the search title, the degree of importance in the question bank, whether the subject word, and the like.
  • the keyword can be used as a query to retrieve the relevant topic in the existing database, and the text retrieval result related to the text description is obtained.
  • S42 Perform weight adjustment on the text retrieval result according to the attribute information of the topic, and obtain a search result after the weight adjustment;
  • the text retrieval result can be adjusted according to the knowledge point, type, difficulty, and answer quality of the search topic.
  • the text search result may be limited to the same or similar to the knowledge point of the search title, and the text search result whose type and difficulty are similar to the search title is weighted, and the weight of the text search result with high quality is weighted.
  • the specific weighted values can be preset according to actual needs.
  • the knowledge points used in this embodiment may be fine-grained knowledge points. For details, refer to related descriptions in the subsequent topic knowledge point extraction.
  • the search title is a single knowledge point
  • the text search result of the single knowledge point is weighted; or, if the search title is a mixed knowledge point, each knowledge point in the mixed knowledge point and the corresponding weight are determined, and then according to Each knowledge point and the corresponding weight determine the weight of the text retrieval result, and the text retrieval result close to the retrieval title is weighted.
  • the correlation between the recommended topic and the current search topic can be improved.
  • a preset number of search results having a larger weight is selected as the preliminary search result, and the preset number is, for example, 50.
  • the acquiring user description information of the user includes:
  • the user can be assigned a corresponding identification information when the user is registered.
  • the login information carries the user's own identification information, and the system can obtain the user identification information (id) in the login information.
  • S52 The user description information corresponding to the identifier information of the user is obtained in the preset user model, where the user model correspondingly saves the identifier information of the user and the user description information.
  • the online system may further include: a user feature obtaining module 33, and the user feature obtaining module 33 may obtain user description information corresponding to the user's identification information in the user model according to the user's identification information.
  • the user description information includes, for example, user preference difficulty, user preference type, user textbook version, user browsing of the topic, click, and collection status.
  • the online system further includes: an advanced sorting module 34, and the advanced sorting module 34 is configured to sort the preliminary search results according to the attribute information of the user. .
  • the advanced sorting process may include: weighting preliminary search results that are consistent with the user description information, and sorting the preliminary search results according to the weighted weights.
  • weighting is performed
  • weighting is performed
  • the source of the question is consistent with the version of the user's textbook and weighted.
  • the title grade information is consistent with the user's current grade and weighted.
  • the current knowledge point user history behavior analysis is adjusted according to the difficulty and number of browsing of the current knowledge point in the user history, for example, a question slightly higher than the current difficulty level, and weighted.
  • the search results are sorted in descending order of weights, for example, 50 sorted results are obtained.
  • the preset number may be set by the user or set by the system by default. After that, the result obtained by sorting in the previous step may be selected according to the order from the front to the back.
  • the web product can also display the recommended topic to the user.
  • the advanced ranking can show the user the recommendation results associated with the search topic.
  • the above topic structured information base and user model can be established offline.
  • the offline system may include a feature extraction distribution module 61 and a title.
  • the title feature extraction module 62, the topic feature extraction module 62 may specifically include: a topic type classification module 621, a topic difficulty classification module 622, a topic structure splitting module 623, a topic knowledge point extraction module 624, an answer quality ranking module 625, and a topic description A module 626.
  • the establishing a topic structured information base includes:
  • the historical title refers to the title that is input when the topic structured information database is created. Since it is input before the current search topic, it can be called a history topic in order to distinguish it from the current search topic.
  • each new historical topic is represented by a new topic.
  • the feature extraction distribution module distributes each newly added topic to a classification module such as a topic type classification module and a topic difficulty classification module.
  • S73 Perform topic feature extraction on the historical topic, and obtain topic attribute information corresponding to each classification module.
  • the process of extracting topic features may include:
  • the title attribute information that can be obtained includes: the subject type and the type of the topic.
  • a classification model built using a Support Vector Machine (SVM).
  • SVM Support Vector Machine
  • the n-gram-based feature is used to classify the topic type, including the subject type (language, mathematics, physics, chemistry, etc.) of the topic, as well as the type of the topic (choose, fill in the blank, short answer, etc.).
  • the obtained topic attribute information may include: the difficulty level to which each newly added topic belongs, and the number of difficulty levels may be preset.
  • a classification model constructed by a Gradient Boosting Decision Tree may be used to divide the difficulty value of the problem and generate a difficulty level classification.
  • the main features adopted include: the description keyword of the parsed content, the length of the answer, the ratio of the number of times the knowledge point is questioned to the number of resolutions, and the problem is obtained on the User Generated Content (UGC) platform. The number of answers, the level of respondents, and the time-consuming information.
  • the obtained topic attribute information may include: a structure such as a stem, a question, an option, and a blank fill.
  • the sentence granularity can be classified according to the sentence granularity, and then divided according to the sequence of the sentence, and then divided into various structures such as the stem (background description fragment, conditional fragment), questions, options, and blanks.
  • stem background description fragment, conditional fragment
  • questions options, and blanks.
  • the obtained topic attribute information includes: knowledge points, and the knowledge points can be represented by tags.
  • the knowledge point label can be obtained by fusing the following two main processes.
  • Subject keyword extraction In the case of topic structure splitting, keyword extraction is mainly performed from the conditional segment and the question segment. Keyword extraction is done using an SVM classification model. The features used include the part of speech, the position in the sentence, the importance in the question bank, and whether the subject is a feature. Classify words as keywords and non-keywords. In each question, keywords that reach the threshold can be selected, and the maximum value of the selected keywords can be limited, for example, up to 5 keywords are selected.
  • the tag can be directly obtained from the keyword, for example, the keyword is determined as a tag or the like.
  • KNN K-Nearest Neighbor
  • (III) Label fusion merging the first label and the second label, for example, de-duplicating the first label and the second label to form a set, wherein the first label is a label extracted by a keyword, and the second label is A label obtained by comparing similar fragments.
  • the confidence level corresponding to each label may be determined, and finally, according to the confidence level, a preset number of labels with a larger degree of confidence is selected as the knowledge point label of the topic, and the preset number is, for example, ten.
  • the obtained topic attribute information includes the level of the answer quality, for example, high quality answer, general quality answer, low quality answer, and the like.
  • the SVM model can be used to rank the quality of the responses, which are divided into high quality, general quality, and low quality answers.
  • the features used in answering the quality grading are: the semantic relevance of the answer and the topic, the source, the length, the formatting information, and the user's click-through behavior.
  • the title attribute information obtained by the title description includes, for example, a normalized description.
  • the descriptions of different topics are inconsistent. According to the definition, after the description mode is normalized, it is added to the database.
  • classification model uses SVM, GBDT as an example, and can also be the remaining classification models, for example, Logistic regression, linear regression, random forest, neural network, naive Bayes and other algorithms with classification ability. model.
  • S74 Acquire identification information of the historical topic, and correspondingly save the identification information of the topic and the topic attribute information.
  • Each of the historical topics may be assigned unique identification information. After obtaining the topic attribute information of the historical topic, the identification information and the topic attribute information of the historical title may be correspondingly saved in the topic structured information database.
  • the prior art acquires a recommended topic according to the description text of the current topic, due to the degree of structural analysis of the topic Not enough, there is no effective way to distinguish between the type of topic, the condition, the semantic scene of the content of the question, etc., resulting in low relevance of the search.
  • different topic attribute information such as the title type, the title difficulty, the title structure, the topic knowledge point, etc. are saved in a structured information form, so that the attribute information of each topic has a hierarchy. Relationships and interconnections to improve search relevance.
  • the offline system further includes: a user modeling module 63, configured to establish a user model according to the user behavior log, the user attribute information, and the topic structured information library.
  • the establishing a user model includes:
  • the user behavior log can record the user's browsing, clicking, and collecting questions in the question bank.
  • User attribute information refers to some meta attributes about the user, such as gender, type (parent/student/teacher), region, grade, school, etc.
  • S82 Perform user modeling according to the topic structured information base, and the user behavior log and user attribute information, to obtain a user model.
  • the user can browse, click or collect the title, and according to the structured information database of the topic, the topic attribute information of the corresponding topic can be obtained, so that the user preference information can be obtained, for example, the subject concerned by the user can be known.
  • Knowledge point information, topic information, knowledge point difficulty level, title type information, etc. it can also be recorded in the user model.
  • the user identification information may be allocated when the user registers, and the user identification information and the user preference information and the user attribute information may be saved in the user model.
  • the correlation between the recommended topic and the search topic can be improved;
  • the user describes the information, and sorts the preliminary search results according to the user description information, and can refer to the user information in the recommendation, thereby improving the correlation with the user and improving the recommendation effect.
  • the fine-grained processing is performed on the knowledge point extraction, and the difference between the single knowledge point and the mixed knowledge point is considered, and the correlation between the recommended related topic and the currently retrieved topic is effectively improved.
  • This embodiment introduces more topic attribute information, such as difficulty, type, etc., which cannot be processed from the perspective of text similarity. The introduction of these attribute information improves the correlation between the recommended topic and the search topic.
  • the user's personalized recommendation can be supported, and different users can be seen to see different related topics, thereby improving the user experience.
  • FIG. 9 is a schematic structural diagram of a title recommendation apparatus according to another embodiment of the present invention.
  • the apparatus 90 includes a receiving module 91, an obtaining module 92, a sorting module 93, and a determining module 94.
  • the receiving module 91 is configured to receive a search question
  • a search question can be entered in the search box.
  • the embodiment may be a server.
  • the receiving module at this time is configured to receive a search question sent by the browser, and the browser may obtain the search question input by the user from the search box. or,
  • the embodiment may also be a web product device with a search function, and the receiving module at this time is configured to receive a search question input by the user.
  • the obtaining module 92 is configured to acquire the topic attribute information of the search topic, and obtain a preliminary search result according to the topic attribute information;
  • the obtaining module 92 is configured to obtain the topic attribute information of the search topic, including:
  • the title attribute information corresponding to the identifier information is obtained, wherein the title structured information database corresponds to the identification information of the saved topic and the topic attribute information.
  • the image may be identified by Optical Character Recognition (OCR) to obtain the recognition result.
  • OCR Optical Character Recognition
  • the search result is the same as or similar to the recognition result.
  • the title identifies the identification information (id) of the same or similar topic as the identification information (id) of the current search title.
  • the title attribute information corresponding to the id of the search title may be acquired in the pre-established topic structured information base.
  • the title structured information database stores the attribute attribute information corresponding to the identification information of the title.
  • the title attribute information includes, for example, the title type, the title difficulty, the title structure, the topic knowledge point, the answer quality, and the normalized topic description.
  • the apparatus 90 further includes: a first establishing module 95, configured to establish a topic structured information base, where the first establishing module 95 is specifically configured to:
  • the first establishing module 95 is configured to perform topic feature extraction on the historical topic, and obtain topic attribute information corresponding to each classification module, including:
  • the classification module is a topic knowledge point extraction module
  • the topic keyword extraction is performed, and the similar segment comparison is performed, and the first label is extracted according to the topic keyword, and the second label is obtained according to the similar segment comparison, and
  • the first label and the second label are merged, and a preset number of labels are selected from the merged labels to be determined as a knowledge point label.
  • the obtaining module 92 is configured to obtain a preliminary search result according to the topic attribute information, including:
  • the knowledge point information includes: a single knowledge point or a mixed knowledge point;
  • a general word segmentation technique can be used to segment a search term, and then a keyword can be obtained from the obtained segmentation according to a preset rule.
  • the preset rules are, for example, based on the position of the participle in the search title, the degree of importance in the question bank, whether the subject word, and the like.
  • the keyword can be used as a query to retrieve the relevant topic in the existing database, and the text retrieval result related to the text description is obtained.
  • the text retrieval result can be adjusted according to the knowledge point, type, difficulty, and answer quality of the search topic.
  • the text search result may be limited to the same or similar to the knowledge point of the search title, and the text search result whose type and difficulty are similar to the search title is weighted, and the weight of the text search result with high quality is weighted.
  • the specific weighted values can be preset according to actual needs.
  • the knowledge points used in this embodiment may be fine-grained knowledge points. For details, refer to related descriptions in the subsequent topic knowledge point extraction.
  • the search title is a single knowledge point
  • the text search result of the single knowledge point is weighted; or, if the search title is a mixed knowledge point, each knowledge point in the mixed knowledge point and the corresponding weight are determined, and then according to Each knowledge point and the corresponding weight determine the weight of the text retrieval result, and the text retrieval result close to the retrieval title is weighted.
  • the correlation between the recommended topic and the current search topic can be improved.
  • a preset number of search results having a larger weight is selected as the preliminary search result, and the preset number is, for example, 50.
  • the sorting module 93 is configured to obtain user description information of the user, and sort the preliminary search results according to the user description information to obtain a sorted result;
  • the sorting module 93 is configured to obtain user description information of the user, including:
  • the user can be assigned a corresponding identification information when the user is registered.
  • the login information carries the user's own identification information, and the system can obtain the user identification information (id) in the login information.
  • the user description information includes, for example, user preference difficulty, user preference type, user textbook version, user browsing of the topic, click, and collection status.
  • the apparatus 90 further includes: a second establishing module 96, configured to establish a user model, where the second establishing module 96 is specifically configured to:
  • topic structured information base and the user behavior log and user attribute information
  • user modeling is performed to obtain a user model.
  • the sorting module 93 is configured to sort the preliminary search results according to the user description information, including:
  • the preliminary search results that are consistent with the user description information are weighted, and the preliminary search results are sorted according to the weighted weights.
  • weighting is performed
  • weighting is performed
  • the source of the question is consistent with the version of the user's textbook and weighted.
  • the title grade information is consistent with the user's current grade and weighted.
  • the current knowledge point user history behavior analysis is adjusted according to the difficulty and number of browsing of the current knowledge point in the user history, for example, a question slightly higher than the current difficulty level, and weighted.
  • the search results are sorted in descending order of weights, for example, 50 sorted results are obtained.
  • the determining module 94 is configured to select a preset number of results from the sorted result and determine the recommended title.
  • the preset number may be set by the user or set by the system by default. After that, the result obtained by sorting in the previous step may be selected according to the order from the front to the back.
  • the search topic is input by the user, and the device includes:
  • the display module 97 is configured to display the recommended topic to the user.
  • the user information may be referred to during the recommendation, thereby improving the relevance with the user and improving the recommendation effect.
  • the fine-grained processing is performed on the knowledge point extraction, and the difference between the single knowledge point and the mixed knowledge point is considered, and the correlation between the recommended related topic and the currently retrieved topic is effectively improved.
  • This embodiment introduces more topic attribute information, such as difficulty, type, etc., which cannot be processed from the perspective of text similarity. The introduction of these attribute information improves the correlation between the recommended topic and the search topic.
  • the user's personalized recommendation can be supported, and different users can be seen to see different related topics, thereby improving the user experience.
  • An embodiment of the present invention further provides an electronic device, including: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory when the one or more When the processor executes: receiving the search topic; acquiring the topic attribute information of the search topic, and obtaining a preliminary search result according to the topic attribute information; acquiring user description information of the user, and performing the preliminary according to the user description information
  • the search results are sorted to obtain the sorted result; after the sorted result, the preset number of results is selected and determined as the recommended title.
  • Embodiments of the present invention also provide a non-volatile computer storage medium storing one or more modules, when the one or more modules are executed: receiving a retrieval title; acquiring the retrieval a topic attribute information of the topic, and obtaining a preliminary search result according to the topic attribute information; acquiring user description information of the user, and sorting the preliminary search result according to the user description information to obtain a sorted result; After the sorted result, the result of the preset number is selected and determined as the recommended title. .
  • portions of the invention may be implemented in hardware, software, firmware or a combination thereof.
  • multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: having logic gates for implementing logic functions on data signals. Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.
  • each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated modules, if implemented in the form of software functional modules and sold or used as stand-alone products, may also be stored in a computer readable storage medium.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and device for question recommendation. The question recommendation method comprises: receiving a search question (S11); acquiring question attribute information of the search question, and acquiring an initial search result according to the question attribute information (S12); acquiring user description information of a user, and sorting the initial search result according to the user description information to obtain a sorted result (S13); and selecting a preset number of questions from the sorted result, and determining the same as recommended questions (S14). The question recommendation method is able to determine the relevance between the recommended questions and the search questions, thus improving the recommendation accuracy.

Description

题目推荐方法和题目推荐装置Title recommendation method and topic recommendation device

相关申请的交叉引用Cross-reference to related applications

本申请要求百度在线网络技术(北京)有限公司于2015年5月14日提交的、发明名称为“题目推荐方法和题目推荐装置”的、中国专利申请号“201510246727.2”的优先权。This application claims the priority of the Chinese patent application number "201510246727.2", which was submitted by Baidu Online Network Technology (Beijing) Co., Ltd. on May 14, 2015, and whose invention title is "Title Recommendation Method and Title Recommendation Device".

技术领域Technical field

本发明涉及互联网技术领域,尤其涉及一种题目推荐方法和题目推荐装置。The present invention relates to the field of Internet technologies, and in particular, to a topic recommendation method and a topic recommendation device.

背景技术Background technique

现有相关题目的推荐方法,主要有三种手段:一种是根据当前题目的描述文本,试图找到描述类似的其他题目,展现给用户一个列表,列表中根据文本相似性进行排序;第二种是根据当前题目的知识点信息,试图找到相同或相似知识点的其他题目;第三种是根据用户的检索和点击行为,找到用户经常顺序查看的两个或多个问题,通过挖掘行为的频繁项,建立题目之间的关联关系,展现给用户一个列表,列表中根据关联关系的强弱来进行排序。There are three main methods for recommending related topics: one is to find other similar descriptions based on the description text of the current topic, and present a list to the user, and the list is sorted according to text similarity; the second is According to the knowledge point information of the current topic, try to find other topics with the same or similar knowledge points; the third is to find two or more questions that the user frequently views in order according to the user's retrieval and click behavior, and to mine the frequent items of the behavior. The relationship between the topics is established, and the user is presented with a list, and the list is sorted according to the strength of the association.

但是,上述方法推荐的题目的相关性都较差,推荐效果不理想。However, the relevance of the questions recommended by the above methods is poor, and the recommended effect is not satisfactory.

发明内容Summary of the invention

本发明旨在至少在一定程度上解决相关技术中的技术问题之一。The present invention aims to solve at least one of the technical problems in the related art to some extent.

为此,本发明的一个目的在于提出一种题目推荐方法,该方法可以提高推荐题目与检索题目的相关性,提高推荐效果。To this end, an object of the present invention is to provide a method for recommending a topic, which can improve the correlation between a recommended topic and a search topic, and improve the recommendation effect.

本发明的另一个目的在于提出一种题目推荐装置。Another object of the present invention is to provide a title recommendation device.

为达到上述目的,本发明第一方面实施例提出的题目推荐方法,包括:接收检索题目;获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果;获取用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果;从所述排序后的结果后选择预设个数的结果,确定为推荐题目。The method for recommending a topic according to the first aspect of the present invention includes: receiving a search topic; acquiring topic attribute information of the search title, and obtaining a preliminary search result according to the topic attribute information; acquiring a user of the user Descriptive information, and sorting the preliminary search results according to the user description information to obtain a sorted result; and selecting a preset number of results from the sorted result to determine a recommended title.

本发明第一方面实施例提出的题目推荐方法,通过获取题目属性信息并根据题目属性信息获取初步检索结果,由于参考了题目属性信息,不仅仅是文本相似度,因此可以提升推荐题目与检索题目的相关性;另外,通过获取用户描述信息,并根据用户描述信息对初步检索结果进行排序,可以在推荐时参考用户信息,提升与用户的相关性, 提高推荐效果。The topic recommendation method proposed by the first aspect of the present invention obtains the preliminary search result by acquiring the topic attribute information and the reference attribute information, and not only the text similarity, but also the recommended title and the search title. Correlation; in addition, by obtaining the user description information and sorting the preliminary search results according to the user description information, the user information can be referred to in the recommendation, and the correlation with the user is improved. Improve the recommendation.

为达到上述目的,本发明第二方面实施例提出的题目推荐装置,包括:接收模块,用于接收检索题目;获取模块,用于获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果;排序模块,用于获取用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果;确定模块,用于从所述排序后的结果后选择预设个数的结果,确定为推荐题目。In order to achieve the above object, a topic recommendation device according to a second aspect of the present invention includes: a receiving module, configured to receive a search topic; and an obtaining module, configured to acquire topic attribute information of the search topic, and according to the topic attribute Obtaining a preliminary search result; the sorting module is configured to obtain user description information of the user, and sort the preliminary search result according to the user description information to obtain a sorted result; and a determining module, configured to After the result, the result of selecting the preset number is determined as the recommended title.

本发明第二方面实施例提出的题目推荐装置,通过获取题目属性信息并根据题目属性信息获取初步检索结果,由于参考了题目属性信息,不仅仅是文本相似度,因此可以提升推荐题目与检索题目的相关性;另外,通过获取用户描述信息,并根据用户描述信息对初步检索结果进行排序,可以在推荐时参考用户信息,提升与用户的相关性,提高推荐效果。The title recommendation device according to the second aspect of the present invention obtains the preliminary attribute search information according to the title attribute information, and obtains the preliminary search result according to the title attribute information. Since the title attribute information is referenced, not only the text similarity, but also the recommended title and the search title can be improved. Correlation; in addition, by obtaining the user description information and sorting the preliminary search results according to the user description information, the user information can be referred to in the recommendation, the relevance to the user is improved, and the recommendation effect is improved.

本发明实施例还提出了一种电子设备,包括:一个或者多个处理器;存储器;一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时:执行如本发明第一方面实施例任一项所述的方法。An embodiment of the present invention further provides an electronic device, including: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory when the one or more When the processor is executed: the method according to any of the first aspect of the invention is performed.

本发明实施例还提出了一种非易失性计算机存储介质,所述计算机存储介质存储有一个或者多个模块,当所述一个或者多个模块被执行时:执行如本发明第一方面实施例任一项所述的方法。本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Embodiments of the present invention also provide a non-volatile computer storage medium having one or more modules stored when the one or more modules are executed: performing the first aspect of the present invention The method of any of the preceding claims. The additional aspects and advantages of the invention will be set forth in part in the description which follows.

附图说明DRAWINGS

本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and readily understood from

图1是本发明一实施例提出的题目推荐方法的流程示意图;1 is a schematic flow chart of a method for recommending a topic according to an embodiment of the present invention;

图2是本发明实施例中获取题目属性信息的流程示意图;2 is a schematic flowchart of acquiring topic attribute information in an embodiment of the present invention;

图3是本发明实施例中线上实现流程示意图;3 is a schematic diagram of an online implementation process in an embodiment of the present invention;

图4是本发明实施例中获取初步检索结果的流程示意图;4 is a schematic flowchart of obtaining preliminary search results in an embodiment of the present invention;

图5是本发明实施例中获取用户描述信息的流程示意图;FIG. 5 is a schematic flowchart of obtaining user description information in an embodiment of the present invention;

图6是本发明实施例中线下实现流程示意图;6 is a schematic diagram of an offline implementation process in an embodiment of the present invention;

图7是本发明实施例中建立题目结构化信息库的流程示意图;7 is a schematic flowchart of establishing a topic structured information base in an embodiment of the present invention;

图8是本发明实施例中建立用户模型的流程示意图;8 is a schematic flowchart of establishing a user model in an embodiment of the present invention;

图9是本发明另一实施例提出的题目推荐装置的结构示意图;FIG. 9 is a schematic structural diagram of a title recommendation apparatus according to another embodiment of the present invention; FIG.

图10是本发明另一实施例提出的题目推荐装置的结构示意图。 FIG. 10 is a schematic structural diagram of a title recommendation apparatus according to another embodiment of the present invention.

具体实施方式detailed description

下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的模块或具有相同或类似功能的模块。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。相反,本发明的实施例包括落入所附调权利要求书的精神和内涵范围内的所有变化、修改和等同物。The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the accompanying drawings, in which the same or similar reference numerals indicate the same or similar modules or modules having the same or similar functions. The embodiments described below with reference to the accompanying drawings are intended to be illustrative of the invention and are not to be construed as limiting. Rather, the invention is to cover all modifications, variations and equivalents of the inventions.

图1是本发明一实施例提出的题目推荐方法的流程示意图,该方法包括:FIG. 1 is a schematic flowchart of a method for recommending a topic according to an embodiment of the present invention, where the method includes:

S11:接收检索题目。S11: Receive a search question.

当用户需要检索题目时,可以在搜索框中输入检索题目。When the user needs to retrieve a question, a search question can be entered in the search box.

可以理解的是,本实施例可以由服务器执行,当由服务器执行时,浏览器内的搜索框接收到用户输入的检索题目后,可以将该检索题目发送给服务器,由服务器接收浏览器发送的检索题目。或者,It can be understood that the embodiment can be executed by the server. When executed by the server, after the search box in the browser receives the search question input by the user, the search question can be sent to the server, and the server receives the message sent by the browser. Search for the title. or,

本实施例也可以由具有搜索功能的web产品实现,该web产品包括前端与用户交互的部分以及后台处理部分,此时,可以由该web产品的前端接收到用户输入的检索题目,例如由搜索框接收到用户输入的检索题目。The embodiment may also be implemented by a web product having a search function, the web product includes a part of the front end interacting with the user and a background processing part. At this time, the search term input by the user may be received by the front end of the web product, for example, by searching. The box receives the search question entered by the user.

S12:获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果。S12: Obtain topic attribute information of the search topic, and obtain a preliminary search result according to the topic attribute information.

可选的,参见图2,所述获取所述检索题目的题目属性信息,包括:Optionally, referring to FIG. 2, the acquiring the topic attribute information of the search topic includes:

S21:获取所述检索题目的标识信息;S21: Obtain identification information of the search topic.

例如,当用户输入的检索题目是图片形式时,可以先对图片进行光学字符识别(Optical Character Recognition,OCR)识别,得到识别结果,在预先保存的原题库中,查找与识别结果相同或者类似的题目,将查找到的相同或类似的题目的标识信息(id)作为当前的检索题目的标识信息(id)。For example, when the search term input by the user is in the form of a picture, the image may be identified by Optical Character Recognition (OCR) to obtain the recognition result. In the pre-stored original question bank, the search result is the same as or similar to the recognition result. The title identifies the identification information (id) of the same or similar topic as the identification information (id) of the current search title.

S22:在预先建立的题目结构化信息库中,获取与所述标识信息对应的题目属性信息,其中,所述题目结构化信息库中对应保存题目的标识信息与题目属性信息。S22: Acquire, in a pre-established topic structured information database, topic attribute information corresponding to the identifier information, where the title structured information database corresponds to the identification information of the saved topic and the topic attribute information.

在获取检索题目的id后,可以在预先建立的题目结构化信息库获取与检索题目的id对应的题目属性信息。After obtaining the id of the search title, the title attribute information corresponding to the id of the search title may be acquired in the pre-established topic structured information base.

题目结构化信息库中与题目的标识信息对应保存有题目属性信息,题目属性信息例如包括:题目类型,题目难度,题目结构,题目知识点,回答质量,归一化后的题目描述等。The title structured information database stores the attribute attribute information corresponding to the identification information of the title. The title attribute information includes, for example, the title type, the title difficulty, the title structure, the topic knowledge point, the answer quality, and the normalized topic description.

具体的,根据用户输入的检索题目获取推荐题目是在线上系统完成的。参见图3,线上系统可以包括题目特征获取模块31,题目特征获取模块31获取到检索题目后,可以从题目结构化信息库中获取题目属性信息。Specifically, obtaining the recommended topic according to the search term input by the user is completed online. Referring to FIG. 3, the online system may include a topic feature acquisition module 31. After the topic feature acquisition module 31 obtains the retrieval topic, the topic attribute information may be obtained from the topic structured information library.

参见图3,线上系统还可以包括:文本检索模块32,题目特征获取模块31获取到题目 属性信息后,文本检索模块32可以根据题目属性信息获取初步检索结果。Referring to FIG. 3, the online system may further include: a text retrieval module 32, and the topic feature acquisition module 31 acquires the topic. After the attribute information, the text retrieval module 32 can obtain the preliminary search result according to the topic attribute information.

可选的,参见图4,所述根据所述题目属性信息获取初步检索结果,包括:Optionally, referring to FIG. 4, the obtaining the preliminary search result according to the attribute information of the topic includes:

S41:获取所述检索题目的关键词,并根据所述关键词进行文本检索,得到文本检索结果;S41: Obtain a keyword of the search topic, and perform a text search according to the keyword to obtain a text search result;

例如,可以采用通用的分词技术对检索题目进行分词,再根据预设规则从得到的分词中获取关键词。预设规则例如根据分词在检索题目中的位置,在题库中的重要程度,是否主题词等。For example, a general word segmentation technique can be used to segment a search term, and then a keyword can be obtained from the obtained segmentation according to a preset rule. The preset rules are, for example, based on the position of the participle in the search title, the degree of importance in the question bank, whether the subject word, and the like.

在获取关键词后,可以以关键词作为检索词(query),在已有的数据库中检索相关题目,得到文本描述相关的文本检索结果。After obtaining the keyword, the keyword can be used as a query to retrieve the relevant topic in the existing database, and the text retrieval result related to the text description is obtained.

S42:根据所述题目属性信息对所述文本检索结果进行调权,得到调权后的检索结果;S42: Perform weight adjustment on the text retrieval result according to the attribute information of the topic, and obtain a search result after the weight adjustment;

例如,可以根据检索题目的知识点,类型,难度,回答质量等信息,对文本检索结果进行调权。具体的,可以限定文本检索结果与检索题目的知识点相同或相近,将类型,难度与检索题目相近的文本检索结果加权,回答质量高的文本检索结果加权等。具体的加权的数值可以根据实际需求预先设置。For example, the text retrieval result can be adjusted according to the knowledge point, type, difficulty, and answer quality of the search topic. Specifically, the text search result may be limited to the same or similar to the knowledge point of the search title, and the text search result whose type and difficulty are similar to the search title is weighted, and the weight of the text search result with high quality is weighted. The specific weighted values can be preset according to actual needs.

通过调权,可以得到具有不同权重的文本检索结果。By adjusting the weight, text retrieval results with different weights can be obtained.

另外,本实施例中采用的知识点可以是细粒度的知识点,具体的可以参见后续题目知识点抽取中的相关描述。In addition, the knowledge points used in this embodiment may be fine-grained knowledge points. For details, refer to related descriptions in the subsequent topic knowledge point extraction.

通过采用细粒度的知识点,可以提高推荐题目与当前的检索题目的相关性。By using fine-grained knowledge points, the relevance of recommended topics to current search questions can be improved.

S43:获取所述题目属性信息中的知识点信息,对所述调权后的检索结果进行再次调权,其中,所述知识点信息包括:单一知识点或者混合知识点;S43: Acquire the knowledge point information in the attribute information of the topic, and perform the weight adjustment on the weighted search result, where the knowledge point information includes: a single knowledge point or a mixed knowledge point;

例如,如果检索题目是单一知识点,则对单一知识点的文本检索结果进行加权;或者,如果检索题目是混合知识点,则确定混合知识点中的每个知识点以及对应的权重,再根据每个知识点以及对应的权重确定文本检索结果的权重,将与检索题目相近的文本检索结果进行加权。For example, if the search title is a single knowledge point, the text search result of the single knowledge point is weighted; or, if the search title is a mixed knowledge point, each knowledge point in the mixed knowledge point and the corresponding weight are determined, and then according to Each knowledge point and the corresponding weight determine the weight of the text retrieval result, and the text retrieval result close to the retrieval title is weighted.

本实施例中,通过区分单一知识点和混合知识点,可以提升推荐题目与当前的检索题目的相关性。In this embodiment, by distinguishing between a single knowledge point and a mixed knowledge point, the correlation between the recommended topic and the current search topic can be improved.

S44:从再次调权后的检索结果中,选择预设个数的检索结果,确定为所述初步检索结果。S44: Select a preset number of search results from the search result after the weight adjustment again, and determine the preliminary search result.

例如,根据权重信息,选择权重较大的预设个数的检索结果作为初步检索结果,预设个数例如为50个。For example, according to the weight information, a preset number of search results having a larger weight is selected as the preliminary search result, and the preset number is, for example, 50.

S13:获取用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果; S13: Acquire user description information of the user, and sort the preliminary search results according to the user description information to obtain a sorted result;

可选的,参见图5,所述获取用户的用户描述信息,包括:Optionally, referring to FIG. 5, the acquiring user description information of the user includes:

S51:获取用户的标识信息;S51: Obtain identification information of the user.

其中,可以在每个用户注册时,为用户分配对应的标识信息,用户登录时,登录信息中携带用户自身的标识信息,系统可以在登录信息中获取用户的标识信息(id)。The user can be assigned a corresponding identification information when the user is registered. When the user logs in, the login information carries the user's own identification information, and the system can obtain the user identification information (id) in the login information.

S52:在预设建立的用户模型中,获取与所述用户的标识信息对应的用户描述信息,其中,所述用户模型中对应保存用户的标识信息与用户描述信息。S52: The user description information corresponding to the identifier information of the user is obtained in the preset user model, where the user model correspondingly saves the identifier information of the user and the user description information.

参见图3,线上系统还可以包括:用户特征获取模块33,用户特征获取模块33可以根据用户的标识信息,在用户模型中获取与用户的标识信息对应的用户描述信息。Referring to FIG. 3, the online system may further include: a user feature obtaining module 33, and the user feature obtaining module 33 may obtain user description information corresponding to the user's identification information in the user model according to the user's identification information.

用户描述信息例如包括:用户偏好难度,用户偏好类型,用户教材版本,用户对题目的浏览,点击,收藏情况等。The user description information includes, for example, user preference difficulty, user preference type, user textbook version, user browsing of the topic, click, and collection status.

在获取到用户描述信息后,可以对初步检索结果进行排序,例如,参见图3,线上系统还包括:高级排序模块34,高级排序模块34用于根据用户的属性信息对初步检索结果进行排序。After the user description information is obtained, the preliminary search results may be sorted. For example, referring to FIG. 3, the online system further includes: an advanced sorting module 34, and the advanced sorting module 34 is configured to sort the preliminary search results according to the attribute information of the user. .

具体的,高级排序流程可以包括:对与用户描述信息一致的初步检索结果进行加权,并根据加权后的权重,对初步检索结果进行排序。Specifically, the advanced sorting process may include: weighting preliminary search results that are consistent with the user description information, and sorting the preliminary search results according to the weighted weights.

例如,对如下的初步检索结果进行加权:For example, weight the preliminary search results as follows:

题目难度等级与用户偏好难度等级一致的,进行加权;If the difficulty level of the topic is consistent with the user's preference difficulty level, weighting is performed;

题目类型与用户偏好类型一致的,进行加权;If the topic type is consistent with the user preference type, weighting is performed;

题目来源与用户教材版本一致的,进行加权。The source of the question is consistent with the version of the user's textbook and weighted.

题目年级信息与用户当前年级一致的,进行加权。The title grade information is consistent with the user's current grade and weighted.

当前知识点用户历史行为分析,根据用户历史上对当前知识点的浏览难度和次数,进行调整,例如,略高于当前难度等级的题目,进行加权。The current knowledge point user history behavior analysis is adjusted according to the difficulty and number of browsing of the current knowledge point in the user history, for example, a question slightly higher than the current difficulty level, and weighted.

在进行上述加权后,按照权重从大到小的顺序对检索结果进行排序,例如,得到50个排序后的结果。After the above weighting is performed, the search results are sorted in descending order of weights, for example, 50 sorted results are obtained.

S14:从所述排序后的结果后选择预设个数的结果,确定为推荐题目。S14: Select a preset number of results from the sorted result to determine a recommended title.

其中,预设个数可以是用户指定或者系统默认设置的,之后,可以在上步排序后得到的结果中,按照从前到后的顺序,选择预设个数的结果。The preset number may be set by the user or set by the system by default. After that, the result obtained by sorting in the previous step may be selected according to the order from the front to the back.

进一步的,当执行本实施例的主体是web产品时,该web产品还可以向用户展示推荐题目。Further, when the main body of the embodiment is a web product, the web product can also display the recommended topic to the user.

例如,参见图3,高级排序后可以向用户展示与检索题目相关的推荐结果。For example, referring to Figure 3, the advanced ranking can show the user the recommendation results associated with the search topic.

上述的题目结构化信息库和用户模型可以是在线下建立的。The above topic structured information base and user model can be established offline.

在建立题目结构化信息库时,参见图6,线下系统可以包括特征提取分发模块61和题 目特征提取模块62,题目特征提取模块62可以具体包括:题目类型分类模块621,题目难度分级模块622,题目结构拆分模块623,题目知识点抽取模块624,回答质量分级模块625和题目描述归一模块626。When building the topic structured information base, referring to FIG. 6, the offline system may include a feature extraction distribution module 61 and a title. The title feature extraction module 62, the topic feature extraction module 62 may specifically include: a topic type classification module 621, a topic difficulty classification module 622, a topic structure splitting module 623, a topic knowledge point extraction module 624, an answer quality ranking module 625, and a topic description A module 626.

参见图7,所述建立题目结构化信息库,包括:Referring to FIG. 7, the establishing a topic structured information base includes:

S71:获取历史题目。S71: Obtain a history topic.

其中,历史题目是指建立题目结构化信息库时输入的题目,由于是在当前的检索题目之前输入的,为了与当前的检索题目区分,可以称为历史题目。Among them, the historical title refers to the title that is input when the topic structured information database is created. Since it is input before the current search topic, it can be called a history topic in order to distinguish it from the current search topic.

参见图6,每个新增的历史题目用新增题目表示。Referring to Figure 6, each new historical topic is represented by a new topic.

S72:将所述历史题目分发到不同的分类模块中。S72: Distribute the historical title to different classification modules.

例如,参见图6,特征提取分发模块将每个新增题目分别分发到题目类型分类模块,题目难度分级模块等分类模块中。For example, referring to FIG. 6, the feature extraction distribution module distributes each newly added topic to a classification module such as a topic type classification module and a topic difficulty classification module.

S73:对所述历史题目进行题目特征提取,获取每个分类模块对应的题目属性信息。S73: Perform topic feature extraction on the historical topic, and obtain topic attribute information corresponding to each classification module.

具体的,题目特征提取的流程可以包括:Specifically, the process of extracting topic features may include:

(1)题目类型分类(1) Classification of topic types

经过题目类型分类,可以获取的题目属性信息包括:学科类型和题目的类型。After the topic type classification, the title attribute information that can be obtained includes: the subject type and the type of the topic.

例如,采用支持向量机(Support Vector Machine,SVM)构建的分类模型。采用n-gram为主的特征,进行题目类型的分类,包括题目的学科类型(语文、数学、物理、化学等),以及题目的类型(选择、填空、简答等)。For example, a classification model built using a Support Vector Machine (SVM). The n-gram-based feature is used to classify the topic type, including the subject type (language, mathematics, physics, chemistry, etc.) of the topic, as well as the type of the topic (choose, fill in the blank, short answer, etc.).

(2)题目难度分级(2) Classification of the difficulty of the topic

经过题目难度分级,得到的题目属性信息可以包括:每个新增题目属于的难度级别,可以预先设置难度级别的个数。After the problem difficulty is graded, the obtained topic attribute information may include: the difficulty level to which each newly added topic belongs, and the number of difficulty levels may be preset.

具体的,可以采用梯度提升决策树(Gradient Boosting Decision Tree,GBDT)构建的分类模型,划分题目的难度值,产出一个难度级别分类。在分类模块中,采用的主要特征包括:解析内容的描述关键词,回答的长度,该知识点被提问的次数与解决数之比,该问题在用户生成内容(User Generated Content,UGC)平台获取的回答个数,回答者等级,回答耗时信息等。Specifically, a classification model constructed by a Gradient Boosting Decision Tree (GBDT) may be used to divide the difficulty value of the problem and generate a difficulty level classification. In the classification module, the main features adopted include: the description keyword of the parsed content, the length of the answer, the ratio of the number of times the knowledge point is questioned to the number of resolutions, and the problem is obtained on the User Generated Content (UGC) platform. The number of answers, the level of respondents, and the time-consuming information.

(3)题目结构拆分(3) Topic structure split

经过题目结构拆分,得到的题目属性信息可以包括:提干,提问,选项,填空项等结构。After the topic structure is split, the obtained topic attribute information may include: a structure such as a stem, a question, an option, and a blank fill.

具体的,可以先按照句子粒度进行分类,再根据句子的序列进行片划分,从而拆分为题干(背景描述片段,条件片段),提问,选项,填空项等多种结构。Specifically, it can be classified according to the sentence granularity, and then divided according to the sequence of the sentence, and then divided into various structures such as the stem (background description fragment, conditional fragment), questions, options, and blanks.

(4)题目知识点抽取 (4) Topic knowledge point extraction

通过题目知识点抽取,获取的题目属性信息包括:知识点,知识点可以用标签(tag)表示。Through the topic knowledge point extraction, the obtained topic attribute information includes: knowledge points, and the knowledge points can be represented by tags.

知识点标签可以由如下两个主要过程进行融合而得到。The knowledge point label can be obtained by fusing the following two main processes.

(I)进行题目关键词抽取:在题目结构拆分的情况下,主要从条件片段和提问片段中,进行关键词抽取。关键词抽取使用一个SVM分类模型来做。所采用的特征,包括词性,在句子中的位置,在题库中的重要程度,是否主题词等特征。将词分类为关键词和非关键词。在每个题目中,可以选择达到阈值的关键词,同时可以限定选择的关键词的最大值,例如最多选择5个关键词。(I) Subject keyword extraction: In the case of topic structure splitting, keyword extraction is mainly performed from the conditional segment and the question segment. Keyword extraction is done using an SVM classification model. The features used include the part of speech, the position in the sentence, the importance in the question bank, and whether the subject is a feature. Classify words as keywords and non-keywords. In each question, keywords that reach the threshold can be selected, and the maximum value of the selected keywords can be limited, for example, up to 5 keywords are selected.

在抽取得到关键词后,可以从关键词直接获取标签,例如,将关键词确定为标签等。After extracting the keyword, the tag can be directly obtained from the keyword, for example, the keyword is determined as a tag or the like.

(II)相似片段比对:利用相似片段比对的方法,将当前题目中的重要部分(条件片段、提问片段、选项片段),与题库中已经打上标签的其他题目重要片段进行相似度比对。通过K最近邻(k-NearestNeighbor,KNN)的方法,为当前待处理的题目打上相应的标签。(II) Similar segment alignment: using similar segment alignment methods, the important parts of the current topic (conditional segments, question segments, and option segments) are compared with other important segments of the title tag that have been tagged in the question bank. . The K-Nearest Neighbor (KNN) method is used to mark the current topic to be processed.

(III)标签融合:融合第一标签和第二标签,例如,将第一标签和第二标签去重后组成一个集合,其中,第一标签是通过关键词抽取得到的标签,第二标签是通过相似片段比对得到的标签。另外,在确定标签时还可以确定每个标签对应的置信度,最终根据置信度,选择置信度较大的预设个数的标签作为题目的知识点标签,预设个数例如为10个。(III) Label fusion: merging the first label and the second label, for example, de-duplicating the first label and the second label to form a set, wherein the first label is a label extracted by a keyword, and the second label is A label obtained by comparing similar fragments. In addition, when determining the label, the confidence level corresponding to each label may be determined, and finally, according to the confidence level, a preset number of labels with a larger degree of confidence is selected as the knowledge point label of the topic, and the preset number is, for example, ten.

(5)回答质量分级(5) Answer quality rating

经过回答质量分级,获取的题目属性信息包括回答质量所处的级别,例如,高质量答案,一般质量答案,低质量答案等。After answering the quality rating, the obtained topic attribute information includes the level of the answer quality, for example, high quality answer, general quality answer, low quality answer, and the like.

具体的,可以使用SVM模型,对回答质量进行分级,分为高质量、一般质量和低质量答案。在回答质量分级时采用的特征主要有:答案与题目的语义关联、来源、长度、格式化信息、用户点击浏览行为等特征。Specifically, the SVM model can be used to rank the quality of the responses, which are divided into high quality, general quality, and low quality answers. The features used in answering the quality grading are: the semantic relevance of the answer and the topic, the source, the length, the formatting information, and the user's click-through behavior.

(6)题目描述归一(6) Title description

题目描述归一得到的题目属性信息例如包括:归一化后的描述。The title attribute information obtained by the title description includes, for example, a normalized description.

具体的,多个不同来源的题目,描述方式(尤其是公式)是不一致的。根据定义,对描述方式进行归一化以后,加入数据库中。Specifically, the descriptions of different topics (especially formulas) are inconsistent. According to the definition, after the description mode is normalized, it is added to the database.

可以理解的是,上述的分类模型以SVM,GBDT为例,还可以是其余的分类模型,例如,逻辑斯蒂回归、线性回归、随机森林、神经网络、朴素贝叶斯等具有分类能力的算法模型。It can be understood that the above classification model uses SVM, GBDT as an example, and can also be the remaining classification models, for example, Logistic regression, linear regression, random forest, neural network, naive Bayes and other algorithms with classification ability. model.

S74:获取所述历史题目的标识信息,并对应保存题目的标识信息与题目属性信息。S74: Acquire identification information of the historical topic, and correspondingly save the identification information of the topic and the topic attribute information.

其中,可以为每个历史题目分配唯一的标识信息,在获取该历史题目的题目属性信息后,可以在题目结构化信息库中对应保存该历史题目的标识信息与题目属性信息。Each of the historical topics may be assigned unique identification information. After obtaining the topic attribute information of the historical topic, the identification information and the topic attribute information of the historical title may be correspondingly saved in the topic structured information database.

现有技术在根据当前题目的描述文本获取推荐题目时,由于对题目的结构化解析程度 不够,没有有效区分题目类型,条件,提问内容的语义场景等,造成检索相关性较低。而本实施例中,在保存不同的题目属性信息时,不同的题目属性信息例如题目类型,题目难度,题目结构,题目知识点等,采用结构化信息形式进行保存,使得各题目属性信息有层级关系和互联关联,从而提高检索相关性。The prior art acquires a recommended topic according to the description text of the current topic, due to the degree of structural analysis of the topic Not enough, there is no effective way to distinguish between the type of topic, the condition, the semantic scene of the content of the question, etc., resulting in low relevance of the search. In this embodiment, when different topic attribute information is saved, different topic attribute information such as the title type, the title difficulty, the title structure, the topic knowledge point, etc. are saved in a structured information form, so that the attribute information of each topic has a hierarchy. Relationships and interconnections to improve search relevance.

在建立用户模型时,参见图6,线下系统还包括:用户建模模块63,用户建模模块63用于根据用户行为日志,用户属性信息,题目结构化信息库,建立用户模型。When the user model is established, referring to FIG. 6, the offline system further includes: a user modeling module 63, configured to establish a user model according to the user behavior log, the user attribute information, and the topic structured information library.

可选的,参见图8,所述建立用户模型,包括:Optionally, referring to FIG. 8, the establishing a user model includes:

S81:获取用户行为日志,以及获取用户属性信息。S81: Obtain a user behavior log and obtain user attribute information.

其中,用户行为日志可以记录用户在题库中浏览,点击,收藏的题目。Among them, the user behavior log can record the user's browsing, clicking, and collecting questions in the question bank.

用户属性信息是指关于用户的一些元属性,例如性别,类型(家长/学生/老师),地区,年级,学校等信息。User attribute information refers to some meta attributes about the user, such as gender, type (parent/student/teacher), region, grade, school, etc.

S82:根据题目结构化信息库,以及所述用户行为日志和用户属性信息,进行用户建模,得到用户模型。S82: Perform user modeling according to the topic structured information base, and the user behavior log and user attribute information, to obtain a user model.

例如,通过用户行为日志,可以获知用户浏览,点击或收藏的题目,根据题目结构化信息库,可以获知相应题目的题目属性信息,从而可以获知用户偏好信息,例如,可以获知用户关注的科目、知识点信息、题目信息、知识点难度等级、题目类型信息等。以及,获取用户属性信息后,也可以记录在用户模型中。For example, through the user behavior log, the user can browse, click or collect the title, and according to the structured information database of the topic, the topic attribute information of the corresponding topic can be obtained, so that the user preference information can be obtained, for example, the subject concerned by the user can be known. Knowledge point information, topic information, knowledge point difficulty level, title type information, etc. And, after obtaining the user attribute information, it can also be recorded in the user model.

另外,用户注册时可以分配用户标识信息,在用户模型中,可以对应保存用户标识信息与上述的用户偏好信息以及用户属性信息。In addition, the user identification information may be allocated when the user registers, and the user identification information and the user preference information and the user attribute information may be saved in the user model.

本实施例中,通过获取题目属性信息并根据题目属性信息获取初步检索结果,由于参考了题目属性信息,不仅仅是文本相似度,因此可以提升推荐题目与检索题目的相关性;另外,通过获取用户描述信息,并根据用户描述信息对初步检索结果进行排序,可以在推荐时参考用户信息,提升与用户的相关性,提高推荐效果。本实施例在知识点抽取上,做了细粒度处理,同时考虑了单一知识点和混合知识点的不同,有效提升了推荐的相关题目与当前检索的题目的相关性。本实施例引入了更多的题目属性信息,如难度,类型等,这些信息从文本相似度角度是无法处理的,这些属性信息的引入,提升了推荐题目与检索题目的相关性。本实施例通过参考用户属性信息,可以支持用户的个性化推荐,可以实现不同用户看到不同的相关题目,提升用户体验。In this embodiment, by obtaining the topic attribute information and obtaining the preliminary search result according to the topic attribute information, since the title attribute information is referred to, not only the text similarity, the correlation between the recommended topic and the search topic can be improved; The user describes the information, and sorts the preliminary search results according to the user description information, and can refer to the user information in the recommendation, thereby improving the correlation with the user and improving the recommendation effect. In this embodiment, the fine-grained processing is performed on the knowledge point extraction, and the difference between the single knowledge point and the mixed knowledge point is considered, and the correlation between the recommended related topic and the currently retrieved topic is effectively improved. This embodiment introduces more topic attribute information, such as difficulty, type, etc., which cannot be processed from the perspective of text similarity. The introduction of these attribute information improves the correlation between the recommended topic and the search topic. In this embodiment, by referring to the user attribute information, the user's personalized recommendation can be supported, and different users can be seen to see different related topics, thereby improving the user experience.

图9是本发明另一实施例提出的题目推荐装置的结构示意图,该装置90包括接收模块91,获取模块92,排序模块93和确定模块94。FIG. 9 is a schematic structural diagram of a title recommendation apparatus according to another embodiment of the present invention. The apparatus 90 includes a receiving module 91, an obtaining module 92, a sorting module 93, and a determining module 94.

接收模块91,用于接收检索题目;The receiving module 91 is configured to receive a search question;

当用户需要检索题目时,可以在搜索框中输入检索题目。 When the user needs to retrieve a question, a search question can be entered in the search box.

可以理解的是,本实施例可以是服务器,此时的接收模块用于接收浏览器发送的检索题目,浏览器可以从搜索框中获取用户输入的检索题目。或者,It can be understood that the embodiment may be a server. The receiving module at this time is configured to receive a search question sent by the browser, and the browser may obtain the search question input by the user from the search box. or,

本实施例也可以是具有搜索功能的web产品装置,此时的接收模块用于接收用户输入的检索题目。The embodiment may also be a web product device with a search function, and the receiving module at this time is configured to receive a search question input by the user.

获取模块92,用于获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果;The obtaining module 92 is configured to acquire the topic attribute information of the search topic, and obtain a preliminary search result according to the topic attribute information;

可选的,所述获取模块92用于获取所述检索题目的题目属性信息,包括:Optionally, the obtaining module 92 is configured to obtain the topic attribute information of the search topic, including:

获取所述检索题目的标识信息;Obtaining identification information of the search topic;

在预先建立的题目结构化信息库中,获取与所述标识信息对应的题目属性信息,其中,所述题目结构化信息库中对应保存题目的标识信息与题目属性信息。In the pre-established topic structured information database, the title attribute information corresponding to the identifier information is obtained, wherein the title structured information database corresponds to the identification information of the saved topic and the topic attribute information.

例如,当用户输入的检索题目是图片形式时,可以先对图片进行光学字符识别(Optical Character Recognition,OCR)识别,得到识别结果,在预先保存的原题库中,查找与识别结果相同或者类似的题目,将查找到的相同或类似的题目的标识信息(id)作为当前的检索题目的标识信息(id)。For example, when the search term input by the user is in the form of a picture, the image may be identified by Optical Character Recognition (OCR) to obtain the recognition result. In the pre-stored original question bank, the search result is the same as or similar to the recognition result. The title identifies the identification information (id) of the same or similar topic as the identification information (id) of the current search title.

在获取检索题目的id后,可以在预先建立的题目结构化信息库获取与检索题目的id对应的题目属性信息。After obtaining the id of the search title, the title attribute information corresponding to the id of the search title may be acquired in the pre-established topic structured information base.

题目结构化信息库中与题目的标识信息对应保存有题目属性信息,题目属性信息例如包括:题目类型,题目难度,题目结构,题目知识点,回答质量,归一化后的题目描述等。The title structured information database stores the attribute attribute information corresponding to the identification information of the title. The title attribute information includes, for example, the title type, the title difficulty, the title structure, the topic knowledge point, the answer quality, and the normalized topic description.

可选的,参见图10,该装置90还包括:第一建立模块95,用于建立题目结构化信息库,所述第一建立模块95具体用于:Optionally, referring to FIG. 10, the apparatus 90 further includes: a first establishing module 95, configured to establish a topic structured information base, where the first establishing module 95 is specifically configured to:

获取历史题目;Obtain historical topics;

将所述历史题目分发到不同的分类模块中;Distributing the historical title to different classification modules;

对所述历史题目进行题目特征提取,获取每个分类模块对应的题目属性信息;Performing topic feature extraction on the historical topic, and acquiring topic attribute information corresponding to each classification module;

获取所述历史题目的标识信息,并对应保存题目的标识信息与题目属性信息。Obtaining the identification information of the historical topic, and correspondingly storing the identification information of the topic and the topic attribute information.

其中,建立题目结构化信息库的流程可以参见方法实施例中的相关描述,在此不再赘述。For the process of establishing the topic structured information base, refer to the related description in the method embodiment, and details are not described herein again.

可选的,所述第一建立模块95用于对所述历史题目进行题目特征提取,获取每个分类模块对应的题目属性信息,包括:Optionally, the first establishing module 95 is configured to perform topic feature extraction on the historical topic, and obtain topic attribute information corresponding to each classification module, including:

当所述分类模块是题目知识点抽取模块时,进行题目关键词抽取,以及,相似片断比对,并根据题目关键词抽取获取第一标签,根据相似片断比对得到第二标签,以及,对所述第一标签和所述第二标签进行融合,从融合后的标签中,选择预设个数的标签,确定为知识点标签。 When the classification module is a topic knowledge point extraction module, the topic keyword extraction is performed, and the similar segment comparison is performed, and the first label is extracted according to the topic keyword, and the second label is obtained according to the similar segment comparison, and The first label and the second label are merged, and a preset number of labels are selected from the merged labels to be determined as a knowledge point label.

可选的,所述获取模块92用于根据所述题目属性信息获取初步检索结果,包括:Optionally, the obtaining module 92 is configured to obtain a preliminary search result according to the topic attribute information, including:

获取所述检索题目的关键词,并根据所述关键词进行文本检索,得到文本检索结果;Obtaining a keyword of the search topic, and performing a text search according to the keyword to obtain a text search result;

根据所述题目属性信息对所述文本检索结果进行调权,得到调权后的检索结果;And adjusting the weight of the text search result according to the attribute information of the topic to obtain a search result after the weight adjustment;

获取所述题目属性信息中的知识点信息,对所述调权后的检索结果进行再次调权,其中,所述知识点信息包括:单一知识点或者混合知识点;Obtaining the knowledge point information in the attribute information of the topic, and re-adjusting the weighted search result, where the knowledge point information includes: a single knowledge point or a mixed knowledge point;

从再次调权后的检索结果中,选择预设个数的检索结果,确定为所述初步检索结果。From the search result after the weight adjustment again, a preset number of search results are selected and determined as the preliminary search result.

例如,可以采用通用的分词技术对检索题目进行分词,再根据预设规则从得到的分词中获取关键词。预设规则例如根据分词在检索题目中的位置,在题库中的重要程度,是否主题词等。For example, a general word segmentation technique can be used to segment a search term, and then a keyword can be obtained from the obtained segmentation according to a preset rule. The preset rules are, for example, based on the position of the participle in the search title, the degree of importance in the question bank, whether the subject word, and the like.

在获取关键词后,可以以关键词作为检索词(query),在已有的数据库中检索相关题目,得到文本描述相关的文本检索结果。After obtaining the keyword, the keyword can be used as a query to retrieve the relevant topic in the existing database, and the text retrieval result related to the text description is obtained.

例如,可以根据检索题目的知识点,类型,难度,回答质量等信息,对文本检索结果进行调权。具体的,可以限定文本检索结果与检索题目的知识点相同或相近,将类型,难度与检索题目相近的文本检索结果加权,回答质量高的文本检索结果加权等。具体的加权的数值可以根据实际需求预先设置。For example, the text retrieval result can be adjusted according to the knowledge point, type, difficulty, and answer quality of the search topic. Specifically, the text search result may be limited to the same or similar to the knowledge point of the search title, and the text search result whose type and difficulty are similar to the search title is weighted, and the weight of the text search result with high quality is weighted. The specific weighted values can be preset according to actual needs.

通过调权,可以得到具有不同权重的文本检索结果。By adjusting the weight, text retrieval results with different weights can be obtained.

另外,本实施例中采用的知识点可以是细粒度的知识点,具体的可以参见后续题目知识点抽取中的相关描述。In addition, the knowledge points used in this embodiment may be fine-grained knowledge points. For details, refer to related descriptions in the subsequent topic knowledge point extraction.

通过采用细粒度的知识点,可以提高推荐题目与当前的检索题目的相关性。By using fine-grained knowledge points, the relevance of recommended topics to current search questions can be improved.

例如,如果检索题目是单一知识点,则对单一知识点的文本检索结果进行加权;或者,如果检索题目是混合知识点,则确定混合知识点中的每个知识点以及对应的权重,再根据每个知识点以及对应的权重确定文本检索结果的权重,将与检索题目相近的文本检索结果进行加权。For example, if the search title is a single knowledge point, the text search result of the single knowledge point is weighted; or, if the search title is a mixed knowledge point, each knowledge point in the mixed knowledge point and the corresponding weight are determined, and then according to Each knowledge point and the corresponding weight determine the weight of the text retrieval result, and the text retrieval result close to the retrieval title is weighted.

本实施例中,通过区分单一知识点和混合知识点,可以提升推荐题目与当前的检索题目的相关性。In this embodiment, by distinguishing between a single knowledge point and a mixed knowledge point, the correlation between the recommended topic and the current search topic can be improved.

例如,根据权重信息,选择权重较大的预设个数的检索结果作为初步检索结果,预设个数例如为50个。For example, according to the weight information, a preset number of search results having a larger weight is selected as the preliminary search result, and the preset number is, for example, 50.

排序模块93,用于获取所述用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果;The sorting module 93 is configured to obtain user description information of the user, and sort the preliminary search results according to the user description information to obtain a sorted result;

可选的,所述排序模块93用于获取用户的用户描述信息,包括:Optionally, the sorting module 93 is configured to obtain user description information of the user, including:

获取用户的标识信息;Obtain identification information of the user;

在预设建立的用户模型中,获取与所述用户的标识信息对应的用户描述信息,其中, 所述用户模型中对应保存用户的标识信息与用户描述信息。Obtaining, in a preset user model, user description information corresponding to the identifier information of the user, where The user model correspondingly stores the identification information of the user and the user description information.

其中,可以在每个用户注册时,为用户分配对应的标识信息,用户登录时,登录信息中携带用户自身的标识信息,系统可以在登录信息中获取用户的标识信息(id)。The user can be assigned a corresponding identification information when the user is registered. When the user logs in, the login information carries the user's own identification information, and the system can obtain the user identification information (id) in the login information.

用户描述信息例如包括:用户偏好难度,用户偏好类型,用户教材版本,用户对题目的浏览,点击,收藏情况等。The user description information includes, for example, user preference difficulty, user preference type, user textbook version, user browsing of the topic, click, and collection status.

可选的,参见图10,该装置90还包括:第二建立模块96,用于建立用户模型,所述第二建立模块96具体用于:Optionally, referring to FIG. 10, the apparatus 90 further includes: a second establishing module 96, configured to establish a user model, where the second establishing module 96 is specifically configured to:

获取用户行为日志,以及获取用户属性信息;Obtain a user behavior log and obtain user attribute information;

根据题目结构化信息库,以及所述用户行为日志和用户属性信息,进行用户建模,得到用户模型。According to the topic structured information base, and the user behavior log and user attribute information, user modeling is performed to obtain a user model.

其中,建立用户模型的流程可以参见方法实施例中的相关描述,在此不再赘述。For the process of establishing the user model, refer to the related description in the method embodiment, and details are not described herein again.

可选的,所述排序模块93用于根据所述用户描述信息对所述初步检索结果进行排序,包括:Optionally, the sorting module 93 is configured to sort the preliminary search results according to the user description information, including:

对与用户描述信息一致的初步检索结果进行加权,并根据加权后的权重,对初步检索结果进行排序。The preliminary search results that are consistent with the user description information are weighted, and the preliminary search results are sorted according to the weighted weights.

例如,对如下的初步检索结果进行加权:For example, weight the preliminary search results as follows:

题目难度等级与用户偏好难度等级一致的,进行加权;If the difficulty level of the topic is consistent with the user's preference difficulty level, weighting is performed;

题目类型与用户偏好类型一致的,进行加权;If the topic type is consistent with the user preference type, weighting is performed;

题目来源与用户教材版本一致的,进行加权。The source of the question is consistent with the version of the user's textbook and weighted.

题目年级信息与用户当前年级一致的,进行加权。The title grade information is consistent with the user's current grade and weighted.

当前知识点用户历史行为分析,根据用户历史上对当前知识点的浏览难度和次数,进行调整,例如,略高于当前难度等级的题目,进行加权。The current knowledge point user history behavior analysis is adjusted according to the difficulty and number of browsing of the current knowledge point in the user history, for example, a question slightly higher than the current difficulty level, and weighted.

在进行上述加权后,按照权重从大到小的顺序对检索结果进行排序,例如,得到50个排序后的结果。After the above weighting is performed, the search results are sorted in descending order of weights, for example, 50 sorted results are obtained.

确定模块94,用于从所述排序后的结果后选择预设个数的结果,确定为推荐题目。The determining module 94 is configured to select a preset number of results from the sorted result and determine the recommended title.

其中,预设个数可以是用户指定或者系统默认设置的,之后,可以在上步排序后得到的结果中,按照从前到后的顺序,选择预设个数的结果。The preset number may be set by the user or set by the system by default. After that, the result obtained by sorting in the previous step may be selected according to the order from the front to the back.

进一步的,当该装置是web产品装置时,参见图10,所述检索题目是由所述用户输入的,该装置包括:Further, when the device is a web product device, referring to FIG. 10, the search topic is input by the user, and the device includes:

展示模块97,用于向所述用户展示所述推荐题目。The display module 97 is configured to display the recommended topic to the user.

本实施例中,通过获取题目属性信息并根据题目属性信息获取初步检索结果,由于参考了题目属性信息,不仅仅是文本相似度,因此可以提升推荐题目与检索题目的 相关性;另外,通过获取用户描述信息,并根据用户描述信息对初步检索结果进行排序,可以在推荐时参考用户信息,提升与用户的相关性,提高推荐效果。本实施例在知识点抽取上,做了细粒度处理,同时考虑了单一知识点和混合知识点的不同,有效提升了推荐的相关题目与当前检索的题目的相关性。本实施例引入了更多的题目属性信息,如难度,类型等,这些信息从文本相似度角度是无法处理的,这些属性信息的引入,提升了推荐题目与检索题目的相关性。本实施例通过参考用户属性信息,可以支持用户的个性化推荐,可以实现不同用户看到不同的相关题目,提升用户体验。In this embodiment, by obtaining the title attribute information and obtaining the preliminary search result according to the title attribute information, since the title attribute information is referred to, not only the text similarity, but the recommended title and the search title can be improved. Correlation; In addition, by obtaining the user description information and sorting the preliminary search results according to the user description information, the user information may be referred to during the recommendation, thereby improving the relevance with the user and improving the recommendation effect. In this embodiment, the fine-grained processing is performed on the knowledge point extraction, and the difference between the single knowledge point and the mixed knowledge point is considered, and the correlation between the recommended related topic and the currently retrieved topic is effectively improved. This embodiment introduces more topic attribute information, such as difficulty, type, etc., which cannot be processed from the perspective of text similarity. The introduction of these attribute information improves the correlation between the recommended topic and the search topic. In this embodiment, by referring to the user attribute information, the user's personalized recommendation can be supported, and different users can be seen to see different related topics, thereby improving the user experience.

本发明实施例还提出了一种电子设备,包括:一个或者多个处理器;存储器;一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时:接收检索题目;获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果;获取用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果;从所述排序后的结果后选择预设个数的结果,确定为推荐题目。An embodiment of the present invention further provides an electronic device, including: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory when the one or more When the processor executes: receiving the search topic; acquiring the topic attribute information of the search topic, and obtaining a preliminary search result according to the topic attribute information; acquiring user description information of the user, and performing the preliminary according to the user description information The search results are sorted to obtain the sorted result; after the sorted result, the preset number of results is selected and determined as the recommended title.

本发明实施例还提出了一种非易失性计算机存储介质,所述计算机存储介质存储有一个或者多个模块,当所述一个或者多个模块被执行时:接收检索题目;获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果;获取用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果;从所述排序后的结果后选择预设个数的结果,确定为推荐题目。。Embodiments of the present invention also provide a non-volatile computer storage medium storing one or more modules, when the one or more modules are executed: receiving a retrieval title; acquiring the retrieval a topic attribute information of the topic, and obtaining a preliminary search result according to the topic attribute information; acquiring user description information of the user, and sorting the preliminary search result according to the user description information to obtain a sorted result; After the sorted result, the result of the preset number is selected and determined as the recommended title. .

需要说明的是,在本发明的描述中,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。此外,在本发明的描述中,除非另有说明,“多个”的含义是两个或两个以上。It should be noted that in the description of the present invention, the terms "first", "second" and the like are used for descriptive purposes only, and are not to be construed as indicating or implying relative importance. Further, in the description of the present invention, the meaning of "a plurality" is two or more unless otherwise specified.

流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本发明的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本发明的实施例所属技术领域的技术人员所理解。Any process or method description in the flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code that includes one or more executable instructions for implementing the steps of a particular logical function or process. And the scope of the preferred embodiments of the invention includes additional implementations, in which the functions may be performed in a substantially simultaneous manner or in an opposite order depending on the functions involved, in the order shown or discussed. It will be understood by those skilled in the art to which the embodiments of the present invention pertain.

应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。 It should be understood that portions of the invention may be implemented in hardware, software, firmware or a combination thereof. In the above-described embodiments, multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: having logic gates for implementing logic functions on data signals. Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.

本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。One of ordinary skill in the art can understand that all or part of the steps carried by the method of implementing the above embodiments can be completed by a program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, one or a combination of the steps of the method embodiments is included.

此外,在本发明各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules. The integrated modules, if implemented in the form of software functional modules and sold or used as stand-alone products, may also be stored in a computer readable storage medium.

上述提到的存储介质可以是只读存储器,磁盘或光盘等。The above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" and the like means a specific feature described in connection with the embodiment or example. A structure, material or feature is included in at least one embodiment or example of the invention. In the present specification, the schematic representation of the above terms does not necessarily mean the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples.

尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。 Although the embodiments of the present invention have been shown and described, it is understood that the above-described embodiments are illustrative and are not to be construed as limiting the scope of the invention. The embodiments are subject to variations, modifications, substitutions and variations.

Claims (21)

一种题目推荐方法,其特征在于,包括:A method for recommending a topic, characterized in that it comprises: 接收检索题目;Receiving a search question; 获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果;Obtaining topic attribute information of the search topic, and obtaining preliminary search results according to the topic attribute information; 获取用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果;Obtaining user description information of the user, and sorting the preliminary search results according to the user description information to obtain a sorted result; 从所述排序后的结果后选择预设个数的结果,确定为推荐题目。A result of selecting a preset number from the sorted result is determined as a recommended title. 根据权利要求1所述的方法,其特征在于,所述获取所述检索题目的题目属性信息,包括:The method according to claim 1, wherein the obtaining the topic attribute information of the search topic comprises: 获取所述检索题目的标识信息;Obtaining identification information of the search topic; 在预先建立的题目结构化信息库中,获取与所述标识信息对应的题目属性信息,其中,所述题目结构化信息库中对应保存题目的标识信息与题目属性信息。In the pre-established topic structured information database, the title attribute information corresponding to the identifier information is obtained, wherein the title structured information database corresponds to the identification information of the saved topic and the topic attribute information. 根据权利要求2所述的方法,其特征在于,还包括:建立题目结构化信息库,所述建立题目结构化信息库,包括:The method of claim 2, further comprising: establishing a topic structured information base, the establishing the topic structured information base, comprising: 获取历史题目;Obtain historical topics; 将所述历史题目分发到不同的分类模块中;Distributing the historical title to different classification modules; 对所述历史题目进行题目特征提取,获取每个分类模块对应的题目属性信息;Performing topic feature extraction on the historical topic, and acquiring topic attribute information corresponding to each classification module; 获取所述历史题目的标识信息,并对应保存题目的标识信息与题目属性信息。Obtaining the identification information of the historical topic, and correspondingly storing the identification information of the topic and the topic attribute information. 根据权利要求3所述的方法,其特征在于,所述分类模块包括如下项中的至少一项:The method of claim 3 wherein said classification module comprises at least one of the following: 题目类型分类模块,题目难度分级模块,题目结构拆分模块,题目知识点抽取模块,回答质量分级模块,题目描述归一模块。The topic type classification module, the topic difficulty classification module, the topic structure split module, the topic knowledge point extraction module, the answer quality classification module, and the topic description normalization module. 根据权利要求4所述的方法,其特征在于,所述对所述历史题目进行题目特征提取,获取每个分类模块对应的题目属性信息,包括:The method of claim 4, wherein the topic feature extraction is performed on the historical topic, and the topic attribute information corresponding to each classification module is obtained, including: 当所述分类模块是题目知识点抽取模块时,进行题目关键词抽取,以及,相似片断比对,并根据题目关键词抽取获取第一标签,根据相似片断比对得到第二标签,以及,对所述第一标签和所述第二标签进行融合,从融合后的标签中,选择预设个数的标签,确定为知识点标签。When the classification module is a topic knowledge point extraction module, the topic keyword extraction is performed, and the similar segment comparison is performed, and the first label is extracted according to the topic keyword, and the second label is obtained according to the similar segment comparison, and The first label and the second label are merged, and a preset number of labels are selected from the merged labels to be determined as a knowledge point label. 根据权利要求1-5任一项所述的方法,其特征在于,所述根据所述题目属性信息获取初步检索结果,包括:The method according to any one of claims 1 to 5, wherein the obtaining the preliminary search result according to the title attribute information comprises: 获取所述检索题目的关键词,并根据所述关键词进行文本检索,得到文本检索结果;Obtaining a keyword of the search topic, and performing a text search according to the keyword to obtain a text search result; 根据所述题目属性信息对所述文本检索结果进行调权,得到调权后的检索结果; And adjusting the weight of the text search result according to the attribute information of the topic to obtain a search result after the weight adjustment; 获取所述题目属性信息中的知识点信息,对所述调权后的检索结果进行再次调权,其中,所述知识点信息包括:单一知识点或者混合知识点;Obtaining the knowledge point information in the attribute information of the topic, and re-adjusting the weighted search result, where the knowledge point information includes: a single knowledge point or a mixed knowledge point; 从再次调权后的检索结果中,选择预设个数的检索结果,确定为所述初步检索结果。From the search result after the weight adjustment again, a preset number of search results are selected and determined as the preliminary search result. 根据权利要求1-6任一项所述的方法,其特征在于,所述获取用户的用户描述信息,包括:The method according to any one of claims 1-6, wherein the obtaining user description information of the user comprises: 获取用户的标识信息;Obtain identification information of the user; 在预设建立的用户模型中,获取与所述用户的标识信息对应的用户描述信息,其中,所述用户模型中对应保存用户的标识信息与用户描述信息。The user description information corresponding to the identifier information of the user is obtained in a preset user model, where the user model correspondingly stores the identifier information of the user and the user description information. 根据权利要求7所述的方法,其特征在于,还包括:建立用户模型,所述建立用户模型,包括:The method of claim 7, further comprising: establishing a user model, the establishing a user model, comprising: 获取用户行为日志,以及获取用户属性信息;Obtain a user behavior log and obtain user attribute information; 根据题目结构化信息库,以及所述用户行为日志和用户属性信息,进行用户建模,得到用户模型。According to the topic structured information base, and the user behavior log and user attribute information, user modeling is performed to obtain a user model. 根据权利要求1-8任一项所述的方法,其特征在于,所述根据所述用户描述信息对所述初步检索结果进行排序,包括:The method according to any one of claims 1-8, wherein the sorting the preliminary search results according to the user description information comprises: 对与用户描述信息一致的初步检索结果进行加权,并根据加权后的权重,对初步检索结果进行排序。The preliminary search results that are consistent with the user description information are weighted, and the preliminary search results are sorted according to the weighted weights. 根据权利要求1-9任一项所述的方法,其特征在于,所述检索题目是由所述用户输入的,所述方法还包括:The method according to any one of claims 1 to 9, wherein the search term is input by the user, the method further comprising: 向所述用户展示所述推荐题目。Presenting the recommended topic to the user. 一种题目推荐装置,其特征在于,包括:A title recommendation device, comprising: 接收模块,用于接收检索题目;a receiving module, configured to receive a search question; 获取模块,用于获取所述检索题目的题目属性信息,并根据所述题目属性信息获取初步检索结果;An obtaining module, configured to acquire topic attribute information of the search topic, and obtain a preliminary search result according to the topic attribute information; 排序模块,用于获取所述用户的用户描述信息,并根据所述用户描述信息对所述初步检索结果进行排序,得到排序后的结果;a sorting module, configured to acquire user description information of the user, and sort the preliminary search results according to the user description information, to obtain a sorted result; 确定模块,用于从所述排序后的结果后选择预设个数的结果,确定为推荐题目。The determining module is configured to select a preset number of results from the sorted result and determine the recommended title. 根据权利要求11所述的装置,其特征在于,所述获取模块用于获取所述检索题目的题目属性信息,包括:The device according to claim 11, wherein the obtaining module is configured to acquire the title attribute information of the search topic, including: 获取所述检索题目的标识信息;Obtaining identification information of the search topic; 在预先建立的题目结构化信息库中,获取与所述标识信息对应的题目属性信息,其中,所述题目结构化信息库中对应保存题目的标识信息与题目属性信息。 In the pre-established topic structured information database, the title attribute information corresponding to the identifier information is obtained, wherein the title structured information database corresponds to the identification information of the saved topic and the topic attribute information. 根据权利要求12所述的装置,其特征在于,还包括:第一建立模块,用于建立题目结构化信息库,所述第一建立模块具体用于:The device according to claim 12, further comprising: a first establishing module, configured to establish a topic structured information base, wherein the first establishing module is specifically configured to: 获取历史题目;Obtain historical topics; 将所述历史题目分发到不同的分类模块中;Distributing the historical title to different classification modules; 对所述历史题目进行题目特征提取,获取每个分类模块对应的题目属性信息;Performing topic feature extraction on the historical topic, and acquiring topic attribute information corresponding to each classification module; 获取所述历史题目的标识信息,并对应保存题目的标识信息与题目属性信息。Obtaining the identification information of the historical topic, and correspondingly storing the identification information of the topic and the topic attribute information. 根据权利要求13所述的装置,其特征在于,所述第一建立模块用于对所述历史题目进行题目特征提取,获取每个分类模块对应的题目属性信息,包括:The device according to claim 13, wherein the first establishing module is configured to perform topic feature extraction on the historical topic, and obtain topic attribute information corresponding to each classification module, including: 当所述分类模块是题目知识点抽取模块时,进行题目关键词抽取,以及,相似片断比对,并根据题目关键词抽取获取第一标签,根据相似片断比对得到第二标签,以及,对所述第一标签和所述第二标签进行融合,从融合后的标签中,选择预设个数的标签,确定为知识点标签。When the classification module is a topic knowledge point extraction module, the topic keyword extraction is performed, and the similar segment comparison is performed, and the first label is extracted according to the topic keyword, and the second label is obtained according to the similar segment comparison, and The first label and the second label are merged, and a preset number of labels are selected from the merged labels to be determined as a knowledge point label. 根据权利要求11-14任一项所述的装置,其特征在于,所述获取模块用于根据所述题目属性信息获取初步检索结果,包括:The device according to any one of claims 11 to 14, wherein the obtaining module is configured to obtain preliminary search results according to the topic attribute information, including: 获取所述检索题目的关键词,并根据所述关键词进行文本检索,得到文本检索结果;Obtaining a keyword of the search topic, and performing a text search according to the keyword to obtain a text search result; 根据所述题目属性信息对所述文本检索结果进行调权,得到调权后的检索结果;And adjusting the weight of the text search result according to the attribute information of the topic to obtain a search result after the weight adjustment; 获取所述题目属性信息中的知识点信息,对所述调权后的检索结果进行再次调权,其中,所述知识点信息包括:单一知识点或者混合知识点;Obtaining the knowledge point information in the attribute information of the topic, and re-adjusting the weighted search result, where the knowledge point information includes: a single knowledge point or a mixed knowledge point; 从再次调权后的检索结果中,选择预设个数的检索结果,确定为所述初步检索结果。From the search result after the weight adjustment again, a preset number of search results are selected and determined as the preliminary search result. 根据权利要求11-15任一项所述的装置,其特征在于,所述排序模块用于获取用户的用户描述信息,包括:The device according to any one of claims 11 to 15, wherein the sorting module is configured to acquire user description information of the user, including: 获取用户的标识信息;Obtain identification information of the user; 在预设建立的用户模型中,获取与所述用户的标识信息对应的用户描述信息,其中,所述用户模型中对应保存用户的标识信息与用户描述信息。The user description information corresponding to the identifier information of the user is obtained in a preset user model, where the user model correspondingly stores the identifier information of the user and the user description information. 根据权利要求16所述的装置,其特征在于,还包括:第二建立模块,用于建立用户模型,所述第二建立模块具体用于:The device according to claim 16, further comprising: a second establishing module, configured to establish a user model, wherein the second establishing module is specifically configured to: 获取用户行为日志,以及获取用户属性信息;Obtain a user behavior log and obtain user attribute information; 根据题目结构化信息库,以及所述用户行为日志和用户属性信息,进行用户建模,得到用户模型。According to the topic structured information base, and the user behavior log and user attribute information, user modeling is performed to obtain a user model. 根据权利要求11-17任一项所述的装置,其特征在于,所述排序模块用于根据所述用户描述信息对所述初步检索结果进行排序,包括:The device according to any one of claims 11-17, wherein the sorting module is configured to sort the preliminary search results according to the user description information, including: 对与用户描述信息一致的初步检索结果进行加权,并根据加权后的权重,对初步检索 结果进行排序。Pivoting the preliminary search results consistent with the user description information, and based on the weighted weights, the preliminary search The results are sorted. 根据权利要求11-18任一项所述的装置,其特征在于,所述检索题目是由所述用户输入的,所述装置还包括:The device according to any one of claims 11 to 18, wherein the search term is input by the user, the device further comprising: 展示模块,用于向所述用户展示所述推荐题目。And a display module, configured to display the recommended topic to the user. 一种电子设备,其特征在于,包括:An electronic device, comprising: 一个或者多个处理器;One or more processors; 存储器;Memory 一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时:One or more programs, the one or more programs being stored in the memory, when executed by the one or more processors: 执行如权利要求1-10任一项所述的方法。Performing the method of any of claims 1-10. 一种非易失性计算机存储介质,其特征在于,所述计算机存储介质存储有一个或者多个模块,当所述一个或者多个模块被执行时:A non-volatile computer storage medium characterized in that the computer storage medium stores one or more modules when the one or more modules are executed: 执行如权利要求1-10任一项所述的方法。 Performing the method of any of claims 1-10.
PCT/CN2015/090002 2015-05-14 2015-09-18 Method and device for question recommendation Ceased WO2016179938A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510246727.2A CN104834729B (en) 2015-05-14 2015-05-14 Topic recommends method and topic recommendation apparatus
CN201510246727.2 2015-05-14

Publications (1)

Publication Number Publication Date
WO2016179938A1 true WO2016179938A1 (en) 2016-11-17

Family

ID=53812615

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/090002 Ceased WO2016179938A1 (en) 2015-05-14 2015-09-18 Method and device for question recommendation

Country Status (2)

Country Link
CN (1) CN104834729B (en)
WO (1) WO2016179938A1 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222678A (en) * 2019-04-30 2019-09-10 宜春宜联科技有限公司 A kind of item analysis method, system, readable storage medium storing program for executing and electronic equipment
CN110347791A (en) * 2019-06-20 2019-10-18 广东工业大学 A kind of topic recommended method based on multi-tag classification convolutional neural networks
CN110737698A (en) * 2019-10-15 2020-01-31 重庆浪尖至简物联网科技有限公司 question-related information recommendation method based on question description
CN111507550A (en) * 2019-01-30 2020-08-07 广州泰迪智能科技有限公司 Automatic recommendation method for optimal solution of work order problem
CN111797222A (en) * 2020-06-29 2020-10-20 平安国际智慧城市科技股份有限公司 Course knowledge graph construction method, device, terminal and storage medium
CN111898343A (en) * 2020-08-03 2020-11-06 北京师范大学 A Phrase Structure Tree-Based Similar Topic Recognition Method and System
CN111931875A (en) * 2020-10-10 2020-11-13 北京世纪好未来教育科技有限公司 Data processing method, electronic device and computer readable medium
CN112069295A (en) * 2020-09-18 2020-12-11 科大讯飞股份有限公司 Similar question recommendation method and device, electronic equipment and storage medium
CN112100341A (en) * 2020-04-13 2020-12-18 上海迷因网络科技有限公司 Intelligent question classification and recommendation method for rapid expressive force test
CN112256743A (en) * 2020-10-22 2021-01-22 北京猿力未来科技有限公司 Adaptive question setting method, equipment and storage medium
CN112256869A (en) * 2020-10-12 2021-01-22 浙江大学 Same-knowledge-point test question grouping system and method based on question meaning text
CN112424763A (en) * 2019-04-30 2021-02-26 北京字节跳动网络技术有限公司 Object recommendation method and device, storage medium and terminal equipment
CN112487183A (en) * 2020-11-10 2021-03-12 江苏乐易学教育科技有限公司 Labeled test question knowledge point classification method and system
CN112686052A (en) * 2020-12-28 2021-04-20 科大讯飞股份有限公司 Test question recommendation method, test question training method, electronic equipment and storage device
CN113051886A (en) * 2021-03-25 2021-06-29 科大讯飞股份有限公司 Test question duplicate checking method and device, storage medium and equipment
CN113254611A (en) * 2021-05-18 2021-08-13 北京小米移动软件有限公司 Question recommendation method and device, electronic equipment and storage medium
CN113538188A (en) * 2021-07-27 2021-10-22 北京世纪好未来教育科技有限公司 Test paper generation method, device, electronic device and computer-readable storage medium
CN113590961A (en) * 2021-08-03 2021-11-02 浙江工商大学 Personalized exercise recommendation method and device based on cognition and state evaluation and intelligent terminal
CN113934922A (en) * 2020-07-14 2022-01-14 中移(成都)信息通信科技有限公司 Intelligent recommendation method, device, equipment and computer storage medium
CN114329181A (en) * 2021-11-29 2022-04-12 腾讯科技(深圳)有限公司 Method, device and electronic device for topic recommendation
CN114492638A (en) * 2022-01-26 2022-05-13 第四范式(北京)技术有限公司 Feature extraction method and device, electronic device, and storage medium
CN114817545A (en) * 2022-05-07 2022-07-29 上海交通大学宁波人工智能研究院 A similar topic recommendation system and method
CN115080724A (en) * 2021-03-15 2022-09-20 广州视源电子科技股份有限公司 Exercise recommendation method and device, electronic equipment and storage medium
CN115374255A (en) * 2021-05-18 2022-11-22 腾讯科技(深圳)有限公司 Topic recommendation method, device, equipment and storage medium
CN115481328A (en) * 2022-10-12 2022-12-16 杭州摩西科技发展有限公司 Method, device, computer equipment and storage medium for generating customized question bank
CN116166956A (en) * 2021-11-19 2023-05-26 北京猿力未来科技有限公司 Training method and device for topic processing model
CN116610782A (en) * 2023-04-28 2023-08-18 北京百度网讯科技有限公司 Text retrieval method, device, electronic equipment and medium
CN118428360A (en) * 2024-07-05 2024-08-02 卓世智星(青田)元宇宙科技有限公司 Question quality detection method and device

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834729B (en) * 2015-05-14 2018-08-10 作业帮教育科技(北京)有限公司 Topic recommends method and topic recommendation apparatus
CN105138653B (en) * 2015-08-28 2018-08-21 天津大学 It is a kind of that method and its recommendation apparatus are recommended based on typical degree and the topic of difficulty
CN105678575B (en) * 2015-12-31 2020-11-13 华南师范大学 Personalized recommendation method and system based on user attribute knowledge base
CN105912654A (en) * 2016-04-08 2016-08-31 南方电网科学研究院有限责任公司 Case retrieval method and system for power equipment defect fault
CN107301163B (en) * 2016-04-14 2020-11-17 科大讯飞股份有限公司 Formula-containing text semantic parsing method and device
CN106599054B (en) * 2016-11-16 2019-12-24 福建天泉教育科技有限公司 Method and system for classifying and pushing questions
CN106776724B (en) * 2016-11-16 2020-09-08 福建天泉教育科技有限公司 Question classification method and system
CN106651696B (en) * 2016-11-16 2020-10-27 福建天泉教育科技有限公司 Approximate question pushing method and system
CN106777328B (en) * 2017-01-11 2020-11-20 广东小天才科技有限公司 Method and device for topic recommendation for mobile terminal
CN107562769A (en) * 2017-05-24 2018-01-09 广东工业大学 A kind of online answer topic recommends method and device
CN107301169B (en) * 2017-06-16 2021-02-05 科大讯飞股份有限公司 Method and device for detecting off-topic composition and terminal equipment
CN107292785A (en) * 2017-06-27 2017-10-24 北京粉笔蓝天科技有限公司 One kind is set a question method and system
CN108090119A (en) * 2017-11-08 2018-05-29 广东小天才科技有限公司 Method, device, mobile terminal and storage medium for displaying answers to questions
CN108171629A (en) * 2017-12-28 2018-06-15 北京中税网控股股份有限公司 A kind of course recommends method and device
CN108376132B (en) * 2018-03-16 2020-08-28 中国科学技术大学 A method and system for judging similar test questions
CN108984702A (en) * 2018-07-06 2018-12-11 深圳市卓帆技术有限公司 Examination question comparison method and system
CN109635100A (en) * 2018-12-24 2019-04-16 上海仁静信息技术有限公司 A kind of recommended method, device, electronic equipment and the storage medium of similar topic
CN111723231B (en) * 2019-03-20 2023-10-17 北京百舸飞驰科技有限公司 Question prediction method and device
CN110362671B (en) * 2019-07-16 2022-04-19 安徽知学科技有限公司 Topic recommendation method, device and storage medium
CN110689275A (en) * 2019-10-10 2020-01-14 江苏曲速教育科技有限公司 Method and system for analyzing and quantitatively scoring wrong questions
CN111654516A (en) * 2020-03-06 2020-09-11 厦门区块链云科技有限公司 Block chain original content cochain and distribution system
CN111625631B (en) * 2020-04-14 2022-06-17 西南大学 Method for generating option of choice question
CN111222076B (en) * 2020-04-16 2020-08-07 江西软云科技股份有限公司 Topic pushing method, system, readable storage medium and computer equipment
CN111914176B (en) * 2020-08-07 2023-10-27 腾讯科技(深圳)有限公司 Question recommendation method and device
CN111859094A (en) * 2020-08-10 2020-10-30 广州驰兴通用技术研究有限公司 Information analysis method and system based on cloud computing
CN112216161B (en) * 2020-10-23 2022-02-22 新维畅想数字科技(北京)有限公司 Digital work teaching method and device
CN112614034A (en) * 2021-03-05 2021-04-06 北京世纪好未来教育科技有限公司 Test question recommendation method and device, electronic equipment and readable storage medium
CN113360631A (en) * 2021-05-26 2021-09-07 医声医事(北京)科技有限公司 Intelligent volume assembling method and device
CN113409635B (en) * 2021-06-17 2025-02-11 上海松鼠课堂人工智能科技有限公司 Interactive teaching method and system based on virtual reality scene
US11729068B2 (en) 2021-09-09 2023-08-15 International Business Machines Corporation Recommend target systems for operator to attention in monitor tool
CN114254615B (en) * 2021-12-14 2025-06-10 科大讯飞股份有限公司 Method, device, electronic equipment and storage medium for winding
CN115344686A (en) * 2022-08-22 2022-11-15 北京有竹居网络技术有限公司 A topic recommendation method, device, computer equipment and storage medium
CN118170945B (en) * 2024-05-15 2024-08-13 北京无忧创想信息技术有限公司 Post-class problem generation method and device for community video courses
CN118760758B (en) * 2024-09-05 2025-08-22 光合新知(北京)科技有限公司 A label matching method and system for smart teaching

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034373A (en) * 2009-09-29 2011-04-27 新技网路科技股份有限公司 Auxiliary learning method and system thereof
CN103136302A (en) * 2011-12-05 2013-06-05 北大方正集团有限公司 Method and device of test question repeat output
US20130262459A1 (en) * 2012-04-03 2013-10-03 Python4Fun Identifying social profiles in a social network having relevance to a first file
CN103577507A (en) * 2012-08-10 2014-02-12 俞晓鸿 Intelligent question bank system with real-time detection and self-adaptive evolution mechanism and method
CN104834729A (en) * 2015-05-14 2015-08-12 百度在线网络技术(北京)有限公司 Title recommendation method and title recommendation device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8200661B1 (en) * 2008-12-18 2012-06-12 Google Inc. Dynamic recommendations based on user actions
CN103955525A (en) * 2014-05-09 2014-07-30 北京奇虎科技有限公司 Method and client for searching answer to test question
CN104063443A (en) * 2014-06-13 2014-09-24 百度在线网络技术(北京)有限公司 Method and device for providing search result

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034373A (en) * 2009-09-29 2011-04-27 新技网路科技股份有限公司 Auxiliary learning method and system thereof
CN103136302A (en) * 2011-12-05 2013-06-05 北大方正集团有限公司 Method and device of test question repeat output
US20130262459A1 (en) * 2012-04-03 2013-10-03 Python4Fun Identifying social profiles in a social network having relevance to a first file
CN103577507A (en) * 2012-08-10 2014-02-12 俞晓鸿 Intelligent question bank system with real-time detection and self-adaptive evolution mechanism and method
CN104834729A (en) * 2015-05-14 2015-08-12 百度在线网络技术(北京)有限公司 Title recommendation method and title recommendation device

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507550A (en) * 2019-01-30 2020-08-07 广州泰迪智能科技有限公司 Automatic recommendation method for optimal solution of work order problem
CN110222678A (en) * 2019-04-30 2019-09-10 宜春宜联科技有限公司 A kind of item analysis method, system, readable storage medium storing program for executing and electronic equipment
CN112424763B (en) * 2019-04-30 2023-09-12 抖音视界有限公司 Object recommendation method and device, storage medium and terminal equipment
CN112424763A (en) * 2019-04-30 2021-02-26 北京字节跳动网络技术有限公司 Object recommendation method and device, storage medium and terminal equipment
CN110347791A (en) * 2019-06-20 2019-10-18 广东工业大学 A kind of topic recommended method based on multi-tag classification convolutional neural networks
CN110737698A (en) * 2019-10-15 2020-01-31 重庆浪尖至简物联网科技有限公司 question-related information recommendation method based on question description
CN112100341A (en) * 2020-04-13 2020-12-18 上海迷因网络科技有限公司 Intelligent question classification and recommendation method for rapid expressive force test
CN112100341B (en) * 2020-04-13 2023-07-07 上海擅择教育科技有限公司 Intelligent question classification and recommendation method for rapid expressive force test
CN111797222A (en) * 2020-06-29 2020-10-20 平安国际智慧城市科技股份有限公司 Course knowledge graph construction method, device, terminal and storage medium
CN111797222B (en) * 2020-06-29 2023-12-22 平安国际智慧城市科技股份有限公司 Course knowledge graph construction method, device, terminal and storage medium
CN113934922A (en) * 2020-07-14 2022-01-14 中移(成都)信息通信科技有限公司 Intelligent recommendation method, device, equipment and computer storage medium
CN111898343A (en) * 2020-08-03 2020-11-06 北京师范大学 A Phrase Structure Tree-Based Similar Topic Recognition Method and System
CN112069295A (en) * 2020-09-18 2020-12-11 科大讯飞股份有限公司 Similar question recommendation method and device, electronic equipment and storage medium
CN111931875A (en) * 2020-10-10 2020-11-13 北京世纪好未来教育科技有限公司 Data processing method, electronic device and computer readable medium
CN111931875B (en) * 2020-10-10 2021-10-08 北京世纪好未来教育科技有限公司 Data processing method, electronic device and computer readable medium
CN112256869A (en) * 2020-10-12 2021-01-22 浙江大学 Same-knowledge-point test question grouping system and method based on question meaning text
CN112256743B (en) * 2020-10-22 2024-06-04 北京猿力未来科技有限公司 Self-adaptive question setting method, device and storage medium
CN112256743A (en) * 2020-10-22 2021-01-22 北京猿力未来科技有限公司 Adaptive question setting method, equipment and storage medium
CN112487183A (en) * 2020-11-10 2021-03-12 江苏乐易学教育科技有限公司 Labeled test question knowledge point classification method and system
CN112686052A (en) * 2020-12-28 2021-04-20 科大讯飞股份有限公司 Test question recommendation method, test question training method, electronic equipment and storage device
CN112686052B (en) * 2020-12-28 2023-12-01 科大讯飞股份有限公司 Test question recommendation and related model training method, electronic equipment and storage device
CN115080724B (en) * 2021-03-15 2025-08-12 广州视源电子科技股份有限公司 Problem recommendation method and device, electronic equipment and storage medium
CN115080724A (en) * 2021-03-15 2022-09-20 广州视源电子科技股份有限公司 Exercise recommendation method and device, electronic equipment and storage medium
CN113051886A (en) * 2021-03-25 2021-06-29 科大讯飞股份有限公司 Test question duplicate checking method and device, storage medium and equipment
CN113051886B (en) * 2021-03-25 2023-12-01 科大讯飞股份有限公司 Test question duplicate checking method, device, storage medium and equipment
CN113254611A (en) * 2021-05-18 2021-08-13 北京小米移动软件有限公司 Question recommendation method and device, electronic equipment and storage medium
CN115374255A (en) * 2021-05-18 2022-11-22 腾讯科技(深圳)有限公司 Topic recommendation method, device, equipment and storage medium
CN113538188A (en) * 2021-07-27 2021-10-22 北京世纪好未来教育科技有限公司 Test paper generation method, device, electronic device and computer-readable storage medium
CN113538188B (en) * 2021-07-27 2024-03-01 北京世纪好未来教育科技有限公司 Test paper generation method and device, electronic equipment and computer readable storage medium
CN113590961A (en) * 2021-08-03 2021-11-02 浙江工商大学 Personalized exercise recommendation method and device based on cognition and state evaluation and intelligent terminal
CN113590961B (en) * 2021-08-03 2023-06-23 浙江工商大学 Personalized exercise recommendation method, device and intelligent terminal based on cognition and state evaluation
CN116166956A (en) * 2021-11-19 2023-05-26 北京猿力未来科技有限公司 Training method and device for topic processing model
CN114329181A (en) * 2021-11-29 2022-04-12 腾讯科技(深圳)有限公司 Method, device and electronic device for topic recommendation
CN114492638A (en) * 2022-01-26 2022-05-13 第四范式(北京)技术有限公司 Feature extraction method and device, electronic device, and storage medium
CN114817545A (en) * 2022-05-07 2022-07-29 上海交通大学宁波人工智能研究院 A similar topic recommendation system and method
CN115481328A (en) * 2022-10-12 2022-12-16 杭州摩西科技发展有限公司 Method, device, computer equipment and storage medium for generating customized question bank
CN116610782A (en) * 2023-04-28 2023-08-18 北京百度网讯科技有限公司 Text retrieval method, device, electronic equipment and medium
CN116610782B (en) * 2023-04-28 2024-03-15 北京百度网讯科技有限公司 Text retrieval method, device, electronic equipment and medium
CN118428360A (en) * 2024-07-05 2024-08-02 卓世智星(青田)元宇宙科技有限公司 Question quality detection method and device

Also Published As

Publication number Publication date
CN104834729B (en) 2018-08-10
CN104834729A (en) 2015-08-12

Similar Documents

Publication Publication Date Title
CN104834729B (en) Topic recommends method and topic recommendation apparatus
CN110888990B (en) Text recommendation method, device, equipment and medium
CN108009228B (en) Method, device and storage medium for setting content label
US11023523B2 (en) Video content retrieval system
CN106649818B (en) Application search intent identification method, device, application search method and server
US8930288B2 (en) Learning tags for video annotation using latent subtags
US8787683B1 (en) Image classification
Marujo et al. Supervised topical key phrase extraction of news stories using crowdsourcing, light filtering and co-reference normalization
US9348900B2 (en) Generating an answer from multiple pipelines using clustering
US9230009B2 (en) Routing of questions to appropriately trained question and answer system pipelines using clustering
CN109271574A (en) A kind of hot word recommended method and device
US20140358928A1 (en) Clustering Based Question Set Generation for Training and Testing of a Question and Answer System
US20080319973A1 (en) Recommending content using discriminatively trained document similarity
US20120259801A1 (en) Transfer of learning for query classification
US20150095300A1 (en) System and method for mark-up language document rank analysis
CN112307336B (en) Hot spot information mining and previewing method and device, computer equipment and storage medium
CN106547864B (en) A Personalized Information Retrieval Method Based on Query Expansion
CN106649761A (en) Search result display method and device based on profound questioning and answering
CN105975639B (en) Search result ordering method and device
TW201337814A (en) Product information publishing method and device
Sharma et al. NIRMAL: Automatic identification of software relevant tweets leveraging language model
CN106462644B (en) Identify the preferred results page identified from multiple results pages
CN109299277A (en) Public opinion analysis method, server and computer-readable storage medium
CN108038099B (en) A low-frequency keyword recognition method based on word clustering
US20160299891A1 (en) Matching of an input document to documents in a document collection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15891641

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15891641

Country of ref document: EP

Kind code of ref document: A1