CN117149984B - Customization training method and device based on large model thinking chain - Google Patents

Customization training method and device based on large model thinking chain Download PDF

Info

Publication number
CN117149984B
CN117149984B CN202311414847.XA CN202311414847A CN117149984B CN 117149984 B CN117149984 B CN 117149984B CN 202311414847 A CN202311414847 A CN 202311414847A CN 117149984 B CN117149984 B CN 117149984B
Authority
CN
China
Prior art keywords
training
knowledge base
keywords
input prompt
questions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311414847.XA
Other languages
Chinese (zh)
Other versions
CN117149984A (en
Inventor
赵策
王亚
屠静
周勤民
张玥
雷媛媛
孙岩
潘亮亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuoshi Future Beijing technology Co ltd
Original Assignee
Zhuoshi Future Beijing technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuoshi Future Beijing technology Co ltd filed Critical Zhuoshi Future Beijing technology Co ltd
Priority to CN202311414847.XA priority Critical patent/CN117149984B/en
Publication of CN117149984A publication Critical patent/CN117149984A/en
Application granted granted Critical
Publication of CN117149984B publication Critical patent/CN117149984B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • G06Q50/2057Career enhancement or continuing education service

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Tourism & Hospitality (AREA)
  • Probability & Statistics with Applications (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a customization training method and device based on a large model thinking chain, which are applied to the technical field of training informatization and comprise the following steps: the method comprises the steps of obtaining training text data, establishing a training knowledge base, extracting keywords of the training knowledge base, obtaining a history problem list of all students and current problems of target students, obtaining an input prompt sequence with the length of preset times, calculating the possibility of each input prompt, and inputting the input prompt with the highest possibility into a pre-trained large model to obtain customized training information. The invention aims at the problems that in the online training process, the manual reply efficiency of the instructor in the question-answering link is low and the robot question-answering is lack of specialty. The method for generating the customized training material based on the large model thinking chain can provide high-quality customized training material.

Description

Customization training method and device based on large model thinking chain
Technical Field
The invention relates to the technical field of training informatization, in particular to a customization training method and device based on a large model thinking chain.
Background
Training is an important ring of enterprise management. On-line training has become a common training modality. The on-line training time and place are flexible, and have great advantages. There have been problems with the interaction of instructors and students. Especially the question and answer links. With the development of artificial intelligence technology, the dialogue robot can provide a certain help on the basis of traditional manual questions and answers, and the training cost is reduced to a certain extent.
Currently, these dialogue robots based on artificial intelligence can be simply boring, and there is no way to answer the professional questions.
Large Model (LLM) is a natural language processing technology based on deep learning, and has been developed significantly in recent years. It can generate high quality text, answer questions, conduct conversations, etc. The large model can extend the traditional training approach. Traditional training typically relies on instructor guidance and textbook limitations, while large models can generate a rich variety of training information based on student needs. Students can get real-time problem solutions and learning support through conversations with large models, and are not limited by time and place.
But large models require students to raise high quality questions to trigger a corresponding high quality reply. The basis for different students is not the same, however, requiring each student to be able to raise high quality issues is a challenge.
First, the accuracy of the vocabulary of the question is a key factor in ensuring that accurate answers are obtained. An accurate question can clearly express the requirement and give enough context information to make the answer more accurate and targeted. Second, the rationality of the question is an important factor in ensuring that a useful answer is obtained. A reasonable problem should be logical with a definite structure of the front consequences. In addition, the relevance of the problem and training materials is also an important consideration. In asking questions, it is necessary to ensure that the questions are relevant and can be answered. In summary, the quality of the learner questions is critical to asking questions to a large model. How to present accurate, reasonable and relevant questions can improve the answer quality and accuracy of the large model.
Disclosure of Invention
The embodiment of the invention provides a customization training method and device based on a large model thinking chain. Aiming at the problems that in the online training process, the manual reply efficiency of the instructor in the question-answering link is low and the robot question-answering is lack of specialty. The method for generating the customized training materials in the question-answering link is provided based on the large model thinking chain. The large language model can interact with students through a dialogue mode, and provides real-time support for solving and learning of professional problems. Students can communicate with the large language model anytime and anywhere, and are not limited by time and place. In order to solve the above-mentioned purpose, the said technical scheme is as follows:
on one hand, the embodiment of the application provides a customization training method based on a large model thinking chain, which comprises the following steps:
s1: acquiring training text data, and preprocessing the training text data to obtain a training knowledge base;
s2: extracting keywords from the text data of training, calculating the importance of each keyword to obtain the weight of each keyword, and adding each keyword and the corresponding weight into a training knowledge base to serve as the keywords of the training knowledge base;
S3: acquiring text information of the history questions of all students, and preprocessing the text information of the history questions of all students to obtain a history question list;
s4: based on the weight of the keywords, adding each problem in the history problem list to a training knowledge base to obtain a complete training knowledge base;
s5: acquiring text information of a current problem of a target student, preprocessing the text information of the current problem of the target student, and carrying out word vector preprocessing on the current problem;
s6: based on the association relation between the word vector of the current problem and the keywords of the training knowledge base, selecting the keywords of the associated training knowledge base and all the associated problems from the complete training knowledge base according to a preset threshold value to obtain an alternative problem list;
s7: adding the current questions to an alternative question list, selecting a preset number of questions each time according to preset times from the alternative question list, and filling the questions into a preset language template to obtain an input prompt, wherein the number of the input prompts is a preset number of input prompt sequences:
s7, adding the current questions to an alternative question list, selecting a preset number of questions each time from the alternative question list according to preset times, and filling the questions into a preset language template to obtain an input prompt, wherein the number of the input prompts is a preset number, and the input prompt sequence comprises the following components:
S71: acquiring a preset language template, wherein the preset language template is a language template comprising slot positions and connecting words of n consecutive problems;
s72: selecting n questions from the alternative question list according to preset times each time;
s73: based on a complete training knowledge base, extracting nodes corresponding to n problems, and acquiring the shortest paths of the nodes; using a path calculation method to arrange n problems according to the starting point and the end point of the path;
s74: sequentially filling the n arranged problems into the slots of the n consecutive problems to obtain an input prompt;
s75: repeating the steps S72-S74 until the input prompts with preset times are obtained, wherein the number of the constituent elements is an input prompt sequence with preset number;
s8: calculating the possibility of each input prompt based on the input prompt sequence and the training knowledge base, and obtaining the input prompt with the highest possibility;
s9: and inputting the input prompt with the highest possibility into the pre-trained large model to obtain the output of the large model of the current problem of the target trainee, and combining the keywords of the associated training knowledge base and the output of the large model of the current problem to obtain customized training information.
Preferably, the step S1 of obtaining training text data and preprocessing the training text data to obtain a training knowledge base includes:
S11: preprocessing the text data of training to obtain key knowledge points and identifying labels of named entities as key concepts, wherein the preprocessing comprises the following steps: segmentation, stop word removal, entity identification and part-of-speech analysis;
s12: based on the text data of training, carrying out relation extraction on the key knowledge points and the key concepts to obtain key concept relations;
s13: constructing a preliminary knowledge graph according to the key knowledge points, the key concepts and the key concept relations;
s14: and adding labels and links for key knowledge points in the preliminary knowledge graph based on the training text data to obtain the training knowledge graph.
Preferably, the step S2 of extracting keywords from the text data of the training, calculating the importance of each keyword, obtaining the weight of each keyword, adding each keyword and the corresponding weight to the training knowledge base as the keywords of the training knowledge base, and the step of:
s21: preprocessing the text data of the training to obtain preprocessed text data of the training, wherein the preprocessing comprises word segmentation, word stopping and word stem;
s22: constructing a vocabulary based on the preprocessed training text data, and recording all the words which appear and the frequency of each word;
S23: selecting words with occurrence times larger than a preset threshold value to obtain keywords of a training knowledge base;
s24: constructing an inverted index based on the preprocessed training text data;
s25: calculating the weight of each keyword based on the formula (1) to obtain the weight of each keyword of the training knowledge base:
(1)
wherein,is the weight of the key word, +.>Training text material for keywords +.>Is used to determine the number of occurrences of the picture,text data quantity for training containing keywords, < +.>Is the text data quantity;
s26: the weight of the keywords of each training knowledge base is added to the training knowledge base.
Preferably, the step S3 of obtaining text information of the history questions of all the students, and preprocessing the text information of the history questions of all the students to obtain a history question list includes:
s31: preprocessing text information of history problems of all students to obtain phrase sets;
s32: inputting each sentence in the phrase set into a pre-trained phrase classifier, and dividing all phrases into question sentences and non-question sentences;
s33: and obtaining a question set, and converting the question set into a history problem list.
Preferably, the step S4 of adding each question in the history question list to the training knowledge base to obtain a complete training knowledge base, where the step of obtaining the complete training knowledge base includes:
S41: extracting keywords of each phrase from the history problem list;
s42: based on the keywords of each phrase and the close meaning words of the keywords, establishing a matching relation between the phrase and the nodes with the same keywords in the training knowledge base, and if one phrase has a plurality of keywords, matching the phrase to the node with the highest weight of the keywords;
s43: based on the matching relation, each phrase is used as a label to be added to the training knowledge base, and the complete training knowledge base is obtained.
Preferably, the step S8 of calculating the likelihood of each input prompt based on the input prompt sequence and the training knowledge base to obtain the input prompt with the highest likelihood includes:
s81: extracting attributes of input prompts based on a training knowledge base, initializing super parameters, and inputting a formula (2) to obtain the initialized predicted possibility:
(2)
wherein,possibility of being (I),>is the sum of word frequency of each input prompt associated keyword,/->Is the shortest path geometry distance of each input prompt in the training knowledge base,/for each input prompt>Is the hierarchical depth of input prompts at the shortest path through nodes of training knowledge base, +.>Is a superparameter->Is->Weight of->Is->Weight of->Is thatWeights of (2);
S82: performing a hyper-parametric solution to equation (2) based on equation (3) and the initialized predicted likelihood:
(3)
wherein,is a loss function, +.>Is the number of samples, +.>Is a predicted value of likelihood, < >>Is a true value of likelihood;
s83: based on the formula (2) and the super parameters, obtaining the possibility of each input prompt;
s84: and (5) carrying out reverse sorting according to the possibility to obtain the input prompt with the highest possibility.
In a second aspect, embodiments of the present application provide a large model thought chain-based customized training apparatus, including the steps of:
text data unit: the training knowledge base is used for acquiring training text data and preprocessing the training text data to obtain a training knowledge base;
keyword unit: the method comprises the steps of extracting keywords from training text data, calculating the importance of each keyword, obtaining the weight of each keyword, and adding each keyword and the corresponding weight into a training knowledge base to serve as the keywords of the training knowledge base;
history problem unit: the method comprises the steps of acquiring text information of historical questions of all students, and preprocessing the text information of the historical questions of all students to obtain a historical question list;
knowledge base unit: the method comprises the steps of adding each problem in a history problem list to a training knowledge base based on the weight of a keyword to obtain a complete training knowledge base;
Current problem unit: the method comprises the steps of acquiring text information of a current problem of a target student, preprocessing the text information of the current problem of the target student, and carrying out word vector processing on the current problem;
alternative problem unit: the method comprises the steps of selecting keywords of an associated training knowledge base and all associated problems from a complete training knowledge base according to a preset threshold value based on the association relation between word vectors of current problems and keywords of the training knowledge base, and obtaining an alternative problem list;
an input prompting unit: the method is used for adding the current questions to an alternative question list, selecting n questions each time from the alternative question list according to preset times to fill in a preset language template to obtain an input prompt, and the number of the constituent elements is an input prompt sequence with the preset number, and specifically comprises the following steps:
s71: acquiring a preset language template, wherein the preset language template is a language template comprising slot positions and connecting words of n consecutive problems;
s72: selecting n questions from the alternative question list according to preset times each time;
s73: based on a complete training knowledge base, extracting nodes corresponding to n problems, and acquiring the shortest paths of the nodes; using a path calculation method to arrange n problems according to the starting point and the end point of the path;
S74: sequentially filling the n arranged problems into the slots of the n consecutive problems to obtain an input prompt;
s75: repeating the steps S72-S74 until the input prompts with preset times are obtained, wherein the number of the constituent elements is an input prompt sequence with preset number;
possibility unit: the method comprises the steps of calculating the possibility of each input prompt based on an input prompt sequence and a training knowledge base, and obtaining the input prompt with the highest possibility;
training information unit: and the input prompt with the highest possibility is used for inputting the pre-trained large model to obtain the output of the large model of the current problem of the target student, and the key words of the associated training knowledge base are combined with the output of the large model of the current problem to obtain customized training information.
In a third aspect, an embodiment of the present application provides an electronic device, including: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space surrounded by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the method of any one of the preceding claims.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing one or more programs executable by one or more processors to implement the method of any of the preceding claims.
The technical scheme provided by the embodiment of the invention has the beneficial effects that at least: compared with the prior art, the technical scheme has at least the following beneficial effects: based on the large model, the information quantity provided for the trainees exceeds the range of the traditional training materials, and the generation of customized training materials of the trainees is realized; on the basis, the problem that the questioning level has diversity due to different backgrounds of students is fully considered, the problem of the students is optimized, and the output quality of a large model is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a customized training method based on a large model thinking chain provided by an embodiment of the invention;
FIG. 2 is a schematic diagram of a consecutive problem slot provided by an embodiment of the present invention;
FIG. 3 is a block diagram of a customized training device based on a large model thought chain provided by an embodiment of the invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
The embodiment of the invention provides a customization training based on a large model thinking chain. The method may be implemented by an electronic device, which may be a terminal or a server. A flow chart of a method for customizing training based on a large model thinking chain as shown in fig. 1, the process flow of the method may comprise the following steps:
s1: acquiring training text data, and preprocessing the training text data to obtain a training knowledge base;
preferably, the S1 includes:
s11: preprocessing the text data of training to obtain key knowledge points and identifying labels of named entities as key concepts, wherein the preprocessing comprises the following steps: segmentation, stop word removal, entity identification and part-of-speech analysis;
S12: based on the text data of training, carrying out relation extraction on the key knowledge points and the key concepts to obtain key concept relations;
s13: constructing a preliminary knowledge graph according to the key knowledge points, the key concepts and the key concept relations;
s14: and adding labels and links for key knowledge points in the preliminary knowledge graph based on the training text data to obtain the training knowledge graph.
In some embodiments, named entity recognition (Named Entity Recognition, NER) is a natural language processing technique aimed at recognizing named entities from text that have a particular meaning, such as person's name, place name, organization's name, etc. The goal of named entity recognition is to accurately determine named entity boundaries in text and categorize them into predefined categories. Named entity recognition plays a key role in many tasks, such as information extraction, question-answering systems, machine translation, etc. The named entity recognition process comprises the steps of corpus collection, data preprocessing, feature engineering, model selection and training, feature extraction and recognition, post-processing, evaluation and the like. By using a proper machine learning model and a proper feature representation method and combining a large amount of labeling data for model training, accuracy and robustness of named entity identification can be realized.
The key knowledge points are based on the frame combing of the text data, the content organization is carried out according to the content of the text data, the frame is obtained, and then the frame is extracted. Key knowledge points may be nouns or short sentences of a concept definition. The named entity identification is that a large number of named entities, such as person names, place names, organization names and the like, exist in the training data, so that the automatic processing and analysis of the training data are realized, and a richer and more accurate knowledge graph is constructed.
S2: extracting keywords from the text data of training, calculating the importance of each keyword to obtain the weight of each keyword, and adding each keyword and the corresponding weight into a training knowledge base to serve as the keywords of the training knowledge base;
preferably, the S2 includes:
s21: preprocessing the text data of the training to obtain preprocessed text data of the training, wherein the preprocessing comprises word segmentation, word stopping and word stem;
s22: constructing a vocabulary based on the preprocessed training text data, and recording all the words which appear and the frequency of each word;
S23: selecting words with occurrence times larger than a preset threshold value to obtain keywords of a training knowledge base;
s24: constructing an inverted index based on the preprocessed training text data;
s25: calculating the weight of each keyword based on the formula (1) to obtain the weight of each keyword of the training knowledge base:
(1)
wherein,is the weight of the key word, +.>Training text material for keywords +.>Is used to determine the number of occurrences of the picture,text data quantity for training containing keywords, < +.>Is the text data quantity;
s26: the weight of the keywords of each training knowledge base is added to the training knowledge base.
In some embodiments, the weights may also be calculated by Word2 Vec. Word2Vec is a technique that maps words into vector space that captures semantic relationships between words. By calculating the similarity between word vectors, each word can be given a weight.
It should be noted that the inverted index is a data structure for quickly retrieving a document. It maps the word in the document into a list of documents that contain the word. This indexing approach can effectively support both full text searches and keyword searches. Inverted indexes can be used to calculate the weight of each word, one common approach being to use the TF-IDF (word frequency-inverse document frequency) algorithm. The TF-IDF algorithm evaluates the importance of a word in a document collection by calculating word frequency and inverse document frequency. Word frequency (TF) refers to the frequency with which a word appears in a document. Word frequency may be calculated for each word using the word frequency divided by the total number of words in the document. The Inverse Document Frequency (IDF) refers to the importance of a word in the entire document collection. The inverse document frequency for each word may be calculated using the total number of documents divided by the logarithm of the number of documents containing the word. The word frequency and the inverse document frequency are multiplied to obtain the weight of each word. Such weight calculations may help a search engine determine which words in a document are more important and relevant.
It should be further noted that, for training materials, the weights may not accurately reflect the importance of the training document. For example, certain important keywords may appear less frequently in documents, but have a significant impact on training content, while inverted indexes may not evaluate their importance correctly.
Preferably, the defect of the inverted index can be corrected by giving higher weight to the words in the schema.
S3: acquiring text information of the history questions of all students, and preprocessing the text information of the history questions of all students to obtain a history question list;
preferably, the S3 includes:
s31: preprocessing text information of history problems of all students to obtain phrase sets;
s32: inputting each sentence in the phrase set into a pre-trained phrase classifier, and dividing all phrases into question sentences and non-question sentences;
s33: and obtaining a question set, and converting the question set into a history problem list.
In some embodiments, a naive bayes, support vector machine, deep learning model, etc. method may be employed to construct the phrase classifier.
It should be further noted that, in general, question and non-question will exhibit different clustering characteristics in the word vector space. For questions, because questions (e.g., "who," "what," "where," etc.) are typically included, these questions may form distinct clusters with other words in the word vector space, as they are typically related to the question. In addition, questions typically contain specific vocabulary and phrase patterns. Because the method mainly identifies questions, when accuracy judgment of the classifier is carried out again, the questions are mainly identified as true accuracy. At the same time the application needs to guarantee the diversity of the problem. If the problems are more concentrated, multiple problems can be automatically generated by using a large model, and the problems are corrected to cover all knowledge points.
Preferably, the history problem list is ultimately a knowledge point covering all. If the same knowledge point has more problems, the problems can be clustered by using word vectors, and then the problems are combined.
S4: based on the weight of the keywords, adding each problem in the history problem list to a training knowledge base to obtain a complete training knowledge base;
preferably, the S4 includes:
s41: extracting keywords of each phrase from the history problem list;
s42: based on the keywords of each phrase and the close meaning words of the keywords, establishing a matching relation between the phrase and the nodes with the same keywords in the training knowledge base, and if one phrase has a plurality of keywords, matching the phrase to the node with the highest weight of the keywords;
s43: based on the matching relation, each phrase is used as a label to be added to the training knowledge base, and the complete training knowledge base is obtained.
S5: acquiring text information of a current problem of a target student, preprocessing the text information of the current problem of the target student, and carrying out word vector preprocessing on the current problem;
s6: based on the association relation between the word vector of the current problem and the keywords of the training knowledge base, selecting the keywords of the associated training knowledge base and all the associated problems from the complete training knowledge base according to a preset threshold value to obtain an alternative problem list;
In some embodiments, all questions of knowledge points at different levels may be extracted by association relationships. By this means, a better coverage of the knowledge plane can be achieved. Solves the problem of inconsistent questioning levels of students.
S7: adding the current questions to an alternative question list, selecting a preset number of questions from the alternative question list according to preset times, and filling the questions into a preset language template each time to obtain an input prompt, wherein the number of the elements is an input prompt sequence with the preset number;
preferably, the step S7 includes:
s71: acquiring a preset language template, as shown in fig. 2, wherein the preset language template is a language template comprising n consecutive problem slots and connective words;
s72: selecting n questions from the alternative question list according to preset times each time;
s73: based on a complete training knowledge base, extracting nodes corresponding to n problems, and acquiring the shortest paths of the nodes; using a path calculation method to arrange n problems according to the starting point and the end point of the path;
s74: sequentially filling the n arranged problems into n consecutive problem slots to obtain an input prompt;
s75: and repeating the steps S72-S74 until the input prompts with the preset times are obtained, wherein the number of the constituent elements is a preset number of input prompt sequences.
In some embodiments, an alternative problem may be considered to be a discrete space. Searching for the best cues occurs in this discrete space, and the application essentially proposes a local search method that uses a complete training knowledge base to calculate distance measures for different questions. The input prompt sequence with the preset number of the component elements is used for establishing a search space.
It should be noted that, the present application generates n groups of input prompts by sampling from the candidate questions. It is particularly important to ensure the diversity of the alternative questions.
S8: calculating the possibility of each input prompt based on the input prompt sequence and the training knowledge base, and obtaining the input prompt with the highest possibility;
preferably, the step S8 includes:
s81: extracting attributes of input prompts based on a training knowledge base, initializing super parameters, and inputting a formula (2) to obtain the initialized predicted possibility:
(2)
wherein,possibility of being (I),>is the sum of word frequency of each input prompt associated keyword,/->Is the shortest path geometry distance of each input prompt in the training knowledge base,/for each input prompt>Is the hierarchical depth of input prompts at the shortest path through nodes of training knowledge base, +.>Is a superparameter- >Is->Weight of->Is->Weight of->Is thatWeights of (2);
s82: performing a hyper-parametric solution to equation (2) based on equation (3) and the initialized predicted likelihood:
(3)
wherein,is a loss function, +.>Is the number of samples, +.>Is a predicted value of likelihood, < >>Is a true value of likelihood;
s83: based on the formula (2) and the super parameters, obtaining the possibility of each input prompt;
s84: and (5) carrying out reverse sorting according to the possibility to obtain the input prompt with the highest possibility.
S9: and inputting the input prompt with the highest possibility into the pre-trained large model to obtain the output of the large model of the current problem of the target trainee, and combining the keywords of the associated training knowledge base and the output of the large model of the current problem to obtain customized training information.
In some embodiments, customizing training information accounts for weaknesses in the learner's knowledge surface on the one hand and also accounts for correlations between knowledge points. The information thus provided can solve the questions of the learner on the one hand and can fully understand the knowledge system behind the answers for the learner.
It should be noted that large models often suffer from problems of data bias and knowledge inaccuracy. Thus, a large model for training material extraction may be trained from text content associated with the training field. Such as related books, published literature, wiki, etc.
The foregoing is a description of embodiments of the method, and the following further describes embodiments of the device.
As shown in fig. 3, an embodiment of the present application provides a customized training apparatus 300 based on a large model thinking chain, including the following steps:
text data unit 310: the training knowledge base is used for acquiring training text data and preprocessing the training text data to obtain a training knowledge base;
keyword unit 320: the method comprises the steps of extracting keywords from training text data, calculating the importance of each keyword, obtaining the weight of each keyword, and adding each keyword and the corresponding weight into a training knowledge base to serve as the keywords of the training knowledge base;
history problem unit 330: the method comprises the steps of acquiring text information of historical questions of all students, and preprocessing the text information of the historical questions of all students to obtain a historical question list;
knowledge base unit 340: the method comprises the steps of adding each problem in a history problem list to a training knowledge base based on the weight of a keyword to obtain a complete training knowledge base;
current problem unit 350: the method comprises the steps of acquiring text information of a current problem of a target student, preprocessing the text information of the current problem of the target student, and carrying out word vector processing on the current problem;
Alternative questions unit 360: the method comprises the steps of selecting keywords of an associated training knowledge base and all associated problems from a complete training knowledge base according to a preset threshold value based on the association relation between word vectors of current problems and keywords of the training knowledge base, and obtaining an alternative problem list;
the input prompting unit 370: the method is used for adding the current questions to an alternative question list, selecting n questions each time from the alternative question list according to preset times to fill in a preset language template to obtain an input prompt, and the number of the constituent elements is an input prompt sequence with the preset number, and specifically comprises the following steps:
s71: acquiring a preset language template, wherein the preset language template is a language template comprising slot positions and connecting words of n consecutive problems;
s72: selecting n questions from the alternative question list according to preset times each time;
s73: based on a complete training knowledge base, extracting nodes corresponding to n problems, and acquiring the shortest paths of the nodes; using a path calculation method to arrange n problems according to the starting point and the end point of the path;
s74: sequentially filling the n arranged problems into the slots of the n consecutive problems to obtain an input prompt;
S75: repeating the steps S72-S74 until the input prompts with preset times are obtained, wherein the number of the constituent elements is an input prompt sequence with preset number;
possibility unit 380: the method comprises the steps of calculating the possibility of each input prompt based on an input prompt sequence and a training knowledge base, and obtaining the input prompt with the highest possibility;
training information unit 390: and the input prompt with the highest possibility is used for inputting the pre-trained large model to obtain the output of the large model of the current problem of the target student, and the key words of the associated training knowledge base are combined with the output of the large model of the current problem to obtain customized training information.
The embodiment of the invention also provides an electronic device for a large model thinking chain-based customized training method, which is characterized by comprising: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space surrounded by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the method of any one of the preceding claims.
An embodiment of the present invention further provides a computer-readable storage medium for a large model thought chain based customized training method, wherein the computer-readable storage medium stores one or more programs executable by one or more processors to implement the method of any of the preceding claims.
Aiming at the problem that the existing online training question-answering system lacks of the specialization and customization capability, the customization training method based on the large model thinking chain is provided, the method can support the generation consideration of customization training materials of students, meanwhile, the method is not influenced by the background of the students, knowledge point weak items of the students can be accurately identified, high-quality customization training information can be provided, and effective supplement of the traditional training materials is formed.
Fig. 4 is a schematic structural diagram of an electronic device 400 according to an embodiment of the present invention, where the electronic device 400 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 401 and one or more memories 402, where at least one instruction is stored in the memories 402, and the at least one instruction is loaded and executed by the processors 401 to implement the steps of the foregoing method for customizing training based on a large model thinking chain.
In an exemplary embodiment, a computer readable storage medium, such as a memory including instructions executable by a processor in a terminal to perform the above-described chinese text spell checking method, is also provided. For example, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The following points need to be described:
(1) The drawings of the embodiments of the present invention relate only to the structures related to the embodiments of the present invention, and other structures may refer to the general designs.
(2) In the drawings for describing embodiments of the present invention, the thickness of layers or regions is exaggerated or reduced for clarity, i.e., the drawings are not drawn to actual scale. It will be understood that when an element such as a layer, film, region or substrate is referred to as being "on" or "under" another element, it can be "directly on" or "under" the other element or intervening elements may be present.
3) The embodiments of the invention and the features of the embodiments can be combined with each other to give new embodiments without conflict.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (9)

1. A customization training method based on a large model thinking chain is characterized by comprising the following steps:
s1: acquiring training text data, and preprocessing the training text data to obtain a training knowledge base;
s2: extracting keywords from the text data of training, calculating the importance of each keyword to obtain the weight of each keyword, and adding each keyword and the corresponding weight into a training knowledge base to serve as the keywords of the training knowledge base;
s3: acquiring text information of the history questions of all students, and preprocessing the text information of the history questions of all students to obtain a history question list;
s4: based on the weight of the keywords, adding each problem in the history problem list to a training knowledge base to obtain a complete training knowledge base;
S5: acquiring text information of a current problem of a target student, and preprocessing the text information of the current problem of the target student to obtain a word vector of the current problem;
s6: based on the association relation between the word vector of the current problem and the keywords of the training knowledge base, selecting the keywords of the associated training knowledge base and all the associated problems from the complete training knowledge base according to a preset threshold value to obtain an alternative problem list;
s7: adding the current questions to an alternative question list, selecting a preset number of questions each time according to preset times from the alternative question list, and filling the questions into a preset language template to obtain an input prompt, wherein the number of the input prompts is a preset number of input prompt sequences:
s7, adding the current questions to an alternative question list, selecting a preset number of questions each time from the alternative question list according to preset times, and filling the questions into a preset language template to obtain an input prompt, wherein the number of the input prompts is a preset number, and the input prompt sequence comprises the following components:
s71: acquiring a preset language template, wherein the preset language template is a language template comprising slot positions and connecting words of n consecutive problems;
s72: selecting n questions from the alternative question list according to preset times each time;
S73: based on a complete training knowledge base, extracting nodes corresponding to n problems, and acquiring the shortest paths of the nodes; using a path calculation method to arrange n problems according to the starting point and the end point of the path;
s74: sequentially filling the n arranged problems into the slots of the n consecutive problems to obtain an input prompt;
s75: repeating the steps S72-S74 until the input prompts with preset times are obtained, wherein the number of the constituent elements is an input prompt sequence with preset number;
s8: calculating the possibility of each input prompt based on the input prompt sequence and the training knowledge base, and obtaining the input prompt with the highest possibility;
s9: and inputting the input prompt with the highest possibility into the pre-trained large model to obtain the output of the large model of the current problem of the target trainee, and combining the keywords of the associated training knowledge base and the output of the large model of the current problem to obtain customized training information.
2. The method for customizing a training based on a large model thinking chain as claimed in claim 1, wherein the step of S1 of obtaining the training text data and preprocessing the training text data to obtain a training knowledge base comprises the steps of:
s11: preprocessing the text data of training to obtain key knowledge points and identifying labels of named entities as key concepts, wherein the preprocessing comprises the following steps: segmentation, stop word removal, entity identification and part-of-speech analysis;
S12: based on the text data of training, carrying out relation extraction on the key knowledge points and the key concepts to obtain key concept relations;
s13: constructing a preliminary knowledge graph according to the key knowledge points, the key concepts and the key concept relations;
s14: and adding labels and links for key knowledge points in the preliminary knowledge graph based on the training text data to obtain the training knowledge graph.
3. The method for customizing a training system based on a large model thinking chain according to claim 1, wherein the step S2 of extracting keywords from the text data of the training, calculating the importance of each keyword to obtain the weight of each keyword, and adding each keyword and the corresponding weight to the training knowledge base as the keywords of the training knowledge base comprises:
s21: preprocessing the text data of the training to obtain preprocessed text data of the training, wherein the preprocessing comprises word segmentation, word stopping and word stem;
s22: constructing a vocabulary based on the preprocessed training text data, and recording all the words which appear and the frequency of each word;
s23: selecting words with occurrence times larger than a preset threshold value to obtain keywords of a training knowledge base;
S24: constructing an inverted index based on the preprocessed training text data;
s25: calculating the weight of each keyword based on the formula (1) to obtain the weight of each keyword of the training knowledge base:
(1)
wherein,is the weight of the key word, +.>Training text material for keywords +.>The number of occurrences of>Text data quantity for training containing keywords, < +.>Is the text data quantity;
s26: the weight of the keywords of each training knowledge base is added to the training knowledge base.
4. The method for customizing a training system based on a large model thinking chain according to claim 1, wherein the step S3 of obtaining text information of history questions of all students, and preprocessing the text information of the history questions of all students to obtain a history question list comprises:
s31: preprocessing text information of history problems of all students to obtain phrase sets;
s32: inputting each sentence in the phrase set into a pre-trained phrase classifier, and dividing all phrases into question sentences and non-question sentences;
s33: and obtaining a question set, and converting the question set into a history problem list.
5. The method for customizing a training system based on a large model thinking chain as claimed in claim 1, wherein the step S4 of adding each question in the history question list to the training knowledge base to obtain a complete training knowledge base comprises:
S41: extracting keywords of each phrase from the history problem list;
s42: based on the keywords of each phrase and the close meaning words of the keywords, establishing a matching relation between the phrase and the nodes with the same keywords in the training knowledge base, and if a plurality of keywords exist in one phrase, matching the phrase to the node with the highest weight of the keywords;
s43: based on the matching relation, each phrase is used as a label to be added to the training knowledge base, and the complete training knowledge base is obtained.
6. The method for customizing a training system based on a large model thinking chain as claimed in claim 1, wherein the step S8 of calculating the likelihood of each input prompt based on the input prompt sequence and the training knowledge base, and obtaining the input prompt with the highest likelihood comprises:
s81: extracting attributes of input prompts based on a training knowledge base, initializing super parameters, and inputting a formula (2) to obtain the initialized predicted possibility:
(2)
wherein,possibility of being (I),>is the sum of word frequency of each input prompt associated keyword,/->The shortest path geometry distance of each input prompt in the training knowledge base, < >>Is the hierarchical depth of input prompts at the shortest path through nodes of training knowledge base, +. >Is a superparameter->Is->Weight of->Is->Weight of->Is thatWeights of (2);
s82: performing a hyper-parametric solution to equation (2) based on equation (3) and the initialized predicted likelihood:
(3)
where L is the loss function,is the number of samples, +.>Is a predicted value of likelihood, < >>Is a true value of likelihood;
s83: based on the formula (2) and the super parameters, obtaining the possibility of each input prompt;
s84: and (5) carrying out reverse sorting according to the possibility to obtain the input prompt with the highest possibility.
7. A custom training device based on a large model thought chain, characterized in that the device is adapted for use in the method of any of the preceding claims 1-6, the device comprising:
text data unit: the training knowledge base is used for acquiring training text data and preprocessing the training text data to obtain a training knowledge base;
keyword unit: the method comprises the steps of extracting keywords from training text data, calculating the importance of each keyword, obtaining the weight of each keyword, and adding each keyword and the corresponding weight into a training knowledge base to serve as the keywords of the training knowledge base;
history problem unit: the method comprises the steps of acquiring text information of historical questions of all students, and preprocessing the text information of the historical questions of all students to obtain a historical question list;
Knowledge base unit: the method comprises the steps of adding each problem in a history problem list to a training knowledge base based on the weight of a keyword to obtain a complete training knowledge base;
current problem unit: the method comprises the steps of acquiring text information of a current problem of a target student, preprocessing the text information of the current problem of the target student, and carrying out word vector processing on the current problem;
alternative problem unit: the method comprises the steps of selecting keywords of an associated training knowledge base and all associated problems from a complete training knowledge base according to a preset threshold value based on the association relation between word vectors of current problems and keywords of the training knowledge base, and obtaining an alternative problem list;
an input prompting unit: the method is used for adding the current questions to an alternative question list, selecting n questions each time from the alternative question list according to preset times to fill in a preset language template to obtain an input prompt, and the number of the constituent elements is an input prompt sequence with the preset number, and specifically comprises the following steps:
s71: acquiring a preset language template, wherein the preset language template is a language template comprising slot positions and connecting words of n consecutive problems;
s72: selecting n questions from the alternative question list according to preset times each time;
S73: based on a complete training knowledge base, extracting nodes corresponding to n problems, and acquiring the shortest paths of the nodes; using a path calculation method to arrange n problems according to the starting point and the end point of the path;
s74: sequentially filling the n arranged problems into the slots of the n consecutive problems to obtain an input prompt;
s75: repeating the steps S72-S74 until the input prompts with preset times are obtained, wherein the number of the constituent elements is an input prompt sequence with preset number;
possibility unit: the method comprises the steps of calculating the possibility of each input prompt based on an input prompt sequence and a training knowledge base, and obtaining the input prompt with the highest possibility;
training information unit: and the input prompt with the highest possibility is used for inputting the pre-trained large model to obtain the output of the large model of the current problem of the target student, and the key words of the associated training knowledge base are combined with the output of the large model of the current problem to obtain customized training information.
8. An electronic device, the electronic device comprising: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space surrounded by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; a processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the method of any of the preceding claims 1 to 6.
9. A computer readable storage medium storing one or more programs executable by one or more processors to implement the method of any of the preceding claims 1-6.
CN202311414847.XA 2023-10-30 2023-10-30 Customization training method and device based on large model thinking chain Active CN117149984B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311414847.XA CN117149984B (en) 2023-10-30 2023-10-30 Customization training method and device based on large model thinking chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311414847.XA CN117149984B (en) 2023-10-30 2023-10-30 Customization training method and device based on large model thinking chain

Publications (2)

Publication Number Publication Date
CN117149984A CN117149984A (en) 2023-12-01
CN117149984B true CN117149984B (en) 2023-12-29

Family

ID=88901102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311414847.XA Active CN117149984B (en) 2023-10-30 2023-10-30 Customization training method and device based on large model thinking chain

Country Status (1)

Country Link
CN (1) CN117149984B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453898B (en) * 2023-12-25 2024-03-22 中国科学院自动化研究所 Cross-modal question-answering processing method and device based on thinking chain
CN117493531B (en) * 2023-12-29 2024-04-09 深圳星网信通科技股份有限公司 Training material generation method, training material generation equipment and storage medium
CN117786099B (en) * 2024-02-27 2024-04-26 中建安装集团有限公司 Engineering technical data informatization management system and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116739110A (en) * 2023-06-21 2023-09-12 山东慧智博视数字科技有限公司 Large language model distillation method based on thinking chain
CN116824933A (en) * 2023-05-31 2023-09-29 上海深至信息科技有限公司 Medical training system based on large language model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230244938A1 (en) * 2022-02-02 2023-08-03 Google Llc Using Chains of Thought to Prompt Machine-Learned Models Pre-Trained on Diversified Objectives

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116824933A (en) * 2023-05-31 2023-09-29 上海深至信息科技有限公司 Medical training system based on large language model
CN116739110A (en) * 2023-06-21 2023-09-12 山东慧智博视数字科技有限公司 Large language model distillation method based on thinking chain

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于CPN对系统的并发行为进行测试;李华;孙涛;王显荣;邢熠;李颖杰;夏兴行;;计算机科学(01);第218-225页 *

Also Published As

Publication number Publication date
CN117149984A (en) 2023-12-01

Similar Documents

Publication Publication Date Title
CN110427463B (en) Search statement response method and device, server and storage medium
CN109885672B (en) Question-answering type intelligent retrieval system and method for online education
CN107329949B (en) Semantic matching method and system
CN117149984B (en) Customization training method and device based on large model thinking chain
CN110737758A (en) Method and apparatus for generating a model
CN110457708B (en) Vocabulary mining method and device based on artificial intelligence, server and storage medium
Haller et al. Survey on automated short answer grading with deep learning: from word embeddings to transformers
CN116166782A (en) Intelligent question-answering method based on deep learning
CN111666376B (en) Answer generation method and device based on paragraph boundary scan prediction and word shift distance cluster matching
CN110678882A (en) Selecting answer spans from electronic documents using machine learning
CN114218379B (en) Attribution method for question answering incapacity of intelligent question answering system
CN110795544B (en) Content searching method, device, equipment and storage medium
CN112328800A (en) System and method for automatically generating programming specification question answers
US12008473B2 (en) Augmenting machine learning language models using search engine results
CN113221530A (en) Text similarity matching method and device based on circle loss, computer equipment and storage medium
Dumal et al. Adaptive and automated online assessment evaluation system
CN111666374A (en) Method for integrating additional knowledge information into deep language model
Alshammari et al. TAQS: an Arabic question similarity system using transfer learning of BERT with BILSTM
Sukkarieh et al. Auto-marking 2: An update on the UCLES-Oxford University research into using computational linguistics to score short, free text responses
CN113705207A (en) Grammar error recognition method and device
CN117370190A (en) Test case generation method and device, electronic equipment and storage medium
CN111858860B (en) Search information processing method and system, server and computer readable medium
CN117034135A (en) API recommendation method based on prompt learning and double information source fusion
Sen et al. Support-BERT: predicting quality of question-answer pairs in MSDN using deep bidirectional transformer
Lee Natural Language Processing: A Textbook with Python Implementation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant