WO2023279631A1 - Speech manuscript evaluation method and device - Google Patents

Speech manuscript evaluation method and device Download PDF

Info

Publication number
WO2023279631A1
WO2023279631A1 PCT/CN2021/133041 CN2021133041W WO2023279631A1 WO 2023279631 A1 WO2023279631 A1 WO 2023279631A1 CN 2021133041 W CN2021133041 W CN 2021133041W WO 2023279631 A1 WO2023279631 A1 WO 2023279631A1
Authority
WO
WIPO (PCT)
Prior art keywords
speech
preset
subsections
data
neural network
Prior art date
Application number
PCT/CN2021/133041
Other languages
French (fr)
Chinese (zh)
Inventor
张�林
王晔
李东朔
Original Assignee
北京优幕科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京优幕科技有限责任公司 filed Critical 北京优幕科技有限责任公司
Publication of WO2023279631A1 publication Critical patent/WO2023279631A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the embodiments of the present invention relate to the technical field of information processing, and in particular, to a method and device for evaluating speech drafts.
  • Speech is a way of speaking and communicating in a special environment, and professional speech training requires repeated training. During the training process, scoring and evaluating in time to find deficiencies can speed up the training progress. Speech scoring includes facial expressions, speech speed, and speech content.
  • the most common scoring method for speech content is to establish a regression model, that is, to collect speech content with different scores by establishing a massive data set, use manual methods to design features or machine automatic feature extraction, calculate the contribution of each feature to the score, and extract effective features and establish relationships between features and scores.
  • the training regression model is to extract features from the speech script dataset, establish the relationship between features and scores, and store them in the form of weight matrix.
  • this method needs to rely on a large amount of data, and the samples need to cover each score segment, topic and other data, otherwise the scoring results will be randomly distributed, affecting the effectiveness and fairness of the entire scoring.
  • there are only a few excellent samples at the beginning of speech training and there is an extreme lack of samples with low and medium scores. And the same is true for other speech datasets in open data resources, leaving only the best cases of speech, which cannot be learned directly through transfer learning.
  • the embodiment of the present invention provides a method and equipment for evaluating speech drafts, the main purpose of which is to solve the problems of large sample data, few types of samples, and poor validity and fairness of evaluation results in traditional evaluation methods.
  • the embodiments of the present invention mainly provide the following technical solutions:
  • the embodiment of the present invention provides a speech evaluation method, the method comprising:
  • the evaluation results of each speech are determined according to the ranking information of all the subsections for each of the preset questions.
  • the neural network model before using the neural network model to identify all the subsections and multiple different preset questions, it also includes:
  • the neural network uses the plurality of training data to train the neural network model, the neural network outputs sorting information according to a plurality of sample answers and a preset question, and optimizes according to the difference between the output sorting information and the sorting information in the training data Model parameters.
  • obtaining multiple training data specifically includes:
  • the ranking information is obtained according to the ranking of each answer content in the web page.
  • determining the evaluation results of each speech according to the ranking information of each of the preset questions in all the sections specifically includes:
  • the highest ranking information for each of the preset questions for each subsection belonging to the same speech is obtained;
  • An evaluation result for the speech is obtained according to the highest ranking information of the same speech.
  • the ranking information corresponds to preset scores; the evaluation result is a score obtained according to each preset score.
  • the speech is based on the text data obtained by performing speech recognition on the speech recording, and the length of the pause time of the speech is recorded during the speech recognition process; in the step of dividing each of the speeches into several subsections, The speech is divided into subsections according to the semantics of the speech and the length of pauses in speech speech.
  • the ranking information includes empty ranking and/or tied ranking information.
  • an attention mechanism is used to process the text data from the section and the text data from the preset question, and based on the processed Feature data output ordering information.
  • the embodiment of the present invention provides a speech evaluation device, which includes: at least one processor; and a memory connected to the at least one processor; wherein, the memory stores information that can be used by the An instruction executed by a processor, the instruction is executed by the at least one processor, so that the at least one processor executes the above method for evaluating a speech.
  • an embodiment of the present invention provides a computer program product containing instructions, which, when run on a computer, cause the computer to execute the above method for evaluating a speech.
  • the evaluation task of the entire speech is converted into the evaluation of the degree of acceptance of questions and answers, and there is no need to provide a large number of speeches with different qualities as learning samples for the neural network. It is necessary to preset questions related to the topic of the speech, and prepare corresponding answers with different degrees of acceptance, then the neural network model can be trained, and then the evaluation of multiple speeches can be completed, which solves the problem of lack of samples in the existing technology.
  • the problem of speech evaluation, and the accuracy of this program is high
  • Fig. 1 shows the flow chart of a kind of speech draft evaluation method in the embodiment of the present invention
  • FIG. 2 shows a flow chart of another semantic evaluation method in an embodiment of the present invention
  • Fig. 3 shows a schematic diagram of evaluation results for paragraphs obtained in the embodiment of the present invention.
  • Fig. 4 shows a schematic diagram of the working process of the neural network model in the embodiment of the present invention.
  • the present invention provides a speech evaluation method, which can be executed by electronic devices such as computers and servers, as shown in Figure 1, the method includes the following steps:
  • the subsection refers to a paragraph expressing a certain argument, which may be one natural paragraph, or may be multiple. Exemplary: If multiple natural paragraphs are talking about the advantages of the product, these multiple natural paragraphs should be regarded as one paragraph.
  • C11...C1n represent the subsections of the first speech
  • C21...C2n represent the subsections of the second speech
  • Cn1...Cnn represent the n subsections of the nth speech.
  • the method of segmenting the speech may include but not limited to the following methods: using the existing semantic recognition technology, the whole manuscript is divided into several subsections according to the semantics of the text content. It is also possible to combine pauses and semantic formation into sections according to the speech recognition results, and the details are not limited.
  • Recognition which can also be interpreted as popularity, is learned by the neural network by identifying the training data. For example, when training a neural network, answers and questions can be prepared manually, and the ranking of these answers to the corresponding questions can be given manually, that is, the popularity/recognition of each answer; of course, data can also be migrated from other question-answer databases As the sample data for training the neural network.
  • Fig. 4 shows a schematic diagram of the working process of the neural network.
  • the trained neural network is used to identify the segmented sections and output sorting information.
  • the preset question and the segmented subsections are used as the input of the network, such as question 1+C11...Cnn is used as input, and the ordering information of C11...Cnn for preset question 1 is output; similarly, question 2+C11... Cnn is used as input, and outputs C11...Cnn's ranking information for preset question 2.
  • the preset question is set according to the content of the speech, and specifically related questions can be set according to the theme of the speech as the preset question. The higher the ranking of the subsections, the higher the acceptance/popularity for answering the preset questions.
  • the evaluation result is a score
  • the corresponding relationship between the ranking and the score needs to be set, for example, the first ranking corresponds to 10 points, the second ranking corresponds to 8 points, and so on.
  • paragraphs C11...C1n have corresponding rankings for these two questions, and the highest ranking is taken here. For example, C11 ranks second (highest) for question 1, then the score is 8; C14 ranks third (highest) for question 2 with a score of 6.
  • the method of calculating the total score of the two questions of the speech may include but not limited to the following: direct addition, weighted addition, weighted average, etc.
  • the evaluation results can also be classification results, such as pre-setting categories such as "excellent”, “good”, “medium”, and “poor”, and classifying the ranking information output by the neural network to obtain The category the speech belongs to.
  • the evaluation task for the entire speech can be converted into the evaluation of the degree of acceptance of questions and answers, and there is no need to provide a large number of speeches with different qualities as learning samples for the neural network.
  • the embodiment of the present invention also provides a kind of evaluation method of speech, as shown in Figure 2, described method comprises:
  • a set of training data includes a preset question and corresponding multiple candidate answers.
  • the preset question is "What basic facts are described below?"
  • the corresponding candidate There are 40 answers, and the labels are the ranking of the 40 answers, which are given based on people's subjective wishes. The higher the ranking, the higher the quality of the answer, which can be interpreted as the higher the approval degree, the higher the popularity, and the people's preference for the sample answer with the higher ranking.
  • the context related to the preset question and the corresponding answer content can be crawled in several specified webpages. It should be noted that when there are repeated answer content , the sample answers need to be obtained after merging; the ranking information is obtained according to the ranking of each answer content in the web page.
  • the preset questions of a speech on a certain topic can find matching questions on the Internet (such as a question-and-answer system), and the answers to the questions can be obtained, and one of the answers is selected as the best answer by the questioner , other answers are ranked next, and can be ranked according to the interaction between the questioner and the answerer, such as popularity, etc. This ranking can be directly used as the label of the training data.
  • sorting is not equal to the number of candidate answers, for example, there are 40 candidate answers, but the number of sorting is 10.
  • Sorting can have ties and blanks. For example, the first-ranked answer is 0, the second-ranked answer is 0, and the third-ranked answer is 2 or 3, etc., so the sorting information includes empty rankings and/or tied rankings information.
  • the preset questions are set according to the content of the speech, and each assessment corresponds to a speech on the same topic.
  • some questions can be set according to the theme of the speech. For example: “The following describes the basic Facts?", “What are the advantages compared to other products?”, “What kind of benefits can users get?”, “What did the protagonist do?”, “What advanced deeds does the protagonist have?” and so on.
  • the relevant documents of the above examples and the candidate answers are examples only and are not intended to be limiting.
  • the neural network uses the plurality of training data to train the neural network model, the neural network outputs sorting information according to multiple sample answers and a preset question, and according to the relationship between the output sorting information and the sorting information in the training data Differentially optimize model parameters.
  • a two-layer feed-forward neural network is selected, the input is a preset question and several candidate answers, and the output is the labels of several candidate answers.
  • the above-mentioned multiple training data are used to train the network, and the difference between the order of the network output and the label determines the loss, thereby optimizing the network parameters.
  • a ⁇ Rm ⁇ d and B ⁇ R1 ⁇ m are optimized weight matrix parameters
  • b1 ⁇ Rm and b2 ⁇ R are linear bias vectors.
  • the speech is based on the text data obtained by speech recognition of the speech recording, and the length of the pause time of the sound is recorded in the speech recognition process; in the step of dividing each speech into several subsections In , the speech is divided into subsections according to the semantics of the speech and the length of pauses in the speech recording.
  • the neural network model uses the neural network model to identify all the subsections and multiple different preset questions, wherein each of the preset questions and all the subsections are used as input data in turn, and the neural network model extracts features from the input data data, outputting ranking information of all the subsections with respect to preset questions according to the characteristic data, which is used to indicate the recognition/popularity of each of the subsections for answering the preset questions.
  • an attention mechanism (attention) is used to analyze the text data from the section and the text data from the The text data of the preset question is processed, and the sorting information is output based on the feature data obtained after processing.
  • each ranking information corresponds to a preset score, for example, the first ranking corresponds to 10 points, the second corresponds to 8 points, etc., and the corresponding relationship between the ranking and the score is specifically set according to actual needs.
  • a most relevant answer should be found in the speech content to determine the score.
  • the algorithm of the comprehensive score can be the direct sum of the scores of each question to obtain the comprehensive score, or the weight can be set according to the importance of the question, for example: as shown in Figure 1, the final score can be obtained by adding the scores of each answer 35+30.
  • the weight can also be set according to the importance of the question. For example, if the weight of question 1 is set to 1.2 and the weight of question 2 is set to 0.8, the final score will be 1.2*35+0.8*30.
  • the specific calculation method of comprehensive score and weight setting are not limited.
  • the embodiment of the present invention migrates the technology and data of the knowledge question answering system, and constructs a speech evaluation method that lacks data and is difficult to score.
  • the evaluation model has high accuracy and strong interpretability, and can not only give scores, but also give corresponding A similar case for points.
  • An embodiment of the present invention also provides a speech evaluation device, including: at least one processor; and a memory connected in communication with the at least one processor; wherein, the memory stores instructions that can be executed by the one processor , the instruction is executed by the at least one processor, so that the at least one processor executes the method described in the foregoing embodiments.
  • An embodiment of the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the method described in the above-mentioned embodiments.
  • the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed are a speech manuscript evaluation method and device, which relate to the technical field of information processing, and which mainly aim to solve the problems in conventional evaluation methods of large required sample data, few sample types and the effectiveness and fairness of evaluation results being poor. The technical solution comprises: acquiring a plurality of speech manuscripts; segmenting each speech manuscript into a plurality of sections; using a neural network model to identify all of the sections and a plurality of different preset questions, sequentially using each preset question and all of the sections as input data, the neural network model extracting feature data from the input data, and outputting sorting information of all of the sections for a preset question according to the feature data, the sorting information being used for representing the degree of recognition of each section for answering the preset question; and determining an evaluation result of each speech manuscript according to the sorting information of all of the sections for each preset question.

Description

演讲稿测评方法及设备Speech Evaluation Methods and Equipment 技术领域technical field
本发明实施例涉及信息处理技术领域,特别是涉及一种演讲稿测评方法及设备。The embodiments of the present invention relate to the technical field of information processing, and in particular, to a method and device for evaluating speech drafts.
背景技术Background technique
演讲是一种特殊环境下的说话交流方式,专业的演讲训练需要反复的训练。在训练过程中,及时打分评估找到不足可以加快训练进度。对演讲的打分包括表情姿态、语音语速以及演讲内容等方面。Speech is a way of speaking and communicating in a special environment, and professional speech training requires repeated training. During the training process, scoring and evaluating in time to find deficiencies can speed up the training progress. Speech scoring includes facial expressions, speech speed, and speech content.
演讲内容最常见的打分方法为建立回归模型,即通过建立海量的数据集收集不同分数段的演讲内容,使用人工方式设计特征或机器自动抽取特征,计算各项特征对分数的贡献度,提取有效特征并建立特征和分数间的关系。训练回归模型是通过演讲稿数据集提取特征,并建立特征和分数间的关系,并以权重矩阵的形式存储。但此种方法需要依赖于大量数据,样本需要覆盖各分数段、主题等数据,否则打分结果会成随机分布,影响整个打分的有效性和公平性。在实际应用中,演讲训练开始的时候只有少数几个优秀的样本,极度缺乏中低分数样本。而且开放数据资源中的其他演讲数据集也是同样的情况,只留下了演讲最好的案例,无法直接通过迁移学习进行学习。The most common scoring method for speech content is to establish a regression model, that is, to collect speech content with different scores by establishing a massive data set, use manual methods to design features or machine automatic feature extraction, calculate the contribution of each feature to the score, and extract effective features and establish relationships between features and scores. The training regression model is to extract features from the speech script dataset, establish the relationship between features and scores, and store them in the form of weight matrix. However, this method needs to rely on a large amount of data, and the samples need to cover each score segment, topic and other data, otherwise the scoring results will be randomly distributed, affecting the effectiveness and fairness of the entire scoring. In practical applications, there are only a few excellent samples at the beginning of speech training, and there is an extreme lack of samples with low and medium scores. And the same is true for other speech datasets in open data resources, leaving only the best cases of speech, which cannot be learned directly through transfer learning.
发明内容Contents of the invention
有鉴于此,本发明实施例提供了一种演讲稿测评方法及设备,主要目的在于解决传统测评方法中需样本数据大,样本种类少,测评结果有效性及公平性差的问题。In view of this, the embodiment of the present invention provides a method and equipment for evaluating speech drafts, the main purpose of which is to solve the problems of large sample data, few types of samples, and poor validity and fairness of evaluation results in traditional evaluation methods.
为了解决上述问题,本发明实施例主要提供如下技术方案:In order to solve the above problems, the embodiments of the present invention mainly provide the following technical solutions:
第一方面,本发明实施例提供了一种演讲稿测评方法,该方法包括:In the first aspect, the embodiment of the present invention provides a speech evaluation method, the method comprising:
获取多个演讲稿;Get multiple speeches;
分别将各个所述演讲稿分割为若干小节;Divide each of the said speeches into several subsections;
利用神经网络模型对全部所述小节和多个不同的预设问题进行识别, 其中依次将各个所述预设问题与全部所述小节作为输入数据,所述神经网络模型对输入数据提取特征数据,根据所述特征数据输出全部所述小节对于预设问题的排序信息,用于表示各个所述小节对于回答预设问题的认可度。Using a neural network model to identify all the subsections and multiple different preset questions, wherein each of the preset questions and all the subsections are used as input data in turn, and the neural network model extracts characteristic data from the input data, Outputting ranking information of all the subsections with respect to preset questions according to the feature data, which is used to indicate the approval degree of each of the subsections for answering the preset questions.
根据全部所述小节对于各个所述预设问题的排序信息确定各个演讲稿的测评结果。The evaluation results of each speech are determined according to the ranking information of all the subsections for each of the preset questions.
可选的,在利用神经网络模型对全部所述小节和多个不同的预设问题进行识别之前,还包括:Optionally, before using the neural network model to identify all the subsections and multiple different preset questions, it also includes:
获取多个训练数据,每一个所述训练数据分别包括多个样本答案和一个预设问题,以及各个样本答案对于所述预设问题的排序信息;Acquiring a plurality of training data, each of which includes a plurality of sample answers and a preset question, and ranking information of each sample answer for the preset question;
利用所述多个训练数据对所述神经网络模型进行训练,所述神经网络根据多个样本答案和一个预设问题输出排序信息,并根据输出的排序信息与训练数据中的排序信息的差异优化模型参数。Using the plurality of training data to train the neural network model, the neural network outputs sorting information according to a plurality of sample answers and a preset question, and optimizes according to the difference between the output sorting information and the sorting information in the training data Model parameters.
可选的,获取多个训练数据具体包括:Optionally, obtaining multiple training data specifically includes:
在指定的若干网页中爬取与所述预设问题相关的上下文及相应的回答内容;Crawl the context and corresponding answers related to the preset questions from several specified webpages;
根据各个回答内容在所述网页中的排序情况得到所述排序信息。The ranking information is obtained according to the ranking of each answer content in the web page.
可选的,根据全部所述小节对于各个所述预设问题的排序信息确定各个演讲稿的测评结果具体包括:Optionally, determining the evaluation results of each speech according to the ranking information of each of the preset questions in all the sections specifically includes:
在全部所述小节对于各个所述预设问题的排序信息中,获取属于同一演讲稿的各个小节对于各个预设问题的最高排名信息;In the ranking information of all the subsections for each of the preset questions, the highest ranking information for each of the preset questions for each subsection belonging to the same speech is obtained;
根据同一演讲稿的各个所述最高排名信息得到针对所述演讲稿的测评结果。An evaluation result for the speech is obtained according to the highest ranking information of the same speech.
可选的,所述排名信息对应预设分值;所述测评结果是根据各个预设分值得到的一个分值。Optionally, the ranking information corresponds to preset scores; the evaluation result is a score obtained according to each preset score.
可选的,所述演讲稿是基于对演讲录音进行语音识别得到的文字数据,并且在语音识别过程中记录语音的停顿时间长度;在分别将各个所述演讲稿分割为若干小节的步骤中,根据演讲稿的语义以及演讲语音中的停顿时间长度将演讲稿分割为若干小节。Optionally, the speech is based on the text data obtained by performing speech recognition on the speech recording, and the length of the pause time of the speech is recorded during the speech recognition process; in the step of dividing each of the speeches into several subsections, The speech is divided into subsections according to the semantics of the speech and the length of pauses in speech speech.
可选的,所述排序信息中包括为空的排名和/或并列的排名信息。Optionally, the ranking information includes empty ranking and/or tied ranking information.
可选的,所述神经网络模型对输入数据提取特征数据的过程中,采用注意力机制对来自所述小节的文本数据和来自所述预设问题的文本数据进行处理,并基于处理后得到的特征数据输出排序信息。Optionally, in the process of extracting feature data from the input data by the neural network model, an attention mechanism is used to process the text data from the section and the text data from the preset question, and based on the processed Feature data output ordering information.
第二方面,本发明实施例提供了一种演讲稿测评设备,该设备包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行上述演讲稿测评方法。In the second aspect, the embodiment of the present invention provides a speech evaluation device, which includes: at least one processor; and a memory connected to the at least one processor; wherein, the memory stores information that can be used by the An instruction executed by a processor, the instruction is executed by the at least one processor, so that the at least one processor executes the above method for evaluating a speech.
第三方面,本发明实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行上述演讲稿测评方法。In a third aspect, an embodiment of the present invention provides a computer program product containing instructions, which, when run on a computer, cause the computer to execute the above method for evaluating a speech.
根据本发明提供的演讲稿测评方法及设备,将对于整篇演讲稿的测评任务转换为对问题和答案受认可度的测评,不需要提供大量质量不同的演讲稿作为神经网络的学习样本,只需要预设与演讲稿主题相关的问题,并准备相应的受认可度不同的答案,即可训练神经网络模型,进而完成对多篇演讲稿的测评,解决了现有技术中由于缺少样本难以对演讲稿测评的问题,并且本方案准确率高
Figure PCTCN2021133041-appb-000001
According to the speech evaluation method and equipment provided by the present invention, the evaluation task of the entire speech is converted into the evaluation of the degree of acceptance of questions and answers, and there is no need to provide a large number of speeches with different qualities as learning samples for the neural network. It is necessary to preset questions related to the topic of the speech, and prepare corresponding answers with different degrees of acceptance, then the neural network model can be trained, and then the evaluation of multiple speeches can be completed, which solves the problem of lack of samples in the existing technology. The problem of speech evaluation, and the accuracy of this program is high
Figure PCTCN2021133041-appb-000001
附图说明Description of drawings
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明实施例的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiment. The drawings are only for the purpose of illustrating the preferred embodiments and are not considered as limiting the embodiments of the present invention. Also throughout the drawings, the same reference numerals are used to designate the same components. In the attached picture:
图1示出了本发明实施例中的一种演讲稿测评方法的流程图;Fig. 1 shows the flow chart of a kind of speech draft evaluation method in the embodiment of the present invention;
图2示出了本发明实施例中的另一种语义的测评方法的流程图;FIG. 2 shows a flow chart of another semantic evaluation method in an embodiment of the present invention;
图3示出了本发明实施例中得到针对段落的测评结果示意图;Fig. 3 shows a schematic diagram of evaluation results for paragraphs obtained in the embodiment of the present invention;
图4示出了本发明实施例中的神经网络模型的工作过程示意图。Fig. 4 shows a schematic diagram of the working process of the neural network model in the embodiment of the present invention.
具体实施方式detailed description
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开 而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
本发明提供一种演讲稿测评方法,该方法可以由计算机、服务器等电子设备执行,如图1所示,该方法包括如下步骤:The present invention provides a speech evaluation method, which can be executed by electronic devices such as computers and servers, as shown in Figure 1, the method includes the following steps:
101、获取多个演讲稿。101. Get multiple speeches.
102、分别将所述各个演讲稿分割为若干小节。102. Divide each speech script into several subsections.
在本发明实施例中,所述的若小节是指表达某一论点的段落,可能是一个自然段,也可能是多个。示例性的:多个自然段都在讲产品的优点,则要把这多个自然段作为一个段落。将C11……C1n,表示第一篇演讲稿的小节、C21……C2n表示第二篇演讲稿的小节,Cn1……Cnn表示第n篇演讲稿的n个小节。演讲稿的分割方式可以包括但不局限于以下方式:通过现有的语义识别技术,根据文本内容的语义将整篇文稿分成几个小节。也可以根据语音识别结果,综合停顿、语意形成分节,具体的不做限定。In the embodiment of the present invention, the subsection refers to a paragraph expressing a certain argument, which may be one natural paragraph, or may be multiple. Exemplary: If multiple natural paragraphs are talking about the advantages of the product, these multiple natural paragraphs should be regarded as one paragraph. Let C11...C1n represent the subsections of the first speech, C21...C2n represent the subsections of the second speech, and Cn1...Cnn represent the n subsections of the nth speech. The method of segmenting the speech may include but not limited to the following methods: using the existing semantic recognition technology, the whole manuscript is divided into several subsections according to the semantics of the text content. It is also possible to combine pauses and semantic formation into sections according to the speech recognition results, and the details are not limited.
103、利用神经网络模型对全部所述小节和多个不同的预设问题进行识别,其中依次将各个所述预设问题与全部所述小节作为输入数据,所述神经网络模型对输入数据提取特征数据,根据所述特征数据输出全部所述小节对于预设问题的排序信息,用于表示各个所述小节对于预设问题的认可度。103. Using a neural network model to identify all the subsections and multiple different preset questions, wherein each of the preset questions and all the subsections are used as input data in turn, and the neural network model extracts features from the input data data, outputting ranking information of all the subsections with respect to preset questions according to the characteristic data, which is used to indicate the approval degree of each of the subsections with respect to preset questions.
认可度也可解释为受欢迎程度,是由神经网络对训练数据进行识别而学习到的。比如训练神经网络时,可以人工准备答案和问题,并且由人工给出这些答案对于相应问题的排序情况,也即人为各个答案的受欢迎程度/认可度;当然也可以从其它问答数据库中迁移数据作为训练神经网络的样本数据。训练神经网络的方法,以及样本数据的设置和获取途径有多种,具体将在后续实施例中进行介绍。由此可知,本方案中的神经网络不是用于识别答案与问题之间的浅层语义关联,而是基于学习到的知识模拟人的思维对多个答案进行排序,排名越靠前的答案,表示其受认可度/受欢迎程度越高,认为人们应该会更喜欢该答案。但从语义层面来看,排名靠前的 答案不一定比靠后的答案与预设问题的关联性更高。Recognition, which can also be interpreted as popularity, is learned by the neural network by identifying the training data. For example, when training a neural network, answers and questions can be prepared manually, and the ranking of these answers to the corresponding questions can be given manually, that is, the popularity/recognition of each answer; of course, data can also be migrated from other question-answer databases As the sample data for training the neural network. There are various methods for training the neural network, as well as ways to set and obtain sample data, which will be described in detail in subsequent embodiments. It can be seen that the neural network in this solution is not used to identify the shallow semantic relationship between the answer and the question, but to sort multiple answers based on the learned knowledge to simulate human thinking. Indicates that the higher its acceptance/popularity, the more people should like the answer. However, from a semantic point of view, the top-ranked answers are not necessarily more relevant to the preset question than the lower-ranked answers.
图4示出了神经网络的工作过程示意图,本方案利用训练好的神经网络,对分割好的小节进行识别,输出排序信息。具体地,预设问题和分割好的小节作为网络的输入,如问题1+C11……Cnn作为输入,输出C11……Cnn对于预设问题1的排序信息;同样的,问题2+C11……Cnn作为输入,输出C11……Cnn对于预设问题2的排序信息。所述的预设问题是根据演讲稿的内容进行设定的,具体可以根据演讲稿的主题设置相关的问题作为预设问题。排序越靠前的小节,表示对于回答预设问题的认可度/受欢迎程度越高。Fig. 4 shows a schematic diagram of the working process of the neural network. In this solution, the trained neural network is used to identify the segmented sections and output sorting information. Specifically, the preset question and the segmented subsections are used as the input of the network, such as question 1+C11...Cnn is used as input, and the ordering information of C11...Cnn for preset question 1 is output; similarly, question 2+C11... Cnn is used as input, and outputs C11...Cnn's ranking information for preset question 2. The preset question is set according to the content of the speech, and specifically related questions can be set according to the theme of the speech as the preset question. The higher the ranking of the subsections, the higher the acceptance/popularity for answering the preset questions.
104、根据全部所述小节对于各个所述预设问题的排序信息确定各个演讲稿的测评结果。104. Determine the evaluation results of each speech according to the ranking information of each of the preset questions in all the sections.
在一个实施例中,测评结果为分值,在根据排序情况给出得分之前,需要设定排序与得分的对应关系,比如排序第一的情况对应10分、排序第二对应8分等等。假设预定问题有两个,段落C11……C1n对这两个问题都有相应的排序情况,这里取排序最高的,比如C11针对问题1的排序是排第二(最高),那么得分是8;C14针对问题2的排序是排第三(最高),得分为6。计算该篇演讲稿这对这两个问题的总得分方法可以包括但不局限于以下内容:直接相加、加权相加、加权平均等等。In one embodiment, the evaluation result is a score, and before the score is given according to the ranking, the corresponding relationship between the ranking and the score needs to be set, for example, the first ranking corresponds to 10 points, the second ranking corresponds to 8 points, and so on. Assuming that there are two predetermined questions, paragraphs C11...C1n have corresponding rankings for these two questions, and the highest ranking is taken here. For example, C11 ranks second (highest) for question 1, then the score is 8; C14 ranks third (highest) for question 2 with a score of 6. The method of calculating the total score of the two questions of the speech may include but not limited to the following: direct addition, weighted addition, weighted average, etc.
在其它实施例中,测评结果也可以是分类结果,比如预先设定“优秀”、“良好”、“中等”、“较差”等类别,对神经网络输出的排序信息进行分类,得到针对各个演讲稿所属的类别。In other embodiments, the evaluation results can also be classification results, such as pre-setting categories such as "excellent", "good", "medium", and "poor", and classifying the ranking information output by the neural network to obtain The category the speech belongs to.
根据本发明提供的演讲稿测评方法及设备,可以将对于整篇演讲稿的测评任务转换为对问题和答案受认可度的测评,不需要提供大量质量不同的演讲稿作为神经网络的学习样本,只需要预设与演讲稿主题相关的问题,并准备相应的受认可度不同的答案,即可训练神经网络模型,进而完成对多篇演讲稿的测评,解决了现有技术中由于缺少样本难以对演讲稿测评的问题,并且本方案准确率高
Figure PCTCN2021133041-appb-000002
According to the speech evaluation method and equipment provided by the present invention, the evaluation task for the entire speech can be converted into the evaluation of the degree of acceptance of questions and answers, and there is no need to provide a large number of speeches with different qualities as learning samples for the neural network. You only need to preset questions related to the topic of the speech, and prepare corresponding answers with different degrees of acceptance to train the neural network model, and then complete the evaluation of multiple speeches, which solves the problem of lack of samples in the existing technology. Questions about the evaluation of speeches, and the accuracy of this program is high
Figure PCTCN2021133041-appb-000002
本发明实施例还提供一种演讲稿的测评方法,如图2所示,所述方法 包括:The embodiment of the present invention also provides a kind of evaluation method of speech, as shown in Figure 2, described method comprises:
201、获取多个训练数据,每一个所述训练数据分别包括多个样本答案和一个预设问题,以及各个样本答案对于所述预设问题的排序信息。201. Acquire a plurality of training data, each of which includes a plurality of sample answers and a preset question, and ranking information of each sample answer with respect to the preset question.
在本发明实施例中,需要对神经网络模型进行训练,一组训练数据包括一个预设问题以及对应的多个候选答案,比如预设问题是“下文描述了哪些基本事实?”,对应的候选答案有40个,标签是40个答案的排序情况,是基于人的主观意愿给出的。排序越靠前表示答案的质量约高,可解释为排序越靠前的样本答案的认可度越高、受欢迎程度越高、人们更喜欢。In the embodiment of the present invention, the neural network model needs to be trained. A set of training data includes a preset question and corresponding multiple candidate answers. For example, the preset question is "What basic facts are described below?", and the corresponding candidate There are 40 answers, and the labels are the ranking of the 40 answers, which are given based on people's subjective wishes. The higher the ranking, the higher the quality of the answer, which can be interpreted as the higher the approval degree, the higher the popularity, and the people's preference for the sample answer with the higher ranking.
在本发明实施例中,对于多个训练数据的方式可通过在指定的若干网页中爬取与所述预设问题相关的上下文及相应的回答内容,需要说明的是,当存在重复的回答内容时,需要先进行合并后得到样本答案;根据各个回答内容在所述网页中的排序情况得到所述排序信息。示例性的:某一主题的演讲稿的预设问题在互联网中(比如问答系统)可以查找到相符的问题,并且能够获取对该问题的回答,其中某个回答被提问者评选为最佳答案,其他回答排在其后,可以根据提问者与回答者之间互动的情况,如热度等排名,这个排名可以直接作为训练数据的标签。In the embodiment of the present invention, for multiple training data, the context related to the preset question and the corresponding answer content can be crawled in several specified webpages. It should be noted that when there are repeated answer content , the sample answers need to be obtained after merging; the ranking information is obtained according to the ranking of each answer content in the web page. Exemplary: The preset questions of a speech on a certain topic can find matching questions on the Internet (such as a question-and-answer system), and the answers to the questions can be obtained, and one of the answers is selected as the best answer by the questioner , other answers are ranked next, and can be ranked according to the interaction between the questioner and the answerer, such as popularity, etc. This ranking can be directly used as the label of the training data.
需要说明的是,排序的数量不等于候选答案的数量,比如有40个候选答案,而排序数量是10个。排序可以出现并列和空白。比如排序第一的答案是0个、排序第二的答案是0个,排序第三的答案是2个或者3个等,因此所述的排序信息中包括为空的排名和/或并列的排名信息。It should be noted that the number of sorting is not equal to the number of candidate answers, for example, there are 40 candidate answers, but the number of sorting is 10. Sorting can have ties and blanks. For example, the first-ranked answer is 0, the second-ranked answer is 0, and the third-ranked answer is 2 or 3, etc., so the sorting information includes empty rankings and/or tied rankings information.
预设问题是根据演讲稿内容设定的,每次测评对应的则为同一主题的演讲稿,在预设问题时可以针对演讲稿的主题设置一些问题,示例性的:“下文描述了哪些基本事实?”、“相对于其他产品优点有那些?”、“使用者可以得到什么样的好处?”、“主人公做了什么事?”、“主人公有什么先进事迹?”等等。参照常见问答系统的答案排序的常见做法,我们将选取k个候选答案,和演讲稿放在一起。topK算法首先针对指定的问句,先找到top-n(示例性的:n=10)个相关文档,可使用tf-idf或bm25算法。接下来将n个文档其分割为段落,得到一个远大于n的候选答案组,从中挑选topk个 候选答案(示例性的:k=40),需要说明的是上述示例的相关文档及候选答案的数量只是举例,并非意在限定其数量。The preset questions are set according to the content of the speech, and each assessment corresponds to a speech on the same topic. When presetting the questions, some questions can be set according to the theme of the speech. For example: "The following describes the basic Facts?", "What are the advantages compared to other products?", "What kind of benefits can users get?", "What did the protagonist do?", "What advanced deeds does the protagonist have?" and so on. Referring to the common practice of sorting answers in the FAQ system, we will select k candidate answers and put them together with the speech. The topK algorithm first finds top-n (exemplary: n=10) relevant documents for a specified question sentence, and the tf-idf or bm25 algorithm can be used. Next, the n documents are divided into paragraphs to obtain a candidate answer group much larger than n, from which the topk candidate answers (exemplary: k=40) are selected. What needs to be explained is the relevant documents of the above examples and the candidate answers. The quantities are examples only and are not intended to be limiting.
202、利用所述多个训练数据对所述神经网络模型进行训练,所述神经网络根据多个样本答案和一个预设问题输出排序信息,并根据输出的排序信息与训练数据中的排序信息的差异优化模型参数。202. Use the plurality of training data to train the neural network model, the neural network outputs sorting information according to multiple sample answers and a preset question, and according to the relationship between the output sorting information and the sorting information in the training data Differentially optimize model parameters.
在本发明实施例中,选用两层的前馈神经网络,输入是预设问题和若干候选答案,输出是若干候选答案的标签。利用上述多个训练数据对网络进行训练,网络输出的排序与标签的差异决定loss,从而优化网络参数。In the embodiment of the present invention, a two-layer feed-forward neural network is selected, the input is a preset question and several candidate answers, and the output is the labels of several candidate answers. The above-mentioned multiple training data are used to train the network, and the difference between the order of the network output and the label determines the loss, thereby optimizing the network parameters.
网络表示为f(xi)=ReLU(xiAT+b1)BT+b2The network is expressed as f(xi)=ReLU(xiAT+b1)BT+b2
其中xi表示attention后的特征,A∈Rm×d且B∈R1×m为被优化的权重矩阵参数,b1∈Rm且b2∈R为线性偏执向量。Where xi represents the features after attention, A∈Rm×d and B∈R1×m are optimized weight matrix parameters, b1∈Rm and b2∈R are linear bias vectors.
203、获取多个演讲稿。203. Obtain multiple speech scripts.
204、分别将各个所述演讲稿分割为若干小节。204. Divide each of the speeches into several subsections.
在具体实施过程中,所述演讲稿是基于对演讲录音进行语音识别得到的文字数据,并且在语音识别过程中记录声音的停顿时间长度;在分别将各个所述演讲稿分割为若干小节的步骤中,根据演讲稿的语义以及演讲录音中的停顿时间长度将演讲稿分割为若干小节。In the specific implementation process, the speech is based on the text data obtained by speech recognition of the speech recording, and the length of the pause time of the sound is recorded in the speech recognition process; in the step of dividing each speech into several subsections In , the speech is divided into subsections according to the semantics of the speech and the length of pauses in the speech recording.
205、利用神经网络模型对全部所述小节和多个不同的预设问题进行识别,其中依次将各个所述预设问题与全部所述小节作为输入数据,所述神经网络模型对输入数据提取特征数据,根据所述特征数据输出全部所述小节对于预设问题的排序信息,用于表示各个所述小节对于回答预设问题的认可度/受欢迎程度。205. Use the neural network model to identify all the subsections and multiple different preset questions, wherein each of the preset questions and all the subsections are used as input data in turn, and the neural network model extracts features from the input data data, outputting ranking information of all the subsections with respect to preset questions according to the characteristic data, which is used to indicate the recognition/popularity of each of the subsections for answering the preset questions.
神经网络模型的测评过程如图3所示,在优选的实施例中,神经网络模型对输入数据提取特征数据的过程中,采用注意力机制(attention)对来自所述小节的文本数据和来自所述预设问题的文本数据进行处理,并基于处理后得到的特征数据输出排序信息。The evaluation process of the neural network model is shown in Figure 3. In a preferred embodiment, in the process of extracting feature data from the input data by the neural network model, an attention mechanism (attention) is used to analyze the text data from the section and the text data from the The text data of the preset question is processed, and the sorting information is output based on the feature data obtained after processing.
206、在全部所述小节对于各个所述预设问题的排序信息中,获取属于同一演讲稿的各个小节对于各个预设问题的最高排名信息。206. From the ranking information of all the subsections for each of the preset questions, obtain the highest ranking information of each subsection belonging to the same speech for each of the preset questions.
在本发明实施例中,每个排序信息都对应一个预设分值,如排序第一的情况对应10分、第二对应8分等,具体根据实际需求设定排名与分值的对应关系。In the embodiment of the present invention, each ranking information corresponds to a preset score, for example, the first ranking corresponds to 10 points, the second corresponds to 8 points, etc., and the corresponding relationship between the ranking and the score is specifically set according to actual needs.
207、根据同一演讲稿的各个所述最高排名信息得到针对所述演讲稿的测评结果。207. Obtain an evaluation result for the speech according to the highest ranking information of the same speech.
在本发明实施例中,对同一演讲稿中的每一个问题应在演讲内容中找到一个最相关的答案确定得分,示例性的:假设预定问题有两个,段落C11……C1n对这两个问题都有相应的排序情况,这里取排序最高的,比如C11针对问题1的排序是排第二(最高),那么得分是8;C14针对问题2的排序是排第三(最高),得分为6。In the embodiment of the present invention, for each question in the same speech, a most relevant answer should be found in the speech content to determine the score. Exemplary: Assuming that there are two predetermined questions, paragraphs C11...C1n Questions have their corresponding sorting conditions, and the highest ranking is taken here. For example, C11 ranks second (highest) for question 1, then the score is 8; C14 ranks third (highest) for question 2, and the score is 6.
需要说明的是综合得分的算法可以是每个问题得分直接相加得到综合得分,也可以根据问题的重要程度设置权重,示例性的:如图1,可以将每一个答案得分相加得到最终得分35+30。也可以根据问题重要程度设置权重,如将问题1权重设置为1.2,问题2的设置为0.8,则最终得分为1.2*35+0.8*30,具体的综合得分计算方式及权重设置不做限定。It should be noted that the algorithm of the comprehensive score can be the direct sum of the scores of each question to obtain the comprehensive score, or the weight can be set according to the importance of the question, for example: as shown in Figure 1, the final score can be obtained by adding the scores of each answer 35+30. The weight can also be set according to the importance of the question. For example, if the weight of question 1 is set to 1.2 and the weight of question 2 is set to 0.8, the final score will be 1.2*35+0.8*30. The specific calculation method of comprehensive score and weight setting are not limited.
本发明实施例迁移了知识问答系统的技术和数据,构造了缺少数据难以打分的演讲稿予以测评方法,该测评模型准确率高而且可解释性强,不但可以给出分数,而且可以给出相应分值的类似案例。The embodiment of the present invention migrates the technology and data of the knowledge question answering system, and constructs a speech evaluation method that lacks data and is difficult to score. The evaluation model has high accuracy and strong interpretability, and can not only give scores, but also give corresponding A similar case for points.
本发明实施例还提供一种演讲稿测评设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行上述实施例所述的方法。An embodiment of the present invention also provides a speech evaluation device, including: at least one processor; and a memory connected in communication with the at least one processor; wherein, the memory stores instructions that can be executed by the one processor , the instruction is executed by the at least one processor, so that the at least one processor executes the method described in the foregoing embodiments.
本发明实施例还提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行上述实施例所述的方法。An embodiment of the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the method described in the above-mentioned embodiments.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不 限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
显然,上述实施例仅仅是为清楚地说明所作的举例,而并非对实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。而由此所引伸出的显而易见的变化或变动仍处于本发明创造的保护范围之中。Apparently, the above-mentioned embodiments are only examples for clear description, rather than limiting the implementation. For those of ordinary skill in the art, other changes or changes in different forms can be made on the basis of the above description. It is not necessary and impossible to exhaustively list all the implementation manners here. And the obvious changes or changes derived therefrom are still within the scope of protection of the present invention.

Claims (9)

  1. 一种演讲稿测评方法,其特征在于,包括:A speech evaluation method, characterized in that it includes:
    获取多个演讲稿;Get multiple speeches;
    分别将各个所述演讲稿分割为若干小节;Divide each of the said speeches into several subsections;
    利用神经网络模型对全部所述小节和多个不同的预设问题进行识别,其中依次将各个所述预设问题与全部所述小节作为输入数据,所述神经网络模型对输入数据提取特征数据,根据所述特征数据输出全部所述小节对于预设问题的排序信息,用于表示各个所述小节对于回答预设问题的认可度;Using a neural network model to identify all the subsections and a plurality of different preset questions, wherein each of the preset questions and all the subsections are used as input data in turn, and the neural network model extracts characteristic data from the input data, Outputting the sorting information of all the subsections with respect to preset questions according to the feature data, which is used to indicate the approval degree of each of the subsections for answering preset questions;
    根据全部所述小节对于各个所述预设问题的排序信息确定各个演讲稿的测评结果。The evaluation results of each speech are determined according to the ranking information of all the subsections for each of the preset questions.
  2. 根据权利要求1所述的方法,其特征在于,在利用神经网络模型对全部所述小节和多个不同的预设问题进行识别之前,还包括:The method according to claim 1, wherein, before using the neural network model to identify all the subsections and a plurality of different preset questions, it also includes:
    获取多个训练数据,每一个所述训练数据分别包括多个样本答案和一个预设问题,以及各个样本答案对于所述预设问题的排序信息;Acquiring a plurality of training data, each of which includes a plurality of sample answers and a preset question, and ranking information of each sample answer for the preset question;
    利用所述多个训练数据对所述神经网络模型进行训练,所述神经网络根据多个样本答案和一个预设问题输出排序信息,并根据输出的排序信息与训练数据中的排序信息的差异优化模型参数。Using the plurality of training data to train the neural network model, the neural network outputs sorting information according to a plurality of sample answers and a preset question, and optimizes according to the difference between the output sorting information and the sorting information in the training data Model parameters.
  3. 根据权利要求2所述的方法,其特征在于,获取多个训练数据具体包括:The method according to claim 2, wherein obtaining a plurality of training data specifically comprises:
    在指定的若干网页中爬取与所述预设问题相关的上下文及相应的回答内容;Crawl the context and corresponding answers related to the preset questions from several specified webpages;
    根据各个回答内容在所述网页中的排序情况得到所述排序信息。The ranking information is obtained according to the ranking of each answer content in the web page.
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,根据全部所述小节对于各个所述预设问题的排序信息确定各个演讲稿的测评结果具体包括:The method according to any one of claims 1-3, wherein determining the evaluation results of each speech according to the ranking information of each of the preset questions in all the sections specifically includes:
    在全部所述小节对于各个所述预设问题的排序信息中,获取属于同一演讲稿的各个小节对于各个预设问题的最高排名信息;In the ranking information of all the subsections for each of the preset questions, the highest ranking information for each of the preset questions for each subsection belonging to the same speech is obtained;
    根据同一演讲稿的各个所述最高排名信息得到针对所述演讲稿的测评 结果。An evaluation result for the speech is obtained according to the highest ranking information of the same speech.
  5. 根据权利要求4所述的方法,所述排名信息对应预设分值;所述测评结果是根据各个预设分值得到的一个分值。The method according to claim 4, wherein the ranking information corresponds to preset scores; and the evaluation result is a score obtained according to each preset score.
  6. 根据权利要求1所述的方法,其特征在于,所述演讲稿是基于对演讲录音进行语音识别得到的文字数据,并且在语音识别过程中记录语音的停顿时间长度;The method according to claim 1, wherein the speech script is based on the text data obtained by performing speech recognition on the speech recording, and records the pause time length of the speech during the speech recognition process;
    在分别将各个所述演讲稿分割为若干小节的步骤中,根据演讲稿的语义以及演讲语音中的停顿时间长度将演讲稿分割为若干小节。In the step of dividing each of the speeches into several subsections, the speeches are divided into several subsections according to the semantics of the speeches and the length of pauses in the speech speech.
  7. 根据权利要求1或2所述的方法,其特征在于,所述排序信息中包括为空的排名和/或并列的排名信息。The method according to claim 1 or 2, wherein the ranking information includes empty ranking and/or parallel ranking information.
  8. 根据权利要求1或2所述的方法,其特征在于,所述神经网络模型对输入数据提取特征数据的过程中,采用注意力机制对来自所述小节的文本数据和来自所述预设问题的文本数据进行处理,并基于处理后得到的特征数据输出排序信息。The method according to claim 1 or 2, wherein, in the process of extracting feature data from the input data by the neural network model, an attention mechanism is used to analyze the text data from the section and the text data from the preset question. Text data is processed, and sorting information is output based on the processed feature data.
  9. 一种演讲稿测评设备,其特征在于,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行如权利要求1-8中任意一项所述的方法。A speech evaluation device, characterized in that it includes: at least one processor; and a memory connected to the at least one processor in communication; wherein, the memory stores instructions that can be executed by the one processor, so The instructions are executed by the at least one processor, so that the at least one processor performs the method according to any one of claims 1-8.
PCT/CN2021/133041 2021-07-06 2021-11-25 Speech manuscript evaluation method and device WO2023279631A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110759496.0 2021-07-06
CN202110759496.0A CN113255843B (en) 2021-07-06 2021-07-06 Speech manuscript evaluation method and device

Publications (1)

Publication Number Publication Date
WO2023279631A1 true WO2023279631A1 (en) 2023-01-12

Family

ID=77190758

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/133041 WO2023279631A1 (en) 2021-07-06 2021-11-25 Speech manuscript evaluation method and device

Country Status (2)

Country Link
CN (1) CN113255843B (en)
WO (1) WO2023279631A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255843B (en) * 2021-07-06 2021-09-21 北京优幕科技有限责任公司 Speech manuscript evaluation method and device
CN115545042B (en) * 2022-11-25 2023-04-28 北京优幕科技有限责任公司 Lecture draft quality assessment method and lecture draft quality assessment equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304587A (en) * 2018-03-07 2018-07-20 中国科学技术大学 A kind of community's answer platform answer sort method
CN108604240A (en) * 2016-03-17 2018-09-28 谷歌有限责任公司 The problem of based on contextual information and answer interface
US20180365220A1 (en) * 2017-06-15 2018-12-20 Microsoft Technology Licensing, Llc Method and system for ranking and summarizing natural language passages
CN110210301A (en) * 2019-04-26 2019-09-06 平安科技(深圳)有限公司 Method, apparatus, equipment and storage medium based on micro- expression evaluation interviewee
CN110874716A (en) * 2019-09-23 2020-03-10 平安科技(深圳)有限公司 Interview evaluation method and device, electronic equipment and storage medium
CN113255843A (en) * 2021-07-06 2021-08-13 北京优幕科技有限责任公司 Speech manuscript evaluation method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108604240A (en) * 2016-03-17 2018-09-28 谷歌有限责任公司 The problem of based on contextual information and answer interface
US20180365220A1 (en) * 2017-06-15 2018-12-20 Microsoft Technology Licensing, Llc Method and system for ranking and summarizing natural language passages
CN108304587A (en) * 2018-03-07 2018-07-20 中国科学技术大学 A kind of community's answer platform answer sort method
CN110210301A (en) * 2019-04-26 2019-09-06 平安科技(深圳)有限公司 Method, apparatus, equipment and storage medium based on micro- expression evaluation interviewee
CN110874716A (en) * 2019-09-23 2020-03-10 平安科技(深圳)有限公司 Interview evaluation method and device, electronic equipment and storage medium
CN113255843A (en) * 2021-07-06 2021-08-13 北京优幕科技有限责任公司 Speech manuscript evaluation method and device

Also Published As

Publication number Publication date
CN113255843B (en) 2021-09-21
CN113255843A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
Banks et al. A review of best practice recommendations for text analysis in R (and a user-friendly app)
US10719664B2 (en) Cross-media search method
WO2020253503A1 (en) Talent portrait generation method, apparatus and device, and storage medium
WO2023279631A1 (en) Speech manuscript evaluation method and device
KR20190125153A (en) An apparatus for predicting the status of user's psychology and a method thereof
US11409964B2 (en) Method, apparatus, device and storage medium for evaluating quality of answer
CN110175229B (en) Method and system for on-line training based on natural language
US11531928B2 (en) Machine learning for associating skills with content
CN117009490A (en) Training method and device for generating large language model based on knowledge base feedback
CN111046941A (en) Target comment detection method and device, electronic equipment and storage medium
US20200250212A1 (en) Methods and Systems for Searching, Reviewing and Organizing Data Using Hierarchical Agglomerative Clustering
US20200192921A1 (en) Suggesting text in an electronic document
CN110321421B (en) Expert recommendation method for website knowledge community system and computer storage medium
CN111552773A (en) Method and system for searching key sentence of question or not in reading and understanding task
CN112989033B (en) Microblog emotion classification method based on emotion category description
CN117480543A (en) System and method for automatically generating paragraph-based items for testing or evaluation
CN112036705A (en) Quality inspection result data acquisition method, device and equipment
CN117076693A (en) Method for constructing digital human teacher multi-mode large language model pre-training discipline corpus
CN112231491A (en) Similar test question identification method based on knowledge structure
CN115617960A (en) Post recommendation method and device
Pentland et al. Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research
CN113515935B (en) Title generation method, device, terminal and medium
Karlgren et al. Text mining for processing interview data in computational social science
CN112989001A (en) Question and answer processing method, device, medium and electronic equipment
CN114547435A (en) Content quality identification method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21949116

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023577794

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE