WO2022088602A1 - 相似对问题预测的方法、装置及电子设备 - Google Patents

相似对问题预测的方法、装置及电子设备 Download PDF

Info

Publication number
WO2022088602A1
WO2022088602A1 PCT/CN2021/083022 CN2021083022W WO2022088602A1 WO 2022088602 A1 WO2022088602 A1 WO 2022088602A1 CN 2021083022 W CN2021083022 W CN 2021083022W WO 2022088602 A1 WO2022088602 A1 WO 2022088602A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction
training sample
sample set
similar pair
pair problem
Prior art date
Application number
PCT/CN2021/083022
Other languages
English (en)
French (fr)
Inventor
常德杰
刘邦长
谷书锋
赵红文
罗晓斌
张一坤
武云召
刘朝振
王海
张航飞
季科
Original Assignee
北京妙医佳健康科技集团有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京妙医佳健康科技集团有限公司 filed Critical 北京妙医佳健康科技集团有限公司
Priority to US17/238,169 priority Critical patent/US20210241147A1/en
Publication of WO2022088602A1 publication Critical patent/WO2022088602A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/259Fusion by voting

Definitions

  • the present invention relates to the technical field of neural network models, and in particular, to a method, an apparatus and an electronic device for predicting similar pair problems.
  • the purpose of the present invention is to provide a method, apparatus and electronic device for similar problem prediction, so as to alleviate the above technical problems.
  • an embodiment of the present invention provides a method for predicting similar pair problems, wherein the method includes: inputting the similar pair problems to be predicted into multiple different prediction models, and obtaining a prediction result output by each prediction model; Among them, random disturbance parameters are added to the embedding layer of at least one prediction model; a voting operation is performed on multiple prediction results to obtain the final prediction result of the similar pair of problems to be predicted.
  • each prediction model includes a plurality of prediction sub-models, and each prediction sub-model is a similar pair problem determined by a distribution function
  • the training sample set is obtained by training the prediction model; the steps of obtaining the prediction results output by each prediction model include: inputting the problem of similar pairs to be predicted into multiple prediction sub-models included in each prediction model, and obtaining the output of each prediction sub-model. Prediction sub-results; voting on multiple prediction sub-results to obtain prediction results.
  • the embodiment of the present invention provides the second possible implementation manner of the first aspect, wherein the prediction sub-model is trained in the following manner, including: obtaining training samples of the original similarity pair problem Use the similarity transfer principle to perform training sample expansion processing on the original similar pair problem training sample set, and obtain the expanded similar pair problem training sample set; determine the similar pair problem training sample set from the expanded similar pair problem training sample set based on the assignment function; use The similar pair problem training sample set and the specific similar pair problem training sample set are used to train the prediction model to obtain a prediction sub-model.
  • the embodiment of the present invention provides a third possible implementation manner of the first aspect, wherein after obtaining the training sample set for the extended similarity pair problem, the method further includes: Each pair of similar pair problem training samples in the problem training sample set is sequentially labeled; the step of determining the similar pair problem training sample set from the expanded similar pair problem training sample set based on the assignment function includes: using a first function of the assignment function to expand the similarity pair problem training sample set.
  • Determine the first label for the problem training sample set use the second function of the assignment function to determine the second label from the extended similarity pair problem training sample set based on the first label: select the extended similarity pair problem training in the interval between the first label and the second label
  • the sample set is used as the training sample set for the similar pair problem.
  • the embodiment of the present invention provides a sixth possible implementation manner of the first aspect, wherein each pair of specific similarity pair problem training samples in the specific similarity pair problem training sample set is the same as the The similarity of the similar pair problem training sample set is greater than the preset similarity; the steps of using the similar pair problem training sample set and the specific similar pair problem training sample set to train the prediction model to obtain the prediction sub-model include: based on the similar pair problem training sample set The first preset network layer parameter of the training prediction model is trained, and when the loss function of the prediction model converges, the prediction preliminary model of the prediction model is obtained; the second preset network layer of the prediction preliminary model is trained based on the specific similarity pair problem training sample set The prediction sub-model is obtained when the loss function of the initial prediction model is trained to converge.
  • the embodiment of the present invention provides the first possible implementation manner of the first aspect, wherein the random disturbance parameter is generated by using the following formula: Among them, delta represents the random disturbance parameter, a represents the parameter factor, -5 ⁇ a ⁇ 5.
  • an embodiment of the present invention further provides an apparatus for predicting similar pair problems, wherein the apparatus includes: an input module, configured to input the similar pair problems to be predicted into multiple different prediction models, and obtain each prediction model The output prediction result; wherein random disturbance parameters are added to the embedding layer of at least one prediction model; the operation module is used to perform a voting operation on multiple prediction results to obtain the final prediction result of the similar pair of problems to be predicted.
  • an embodiment of the present invention further provides an electronic device, which includes a processor and a memory, the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement the above method.
  • Embodiments of the present invention provide a method, device, and electronic device for predicting similar pair problems, wherein the similar pair problems to be predicted are input into multiple different prediction models, and the prediction results output by each prediction model are obtained; Random disturbance parameters are added to the embedding layer of the prediction model; voting operations are performed on multiple prediction results to obtain the final prediction result of the similar pair of problems to be predicted.
  • the present application by adding random disturbance parameters to the embedding layer of the prediction model, overfitting caused by over-learning of sample knowledge can be effectively prevented in the prediction model, and the prediction accuracy can be effectively improved by using the above-mentioned prediction model to predict similar pair problems.
  • FIG. 1 is a flowchart of a method for predicting a similar problem according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a training sample expansion provided by an embodiment of the present invention.
  • FIG. 3 is a flowchart of another method for predicting similar pair problems provided by an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a similar problem prediction device provided by an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the method, device, and electronic device for predicting similar pair problems provided by the embodiments of the present invention can effectively prevent the prediction model from over-fitting caused by over-learning sample knowledge by adding random disturbance parameters to the embedding layer of the prediction model, and further Using the above prediction model to predict similar pair problems can effectively improve the accuracy of the prediction.
  • the method specifically includes the following steps:
  • Step S102 inputting the similar pair of questions to be predicted into a plurality of different prediction models, and obtaining the prediction results output by each prediction model; wherein, random disturbance parameters are added to the embedding layer of at least one prediction model;
  • Similar pair questions refer to a set of similar pair questions formed by two relatively similar questions, for example, "what is the matter with hemoptysis after strenuous exercise” and “why does hemoptysis occur after strenuous exercise” constitute a set of similar pair questions. Questions; "What's the matter with hemoptysis after strenuous exercise” and “What should I do with hemoptysis after strenuous exercise” constitute a similar pair of questions.
  • different prediction models refer to different types of prediction models.
  • Three text classification models with different prediction types the common roberta wwm large model, roberta pair large model and ernie model, can be used as prediction models to predict similar problems. , to obtain the prediction results output by the three prediction models respectively.
  • the determination of the prediction model can be selected according to actual needs, which is not limited here.
  • the prediction model predicts that the similar pair of questions to be predicted is a group of questions with the same meaning or a group of questions with different meanings. If the obtained prediction result is 0, it means the same meaning. If the prediction result is 1, it means that the meaning is different.
  • the meaning of the prediction result can be set as required, which is not limited here.
  • random disturbance parameters can be added to the embedding layer of at least one of the above three prediction models, which can prevent the prediction model from over-fitting due to over-learning the knowledge of the training samples during the model training process, and thus can prevent the prediction model from overfitting. Effectively improve the forecasting ability of the forecasting model.
  • the random disturbance parameters are generated by the following formula: Among them, delta represents the random disturbance parameter, a represents the parameter factor, -5 ⁇ a ⁇ 5.
  • step S104 a voting operation is performed on the plurality of prediction results to obtain a final prediction result of the similar pair of questions to be predicted.
  • the voting operation may be an absolute majority voting method (more than half of the votes), a relative majority voting method (the most votes), or a weighted voting method, and the specific voting method can be determined according to actual needs.
  • a voting operation is performed on the output prediction results of the above three prediction models by using the relative majority voting method to obtain the final prediction result of the similar pair of questions to be predicted;
  • the prediction result is 0, the prediction result obtained by inputting the similar pair problem to the roberta pair large model is 0, and the prediction result obtained by inputting the similar pair problem to the ernie model is 1.
  • the final prediction result is 0, it means that the similar pair of questions to be predicted is a set of question pairs with the same meaning.
  • An embodiment of the present invention provides a method for predicting similar pair problems, wherein the similar pair problems to be predicted are input into multiple different prediction models, and the prediction results output by each prediction model are obtained; wherein, the embedding layer of at least one prediction model Add random disturbance parameters; vote on multiple prediction results to obtain the final prediction result of the similar pair of problems to be predicted.
  • the embedding layer of at least one prediction model Add random disturbance parameters; vote on multiple prediction results to obtain the final prediction result of the similar pair of problems to be predicted.
  • each prediction model includes a plurality of prediction sub-models, and each prediction sub-model is obtained by training the prediction model with a similar pair of problem training sample sets determined by the assignment function; specifically, the training process of the prediction sub-model can be performed by step A1 - Step A4 implements:
  • Step A1 obtaining the original similar pair problem training sample set
  • the original similar pair problem training sample set may be the original similar pair problem training sample set obtained from the network or other storage devices in advance, after denoising and cleaning; in actual use, the original similar pair problem training sample can be used.
  • the main methods are exploration, category distribution, sentence length distribution exploration, etc. Data analysis can be carried out according to the characteristics explored, so as to facilitate the research on the subsequent training of the prediction model.
  • Step A2 using the similarity transfer principle to perform training sample expansion processing on the original similarity pair problem training sample set to obtain an expanded similarity pair problem training sample set;
  • FIG. 2 shows a schematic diagram of the expansion of training samples, as shown in the far left of Figure 2.
  • query1 questions 1
  • query2 questions 1
  • label label
  • a and B in the first row
  • the corresponding label is 1, indicating that question A and question B are a set of question pairs with different meanings
  • the labels corresponding to A and C in the second row are 1, indicating that question A and question C are a set of question pairs with different meanings
  • the labels corresponding to A and D in the third row, A and E in the fourth row, and A and F in the fifth row are all 0, indicating that A and D are a set of question pairs with the same meaning, and A and E are meanings
  • the same set of question pairs, A and F are a set of question pairs with the
  • the content shown in the right box in Figure 2 is the expanded data for the training sample expansion processing of the original similarity pair problem training sample set in the left box using the similarity transfer principle.
  • the first row of training samples and the second row of training samples show that A and B are a set of question pairs with different meanings, and A and C are also a set of question pairs with different meanings, so it can be inferred that B and C are a set of question pairs with different meanings.
  • the expanded data selected in the right box of Figure 2 is the same as the 0/1 label distribution of the original similar pair problem training sample set. 1
  • the label distribution ratio is close to the 0/1 label distribution ratio of the original similar pair problem training sample set; since the 0/1 label distribution ratio of the original similar pair problem training sample set is 2:3, the box on the right side of Figure 2 can be selected.
  • Step A3 determining the similar pair problem training sample set from the expanded similar pair problem training sample set based on the assignment function
  • step A3 can be realized by step B1-step B3:
  • Step B1 use the first function of the assignment function to determine the first label from the training sample set of the extended similar pair problem:
  • the length of AllNumber is 100, and the offset is set to 10.
  • the offset can be set according to actual needs, which is not limited here.
  • Step B2 use the second function of the assignment function to determine the second label from the training sample set of the extended similar pair problem based on the first label:
  • Step B3 selecting the extended similar pair question training sample set in the interval between the first label and the second label as the similar pair question training sample set.
  • the labels of the problem training sample set are matched with the expanded similarity of the sequential labels respectively, and the expanded similarity is paired with the training samples in the interval labeled 20 and 40 in the problem training sample set.
  • the training sample set for a similar pair problem As a training sample set for a similar pair problem.
  • Step A4 using the similar pair problem training sample set and the specific similar pair problem training sample set to train the prediction model to obtain a prediction sub-model.
  • the above-mentioned specific similar pair problem training sample set is a training sample specially collected according to the actual prediction problem pair to enhance the prediction ability of the prediction sub-model. For example, this time it is a medical problem pair prediction, so we simply rely on the above three prediction models. (The above three prediction models are all bert models)
  • the pre-training model itself may not be enough, so on the basis of bert, this time, on the basis of bert, a medical bert is trained to enhance the pre-training by obtaining medical corpus samples from the Internet.
  • the process of determining the training sample set of specific similarity pair questions is: a) Collect question pairs on the website extensively; b) Compare the similarity with the question pairs in the training sample set of expanded similarity pair questions, which can use Manhattan distance method and Euclidean distance. Method, Chebyshev distance method and other methods to compare the similarity, which is not limited here; the medical corpus samples with the similarity greater than the preset similarity are kept to form a training sample set of specific similarity pair problems.
  • the process of using the similar pair problem training sample set and the specific similar pair problem training sample set to train the prediction model to obtain the prediction sub-model is as follows: training the first preset network layer parameter of the prediction model based on the similar pair problem training sample set, training to prediction When the loss function of the model converges, the initial prediction model of the prediction model is obtained; the second preset network layer parameter of the initial prediction model is trained based on the training sample set of the specific similarity pair problem, and the prediction is obtained when the loss function of the initial prediction model converges after training. submodel.
  • the first 5 layers of network parameters of the prediction model are trained by using the similar pair problem training sample set to obtain the initial prediction model, and the specific similarity pair problem training sample set after screening is used to fine-tune the training parameters of the representation layer of bert to obtain the prediction sub-model .
  • this embodiment provides another method for predicting similar problems, which is implemented on the basis of the foregoing embodiment; this embodiment focuses on obtaining the prediction output of each prediction model. Specific implementation of results.
  • the method for predicting similar pair problems in this embodiment includes the following steps:
  • Step S302 inputting the similar pair problem to be predicted into a plurality of prediction sub-models included in each prediction model, to obtain a prediction sub-result output by each prediction sub-model;
  • the multiple prediction sub-models included in the prediction model are obtained by training the prediction model (for example, the roberta wwm large model) using the multiple similar pair problem training sample sets and the specific similar pair problem training sample sets determined by the distribution function.
  • the training sample sets of these multiple prediction sub-models may be different due to similar pairs of problems. Therefore, the internal parameters of these multiple trained prediction sub-models may be different. Therefore, the predictors output by multiple prediction sub-models may be different. Results may vary.
  • each prediction model uses 5 similar pair problem training sample sets and specific similar pair problem training sample sets determined by the allocation function to obtain 5 prediction sub-models as an example to illustrate, then the above three prediction The model has 15 predictive sub-models available.
  • Step S304 performing a voting operation on the multiple prediction sub-results to obtain a prediction result
  • the 5 prediction sub-models of the roberta wwm The predicted sub-results obtained by the model are 0, 0, 1, 0, and 0 respectively.
  • the voting operation is performed using the relative majority voting method, the predicted result of the roberta wwm large model is 0, and the predicted result of the roberta pair large model is the same as ernie.
  • the prediction results of the model are the same as those obtained by the roberta wwm large model, and will not be repeated here.
  • the voting calculation method can be selected according to actual needs, which is not limited here.
  • step S306 a voting operation is performed on the plurality of prediction results to obtain a final prediction result of the similar pair of questions to be predicted.
  • the roberta pair large model and the ernie model after using the prediction sub-results of its multiple prediction sub-models to obtain the prediction results, a voting operation is required to obtain the final prediction results of the similar pairs to be predicted.
  • the above-mentioned method for predicting a similar problem provided by the embodiment of the present invention first obtains the prediction result of each prediction model through a voting operation of the prediction sub-results output by the multiple prediction sub-models included in each prediction model, and then calculates the prediction results of the multiple prediction models.
  • the prediction result of the model is voted twice to get the final prediction result of the similar pair of questions to be predicted.
  • the voting between the prediction models is performed to generate the final prediction result.
  • the secondary voting operation can enhance the reliability of the model and improve the prediction accuracy of the model.
  • FIG. 4 shows a schematic structural diagram of an apparatus for predicting similar pair problems.
  • the apparatus includes:
  • the input module 402 is used to input the similar pair of questions to be predicted into a plurality of different prediction models, and obtain the prediction result output by each prediction model; wherein, random disturbance parameters are added to the embedding layer of at least one prediction model;
  • the operation module 404 is configured to perform a voting operation on the plurality of prediction results to obtain the final prediction result of the similar pair of questions to be predicted.
  • An embodiment of the present invention provides an apparatus for predicting similar pair problems, wherein the similar pair problems to be predicted are input into a plurality of different prediction models, and the prediction results output by each prediction model are obtained; wherein, the embedding layer of at least one prediction model Add random disturbance parameters; vote on multiple prediction results to obtain the final prediction result of the similar pair of problems to be predicted.
  • the embedding layer of at least one prediction model Add random disturbance parameters; vote on multiple prediction results to obtain the final prediction result of the similar pair of problems to be predicted.
  • FIG. 5 is a schematic structural diagram of the electronic device, wherein the electronic device includes a processor 121 and a memory 120 , and the memory 120 stores data that can be used by the processor 121 Executed computer-executable instructions, the processor 121 executes the computer-executable instructions to implement a method similar to the problem prediction described above.
  • the electronic device further includes a bus 122 and a communication interface 123 , wherein the processor 121 , the communication interface 123 and the memory 120 are connected through the bus 122 .
  • the memory 120 may include a high-speed random access memory (RAM, Random Access Memory), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
  • the communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 123 (which may be wired or wireless), and the Internet, wide area network, local area network, metropolitan area network, etc. may be used.
  • the bus 122 may be an ISA (Industry Standard Architecture, industry standard architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus, or an EISA (Extended Industry Standard Architecture, extended industry standard architecture) bus and the like.
  • the bus 122 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one bidirectional arrow is shown in FIG. 5, but it does not mean that there is only one bus or one type of bus.
  • the processor 121 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 121 or an instruction in the form of software.
  • the above-mentioned processor 121 may be a general-purpose processor, including a central processing unit (Central Processing Unit, referred to as CPU), a network processor (NetworkProcessor, referred to as NP), etc.; may also be a digital signal processor (Digital Signal Processor, referred to as DSP) , Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU Central Processing Unit
  • NP NetworkProcessor
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor 121 reads the information in the memory, and completes the steps of the method for problem prediction similar to the foregoing embodiments in combination with its hardware.
  • Embodiments of the present application further provide a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when the computer-executable instructions are invoked and executed by a processor, the computer-executable instructions cause the processor to
  • a computer-readable storage medium where computer-executable instructions are stored in the computer-readable storage medium, and when the computer-executable instructions are invoked and executed by a processor, the computer-executable instructions cause the processor to
  • the computer program products of the similar problem prediction methods, apparatuses and electronic devices provided by the embodiments of the present application include a computer-readable storage medium storing program codes, and the instructions included in the program codes can be used to execute the preceding method embodiments.
  • the instructions included in the program codes can be used to execute the preceding method embodiments.
  • the specific implementation of the method reference may be made to the method embodiment, which will not be repeated here.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Evolutionary Biology (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种相似对问题预测的方法、装置及电子设备,其中,该方法将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果(S102);其中,至少一个预测模型的嵌入层加入随机扰动参数;对多个预测结果进行投票运算,得到待预测相似对问题的最终预测结果(S104)。该方法通过在预测模型的嵌入层加入随机扰动参数可有效防止预测模型过度学习样本知识造成的过拟合,进而利用上述预测模型对相似对问题进行预测可有效提高预测的准确性。

Description

相似对问题预测的方法、装置及电子设备 技术领域
本发明涉及神经网络模型技术领域,尤其是涉及一种相似对问题预测的方法、装置及电子设备。
背景技术
利用神经网络分类模型将患者常见的问题问答进行相似分类是一个有价值的事情,比如,识别患者相似问题,有利于理解患者真正诉求,帮助快速匹配准确答案,提升患者获得感;而归纳医生相似答案,可有助于分析答案规范性避免误诊。
目前,常在现有神经网络分类模型中加入固定扰动参数以防止过拟合,然而这种方式在模型训练的过程中容易学习到样本知识不利于防止过拟合。
发明内容
有鉴于此,本发明的目的在于提供一种相似对问题预测的方法、装置及电子设备,以缓解上述技术问题。
第一方面,本发明实施例提供了一种相似对问题预测的方法,其中,该方法包括:将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果;其中,至少一个预测模型的嵌入层加入随机扰动参数;对多个预测结果进行投票运算,得到待预测相似对问题的最终预测结果。
结合第一方面,本发明实施例提供了第一方面的第一种可能的实施方式,其中,每个预测模型包括多个预测子模型,每个预测子模型是由分配函数确定的相似对问题训练样本集训练预测模型得到;获得每个预测模型输出的预测结果的步骤,包括:将待预测相似对问题输入每个预测模型包括的多个预测子模型中,得到每个预测子模型输出的预测子结果;将多个预测子结果进行投票运算,得到预测结果。
结合第一方面的第一种可能的实施方式,本发明实施例提供了第一方面的第二种可能的实施方式,其中,预测子模型采用以下方式训练,包括:获取原始相似对问题训练样本集;利用相似性传递原理对原始相似对问题训练样本集进行训练样本扩充处理,得到扩充相似对问题训练样本集;基于分配函数从扩充相似对问题训练样本集中确定相似对问题训练样本集;利用相似对问题训练样本集和特定相似对问题训练样本集训练预测模型得到预测子模型。
结合第一方面的第二种可能的实施方式,本发明实施例提供了 第一方面的第三种可能的实施方式,其中,得到扩充相似对问题训练样本集之后,方法还包括:对扩充相似对问题训练样本集中的每对相似对问题训练样本进行顺序标号;基于分配函数从扩充相似对问题训练样本集中确定相似对问题训练样本集的步骤,包括:利用分配函数的第一函数从扩充相似对问题训练样本集中确定第一标号:利用分配函数的第二函数基于第一标号从扩充相似对问题训练样本集中确定第二标号:选取第一标号和第二标号区间内的扩充相似对问题训练样本集作为相似对问题训练样本集。
结合第一方面的第三种可能的实施方式,本发明实施例提供了第一方面的第四种可能的实施方式,其中,第一函数为:i=AllNumber*radom(0,1)+offset;其中,i表示第一标号,i<AllNumber,AllNumber表示扩充相似对问题训练样本集的长度,offset表示偏移量,offset<AllNumber,offset为正整数。
结合第一方面的第三种可能的实施方式,本发明实施例提供了第一方面的第五种可能的实施方式,其中,第二函数为:j=i+A%*AllNumber;其中,j表示第二标号,i≤j≤AllNumber,A为正整数,0≤A≤100,i表示第一标号,AllNumber表示扩充相似对问题训练样本集的长度。
结合第一方面的第二种可能的实施方式,本发明实施例提供了第一方面的第六种可能的实施方式,其中,特定相似对问题训练样本集中的每对特定相似对问题训练样本与相似对问题训练样本集的相似度均大于预设相似度;利用相似对问题训练样本集和特定相似对问题训练样本集训练预测模型得到预测子模型的步骤,包括:基于相似对问题训练样本集训练预测模型的第一预设网络层数参数,训练至预测模型的损失函数收敛时,得到预测模型的预测初步模型;基于特定相似对问题训练样本集训练预测初步模型的第二预设网络层数参数,训练至预测初步模型的损失函数收敛时,得到预测子模型。
结合第一方面,本发明实施例提供了第一方面的第气种可能的实施方式,其中,利用下式产生随机扰动参数:
Figure PCTCN2021083022-appb-000001
其中,delta表示随机扰动参数,a表示参数因子,-5≤a≤5。
第二方面,本发明实施例还提供一种相似对问题预测的装置,其中,该装置包括:输入模块,用于将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果;其中,至少一个预测 模型的嵌入层加入随机扰动参数;运算模块,用于对多个预测结果进行一次投票运算,得到待预测相似对问题的最终预测结果。
第三方面,本发明实施例还提供一种电子设备,其中,包括处理器和存储器,存储器存储有能够被处理器执行的计算机可执行指令,处理器执行计算机可执行指令以实现上述方法。
本发明实施例带来了以下有益效果:
本发明实施例提供一种相似对问题预测的方法、装置及电子设备,其中,将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果;其中,至少一个预测模型的嵌入层加入随机扰动参数;对多个预测结果进行投票运算,得到待预测相似对问题的最终预测结果。本申请通过在预测模型的嵌入层加入随机扰动参数可有效防止预测模型过度学习样本知识造成的过拟合,进而利用上述预测模型对相似对问题进行预测可有效提高预测的准确性。
本发明的其他特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点在说明书以及附图中所特别指出的结构来实现和获得。
为使本发明的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施方式,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种相似对问题预测的方法的流程图;
图2为本发明实施例提供的一种训练样本扩充的示意图;
图3为本发明实施例提供的另一种相似对问题预测的方法的流程图;
图4为本发明实施例提供的一种相似对问题预测的装置的结构示意图;
图5为本发明实施例提供的一种电子设备的结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合附图对本发明的技术方案进行清楚、完整地描述,显然,所描述的实 施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
目前,常在现有神经网络分类模型中加入固定扰动参数以防止过拟合,然而这种方式在模型训练的过程中容易学习到样本知识不利于防止过拟合。基于此,本发明实施例提供的一种相似对问题预测的方法、装置及电子设备,通过在预测模型的嵌入层加入随机扰动参数可有效防止预测模型过度学习样本知识造成的过拟合,进而利用上述预测模型对相似对问题进行预测可有效提高预测的准确性。
为便于对本实施例进行理解,首先对本发明实施例所公开的一种相似对问题预测的方法进行详细介绍。
参见图1所示的一种相似对问题预测的方法的流程图,该方法具体包括如下步骤:
步骤S102,将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果;其中,至少一个预测模型的嵌入层加入随机扰动参数;
相似对问题是指两个比较相似的问题构成的一组相似对问题,比如,“剧烈运动后咯血,是怎么回事”和“剧烈运动后为什么会咯血”这两个问题构成一组相似对问题;“剧烈运动后咯血,是怎么回事”和“剧烈运动后咯血,应该怎么处理”这两个问题构成的一组相似对问题。
通常,不同的预测模型是指预测模型的类型不同,可选用常见的roberta wwm large模型、roberta pair large模型和ernie模型这三个预测类型不同的文本分类模型作为预测模型对待预测相似对问题进行预测,以分别得到这三个预测模型输出的预测结果。预测模型的确定可以根据实际需要进行选取,在此不进行限定。
其中,根据该预测结果可确定出该预测模型预测出该待预测相似对问题是含义相同的一组问题,还是含义不同的一组问题,如果得到的预测结果为0则表示含义相同,如果得到的预测结果为1则表示含义不相同,预测结果的含义可以根据需要进行设定,在此不进行限定。
在本实施例中,可在上述三个预测模型中至少一个模型的嵌入层加入随机扰动参数,可防止预测模型在模型训练过程中由于过度学习训练样本知识而导致的过拟合现象,进而可有效提高预测模型的预测能力。
具体地,利用下式产生随机扰动参数:
Figure PCTCN2021083022-appb-000002
其中,delta表示随机扰动参数,a表示参数因子,-5≤a≤5。
步骤S104,对多个预测结果进行投票运算,得到待预测相似对问题的最终预测结果。
在本实施例中,投票运算可以为绝对多数投票法(得票超过一半)、相对多数投票法(得票最多)或加权投票法,可以根据实际需要进行确定具体的投票方法在此不进行限定。
在本实施例中,采用相对多数投票方法对上述三个预测模型的输出预测结果进行投票运算,得到待预测相似对问题的最终预测结果;比如,待预测相似对问题输入至roberta wwm large模型得到的预测结果为0,待预测相似对问题输入至roberta pair large模型得到的预测结果为0、待预测相似对问题输入至ernie模型得到的预测结果为1,基于相对多数投票法得到最终预测结果为0,则表示待预测相似对问题为含义相同的一组问题对。
本发明实施例提供一种相似对问题预测的方法,其中,将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果;其中,至少一个预测模型的嵌入层加入随机扰动参数;对多个预测结果进行投票运算,得到待预测相似对问题的最终预测结果。本申请通过在预测模型的嵌入层加入随机扰动参数可有效防止预测模型过度学习样本知识造成的过拟合,进而利用上述预测模型对相似对问题进行预测可有效提高预测的准确性。
通常,每个预测模型包括多个预测子模型,每个预测子模型是由分配函数确定出的相似对问题训练样本集训练预测模型得到的;具体地,预测子模型的训练过程,可由步骤A1-步骤A4实现:
步骤A1,获取原始相似对问题训练样本集;
该原始相似对问题训练样本集可以为预先从网络或其它存储设备上获取的,经去噪和清洗后的原始相似对问题训练样本集;在实际使用时,可对该原始相似对问题训练样本集进行特点探索和特征分布探索,主要进行的手段有探索,类别分布,句子长度分布探索等,可根据探索到的特点进行数据分析,以便于对预测模型后续训练的研究。
步骤A2,利用相似性传递原理对原始相似对问题训练样本集进行训练样本扩充处理,得到扩充相似对问题训练样本集;
上述原始相似对问题训练样本集都是带有标签的训练样本,以用于训练预测模型的训练,为了便于理解,图2示出了一种训练样本扩充的示意图,如图2最左边示出的方框中,为采集到的原始相似对问题训练 样本集,其中,query1(问题1)、query2(问题1)和label(标签)组成一组训练样本,比如,第一行中A和B对应的标签为1,表示问题A和问题B为含义不同的一组问题对,第二行中A和C对应的标签为1,表示问题A和问题C为含义不同的一组问题对,而第三行中的A和D、第四行中的A和E和第五行中的A和F对应的标签均为0,表示A和D为含义相同的一组问题对,A和E为含义相同的一组问题对,A和F为含义相同的一组问题对。
图2中右边方框所示的内容为利用相似性传递原理对左边方框中的原始相似对问题训练样本集进行训练样本扩充处理的扩充数据,具体地,由原始相似对问题训练样本集中的第一行训练样本和第二行训练样本可知,A和B为含义不同的一组问题对,A和C同样为含义不同的一组问题对,则可推断出B和C为含义不同的一组问题对;由原始相似对问题训练样本集中的第一行训练样本和第三行训练样本可知,由于A和D为含义相同的一组问题对,所以,B和D为含义相同的一组问题对;同理,可得到图2右边方框中的根据相似性传递原理对原始相似对问题训练样本集进行推导传递后的扩充数据,在此不对图2右边方框中剩余扩充数据的推导进行一一赘述。
为了保证扩充相似对问题训练样本集和相似对问题训练样本集的0/1标签分布比例相差无几,可在图2右边方框中选取出的扩充数据与原始相似对问题训练样本集的0/1标签分布比例接近于原始相似对问题训练样本集的0/1标签分布比例;由于原始相似对问题训练样本集的0/1标签分布比例为2:3,所以,可选取图2右边方框中选取一组标签为1和一组标签为0的扩充数据添加到原始相似对问题训练样本集以构成扩充相似对问题训练样本集,以保证扩充相似对问题训练样本集的0/1标签分布比例(3:4)与原始相似对问题训练样本集的0/1标签分布比例较为接近,具体地,可选取图2右边方框中第一行扩充数据和剩余6行扩充数据中的任意一行扩充数据添加到原始相似对问题训练样本集中形成训练扩充相似对问题训练样本集,以用于对预测子模型的训练。
步骤A3,基于分配函数从扩充相似对问题训练样本集中确定相似对问题训练样本集;
通常,在确定相似对问题训练样本集之前,需要对扩充相似对问题训练样本集中的每对相似对问题训练样本进行顺序标号,比如,上述扩充相似对问题训练样本集中一共有100个问题对,将该100个问题对按照顺序进行0-100的标号。
其中,步骤A3的过程可由步骤B1-步骤B3实现:
步骤B1,利用分配函数的第一函数从扩充相似对问题训练样本集中确定第一标号:
具体地,第一函数为:i=AllNumber*radom(0,1)+offset;其中,i表示第一标号,i<AllNumber,AllNumber表示扩充相似对问题训练样本集的长度,offset表示偏移量,offset<AllNumber,offset为正整数。
继续以扩充相似对问题训练样本集中一共有100个问题对为例进行说明,其中,AllNumber的长度即为100,offset设置为10,在进行第一次确定第一标号时如果radom(0,1)的随机数为0.1,则经第一函数计算得到的第一标号为i=20。其中,offset可以根据实际需要进行设置,在此不进行限定。
步骤B2,利用分配函数的第二函数基于第一标号从扩充相似对问题训练样本集中确定第二标注号:
上述第二函数为:j=i+A%*AllNumber;其中,j表示第二标号,i≤j≤AllNumber,A为正整数,0≤A≤100。
如果设置A为20,则通过得到的i=20可知,j=40。其中,A可以根据实际需要进行设置,在此不进行限定。
步骤B3,选取第一标号和第二标号区间内的扩充相似对问题训练样本集作为相似对问题训练样本集。
在通过分配函数得到第一标号和第二标号后,分别与顺序标号的扩充相似对问题训练样本集进行标号匹配,将扩充相似对问题训练样本集中标号为20和标号为40区间内的训练样本作为一次相似对问题训练样本集。
由于分配函数中有radom(0,1)的存在,因此,每次确定出的相似对问题训练样本集也是随机的。
步骤A4,利用相似对问题训练样本集和特定相似对问题训练样本集训练预测模型得到预测子模型。
上述特定相似对问题训练样本集为根据实际预测问题对而特定采集的训练样本,以增强预测子模型的预测能力,例如,本次是医学方面的问题对预测,所以单纯倚靠上述三个预测模型(上述三个预测模型均为bert模型)本身预训练模型可能不够,所以本次在bert的基础上,通过网上获取医疗方面的语料样本训练一个医学方面的bert进行预训练的增强。
特定相似对问题训练样本集确定过程为:a)广泛收集网站上的问题对;b)和扩充相似对问题训练样本集中的问题对进行相似度的比对, 可采用曼哈顿距离法、欧氏距离法、切比雪夫距离法等方法进行相似度的比对,在此不进行限定;将相似度大于预设相似度的医疗方面的语料样本留下,以构成特定相似对问题训练样本集。
具体利用相似对问题训练样本集和特定相似对问题训练样本集训练预测模型得到预测子模型的过程为:基于相似对问题训练样本集训练预测模型的第一预设网络层数参数,训练至预测模型的损失函数收敛时,得到预测模型的预测初步模型;基于特定相似对问题训练样本集训练预测初步模型的第二预设网络层数参数,训练至预测初步模型的损失函数收敛时,得到预测子模型。
比如,利用相似对问题训练样本集训练预测模型的前5层网络参数,得到预测初步模型,在使用筛选后的特定相似对问题训练样本集进行微调训练bert的表示层参数,以得到预测子模型。
基于以上对预测子模型的训练的描述,本实施例提供了另一种相似对问题预测的方法,该方法在上述实施例的基础上实现;本实施例重点描述获得每个预测模型输出的预测结果的具体实施方式。如图3所示的另一种相似对问题预测的方法的流程图,本实施例中的相似对问题预测的方法包括如下步骤:
步骤S302,将待预测相似对问题输入每个预测模型包括的多个预测子模型中,得到每个预测子模型输出的预测子结果;
预测模型包括的多个预测子模型均是利用分配函数确定出的多份相似对问题训练样本集和特定相似对问题训练样本集分别对预测模型(例如,roberta wwm large模型)进行训练得到的多个预测子模型,这多个预测子模型由于相似对问题训练样本集可能存在不同,所以,这多个训练好的预测子模型的内部参数可能不同,因此,多个预测子模型输出的预测子结果可能存在不同。
在本实施例中,每个预测模型利用分配函数确定出的5份相似对问题训练样本集和特定相似对问题训练样本集进行训练得到5个预测子模型为例进行说明,则上述三个预测模型可得到15个预测子模型。
步骤S304,将多个预测子结果进行投票运算,得到预测结果;
将每个预测模型包括的5个预测子模型分别进行一次投票运算,得到每个预测模型对应的预测结果,以roberta wwm large模型的5个预测子模型为例进行说明,其中,5个预测子模型得到的预测子结果分别为0、0、1、0、0,采用相对多数投票方法进行投票运算时,得到的roberta wwm large模型的预测结果则为0,roberta pair large模型的预测结果和ernie模型的预测结果同roberta wwm large模型得到的预测结果的方 法相同,在此不一一进行赘述。其中,投票运算的方法可以根据实际需要进行选取,在此不进行限定。
步骤S306,对多个预测结果进行投票运算,得到待预测相似对问题的最终预测结果。
对于roberta wwm large模型、roberta pair large模型和ernie模型在分别利用其多个预测子模型的预测子结果得到预测结果后,还需要在进行一次投票运算才能得到待预测相似对问题的最终预测结果。
本发明实施例提供的上述似对问题预测的方法,首先通过每个预测模型包含的多个预测子模型输出的预测子结果的一次投票运算得到每个预测模型的预测结果,再将多个预测模型的预测结果进行二次投票才得到待预测相似对问题的最终预测结果。本申请利用预测模型内部投票结束后,再进行预测模型之间的投票,生成最终预测结果,二次投票运算能够增强模型可信度,可提高模型的预测准确率。
对应于上述方法实施例,本发明实施例提供了一种相似对问题预测的装置,图4示出了一种相似对问题预测的装置的结构示意图,如图4所示,该装置包括:
输入模块402,用于将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果;其中,至少一个预测模型的嵌入层加入随机扰动参数;
运算模块404,用于对多个预测结果进行一次投票运算,得到待预测相似对问题的最终预测结果。
本发明实施例提供一种相似对问题预测的装置,其中,将待预测相似对问题输入至多个不同的预测模型中,获得每个预测模型输出的预测结果;其中,至少一个预测模型的嵌入层加入随机扰动参数;对多个预测结果进行投票运算,得到待预测相似对问题的最终预测结果。本申请通过在预测模型的嵌入层加入随机扰动参数可有效防止预测模型过度学习样本知识造成的过拟合,进而利用上述预测模型对相似对问题进行预测可有效提高预测的准确性。
本申请实施例还提供了一种电子设备,如图5所示,为该电子设备的结构示意图,其中,该电子设备包括处理器121和存储器120,该存储器120存储有能够被该处理器121执行的计算机可执行指令,该处理器121执行该计算机可执行指令以实现上述相似对问题预测的方法。
在图5示出的实施方式中,该电子设备还包括总线122和通信接口123,其中,处理器121、通信接口123和存储器120通过总线122连接。
其中,存储器120可能包含高速随机存取存储器(RAM,Random Access Memory),也可能还包括非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。通过至少一个通信接口123(可以是有线或者无线)实现该系统网元与至少一个其他网元之间的通信连接,可以使用互联网,广域网,本地网,城域网等。总线122可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。所述总线122可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。
处理器121可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器121中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器121可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(NetworkProcessor,简称NP)等;还可以是数字信号处理器(Digital Signal Processor,简称DSP)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器121读取存储器中的信息,结合其硬件完成前述实施例的相似对问题预测的方法的步骤。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令在被处理器调用和执行时,该计算机可执行指令促使处理器实现上述相似对问题预测的方法,具体实现可参见前述方法实施例,在此不再赘述。
本申请实施例所提供的相似对问题预测的方法、装置及电子设备的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可用于执行前面方法实施例中所述的方法,具体实现可参见方法实施例,在此不再赘述。
除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对步骤、数字表达式和数值并不限制本申请的范围。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请的描述中,需要说明的是,术语“中心”、“上”、“下”、“左”、“右”、“竖直”、“水平”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本申请和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本申请的限制。此外,术语“第一”、“第二”、“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。
最后应说明的是:以上所述实施例,仅为本申请的具体实施方式,用以说明本申请的技术方案,而非对其限制,本申请的保护范围并不局限于此,尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本申请实施例技术方案的精神和范围,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (7)

  1. 一种相似对问题预测的方法,其特征在于,所述方法包括:
    将待预测相似对问题输入至多个不同的预测模型中,获得每个所述预测模型输出的预测结果;其中,至少一个所述预测模型的嵌入层加入随机扰动参数;
    对多个所述预测结果进行投票运算,得到所述待预测相似对问题的最终预测结果;
    每个所述预测模型包括多个预测子模型,每个所述预测子模型是由特定相似对问题训练样本集和由分配函数确定的相似对问题训练样本集训练预测模型得到;
    获得每个所述预测模型输出的预测结果的步骤,包括:
    将所述待预测相似对问题输入每个所述预测模型包括的多个预测子模型中,得到每个所述预测子模型输出的预测子结果;
    将多个所述预测子结果进行投票运算,得到所述预测结果;
    所述预测子模型采用以下方式训练,包括:
    获取原始相似对问题训练样本集;
    利用相似性传递原理对所述原始相似对问题训练样本集进行训练样本扩充处理,得到扩充相似对问题训练样本集;
    基于分配函数从所述扩充相似对问题训练样本集中确定相似对问题训练样本集;
    利用所述相似对问题训练样本集和特定相似对问题训练样本集训练预测模型得到所述预测子模型;
    其中,得到扩充相似对问题训练样本集之后,所述方法还包括:
    对所述扩充相似对问题训练样本集中的每对相似对问题训练样本进行顺序标号;
    基于所述分配函数从所述扩充相似对问题训练样本集中确定所述相似对问题训练样本集的步骤,包括:
    利用所述分配函数的第一函数从所述扩充相似对问题训练样本集中确定第一标号:
    利用所述分配函数的第二函数基于所述第一标号从所述扩充相似对问题训练样本集中确定第二标号:
    选取所述第一标号和所述第二标号区间内的扩充相似对问题训练样本集作为所述相似对问题训练样本集。
  2. 根据权利要求1所述的方法,其特征在于,所述第一函数为:
    i=AllNumber*radom(0,1)+offset;
    其中,i表示所述第一标号,i<AllNumber,AllNumber表示所述扩 充相似对问题训练样本集的长度,offset表示偏移量,offset<AllNumber,offset为正整数。
  3. 根据权利要求2所述的方法,其特征在于,所述第二函数为:
    j=i+A%*AllNumber;
    其中,j表示所述第二标号,i≤j≤AllNumber,A为正整数,0≤A≤100。
  4. 根据权利要求1所述的方法,其特征在于,所述特定相似对问题训练样本集中的每对特定相似对问题训练样本与所述相似对问题训练样本集的相似度均大于预设相似度;
    利用所述相似对问题训练样本集和特定相似对问题训练样本集训练所述预测模型得到所述预测子模型的步骤,包括:
    基于所述相似对问题训练样本集训练所述预测模型的第一预设网络层数参数,训练至预测模型的损失函数收敛时,得到所述预测模型的预测初步模型;
    基于所述特定相似对问题训练样本集训练所述预测初步模型的第二预设网络层数参数,训练至所述预测初步模型的损失函数收敛时,得到所述预测子模型。
  5. 根据权利要求1所述的方法,其特征在于,利用下式产生所述随机扰动参数:
    Figure PCTCN2021083022-appb-100001
    其中,delta表示所述随机扰动参数,a表示参数因子,-5≤a≤5。
  6. 一种相似对问题预测的装置,其特征在于,所述装置包括:
    输入模块,用于将待预测相似对问题输入至多个不同的预测模型中,获得每个所述预测模型输出的预测结果;其中,至少一个所述预测模型的嵌入层加入随机扰动参数;
    运算模块,用于对多个所述预测结果进行投票运算,得到所述待预测相似对问题的最终预测结果;
    每个所述预测模型包括多个预测子模型,每个所述预测子模型是由特定相似对问题训练样本集和由分配函数确定的相似对问题训练样本集训练预测模型得到;
    所述输入模块还用于,将所述待预测相似对问题输入每个所述预测模型包括的多个预测子模型中,得到每个所述预测子模型输出的预测子结果;
    将多个所述预测子结果进行投票运算,得到所述预测结果;
    所述预测子模型采用以下方式训练,包括:
    获取原始相似对问题训练样本集;
    利用相似性传递原理对所述原始相似对问题训练样本集进行训练样本扩充处理,得到扩充相似对问题训练样本集;
    基于分配函数从所述扩充相似对问题训练样本集中确定相似对问题训练样本集;
    利用所述相似对问题训练样本集和特定相似对问题训练样本集训练预测模型得到所述预测子模型;
    其中,得到扩充相似对问题训练样本集之后,对所述扩充相似对问题训练样本集中的每对相似对问题训练样本进行顺序标号;
    基于所述分配函数从所述扩充相似对问题训练样本集中确定所述相似对问题训练样本集的步骤,包括:
    利用所述分配函数的第一函数从所述扩充相似对问题训练样本集中确定第一标号:
    利用所述分配函数的第二函数基于所述第一标号从所述扩充相似对问题训练样本集中确定第二标号:
    选取所述第一标号和所述第二标号区间内的扩充相似对问题训练样本集作为所述相似对问题训练样本集。
  7. 一种电子设备,其特征在于,包括处理器和存储器,所述存储器存储有能够被所述处理器执行的计算机可执行指令,所述处理器执行所述计算机可执行指令以实现权利要求1至5任一项所述方法。
PCT/CN2021/083022 2020-11-02 2021-03-25 相似对问题预测的方法、装置及电子设备 WO2022088602A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/238,169 US20210241147A1 (en) 2020-11-02 2021-04-22 Method and device for predicting pair of similar questions and electronic equipment

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011200385.8A CN112017777B (zh) 2020-11-02 2020-11-02 相似对问题预测的方法、装置及电子设备
CN202011200385.8 2020-11-02

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/238,169 Continuation US20210241147A1 (en) 2020-11-02 2021-04-22 Method and device for predicting pair of similar questions and electronic equipment

Publications (1)

Publication Number Publication Date
WO2022088602A1 true WO2022088602A1 (zh) 2022-05-05

Family

ID=73527986

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083022 WO2022088602A1 (zh) 2020-11-02 2021-03-25 相似对问题预测的方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN112017777B (zh)
WO (1) WO2022088602A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116956197A (zh) * 2023-09-14 2023-10-27 山东理工昊明新能源有限公司 基于深度学习的能源设施故障预测方法、装置及电子设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112017777B (zh) * 2020-11-02 2021-02-26 北京妙医佳健康科技集团有限公司 相似对问题预测的方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845731A (zh) * 2017-02-20 2017-06-13 重庆邮电大学 一种基于多模型融合的潜在换机用户发现方法
CN108932647A (zh) * 2017-07-24 2018-12-04 上海宏原信息科技有限公司 一种预测相似物品及训练其模型的方法和装置
US20190147350A1 (en) * 2016-04-27 2019-05-16 The Fourth Paradigm (Beijing) Tech Co Ltd Method and device for presenting prediction model, and method and device for adjusting prediction model
CN111611781A (zh) * 2020-05-27 2020-09-01 北京妙医佳健康科技集团有限公司 数据标注方法、问答方法、装置及电子设备
CN112017777A (zh) * 2020-11-02 2020-12-01 北京妙医佳健康科技集团有限公司 相似对问题预测的方法、装置及电子设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133943B (zh) * 2017-04-26 2018-07-06 贵州电网有限责任公司输电运行检修分公司 一种防震锤缺陷检测的视觉检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147350A1 (en) * 2016-04-27 2019-05-16 The Fourth Paradigm (Beijing) Tech Co Ltd Method and device for presenting prediction model, and method and device for adjusting prediction model
CN106845731A (zh) * 2017-02-20 2017-06-13 重庆邮电大学 一种基于多模型融合的潜在换机用户发现方法
CN108932647A (zh) * 2017-07-24 2018-12-04 上海宏原信息科技有限公司 一种预测相似物品及训练其模型的方法和装置
CN111611781A (zh) * 2020-05-27 2020-09-01 北京妙医佳健康科技集团有限公司 数据标注方法、问答方法、装置及电子设备
CN112017777A (zh) * 2020-11-02 2020-12-01 北京妙医佳健康科技集团有限公司 相似对问题预测的方法、装置及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116956197A (zh) * 2023-09-14 2023-10-27 山东理工昊明新能源有限公司 基于深度学习的能源设施故障预测方法、装置及电子设备
CN116956197B (zh) * 2023-09-14 2024-01-19 山东理工昊明新能源有限公司 基于深度学习的能源设施故障预测方法、装置及电子设备

Also Published As

Publication number Publication date
CN112017777A (zh) 2020-12-01
CN112017777B (zh) 2021-02-26

Similar Documents

Publication Publication Date Title
CN110147456B (zh) 一种图像分类方法、装置、可读存储介质及终端设备
WO2022161202A1 (zh) 多媒体资源分类模型训练方法和多媒体资源推荐方法
CN112270196B (zh) 实体关系的识别方法、装置及电子设备
US11409964B2 (en) Method, apparatus, device and storage medium for evaluating quality of answer
US20230385553A1 (en) Techniques to add smart device information to machine learning for increased context
CN112287089B (zh) 用于自动问答系统的分类模型训练、自动问答方法及装置
CN110704640A (zh) 一种知识图谱的表示学习方法及装置
RU2664481C1 (ru) Способ и система выбора потенциально ошибочно ранжированных документов с помощью алгоритма машинного обучения
WO2022088602A1 (zh) 相似对问题预测的方法、装置及电子设备
US20240037403A1 (en) Information-aware graph contrastive learning
US11934790B2 (en) Neural network training method and apparatus, semantic classification method and apparatus and medium
CN113707299A (zh) 基于问诊会话的辅助诊断方法、装置及计算机设备
CN113449204B (zh) 基于局部聚合图注意力网络的社会事件分类方法、装置
CN110377733A (zh) 一种基于文本的情绪识别方法、终端设备及介质
WO2023159756A1 (zh) 价格数据的处理方法和装置、电子设备、存储介质
WO2024001806A1 (zh) 一种基于联邦学习的数据价值评估方法及其相关设备
CN114818682B (zh) 基于自适应实体路径感知的文档级实体关系抽取方法
CN113988044B (zh) 错题原因类别的判定方法
WO2024114659A1 (zh) 一种摘要生成方法及其相关设备
US20210241147A1 (en) Method and device for predicting pair of similar questions and electronic equipment
CN113988085B (zh) 文本语义相似度匹配方法、装置、电子设备及存储介质
CN114372518B (zh) 一种基于解题思路和知识点的试题相似度计算方法
CN113010687B (zh) 一种习题标签预测方法、装置、存储介质以及计算机设备
CN117009621A (zh) 信息搜索方法、装置、电子设备、存储介质及程序产品
CN115438658A (zh) 一种实体识别方法、识别模型的训练方法和相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21884346

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11/08/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21884346

Country of ref document: EP

Kind code of ref document: A1