CN110399466A - Screening technique, device, equipment and the storage medium of question and answer data - Google Patents

Screening technique, device, equipment and the storage medium of question and answer data Download PDF

Info

Publication number
CN110399466A
CN110399466A CN201910706456.2A CN201910706456A CN110399466A CN 110399466 A CN110399466 A CN 110399466A CN 201910706456 A CN201910706456 A CN 201910706456A CN 110399466 A CN110399466 A CN 110399466A
Authority
CN
China
Prior art keywords
answer
information
question
source
described problem
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910706456.2A
Other languages
Chinese (zh)
Inventor
时鸿剑
冯欣伟
戴松泰
周环宇
余淼
袁鹏程
宋勋超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910706456.2A priority Critical patent/CN110399466A/en
Publication of CN110399466A publication Critical patent/CN110399466A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present embodiment provides a kind of screening techniques of question and answer data, device, equipment and storage medium, this method comprises: according to the answer pair of the problems in question answering system, determine the problem of problem answers are to including, the source-information of answer and the answer, and according to problem, answer and knowledge mapping, determine the first information, the first information is for indicating whether the type of the answer meets expected type, further according to the source-information of problem and answer, determine the second information, second information is used to indicate the height of the quality of data of described problem answer pair, finally according to the first information with the second information, to problem answers to screening, pass through the accurate screening to problem answers pair, on the one hand reduce the waste of data resource, on the other hand the accuracy of question and answer data and the timeliness of interaction are improved.

Description

Screening technique, device, equipment and the storage medium of question and answer data
Technical field
The present embodiments relate to technical field of intelligent interaction more particularly to a kind of screening technique of question and answer data, device, Equipment and storage medium.
Background technique
With the continuous development in intelligent interaction field, in order to improve intelligent interaction product answer accuracy, generally deposit It has stored up huge question and answer data to deposit, and can reduce the waste of data resource to the filtering of question and answer data, can also promote interactive effect Rate.
Question and answer data in the prior art screen out method, due to lacking sentencing to the quality of data of problem and answer itself It is disconnected, it is easy to cause that there are part low quality datas to screen out, alternatively, when screening out low quality data, it can be by part high quality The problem of data screen out together.In turn, the waste on the one hand causing data resource affects interactive timeliness;On the other hand, The accuracy for reducing question and answer data, affects user experience.
Summary of the invention
The embodiment of the present invention provides screening technique, device, equipment and the storage medium of a kind of question and answer data, for solving The problem that data resource consumption is big in above scheme and question and answer data accuracy is low.
In a first aspect, the present invention provides a kind of screening technique of question and answer data, comprising:
According to the answer pair of the problems in question answering system, determine described problem answer to the problem of including, answer and described The source-information of answer;
According to described problem, the answer and knowledge mapping, the first information is determined, the first information is for indicating described Whether the type of answer meets expected type;
According to the source-information of described problem and the answer, the second information is determined, second information is for indicating institute State the height of the quality of data of problem answers pair;
According to the first information and second information, to described problem answer to screening.
Further, according to the first information and the second information, to described problem answer to screening, comprising:
If the first information indicates that the type of the answer does not meet expected type, alternatively, second information indicates The quality of data of described problem answer pair is low, then by described problem answer to screening out.
In one possible implementation, according to the answer pair of the problems in question answering system, described problem answer is determined Before the source-information including problem, answer and the answer, the method also includes:
The question and answer data of the question answering system are obtained, the question and answer data include multiple problem answers pair.
Specifically, the method also includes:
The entity for including in described problem, the answer and the source-information is obtained respectively.
It is described with determining the first letter according to described problem, the answer and knowledge mapping in a kind of concrete implementation mode Breath, comprising:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
It is described according to described problem and the answer source-information in a kind of concrete implementation mode, determine the second letter Breath, comprising:
Described problem and the source are calculated by similarity operator according to described problem and the source-information The similarity of information;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
Second aspect, the present invention provide a kind of screening plant of question and answer data, comprising:
Processing module, for according to the answer pair of the problems in question answering system, determine described problem answer to the problem of including, The source-information of answer and the answer;
The processing module is also used to according to described problem, the answer and knowledge mapping, determines the first information, and described One information is for indicating whether the type of the answer meets expected type;
The processing module is also used to the source-information according to described problem and the answer, determines the second information, described Second information is used to indicate the height of the quality of data of described problem answer pair;
Screening module, for according to the first information and second information, to described problem answer to screening.
Further, the screening module is specifically used for:
If the first information indicates that the type of the answer does not meet expected type, alternatively, second information indicates The quality of data of described problem answer pair is low, then by described problem answer to screening out.
In a kind of concrete implementation mode, described device further include:
Module is obtained, for obtaining the question and answer data of the question answering system, the question and answer data include multiple problem answers It is right.
Specifically, the acquisition module is also used to:
The entity for including in described problem, the answer and the source-information is obtained respectively.
In a kind of concrete implementation mode, the processing module is specifically used for:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
In a kind of concrete implementation mode, the processing module is specifically used for:
Described problem and the source are calculated by similarity operator according to described problem and the source-information The similarity of information;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
The third aspect, the present invention provide a kind of electronic equipment, comprising: memory and processor;
The memory stores computer executed instructions;
At least one described processor executes the computer executed instructions of the memory storage, so that the processor is held The screening technique of the question and answer data of row as described in relation to the first aspect.
Fourth aspect, the present invention provide a kind of storage medium, comprising: readable storage medium storing program for executing and computer program, the meter Calculation machine program for realizing question and answer data described in first aspect screening technique.
Screening technique, device, equipment and the storage medium of question and answer data provided in this embodiment, according in question answering system The problem of answer pair, determine problem answers to include the problem of, answer and the answer source-information, and according to problem, Answer and knowledge mapping determine the first information, and the first information is for indicating whether the type of the answer meets expected kind Class determines the second information, second information is for indicating described problem answer pair further according to the source-information of problem and answer The quality of data height, finally according to the first information with the second information, to problem answers to screening, by answering problem The accurate screening of case pair, on the one hand reduces the waste of data resource, on the other hand improves the accuracy and friendship of question and answer data Mutual timeliness.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this hair Bright some embodiments for those of ordinary skill in the art without any creative labor, can be with Other attached drawings are obtained according to these attached drawings.
Fig. 1 is the flow diagram of the screening technique embodiment one of question and answer data provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of the screening technique embodiment two of question and answer data provided in an embodiment of the present invention;
Fig. 3 is the flow diagram of the screening technique embodiment three of question and answer data provided in an embodiment of the present invention;
Fig. 4 is the flow diagram of the screening technique example IV of question and answer data provided in an embodiment of the present invention;
Fig. 5 is the structural schematic diagram of the screening plant embodiment one of question and answer data provided in an embodiment of the present invention;
Fig. 6 is the structural schematic diagram of the screening plant embodiment two of question and answer data provided in an embodiment of the present invention;
Fig. 7 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art All other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
The executing subject of this programme is a kind of electronic equipment, which can be a kind of terminal device such as intelligent sound Case, mobile phone, personal computer PC etc., alternatively, can be a kind of server, alternatively, can also be a kind of industrial control equipment etc., Alternatively, the electronic equipment can integrate in above equipment.
This programme provides a kind of screening technique of question and answer data, is applied to above-mentioned electronic equipment, for any question and answer system The question and answer data of system are screened, specifically, can be to multiple problem answers in question and answer data to sieving by this method Choosing, synthtic price index answer to the problems in, the source-information of answer and answer, judge the data matter of each problem answers pair Amount, the quality of data is low, problem subjectivity is strong, the problem answer without clear answer or answer mistake is to screening out.
For example, question answering system can be obtained according to data with existing when processing problem " the Wuqing District street Xia Zhuzhuan postcode " Corresponding area postcode is the correct option of " 301700 ", and the screening technique for the question and answer data that this programme provides can be accurately Assert that the quality of data of the problem answers pair is high;And in the problem of processing " insect note what is known as ", question answering system according to Data with existing obtains wrong answer " the initiating people of Animal Psychology ", and the screening technique for the question and answer data that this programme provides can be sentenced Break the problem answers pair the quality of data it is low, and screened out.
This programme is specifically described below by several specific embodiments.
Fig. 1 is the flow diagram of the screening technique embodiment one of question and answer data provided in an embodiment of the present invention, such as Fig. 1 institute Show, the specific implementation step of the screening technique of question and answer data includes:
S101: according to the answer pair of the problems in question answering system, determine problem answers to the problem of including, answer and answer Source-information.
In this step, to each problem answers pair in question answering system, which is obtained to including by dismantling The problem of, answer and answer source-information, wherein the source-information of answer can be problem answers centering carrying information, Or it can be the source-information that answer is determined according to answer.Source-information can be one section of semantic sequence, or can be system One Resource Locator (Uniform Resource Locator, URL) obtains the source-information of answer according to URL link.
By taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, which is obtained by dismantling Answer is to for " the Wuqing District street Xia Zhuzhuan postcode ", answer being " 301700 " the problem of including, and the source-information of answer For " the Tianjin Wuqing village Xia Zhu colorful time postcode ".
S102: according to problem, answer and knowledge mapping, the first information is determined.
The first information is for indicating whether the type of the answer meets expected type.
It is mapped in knowledge mapping according to problem obtained in step S101, the expection type of answer is obtained, further according to step The type of answer is obtained in rapid S101, judges whether the type of answer and expected type are consistent, to obtain the first information.
Knowledge mapping is a kind of semantic network, the side including node and link node.Node represent entity (entity) or Person's concept (concept), the various semantic relations between Bian Daibiao entity/concept.In the present solution, the entity in problem is reflected It is incident upon the form of " the head entity-relation " or " relationship-tail entity " in " the head entity-relation-tail entity " in knowledge mapping, To be inferred to the expection type of remaining element (head entity or tail entity).
Still by taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, the entity in problem includes " Wuqing District " and " postcode " can map to " head entity-relation " in knowledge mapping, in turn, can be inferred that it Tail entity is a specific postcode, in conjunction with the entity " 301700 " in answer, it can be deduced that the type of answer, which meets, to be pushed away The disconnected expection type obtained.
S103: according to the source-information of problem and answer, the second information is determined.
Second information is used to indicate the height of the quality of data of described problem answer pair.
In this step, according to the source-information of problem and answer, the second information is determined, wherein problem and source-information Be semantic sequence, by judging the consistent degree of two semantic sequences, come determine problem answers pair the quality of data height, Further, the consistent degree of two semantic sequences can be embodied by similarity and/or overlap ratio column.
S104: according to the first information with the second information, to problem answers to screening.
According to the second information obtained in the first information obtained in step S102 and step S103, the problem is answered in determination Case is to being screened out or retained.
In one possible implementation, if the type that this step includes: first information instruction answer does not meet expection Type, then by problem answers to screening out;Alternatively, if the type of first information instruction answer meets expected type, but the second information Indicate that the quality of data of problem answers pair is low, then also by problem answers to screening out;If the type of first information instruction answer meets It is expected that type, and the second information indicates that the quality of data of problem answers pair is high, then by problem answers to reservation.
A kind of screening technique of question and answer data provided in an embodiment of the present invention, according to the answer pair of the problems in question answering system, Determine problem answers to the problem of including, the source-information of answer and the answer, and according to problem, answer and knowledge graph Spectrum determines the first information, and the first information is for indicating whether the type of the answer meets expected type, further according to problem With the source-information of answer, determine that the second information, second information are used to indicate the quality of data of described problem answer pair Just, finally pass through the accurate sieve to problem answers pair to problem answers to screening with the second information according to the first information Choosing, on the one hand reduces the waste of data resource, on the other hand improves the accuracy of question and answer data and the timeliness of interaction.
On the basis of embodiment shown in Fig. 1, Fig. 2 is that the screening technique of question and answer data provided in an embodiment of the present invention is implemented The flow diagram of example two, as shown in Fig. 2, the screening technique of the question and answer data, the also included specific reality before step S101 Now step includes:
S105: obtaining the question and answer data of question answering system, and the question and answer data include multiple problem answers pair.
The question and answer data of question answering system are obtained, which can store in local or be stored in server, ask Answer according to including multiple problem answers pair, specially answer each problem in question and answer data by screening of this programme to question and answer data Case is to the accurate judgement for being screened out or being retained.
In the present embodiment, by obtaining the question and answer data of question answering system, the accurate acquisition of garbled data is treated in realization, so as to In the subsequent screening for carrying out comprehensive question and answer data to question answering system.
On the basis of the above embodiments, the screening technique of the question and answer data of this programme further include: obtain problem respectively, answer The entity for including in case and source-information, it should be appreciated that acquired entity is problem, answer, in source-information in this programme Important entity, for example, the entity of problem " the Wuqing District street Xia Zhuzhuan postcode " include " Wuqing District ", " street Xia Zhuzhuan ", " postal Political affairs coding ", wherein important entity is " Wuqing District " and " postcode ", and " street Xia Zhuzhuan " is insignificant entity, further, Important entity can be obtained by giving a mark to entity, by modes such as models, and determine it as the entity in this programme.
On the basis of the above embodiments, Fig. 3 is the screening technique embodiment of question and answer data provided in an embodiment of the present invention Three flow diagram, as shown in figure 3, step S102 determines the first letter according to described problem, the answer and knowledge mapping Breath, specifically includes following implemented step:
S1021: according to the entity and knowledge mapping of problem, the expection type of answer is determined.
In this step, the entity of problem is mapped in knowledge mapping, obtains the expection type of answer.
Still by taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, the entity in problem includes " Wuqing District " and " postcode " can map to " head entity-relation " in knowledge mapping, in turn, can be inferred that it Tail entity is a specific postcode.
S1022: determining whether the type of answer meets expected type, obtains the first information.
The type of answer is the type of the entity of answer, according to the type of answer, judges the type and expection kind of answer Whether class is consistent, to obtain the first information.
Still by taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, in conjunction with the entity in answer " 301700 ", it can be deduced that the type of answer meets the expection type inferred and obtained.
In the present embodiment, according to the entity and knowledge mapping of problem, the expection type of answer is determined, and determine the kind of answer Whether class meets expected type, obtains the first information, by determine answer whether the content that compliance problem is asked, ensure that answer Match with problem.
On the basis of the above embodiments, Fig. 4 is the screening technique embodiment of question and answer data provided in an embodiment of the present invention Four flow diagram, as shown in figure 4, step S103 determines the second information according to described problem and the answer source-information, Specifically include following implemented step:
S1031: according to problem and source-information, by similarity operator, the similar of problem and source-information is calculated Degree.
In general, problem and source-information are two semantic sequences, for computational problem and source-information similarity Similarity operator of the similarity operator between semantic sequence and semantic sequence (i.e. sentence and sentence), is asked by can be calculated Problem and source-information are inputted mould alternatively, can establish model according to similarity operator by the similarity of topic and source-information Type obtains the similarity of the two.
In specific example, as calculating " the Wuqing District street Xia Zhuzhuan postcode " and " the Tianjin Wuqing village Xia Zhu style Similarity between two semantic sequences of year flower postcode ", obtaining the similarity between two sentences is 0.889.
S1032: the overlap proportion of the entity of the entity of problem and the source-information of answer is obtained.
It in this step, can be anti-by obtaining the overlap proportion of the entity of the entity of problem and the source-information of answer The correlation for mirroring the source-information of problem and answer is able to reflect the quality of data for answer pair of ging wrong.
For example, the entity of problem is " Wuqing District " and " postcode ", the entity of the source-information of answer is " Tianjin ", " Wuqing ", " postcode " pass through calculating it can be found that there are similar or identical entities in problem and source-information The overlap proportion of the two entity can be obtained.
S1033: the second information is determined according to similarity and/or overlap proportion.
This step includes three kinds of possible implementations:
Mode one: according to similarity obtained in step S1031, the second information is determined.For example, by obtained similarity with Preset value is compared, if similarity is greater than preset value, the second obtained information is used to indicate the data matter of problem answers pair Amount is high, and otherwise, the second information is for indicating that the quality of data of problem answers pair is low.
For example, computational problem " the Wuqing District street Xia Zhuzhuan postcode " and source-information " the Tianjin Wuqing village Xia Zhu style year Similarity between two semantic sequences of flower postcode " is 0.889, it is assumed that the preset value of similarity is 0.7, it is determined that this is asked The quality of data for inscribing answer pair is high.
Mode two: according to overlap proportion obtained in step S1032, the second information is determined.For example, the overlap ratio that will be obtained Example is compared with preset value, if overlap proportion is greater than preset value, the second obtained information is for indicating problem answers pair The quality of data is high, and otherwise, the second information is for indicating that the quality of data of problem answers pair is low.
For example, the entity of problem is " Wuqing District " and " postcode ", the entity of the source-information of answer is " Tianjin ", " Wuqing ", " postcode " pass through calculating it can be found that there are similar or identical entities in problem and source-information The overlap proportion that the two entity can be obtained is greater than preset value 80%, it is determined that the quality of data of the problem answers pair is high.
Mode three: according to similarity and overlap proportion, the second information is determined.When similarity and all higher overlap proportion, It specifically can be when being above respective preset value, the second information is used to indicate that the quality of data of problem answers pair to be high, otherwise, the Two information are for indicating that the quality of data of problem answers pair is low.
In the present embodiment, according to problem and source-information, by similarity operator, problem and source-information is calculated Similarity, and the overlap proportion of the entity of the entity of problem and the source-information of answer is obtained, further according to similarity and/or overlapping The second information of ratio-dependent, has been accomplished in several ways the judgement to the quality of data of problem answers pair, has determined problem answers Pair the quality of data height, convenient for determining problem answers to whether needing to be screened out.
Fig. 5 is the structural schematic diagram of the screening plant embodiment one of question and answer data provided in an embodiment of the present invention, such as Fig. 5 institute Show, the screening plant 10 of the question and answer data, comprising:
Processing module 11, for determining that described problem answer is asked include according to the answer pair of the problems in question answering system It inscribes, the source-information of answer and the answer;
The processing module 11 is also used to determine the first information according to described problem, the answer and knowledge mapping, described The first information is for indicating whether the type of the answer meets expected type;
The processing module 11 is also used to the source-information according to described problem and the answer, determines the second information, institute State the height of the quality of data of second information for indicating described problem answer pair;
Screening module 12, for according to the first information and second information, to described problem answer to sieving Choosing.
The screening plant 10 of question and answer data provided in this embodiment, comprising: processing module 11 and screening module 12.According to asking Answer intersystem problem answer pair, determine problem answers to the problem of including, the source-information of answer and the answer, and root According to problem, answer and knowledge mapping, the first information is determined, the first information is for indicating whether the type of the answer meets It is expected that type determines the second information, second information is for indicating described problem further according to the source-information of problem and answer The height of the quality of data of answer pair, finally according to the first information with the second information, to problem answers to screening, by right The accurate screening of problem answers pair, on the one hand reduces the waste of data resource, on the other hand improves the accurate of question and answer data Property and interaction timeliness.
In a kind of concrete implementation mode, the screening module is specifically used for:
If the first information indicates that the type of the answer does not meet expected type, alternatively, second information indicates The quality of data of described problem answer pair is low, then by described problem answer to screening out.
On the basis of embodiment shown in Fig. 5, Fig. 6 is that the screening plant of question and answer data provided in an embodiment of the present invention is implemented The structural schematic diagram of example two, as shown in fig. 6, the screening plant 10 of the question and answer data, further includes:
Module 13 is obtained, for obtaining the question and answer data of the question answering system, the question and answer data include that multiple problems are answered Case pair.
In a kind of concrete implementation mode, the acquisition module is also used to:
The entity for including in described problem, the answer and the source-information is obtained respectively.
In a kind of concrete implementation mode, the processing module is specifically used for:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
In a kind of concrete implementation mode, the processing module is specifically used for:
Described problem and the source are calculated by similarity operator according to described problem and the source-information The similarity of information;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
Device provided in this embodiment, can be used for executing the technical solution of the embodiment of the above method, realization principle and Technical effect is similar, and details are not described herein again for the present embodiment.
Fig. 7 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.As shown in fig. 7, the present embodiment Electronic equipment 20 includes: processor 201 and memory 202;Wherein
Memory 202, for storing computer executed instructions;
Processor 201, for executing the computer executed instructions of memory storage, to realize described in any of the above-described embodiment Question and answer data screening technique.It specifically may refer to the associated description in preceding method embodiment.
Optionally, memory 202 can also be integrated with processor 201 either independent.
When memory 202 is independently arranged, which further includes bus 203, for connecting 202 He of memory Processor 201.
The embodiment of the present invention also provides a kind of computer readable storage medium, stores in the computer readable storage medium There are computer executed instructions, when processor executes the computer executed instructions, realizes the sieve of question and answer data as described above Choosing method.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, apparatus embodiments described above are merely indicative, for example, the division of the module, only Only a kind of logical function partition, there may be another division manner in actual implementation, for example, multiple modules can combine or It is desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or discussed it is mutual it Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of device or module It connects, can be electrical property, mechanical or other forms.
The module as illustrated by the separation member may or may not be physically separated, aobvious as module The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.Some or all of the modules therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
It, can also be in addition, each functional module in each embodiment of the present invention can integrate in one processing unit It is that modules physically exist alone, can also be integrated in one unit with two or more modules.Above-mentioned module at Unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated module realized in the form of software function module, can store and computer-readable deposit at one In storage media.Above-mentioned software function module is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) or processor (English: processor) execute this Shen Please each embodiment the method part steps.
It should be understood that above-mentioned processor can be central processing unit (English: Central Processing Unit, letter Claim: CPU), can also be other general processors, digital signal processor (English: Digital Signal Processor, Referred to as: DSP), specific integrated circuit (English: Application Specific Integrated Circuit, referred to as: ASIC) etc..General processor can be microprocessor or the processor is also possible to any conventional processor etc..In conjunction with hair The step of bright disclosed method, can be embodied directly in hardware processor and execute completion, or with hardware in processor and soft Part block combiner executes completion.
Memory may include high speed RAM memory, it is also possible to and it further include non-volatile memories NVM, for example, at least one Magnetic disk storage can also be USB flash disk, mobile hard disk, read-only memory, disk or CD etc..
Bus can be industry standard architecture (Industry Standard Architecture, ISA) bus, outer Portion's apparatus interconnection (Peripheral Component, PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, EISA) bus etc..Bus can be divided into address bus, data/address bus, control Bus etc..For convenient for indicating, the bus in illustrations does not limit only a bus or a type of bus.
Above-mentioned storage medium can be by any kind of volatibility or non-volatile memory device or their combination It realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable Read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, Disk or CD.Storage medium can be any usable medium that general or specialized computer can access.
A kind of illustrative storage medium is coupled to processor, believes to enable a processor to read from the storage medium Breath, and information can be written to the storage medium.Certainly, storage medium is also possible to the component part of processor.It processor and deposits Storage media can be located at specific integrated circuit (Application Specific Integrated Circuits, referred to as: ASIC in).Certainly, pocessor and storage media can also be used as discrete assembly and be present in electronic equipment or main control device.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above-mentioned each method embodiment can lead to The relevant hardware of program instruction is crossed to complete.Program above-mentioned can be stored in a computer readable storage medium.The journey When being executed, execution includes the steps that above-mentioned each method embodiment to sequence;And storage medium above-mentioned include: ROM, RAM, magnetic disk or The various media that can store program code such as person's CD.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (14)

1. a kind of screening technique of question and answer data characterized by comprising
According to the answer pair of the problems in question answering system, determine described problem answer to the problem of including, answer and the answer Source-information;
According to described problem, the answer and knowledge mapping, the first information is determined, the first information is for indicating the answer Type whether meet expected type;
According to the source-information of described problem and the answer, determine that the second information, second information are asked for indicating described Inscribe the height of the quality of data of answer pair;
According to the first information and second information, to described problem answer to screening.
2. the method according to claim 1, wherein being asked according to the first information and the second information described Topic answer is to screening, comprising:
If the first information indicates that the type of the answer does not meet expected type, alternatively, described in second information expression The quality of data of problem answers pair is low, then by described problem answer to screening out.
3. method according to claim 1 or 2, which is characterized in that according to the answer pair of the problems in question answering system, determine Before described problem answer is to the source-information including problem, answer and the answer, the method also includes:
The question and answer data of the question answering system are obtained, the question and answer data include multiple problem answers pair.
4. method according to claim 1 or 2, which is characterized in that the method also includes:
The entity for including in described problem, the answer and the source-information is obtained respectively.
5. according to the method described in claim 4, it is characterized in that, described with according to described problem, the answer and knowledge mapping, Determine the first information, comprising:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
6. according to the method described in claim 4, it is characterized in that, described according to described problem and the answer source-information, Determine the second information, comprising:
Described problem and the source-information are calculated by similarity operator according to described problem and the source-information Similarity;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
7. a kind of screening plant of question and answer data characterized by comprising
Processing module, for according to the answer pair of the problems in question answering system, determine described problem answer to the problem of including, answer And the source-information of the answer;
The processing module is also used to determine the first information, first letter according to described problem, the answer and knowledge mapping Breath is for indicating whether the type of the answer meets expected type;
The processing module is also used to the source-information according to described problem and the answer, determines the second information, and described second Information is used to indicate the height of the quality of data of described problem answer pair;
Screening module, for according to the first information and second information, to described problem answer to screening.
8. device according to claim 7, which is characterized in that the screening module is specifically used for:
If the first information indicates that the type of the answer does not meet expected type, alternatively, described in second information expression The quality of data of problem answers pair is low, then by described problem answer to screening out.
9. device according to claim 7 or 8, which is characterized in that described device further include:
Module is obtained, for obtaining the question and answer data of the question answering system, the question and answer data include multiple problem answers pair.
10. device according to claim 7 or 8, which is characterized in that the acquisition module is also used to:
The entity for including in described problem, the answer and the source-information is obtained respectively.
11. device according to claim 10, which is characterized in that the processing module is specifically used for:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
12. device according to claim 10, which is characterized in that the processing module is specifically used for:
Described problem and the source-information are calculated by similarity operator according to described problem and the source-information Similarity;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
13. a kind of electronic equipment characterized by comprising memory and processor;
The memory stores computer executed instructions;
At least one described processor executes the computer executed instructions of the memory storage, so that the processor executes such as The screening technique of question and answer data as claimed in any one of claims 1 to 6.
14. a kind of storage medium characterized by comprising readable storage medium storing program for executing and computer program, the computer program are used In the screening technique for realizing question and answer data as claimed in any one of claims 1 to 6.
CN201910706456.2A 2019-08-01 2019-08-01 Screening technique, device, equipment and the storage medium of question and answer data Pending CN110399466A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910706456.2A CN110399466A (en) 2019-08-01 2019-08-01 Screening technique, device, equipment and the storage medium of question and answer data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910706456.2A CN110399466A (en) 2019-08-01 2019-08-01 Screening technique, device, equipment and the storage medium of question and answer data

Publications (1)

Publication Number Publication Date
CN110399466A true CN110399466A (en) 2019-11-01

Family

ID=68327257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910706456.2A Pending CN110399466A (en) 2019-08-01 2019-08-01 Screening technique, device, equipment and the storage medium of question and answer data

Country Status (1)

Country Link
CN (1) CN110399466A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577556A (en) * 2013-10-21 2014-02-12 北京奇虎科技有限公司 Device and method for obtaining association degree of question and answer pair
CN103914543A (en) * 2014-04-03 2014-07-09 北京百度网讯科技有限公司 Search result displaying method and device
US10031970B1 (en) * 2013-09-12 2018-07-24 Intuit Inc. Search engine optimization in social question and answer systems
CN109033050A (en) * 2018-06-29 2018-12-18 北京百度网讯科技有限公司 Article generation method, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10031970B1 (en) * 2013-09-12 2018-07-24 Intuit Inc. Search engine optimization in social question and answer systems
CN103577556A (en) * 2013-10-21 2014-02-12 北京奇虎科技有限公司 Device and method for obtaining association degree of question and answer pair
CN103914543A (en) * 2014-04-03 2014-07-09 北京百度网讯科技有限公司 Search result displaying method and device
CN109033050A (en) * 2018-06-29 2018-12-18 北京百度网讯科技有限公司 Article generation method, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN108121795B (en) User behavior prediction method and device
CN110377716A (en) Exchange method, device and the computer readable storage medium of dialogue
CN110210021A (en) Read understanding method and device
CN112860841A (en) Text emotion analysis method, device and equipment and storage medium
CN107180386A (en) A kind of quantization strategy live broadcast system
CN109859747A (en) Voice interactive method, equipment and storage medium
CN114298039B (en) Sensitive word recognition method and device, electronic equipment and storage medium
CN112017777B (en) Method and device for predicting similar pair problem and electronic equipment
CN108846138A (en) A kind of the problem of fusion answer information disaggregated model construction method, device and medium
CN110189751A (en) Method of speech processing and equipment
CN109949830A (en) User's intension recognizing method and equipment
CN114996486A (en) Data recommendation method and device, server and storage medium
US20190171745A1 (en) Open ended question identification for investigations
CN110162774B (en) Automatic news emotion calibration method and device based on financial market quotation
CN112307754A (en) Statement acquisition method and device
CN113658586B (en) Training method of voice recognition model, voice interaction method and device
CN110362361A (en) The method and device of documenting
CN110399466A (en) Screening technique, device, equipment and the storage medium of question and answer data
CN112307751A (en) Data desensitization method and system based on natural language processing
CN111859933A (en) Training method, recognition method, device and equipment of Malay recognition model
CN109902309A (en) Interpretation method, device, equipment and storage medium
CN115294947A (en) Audio data processing method and device, electronic equipment and medium
CN109300031A (en) Data digging method and device based on stock comment data
CN115564557A (en) Repayment capability evaluation model training method and device, electronic equipment and medium
CN114462376A (en) RPA and AI-based court trial record generation method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191101