CN110399466A - Screening technique, device, equipment and the storage medium of question and answer data - Google Patents
Screening technique, device, equipment and the storage medium of question and answer data Download PDFInfo
- Publication number
- CN110399466A CN110399466A CN201910706456.2A CN201910706456A CN110399466A CN 110399466 A CN110399466 A CN 110399466A CN 201910706456 A CN201910706456 A CN 201910706456A CN 110399466 A CN110399466 A CN 110399466A
- Authority
- CN
- China
- Prior art keywords
- answer
- information
- question
- source
- described problem
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present embodiment provides a kind of screening techniques of question and answer data, device, equipment and storage medium, this method comprises: according to the answer pair of the problems in question answering system, determine the problem of problem answers are to including, the source-information of answer and the answer, and according to problem, answer and knowledge mapping, determine the first information, the first information is for indicating whether the type of the answer meets expected type, further according to the source-information of problem and answer, determine the second information, second information is used to indicate the height of the quality of data of described problem answer pair, finally according to the first information with the second information, to problem answers to screening, pass through the accurate screening to problem answers pair, on the one hand reduce the waste of data resource, on the other hand the accuracy of question and answer data and the timeliness of interaction are improved.
Description
Technical field
The present embodiments relate to technical field of intelligent interaction more particularly to a kind of screening technique of question and answer data, device,
Equipment and storage medium.
Background technique
With the continuous development in intelligent interaction field, in order to improve intelligent interaction product answer accuracy, generally deposit
It has stored up huge question and answer data to deposit, and can reduce the waste of data resource to the filtering of question and answer data, can also promote interactive effect
Rate.
Question and answer data in the prior art screen out method, due to lacking sentencing to the quality of data of problem and answer itself
It is disconnected, it is easy to cause that there are part low quality datas to screen out, alternatively, when screening out low quality data, it can be by part high quality
The problem of data screen out together.In turn, the waste on the one hand causing data resource affects interactive timeliness;On the other hand,
The accuracy for reducing question and answer data, affects user experience.
Summary of the invention
The embodiment of the present invention provides screening technique, device, equipment and the storage medium of a kind of question and answer data, for solving
The problem that data resource consumption is big in above scheme and question and answer data accuracy is low.
In a first aspect, the present invention provides a kind of screening technique of question and answer data, comprising:
According to the answer pair of the problems in question answering system, determine described problem answer to the problem of including, answer and described
The source-information of answer;
According to described problem, the answer and knowledge mapping, the first information is determined, the first information is for indicating described
Whether the type of answer meets expected type;
According to the source-information of described problem and the answer, the second information is determined, second information is for indicating institute
State the height of the quality of data of problem answers pair;
According to the first information and second information, to described problem answer to screening.
Further, according to the first information and the second information, to described problem answer to screening, comprising:
If the first information indicates that the type of the answer does not meet expected type, alternatively, second information indicates
The quality of data of described problem answer pair is low, then by described problem answer to screening out.
In one possible implementation, according to the answer pair of the problems in question answering system, described problem answer is determined
Before the source-information including problem, answer and the answer, the method also includes:
The question and answer data of the question answering system are obtained, the question and answer data include multiple problem answers pair.
Specifically, the method also includes:
The entity for including in described problem, the answer and the source-information is obtained respectively.
It is described with determining the first letter according to described problem, the answer and knowledge mapping in a kind of concrete implementation mode
Breath, comprising:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
It is described according to described problem and the answer source-information in a kind of concrete implementation mode, determine the second letter
Breath, comprising:
Described problem and the source are calculated by similarity operator according to described problem and the source-information
The similarity of information;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
Second aspect, the present invention provide a kind of screening plant of question and answer data, comprising:
Processing module, for according to the answer pair of the problems in question answering system, determine described problem answer to the problem of including,
The source-information of answer and the answer;
The processing module is also used to according to described problem, the answer and knowledge mapping, determines the first information, and described
One information is for indicating whether the type of the answer meets expected type;
The processing module is also used to the source-information according to described problem and the answer, determines the second information, described
Second information is used to indicate the height of the quality of data of described problem answer pair;
Screening module, for according to the first information and second information, to described problem answer to screening.
Further, the screening module is specifically used for:
If the first information indicates that the type of the answer does not meet expected type, alternatively, second information indicates
The quality of data of described problem answer pair is low, then by described problem answer to screening out.
In a kind of concrete implementation mode, described device further include:
Module is obtained, for obtaining the question and answer data of the question answering system, the question and answer data include multiple problem answers
It is right.
Specifically, the acquisition module is also used to:
The entity for including in described problem, the answer and the source-information is obtained respectively.
In a kind of concrete implementation mode, the processing module is specifically used for:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
In a kind of concrete implementation mode, the processing module is specifically used for:
Described problem and the source are calculated by similarity operator according to described problem and the source-information
The similarity of information;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
The third aspect, the present invention provide a kind of electronic equipment, comprising: memory and processor;
The memory stores computer executed instructions;
At least one described processor executes the computer executed instructions of the memory storage, so that the processor is held
The screening technique of the question and answer data of row as described in relation to the first aspect.
Fourth aspect, the present invention provide a kind of storage medium, comprising: readable storage medium storing program for executing and computer program, the meter
Calculation machine program for realizing question and answer data described in first aspect screening technique.
Screening technique, device, equipment and the storage medium of question and answer data provided in this embodiment, according in question answering system
The problem of answer pair, determine problem answers to include the problem of, answer and the answer source-information, and according to problem,
Answer and knowledge mapping determine the first information, and the first information is for indicating whether the type of the answer meets expected kind
Class determines the second information, second information is for indicating described problem answer pair further according to the source-information of problem and answer
The quality of data height, finally according to the first information with the second information, to problem answers to screening, by answering problem
The accurate screening of case pair, on the one hand reduces the waste of data resource, on the other hand improves the accuracy and friendship of question and answer data
Mutual timeliness.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this hair
Bright some embodiments for those of ordinary skill in the art without any creative labor, can be with
Other attached drawings are obtained according to these attached drawings.
Fig. 1 is the flow diagram of the screening technique embodiment one of question and answer data provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of the screening technique embodiment two of question and answer data provided in an embodiment of the present invention;
Fig. 3 is the flow diagram of the screening technique embodiment three of question and answer data provided in an embodiment of the present invention;
Fig. 4 is the flow diagram of the screening technique example IV of question and answer data provided in an embodiment of the present invention;
Fig. 5 is the structural schematic diagram of the screening plant embodiment one of question and answer data provided in an embodiment of the present invention;
Fig. 6 is the structural schematic diagram of the screening plant embodiment two of question and answer data provided in an embodiment of the present invention;
Fig. 7 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
All other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
The executing subject of this programme is a kind of electronic equipment, which can be a kind of terminal device such as intelligent sound
Case, mobile phone, personal computer PC etc., alternatively, can be a kind of server, alternatively, can also be a kind of industrial control equipment etc.,
Alternatively, the electronic equipment can integrate in above equipment.
This programme provides a kind of screening technique of question and answer data, is applied to above-mentioned electronic equipment, for any question and answer system
The question and answer data of system are screened, specifically, can be to multiple problem answers in question and answer data to sieving by this method
Choosing, synthtic price index answer to the problems in, the source-information of answer and answer, judge the data matter of each problem answers pair
Amount, the quality of data is low, problem subjectivity is strong, the problem answer without clear answer or answer mistake is to screening out.
For example, question answering system can be obtained according to data with existing when processing problem " the Wuqing District street Xia Zhuzhuan postcode "
Corresponding area postcode is the correct option of " 301700 ", and the screening technique for the question and answer data that this programme provides can be accurately
Assert that the quality of data of the problem answers pair is high;And in the problem of processing " insect note what is known as ", question answering system according to
Data with existing obtains wrong answer " the initiating people of Animal Psychology ", and the screening technique for the question and answer data that this programme provides can be sentenced
Break the problem answers pair the quality of data it is low, and screened out.
This programme is specifically described below by several specific embodiments.
Fig. 1 is the flow diagram of the screening technique embodiment one of question and answer data provided in an embodiment of the present invention, such as Fig. 1 institute
Show, the specific implementation step of the screening technique of question and answer data includes:
S101: according to the answer pair of the problems in question answering system, determine problem answers to the problem of including, answer and answer
Source-information.
In this step, to each problem answers pair in question answering system, which is obtained to including by dismantling
The problem of, answer and answer source-information, wherein the source-information of answer can be problem answers centering carrying information,
Or it can be the source-information that answer is determined according to answer.Source-information can be one section of semantic sequence, or can be system
One Resource Locator (Uniform Resource Locator, URL) obtains the source-information of answer according to URL link.
By taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, which is obtained by dismantling
Answer is to for " the Wuqing District street Xia Zhuzhuan postcode ", answer being " 301700 " the problem of including, and the source-information of answer
For " the Tianjin Wuqing village Xia Zhu colorful time postcode ".
S102: according to problem, answer and knowledge mapping, the first information is determined.
The first information is for indicating whether the type of the answer meets expected type.
It is mapped in knowledge mapping according to problem obtained in step S101, the expection type of answer is obtained, further according to step
The type of answer is obtained in rapid S101, judges whether the type of answer and expected type are consistent, to obtain the first information.
Knowledge mapping is a kind of semantic network, the side including node and link node.Node represent entity (entity) or
Person's concept (concept), the various semantic relations between Bian Daibiao entity/concept.In the present solution, the entity in problem is reflected
It is incident upon the form of " the head entity-relation " or " relationship-tail entity " in " the head entity-relation-tail entity " in knowledge mapping,
To be inferred to the expection type of remaining element (head entity or tail entity).
Still by taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, the entity in problem includes
" Wuqing District " and " postcode " can map to " head entity-relation " in knowledge mapping, in turn, can be inferred that it
Tail entity is a specific postcode, in conjunction with the entity " 301700 " in answer, it can be deduced that the type of answer, which meets, to be pushed away
The disconnected expection type obtained.
S103: according to the source-information of problem and answer, the second information is determined.
Second information is used to indicate the height of the quality of data of described problem answer pair.
In this step, according to the source-information of problem and answer, the second information is determined, wherein problem and source-information
Be semantic sequence, by judging the consistent degree of two semantic sequences, come determine problem answers pair the quality of data height,
Further, the consistent degree of two semantic sequences can be embodied by similarity and/or overlap ratio column.
S104: according to the first information with the second information, to problem answers to screening.
According to the second information obtained in the first information obtained in step S102 and step S103, the problem is answered in determination
Case is to being screened out or retained.
In one possible implementation, if the type that this step includes: first information instruction answer does not meet expection
Type, then by problem answers to screening out;Alternatively, if the type of first information instruction answer meets expected type, but the second information
Indicate that the quality of data of problem answers pair is low, then also by problem answers to screening out;If the type of first information instruction answer meets
It is expected that type, and the second information indicates that the quality of data of problem answers pair is high, then by problem answers to reservation.
A kind of screening technique of question and answer data provided in an embodiment of the present invention, according to the answer pair of the problems in question answering system,
Determine problem answers to the problem of including, the source-information of answer and the answer, and according to problem, answer and knowledge graph
Spectrum determines the first information, and the first information is for indicating whether the type of the answer meets expected type, further according to problem
With the source-information of answer, determine that the second information, second information are used to indicate the quality of data of described problem answer pair
Just, finally pass through the accurate sieve to problem answers pair to problem answers to screening with the second information according to the first information
Choosing, on the one hand reduces the waste of data resource, on the other hand improves the accuracy of question and answer data and the timeliness of interaction.
On the basis of embodiment shown in Fig. 1, Fig. 2 is that the screening technique of question and answer data provided in an embodiment of the present invention is implemented
The flow diagram of example two, as shown in Fig. 2, the screening technique of the question and answer data, the also included specific reality before step S101
Now step includes:
S105: obtaining the question and answer data of question answering system, and the question and answer data include multiple problem answers pair.
The question and answer data of question answering system are obtained, which can store in local or be stored in server, ask
Answer according to including multiple problem answers pair, specially answer each problem in question and answer data by screening of this programme to question and answer data
Case is to the accurate judgement for being screened out or being retained.
In the present embodiment, by obtaining the question and answer data of question answering system, the accurate acquisition of garbled data is treated in realization, so as to
In the subsequent screening for carrying out comprehensive question and answer data to question answering system.
On the basis of the above embodiments, the screening technique of the question and answer data of this programme further include: obtain problem respectively, answer
The entity for including in case and source-information, it should be appreciated that acquired entity is problem, answer, in source-information in this programme
Important entity, for example, the entity of problem " the Wuqing District street Xia Zhuzhuan postcode " include " Wuqing District ", " street Xia Zhuzhuan ", " postal
Political affairs coding ", wherein important entity is " Wuqing District " and " postcode ", and " street Xia Zhuzhuan " is insignificant entity, further,
Important entity can be obtained by giving a mark to entity, by modes such as models, and determine it as the entity in this programme.
On the basis of the above embodiments, Fig. 3 is the screening technique embodiment of question and answer data provided in an embodiment of the present invention
Three flow diagram, as shown in figure 3, step S102 determines the first letter according to described problem, the answer and knowledge mapping
Breath, specifically includes following implemented step:
S1021: according to the entity and knowledge mapping of problem, the expection type of answer is determined.
In this step, the entity of problem is mapped in knowledge mapping, obtains the expection type of answer.
Still by taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, the entity in problem includes
" Wuqing District " and " postcode " can map to " head entity-relation " in knowledge mapping, in turn, can be inferred that it
Tail entity is a specific postcode.
S1022: determining whether the type of answer meets expected type, obtains the first information.
The type of answer is the type of the entity of answer, according to the type of answer, judges the type and expection kind of answer
Whether class is consistent, to obtain the first information.
Still by taking problem answers are to for " the Wuqing District street Xia Zhuzhuan postcode -301700 " as an example, in conjunction with the entity in answer
" 301700 ", it can be deduced that the type of answer meets the expection type inferred and obtained.
In the present embodiment, according to the entity and knowledge mapping of problem, the expection type of answer is determined, and determine the kind of answer
Whether class meets expected type, obtains the first information, by determine answer whether the content that compliance problem is asked, ensure that answer
Match with problem.
On the basis of the above embodiments, Fig. 4 is the screening technique embodiment of question and answer data provided in an embodiment of the present invention
Four flow diagram, as shown in figure 4, step S103 determines the second information according to described problem and the answer source-information,
Specifically include following implemented step:
S1031: according to problem and source-information, by similarity operator, the similar of problem and source-information is calculated
Degree.
In general, problem and source-information are two semantic sequences, for computational problem and source-information similarity
Similarity operator of the similarity operator between semantic sequence and semantic sequence (i.e. sentence and sentence), is asked by can be calculated
Problem and source-information are inputted mould alternatively, can establish model according to similarity operator by the similarity of topic and source-information
Type obtains the similarity of the two.
In specific example, as calculating " the Wuqing District street Xia Zhuzhuan postcode " and " the Tianjin Wuqing village Xia Zhu style
Similarity between two semantic sequences of year flower postcode ", obtaining the similarity between two sentences is 0.889.
S1032: the overlap proportion of the entity of the entity of problem and the source-information of answer is obtained.
It in this step, can be anti-by obtaining the overlap proportion of the entity of the entity of problem and the source-information of answer
The correlation for mirroring the source-information of problem and answer is able to reflect the quality of data for answer pair of ging wrong.
For example, the entity of problem is " Wuqing District " and " postcode ", the entity of the source-information of answer is " Tianjin ",
" Wuqing ", " postcode " pass through calculating it can be found that there are similar or identical entities in problem and source-information
The overlap proportion of the two entity can be obtained.
S1033: the second information is determined according to similarity and/or overlap proportion.
This step includes three kinds of possible implementations:
Mode one: according to similarity obtained in step S1031, the second information is determined.For example, by obtained similarity with
Preset value is compared, if similarity is greater than preset value, the second obtained information is used to indicate the data matter of problem answers pair
Amount is high, and otherwise, the second information is for indicating that the quality of data of problem answers pair is low.
For example, computational problem " the Wuqing District street Xia Zhuzhuan postcode " and source-information " the Tianjin Wuqing village Xia Zhu style year
Similarity between two semantic sequences of flower postcode " is 0.889, it is assumed that the preset value of similarity is 0.7, it is determined that this is asked
The quality of data for inscribing answer pair is high.
Mode two: according to overlap proportion obtained in step S1032, the second information is determined.For example, the overlap ratio that will be obtained
Example is compared with preset value, if overlap proportion is greater than preset value, the second obtained information is for indicating problem answers pair
The quality of data is high, and otherwise, the second information is for indicating that the quality of data of problem answers pair is low.
For example, the entity of problem is " Wuqing District " and " postcode ", the entity of the source-information of answer is " Tianjin ",
" Wuqing ", " postcode " pass through calculating it can be found that there are similar or identical entities in problem and source-information
The overlap proportion that the two entity can be obtained is greater than preset value 80%, it is determined that the quality of data of the problem answers pair is high.
Mode three: according to similarity and overlap proportion, the second information is determined.When similarity and all higher overlap proportion,
It specifically can be when being above respective preset value, the second information is used to indicate that the quality of data of problem answers pair to be high, otherwise, the
Two information are for indicating that the quality of data of problem answers pair is low.
In the present embodiment, according to problem and source-information, by similarity operator, problem and source-information is calculated
Similarity, and the overlap proportion of the entity of the entity of problem and the source-information of answer is obtained, further according to similarity and/or overlapping
The second information of ratio-dependent, has been accomplished in several ways the judgement to the quality of data of problem answers pair, has determined problem answers
Pair the quality of data height, convenient for determining problem answers to whether needing to be screened out.
Fig. 5 is the structural schematic diagram of the screening plant embodiment one of question and answer data provided in an embodiment of the present invention, such as Fig. 5 institute
Show, the screening plant 10 of the question and answer data, comprising:
Processing module 11, for determining that described problem answer is asked include according to the answer pair of the problems in question answering system
It inscribes, the source-information of answer and the answer;
The processing module 11 is also used to determine the first information according to described problem, the answer and knowledge mapping, described
The first information is for indicating whether the type of the answer meets expected type;
The processing module 11 is also used to the source-information according to described problem and the answer, determines the second information, institute
State the height of the quality of data of second information for indicating described problem answer pair;
Screening module 12, for according to the first information and second information, to described problem answer to sieving
Choosing.
The screening plant 10 of question and answer data provided in this embodiment, comprising: processing module 11 and screening module 12.According to asking
Answer intersystem problem answer pair, determine problem answers to the problem of including, the source-information of answer and the answer, and root
According to problem, answer and knowledge mapping, the first information is determined, the first information is for indicating whether the type of the answer meets
It is expected that type determines the second information, second information is for indicating described problem further according to the source-information of problem and answer
The height of the quality of data of answer pair, finally according to the first information with the second information, to problem answers to screening, by right
The accurate screening of problem answers pair, on the one hand reduces the waste of data resource, on the other hand improves the accurate of question and answer data
Property and interaction timeliness.
In a kind of concrete implementation mode, the screening module is specifically used for:
If the first information indicates that the type of the answer does not meet expected type, alternatively, second information indicates
The quality of data of described problem answer pair is low, then by described problem answer to screening out.
On the basis of embodiment shown in Fig. 5, Fig. 6 is that the screening plant of question and answer data provided in an embodiment of the present invention is implemented
The structural schematic diagram of example two, as shown in fig. 6, the screening plant 10 of the question and answer data, further includes:
Module 13 is obtained, for obtaining the question and answer data of the question answering system, the question and answer data include that multiple problems are answered
Case pair.
In a kind of concrete implementation mode, the acquisition module is also used to:
The entity for including in described problem, the answer and the source-information is obtained respectively.
In a kind of concrete implementation mode, the processing module is specifically used for:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
In a kind of concrete implementation mode, the processing module is specifically used for:
Described problem and the source are calculated by similarity operator according to described problem and the source-information
The similarity of information;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
Device provided in this embodiment, can be used for executing the technical solution of the embodiment of the above method, realization principle and
Technical effect is similar, and details are not described herein again for the present embodiment.
Fig. 7 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.As shown in fig. 7, the present embodiment
Electronic equipment 20 includes: processor 201 and memory 202;Wherein
Memory 202, for storing computer executed instructions;
Processor 201, for executing the computer executed instructions of memory storage, to realize described in any of the above-described embodiment
Question and answer data screening technique.It specifically may refer to the associated description in preceding method embodiment.
Optionally, memory 202 can also be integrated with processor 201 either independent.
When memory 202 is independently arranged, which further includes bus 203, for connecting 202 He of memory
Processor 201.
The embodiment of the present invention also provides a kind of computer readable storage medium, stores in the computer readable storage medium
There are computer executed instructions, when processor executes the computer executed instructions, realizes the sieve of question and answer data as described above
Choosing method.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, apparatus embodiments described above are merely indicative, for example, the division of the module, only
Only a kind of logical function partition, there may be another division manner in actual implementation, for example, multiple modules can combine or
It is desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or discussed it is mutual it
Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of device or module
It connects, can be electrical property, mechanical or other forms.
The module as illustrated by the separation member may or may not be physically separated, aobvious as module
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.Some or all of the modules therein can be selected to realize the mesh of this embodiment scheme according to the actual needs
's.
It, can also be in addition, each functional module in each embodiment of the present invention can integrate in one processing unit
It is that modules physically exist alone, can also be integrated in one unit with two or more modules.Above-mentioned module at
Unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated module realized in the form of software function module, can store and computer-readable deposit at one
In storage media.Above-mentioned software function module is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) or processor (English: processor) execute this Shen
Please each embodiment the method part steps.
It should be understood that above-mentioned processor can be central processing unit (English: Central Processing Unit, letter
Claim: CPU), can also be other general processors, digital signal processor (English: Digital Signal Processor,
Referred to as: DSP), specific integrated circuit (English: Application Specific Integrated Circuit, referred to as:
ASIC) etc..General processor can be microprocessor or the processor is also possible to any conventional processor etc..In conjunction with hair
The step of bright disclosed method, can be embodied directly in hardware processor and execute completion, or with hardware in processor and soft
Part block combiner executes completion.
Memory may include high speed RAM memory, it is also possible to and it further include non-volatile memories NVM, for example, at least one
Magnetic disk storage can also be USB flash disk, mobile hard disk, read-only memory, disk or CD etc..
Bus can be industry standard architecture (Industry Standard Architecture, ISA) bus, outer
Portion's apparatus interconnection (Peripheral Component, PCI) bus or extended industry-standard architecture (Extended
Industry Standard Architecture, EISA) bus etc..Bus can be divided into address bus, data/address bus, control
Bus etc..For convenient for indicating, the bus in illustrations does not limit only a bus or a type of bus.
Above-mentioned storage medium can be by any kind of volatibility or non-volatile memory device or their combination
It realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable
Read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash memory,
Disk or CD.Storage medium can be any usable medium that general or specialized computer can access.
A kind of illustrative storage medium is coupled to processor, believes to enable a processor to read from the storage medium
Breath, and information can be written to the storage medium.Certainly, storage medium is also possible to the component part of processor.It processor and deposits
Storage media can be located at specific integrated circuit (Application Specific Integrated Circuits, referred to as:
ASIC in).Certainly, pocessor and storage media can also be used as discrete assembly and be present in electronic equipment or main control device.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above-mentioned each method embodiment can lead to
The relevant hardware of program instruction is crossed to complete.Program above-mentioned can be stored in a computer readable storage medium.The journey
When being executed, execution includes the steps that above-mentioned each method embodiment to sequence;And storage medium above-mentioned include: ROM, RAM, magnetic disk or
The various media that can store program code such as person's CD.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent
Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to
So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into
Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution
The range of scheme.
Claims (14)
1. a kind of screening technique of question and answer data characterized by comprising
According to the answer pair of the problems in question answering system, determine described problem answer to the problem of including, answer and the answer
Source-information;
According to described problem, the answer and knowledge mapping, the first information is determined, the first information is for indicating the answer
Type whether meet expected type;
According to the source-information of described problem and the answer, determine that the second information, second information are asked for indicating described
Inscribe the height of the quality of data of answer pair;
According to the first information and second information, to described problem answer to screening.
2. the method according to claim 1, wherein being asked according to the first information and the second information described
Topic answer is to screening, comprising:
If the first information indicates that the type of the answer does not meet expected type, alternatively, described in second information expression
The quality of data of problem answers pair is low, then by described problem answer to screening out.
3. method according to claim 1 or 2, which is characterized in that according to the answer pair of the problems in question answering system, determine
Before described problem answer is to the source-information including problem, answer and the answer, the method also includes:
The question and answer data of the question answering system are obtained, the question and answer data include multiple problem answers pair.
4. method according to claim 1 or 2, which is characterized in that the method also includes:
The entity for including in described problem, the answer and the source-information is obtained respectively.
5. according to the method described in claim 4, it is characterized in that, described with according to described problem, the answer and knowledge mapping,
Determine the first information, comprising:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
6. according to the method described in claim 4, it is characterized in that, described according to described problem and the answer source-information,
Determine the second information, comprising:
Described problem and the source-information are calculated by similarity operator according to described problem and the source-information
Similarity;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
7. a kind of screening plant of question and answer data characterized by comprising
Processing module, for according to the answer pair of the problems in question answering system, determine described problem answer to the problem of including, answer
And the source-information of the answer;
The processing module is also used to determine the first information, first letter according to described problem, the answer and knowledge mapping
Breath is for indicating whether the type of the answer meets expected type;
The processing module is also used to the source-information according to described problem and the answer, determines the second information, and described second
Information is used to indicate the height of the quality of data of described problem answer pair;
Screening module, for according to the first information and second information, to described problem answer to screening.
8. device according to claim 7, which is characterized in that the screening module is specifically used for:
If the first information indicates that the type of the answer does not meet expected type, alternatively, described in second information expression
The quality of data of problem answers pair is low, then by described problem answer to screening out.
9. device according to claim 7 or 8, which is characterized in that described device further include:
Module is obtained, for obtaining the question and answer data of the question answering system, the question and answer data include multiple problem answers pair.
10. device according to claim 7 or 8, which is characterized in that the acquisition module is also used to:
The entity for including in described problem, the answer and the source-information is obtained respectively.
11. device according to claim 10, which is characterized in that the processing module is specifically used for:
According to the entity of described problem and the knowledge mapping, the expection type of the answer is determined;
It determines whether the type of the answer meets the expected type, obtains the first information.
12. device according to claim 10, which is characterized in that the processing module is specifically used for:
Described problem and the source-information are calculated by similarity operator according to described problem and the source-information
Similarity;
Obtain the overlap proportion of the entity of the entity of described problem and the source-information of the answer;
Second information is determined according to the similarity and/or the overlap proportion.
13. a kind of electronic equipment characterized by comprising memory and processor;
The memory stores computer executed instructions;
At least one described processor executes the computer executed instructions of the memory storage, so that the processor executes such as
The screening technique of question and answer data as claimed in any one of claims 1 to 6.
14. a kind of storage medium characterized by comprising readable storage medium storing program for executing and computer program, the computer program are used
In the screening technique for realizing question and answer data as claimed in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910706456.2A CN110399466A (en) | 2019-08-01 | 2019-08-01 | Screening technique, device, equipment and the storage medium of question and answer data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910706456.2A CN110399466A (en) | 2019-08-01 | 2019-08-01 | Screening technique, device, equipment and the storage medium of question and answer data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110399466A true CN110399466A (en) | 2019-11-01 |
Family
ID=68327257
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910706456.2A Pending CN110399466A (en) | 2019-08-01 | 2019-08-01 | Screening technique, device, equipment and the storage medium of question and answer data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110399466A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103577556A (en) * | 2013-10-21 | 2014-02-12 | 北京奇虎科技有限公司 | Device and method for obtaining association degree of question and answer pair |
CN103914543A (en) * | 2014-04-03 | 2014-07-09 | 北京百度网讯科技有限公司 | Search result displaying method and device |
US10031970B1 (en) * | 2013-09-12 | 2018-07-24 | Intuit Inc. | Search engine optimization in social question and answer systems |
CN109033050A (en) * | 2018-06-29 | 2018-12-18 | 北京百度网讯科技有限公司 | Article generation method, equipment and storage medium |
-
2019
- 2019-08-01 CN CN201910706456.2A patent/CN110399466A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10031970B1 (en) * | 2013-09-12 | 2018-07-24 | Intuit Inc. | Search engine optimization in social question and answer systems |
CN103577556A (en) * | 2013-10-21 | 2014-02-12 | 北京奇虎科技有限公司 | Device and method for obtaining association degree of question and answer pair |
CN103914543A (en) * | 2014-04-03 | 2014-07-09 | 北京百度网讯科技有限公司 | Search result displaying method and device |
CN109033050A (en) * | 2018-06-29 | 2018-12-18 | 北京百度网讯科技有限公司 | Article generation method, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108121795B (en) | User behavior prediction method and device | |
CN110377716A (en) | Exchange method, device and the computer readable storage medium of dialogue | |
CN110210021A (en) | Read understanding method and device | |
CN112860841A (en) | Text emotion analysis method, device and equipment and storage medium | |
CN107180386A (en) | A kind of quantization strategy live broadcast system | |
CN109859747A (en) | Voice interactive method, equipment and storage medium | |
CN114298039B (en) | Sensitive word recognition method and device, electronic equipment and storage medium | |
CN112017777B (en) | Method and device for predicting similar pair problem and electronic equipment | |
CN108846138A (en) | A kind of the problem of fusion answer information disaggregated model construction method, device and medium | |
CN110189751A (en) | Method of speech processing and equipment | |
CN109949830A (en) | User's intension recognizing method and equipment | |
CN114996486A (en) | Data recommendation method and device, server and storage medium | |
US20190171745A1 (en) | Open ended question identification for investigations | |
CN110162774B (en) | Automatic news emotion calibration method and device based on financial market quotation | |
CN112307754A (en) | Statement acquisition method and device | |
CN113658586B (en) | Training method of voice recognition model, voice interaction method and device | |
CN110362361A (en) | The method and device of documenting | |
CN110399466A (en) | Screening technique, device, equipment and the storage medium of question and answer data | |
CN112307751A (en) | Data desensitization method and system based on natural language processing | |
CN111859933A (en) | Training method, recognition method, device and equipment of Malay recognition model | |
CN109902309A (en) | Interpretation method, device, equipment and storage medium | |
CN115294947A (en) | Audio data processing method and device, electronic equipment and medium | |
CN109300031A (en) | Data digging method and device based on stock comment data | |
CN115564557A (en) | Repayment capability evaluation model training method and device, electronic equipment and medium | |
CN114462376A (en) | RPA and AI-based court trial record generation method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191101 |