CN109062973A - A kind of method for digging, device, server and the storage medium of question and answer resource - Google Patents
A kind of method for digging, device, server and the storage medium of question and answer resource Download PDFInfo
- Publication number
- CN109062973A CN109062973A CN201810696978.4A CN201810696978A CN109062973A CN 109062973 A CN109062973 A CN 109062973A CN 201810696978 A CN201810696978 A CN 201810696978A CN 109062973 A CN109062973 A CN 109062973A
- Authority
- CN
- China
- Prior art keywords
- answer
- initial
- question
- resource
- initial problem
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The embodiment of the invention discloses method for digging, device, server and the storage mediums of a kind of question and answer resource.The described method includes: extracting the corresponding initial answer of each initial problem in each question and answer pair in community's question and answer resource;The corresponding target answer of each initial problem is determined according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource;Target question and answer resource is excavated according to each initial problem and the corresponding target answer of each initial problem.Excavating cost can be not only saved, digging efficiency can also be improved and excavates accuracy.
Description
Technical field
The present embodiments relate to Internet technical field more particularly to a kind of method for digging, device, the clothes of question and answer resource
Business device and storage medium.
Background technique
With the fast development of internet, the function of search engine is increasingly powerful, and user also gets over the expectation of search engine
Come it is higher, start from basic related web page recall to intelligent answer change.When user to be inquired by search engine input
The problem of when, it is desirable to the search result of acquisition is no longer relevant webpage, and wants to directly obtain the answer of problem.
Depth question and answer refer to the language for understanding the mankind, the meaning of intelligent recognition problem, and from the internet data of magnanimity
The answer of extraction problem.One of the vital task of depth question answering system exactly constructs good question and answer resource.On the internet,
Community's question and answer resource can provide question and answer resource for user, but the quality of question and answer pair is difficult to ensure in community's question and answer resource;And
The source that UGC (User Generated Content) can furnish an answer for offline question answering system, but generally existing problem in UGC
It is second-rate, or even the situation of mistake.By way of manual review and artificial correction, although can be from magnanimity, many and diverse
The question and answer resource of a collection of high quality is excavated in UGC content.But the human cost of this method is too big, efficiency is too low, it is difficult to
It is applied in actual product.
Summary of the invention
In view of this, method for digging, device, server and storage that the embodiment of the present invention provides a kind of question and answer resource are situated between
Matter can not only save excavating cost, can also improve digging efficiency and excavate accuracy.
In a first aspect, the embodiment of the invention provides a kind of method for digging of question and answer resource, which comprises
The corresponding initial answer of each initial problem is extracted in each question and answer pair in community's question and answer resource;
It is determined according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource each initial
The corresponding target answer of problem;
Target question and answer resource is excavated according to each initial problem and the corresponding target answer of each initial problem.
In the above-described embodiments, described according to each in the corresponding initial answer of each initial problem and vertical class resource
UGC content determines the corresponding target answer of each initial problem, comprising:
It is determined according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource each initial
The corresponding alternative answer of problem;
It is determined according to the corresponding initial answer of each initial problem and the corresponding alternative answer of each initial problem each
The corresponding target answer of initial problem.
In the above-described embodiments, described according to each in the corresponding initial answer of each initial problem and vertical class resource
UGC content determines the corresponding alternative answer of each initial problem, comprising:
Calculate the corresponding sentence vector of each UGC content of the corresponding sentence vector sum of each initial answer;
It is determined according to the corresponding sentence vector of the corresponding each UGC content of sentence vector sum of each initial answer each initial
The corresponding alternative answer of problem.
It is in the above-described embodiments, described that calculate the corresponding each UGC content of sentence vector sum of each initial answer corresponding
Sentence vector, comprising:
According to basic statement be unit by each initial answer and each UGC content be respectively divided into the first sentence dictionary and
Second sentence dictionary;
The corresponding sentence vector of each initial answer is calculated according to the first sentence dictionary and the second sentence dictionary
Sentence vector corresponding with each UGC content.
In the above-described embodiments, described corresponding according to the corresponding initial answer of each initial problem and each initial problem
Alternative answer determine the corresponding target answer of each initial problem, comprising:
Calculate the corresponding word vectors of each initial answer and the corresponding word vectors of each alternative answer;
It is determined according to each initial corresponding word vectors of answer and the corresponding word vectors of each alternative answer each first
The corresponding target answer of beginning problem.
Second aspect, the embodiment of the invention provides a kind of method for digging of question and answer resource, described device includes: extraction mould
Block, determining module and excavation module;Wherein,
The extraction module, it is corresponding for extracting each initial problem in each question and answer pair in community's question and answer resource
Initial answer;
The determining module, for according to each in the corresponding initial answer of each initial problem and vertical class resource
UGC content determines the corresponding target answer of each initial problem;
The excavation module, for being excavated according to each initial problem and the corresponding target answer of each initial problem
Target question and answer resource.
In the above-described embodiments, the determining module, be specifically used for according to the corresponding initial answer of each initial problem with
And each UGC content in vertical class resource determines the corresponding alternative answer of each initial problem;It is corresponding according to each initial problem
Initial answer and the corresponding alternative answer of each initial problem determine the corresponding target answer of each initial problem.
In the above-described embodiments, the determining module includes: computational submodule and determining submodule;Wherein,
The computational submodule, it is corresponding for calculating each UGC content of the corresponding sentence vector sum of each initial answer
Sentence vector;
The determining submodule, for corresponding according to the corresponding each UGC content of sentence vector sum of each initial answer
Sentence vector determines the corresponding alternative answer of each initial problem.
In the above-described embodiments, the computational submodule, specifically for initially being answered for unit by each according to basic statement
Case and each UGC content are respectively divided into the first sentence dictionary and the second sentence dictionary;According to the first sentence dictionary and institute
It states the second sentence dictionary and calculates the corresponding sentence vector of the corresponding each UGC content of sentence vector sum of each initial answer.
In the above-described embodiments, the computational submodule, be also used to calculate the corresponding word vectors of each initial answer and
The corresponding word vectors of each alternative answer;
The determining submodule is also used to corresponding according to each initial corresponding word vectors of answer and each alternative answer
Word vectors determine the corresponding target answer of each initial problem.
The third aspect, the embodiment of the invention provides a kind of servers, comprising:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the method for digging of question and answer resource described in any embodiment of that present invention.
Fourth aspect, the embodiment of the invention provides a kind of storage mediums, are stored thereon with computer program, the program quilt
The method for digging of question and answer resource described in any embodiment of that present invention is realized when processor executes.
The embodiment of the present invention proposes method for digging, device, server and the storage medium of a kind of question and answer resource, first in society
The corresponding initial answer of each initial problem is extracted in each question and answer pair in area's question and answer resource;Then it is initially asked according to each
It inscribes each UGC content in corresponding initial answer and vertical class resource and determines the corresponding target answer of each initial problem;Most
Target question and answer resource is excavated according to each initial problem and the corresponding target answer of each initial problem afterwards.That is,
It in the inventive solutions, can be according to each in the corresponding initial answer of each initial problem and vertical class resource
UGC content determines the corresponding target answer of each initial problem;Then according to each initial problem and each initial problem pair
Target question and answer resource is excavated in the target answer answered.In the method for digging of existing question and answer resource, pass through manual review and people
The modified mode of work excavates the question and answer resource of a collection of high quality from magnanimity, many and diverse UGC content.It is asked using existing
The method for digging of resource is answered, human cost is too big, and efficiency is too low, it is difficult to be applied in actual product.Therefore with prior art phase
Than method for digging, device, server and the storage medium of the question and answer resource that the embodiment of the present invention proposes can not only save digging
Cost is dug, digging efficiency can also be improved and excavates accuracy;Also, the technical solution of the embodiment of the present invention realizes simple side
Just, convenient for universal, the scope of application is wider.
Detailed description of the invention
Fig. 1 is the implementation flow chart of the method for digging for the question and answer resource that the embodiment of the present invention one provides;
Fig. 2 is the implementation flow chart of the method for digging of question and answer resource provided by Embodiment 2 of the present invention;
Fig. 3 is the implementation flow chart of the method for digging for the question and answer resource that the embodiment of the present invention three provides;
Fig. 4 is the first structure diagram of the excavating gear for the question and answer resource that the embodiment of the present invention four provides;
Fig. 5 is the second structural schematic diagram of the excavating gear for the question and answer resource that the embodiment of the present invention four provides;
Fig. 6 is the structural schematic diagram for the server that the embodiment of the present invention five provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
In description, only some but not all contents related to the present invention are shown in the drawings.
Embodiment one
Fig. 1 is the implementation flow chart of the method for digging for the question and answer resource that the embodiment of the present invention one provides.As shown in Figure 1, asking
The method for digging for answering resource may comprise steps of:
S101, the corresponding initial answer of each initial problem is extracted in each question and answer pair in community's question and answer resource.
In the prior art, class resource of hanging down includes high-quality and authoritative UGC content, but lacks matched problem;
Community's question and answer resource is by question and answer to forming, but the quality of question and answer pair not can guarantee.If class resource of hanging down and community's question and answer resource
There are a large amount of laps, then the two mutually verification, just therefrom can go out the good question and answer resource of a batch by automatic mining.In this hair
In bright specific embodiment, server can extract each initial problem pair in each question and answer pair in community's question and answer resource
The initial answer answered.Specifically, may include multiple<problem in community's question and answer resource, answer>right, server can it is each<
Problem, the answer > corresponding initial answer of each initial problem is extracted in.Specifically, server can problem to 1 < just
Beginning problem 1, initial answer 1 > in extract the corresponding initial answer 1 of initial problem 1;In question and answer to 2 < initial problem 2, initially answer
Case 2 > in extract the corresponding initial answer 2 of initial problem 2;...;In question and answer to N<initial problem N, initial answer N>middle extraction
The corresponding initial answer N of initial problem N out;Wherein, N is the natural number more than or equal to 1.
S102, it is determined respectively according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource
The corresponding target answer of a initial problem.
In a specific embodiment of the present invention, server can according to the corresponding initial answer of each initial problem and hang down
Each UGC content in class resource determines the corresponding target answer of each initial problem.Specifically, server can be asked in community
Answer in each question and answer pair in resource and extract<initial problem 1, initial answer 1>,<initial problem 2, initial answer 2>...,<
Initial problem N, initial answer N >;Wherein, N is the natural number more than or equal to 1.Then server can < initial problem 1, initially
Answer 1>,<initial problem 2, initial answer 2>..., the UGC content in<initial problem N, initial answer N>and vertical class resource
1, UGC content 1 ..., UGC content M, determine initial problem 1, initial problem 2 ..., the corresponding target answer 1 of initial problem N,
Target answer 2 ..., target answer N;M is the natural number more than or equal to 1.
S103, target question and answer money is excavated according to each initial problem and the corresponding target answer of each initial problem
Source.
In a specific embodiment of the present invention, server is according to the corresponding initial answer of each initial problem and vertical class
After each UGC content in resource determines the corresponding target answer of each initial problem, server can be according to each initial
Target question and answer resource is excavated in problem and the corresponding target answer of each initial problem.Specifically, server can be according to first
Beginning problem 1 and the corresponding target answer 1 of initial problem 1, initial problem 2 and the corresponding target answer of initial problem 2
2 ..., the corresponding target answer N of initial problem N and initial problem N excavates target question and answer resource to < initial problem 1, mesh
Mark answer 1>,<initial problem 2, target answer 2>...,<initial problem N, target answer N>;Each target question and answer resource is to group
At target question and answer resource.
The method for digging for the question and answer resource that the embodiment of the present invention proposes, first in each question and answer pair in community's question and answer resource
Extract the corresponding initial answer of each initial problem;Then it is provided according to the corresponding initial answer of each initial problem and vertical class
Each UGC content in source determines the corresponding target answer of each initial problem;Finally according to each initial problem and each
Target question and answer resource is excavated in the corresponding target answer of initial problem.That is, in the inventive solutions, Ke Yigen
Determine that each initial problem is corresponding according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource
Target answer;Then target question and answer money is excavated according to each initial problem and the corresponding target answer of each initial problem
Source.In the method for digging of existing question and answer resource, by way of manual review and artificial correction, from magnanimity, many and diverse
The question and answer resource of a collection of high quality is excavated in UGC content.Using the method for digging of existing question and answer resource, human cost is too
Greatly, efficiency is too low, it is difficult to be applied in actual product.Therefore, compared to the prior art, the question and answer money that the embodiment of the present invention proposes
The method for digging in source can not only save excavating cost, can also improve digging efficiency and excavate accuracy;Also, the present invention
The technical solution realization of embodiment is simple and convenient, it is universal to be convenient for, and the scope of application is wider.
Embodiment two
Fig. 2 is the implementation flow chart of the method for digging of question and answer resource provided by Embodiment 2 of the present invention.As shown in Fig. 2, asking
The method for digging for answering resource may comprise steps of:
S201, the corresponding initial answer of each initial problem is extracted in each question and answer pair in community's question and answer resource.
In the prior art, class resource of hanging down includes high-quality and authoritative UGC content, but lacks matched problem;
Community's question and answer resource is by question and answer to forming, but the quality of question and answer pair not can guarantee.If class resource of hanging down and community's question and answer resource
There are a large amount of laps, then the two mutually verification, just therefrom can go out the good question and answer resource of a batch by automatic mining.In this hair
In bright specific embodiment, server can extract each initial problem pair in each question and answer pair in community's question and answer resource
The initial answer answered.Specifically, may include multiple<problem in community's question and answer resource, answer>right, server can it is each<
Problem, the answer > corresponding initial answer of each initial problem is extracted in.Specifically, server can question and answer to 1 < just
Beginning problem 1, initial answer 1 > in extract the corresponding initial answer 1 of initial problem 1;In question and answer to 2 < initial problem 2, initially answer
Case 2>in extract the corresponding initial answer 2... of initial problem 2, question and answer to N<initial problem N, initial answer N>in extract
The corresponding initial answer N of initial problem N out;Wherein, N is the natural number more than or equal to 1.
S202, it is determined respectively according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource
The corresponding alternative answer of a initial problem.
In a specific embodiment of the present invention, server can according to the corresponding initial answer of each initial problem and hang down
Each UGC content in class resource determines the corresponding alternative answer of each initial problem.Specifically, server can be according to initial
The corresponding initial answer 1 of problem 1, the corresponding initial answer 2 of initial problem 2 ..., the corresponding initial answer N of initial problem N with
And UGC content 1 in vertical class resource, UGC content 2 ..., UGC content N determine the corresponding alternative answer of initial problem 1, initial
The corresponding alternative answer of problem 2 ..., the corresponding alternative answer of initial problem N.That is, in specific embodiments of the present invention
In, server can first filter out the corresponding alternative answer of each initial problem in each UGC content, then filter out
The corresponding target answer of each initial problem is determined in the corresponding alternative answer of each initial problem.Due to can in screening process
To exclude a large amount of UGC content unrelated with each initial problem, therefore the digging efficiency of server can be effectively improved.
S203, it is determined according to the corresponding alternative answer of the corresponding initial answer of each initial problem and each initial problem
The corresponding target answer of each initial problem.
In a specific embodiment of the present invention, server can be according to the corresponding initial answer of each initial problem and each
The corresponding alternative answer of a initial problem determines the corresponding target answer of each initial problem.Specifically, server can basis
The corresponding initial answer 1 of initial problem 1, the corresponding initial answer 2 of initial problem 2 ..., the corresponding initial answer of initial problem N
N and the corresponding alternative answer of initial problem 1, the corresponding alternative answer of initial problem 2 ..., initial problem N it is corresponding alternative
Answer determine the corresponding target answer 1 of initial problem 1, the corresponding target answer 2 of initial problem 2 ..., initial problem N it is corresponding
Target answer N.That is, in a specific embodiment of the present invention, server can be filtered out first in each UGC content respectively
The corresponding alternative answer of a initial problem, then determined in the corresponding alternative answer of each initial problem filtered out it is each just
The corresponding target answer of beginning problem.Since a large amount of UGC unrelated with each initial problem can be excluded in screening process
Content, therefore the digging efficiency of server can be effectively improved.
S204, target question and answer money is excavated according to each initial problem and the corresponding target answer of each initial problem
Source.
In a specific embodiment of the present invention, server is according to the corresponding initial answer of each initial problem and vertical class
After each UGC content in resource determines the corresponding target answer of each initial problem, server can be according to each initial
Target question and answer resource is excavated in problem and the corresponding target answer of each initial problem.Specifically, server can be according to first
Beginning problem 1 and the corresponding target answer 1 of initial problem 1, initial problem 2 and the corresponding target answer of initial problem 2
2 ..., the corresponding target answer N of initial problem N and initial problem N excavates target question and answer resource to < initial problem 1, mesh
Mark answer 1>,<initial problem 2, target answer 2>...,<initial problem N, target answer N>;Each target question and answer resource is to group
At target question and answer resource.
The method for digging for the question and answer resource that the embodiment of the present invention proposes, first in each question and answer pair in community's question and answer resource
Extract the corresponding initial answer of each initial problem;Then it is provided according to the corresponding initial answer of each initial problem and vertical class
Each UGC content in source determines the corresponding target answer of each initial problem;Finally according to each initial problem and each
Target question and answer resource is excavated in the corresponding target answer of initial problem.That is, in the inventive solutions, Ke Yigen
Determine that each initial problem is corresponding according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource
Target answer;Then target question and answer money is excavated according to each initial problem and the corresponding target answer of each initial problem
Source.In the method for digging of existing question and answer resource, by way of manual review and artificial correction, from magnanimity, many and diverse
The question and answer resource of a collection of high quality is excavated in UGC content.Using the method for digging of existing question and answer resource, human cost is too
Greatly, efficiency is too low, it is difficult to be applied in actual product.Therefore, compared to the prior art, the question and answer money that the embodiment of the present invention proposes
The method for digging in source can not only save excavating cost, can also improve digging efficiency and excavate accuracy;Also, the present invention
The technical solution realization of embodiment is simple and convenient, it is universal to be convenient for, and the scope of application is wider.
Embodiment three
Fig. 3 is the implementation flow chart of the method for digging of the question and answer resource in the embodiment of the present invention three.As shown in figure 3, question and answer
The method for digging of resource may comprise steps of:
S301, the corresponding initial answer of each initial problem is extracted in each question and answer pair in community's question and answer resource.
In the prior art, class resource of hanging down includes high-quality and authoritative UGC content, but lacks matched problem;
Community's question and answer resource is by question and answer to forming, but the quality of question and answer pair not can guarantee.If class resource of hanging down and community's question and answer resource
There are a large amount of laps, then the two mutually verification, just therefrom can go out the good question and answer resource of a batch by automatic mining.In this hair
In bright specific embodiment, server can extract each initial problem pair in each question and answer pair in community's question and answer resource
The initial answer answered.Specifically, may include multiple<problem in community's question and answer resource, answer>right, server can it is each<
Problem, the answer > corresponding initial answer of each initial problem is extracted in.Specifically, server can question and answer to 1 < just
Beginning problem 1, initial answer 1 > in extract the corresponding initial answer 1 of initial problem 1;In question and answer to 2 < initial problem 2, initially answer
Case 2>in extract the corresponding initial answer 2... of initial problem 2, question and answer to N<initial problem N, initial answer N>in extract
The corresponding initial answer N of initial problem N out;Wherein, N is the natural number more than or equal to 1.
S302, the corresponding sentence vector of the corresponding each UGC content of sentence vector sum of each initial answer is calculated.
In a specific embodiment of the present invention, server can calculate the corresponding each UGC of sentence vector sum of initial answer
The corresponding sentence vector of content.Specifically, server can be unit according to basic statement by each initial answer and each UGC
Content is respectively divided into the first sentence dictionary and the second sentence dictionary;Then according to the first sentence dictionary and the second sentence dictionary meter
Calculate the corresponding sentence vector of each UGC content of the corresponding sentence vector sum of each initial answer.For example, it is assumed that initial answer 1 is wrapped
Include: initial statement 1, initial statement 2 ..., initial statement X;UGC content 1 include: UGC sentence 1, UGC sentence 2 ..., UGC language
Sentence Y;Wherein, X and Y is the natural number more than or equal to 1.In this step, server can be extracted first in initial answer 1
Each basic statement: initial statement 1, initial statement 2 ..., initial statement X;Then it is extracted in UGC content 1 each basic
Sentence: UGC sentence 1, UGC sentence 2 ..., UGC sentence Y.It then will be in the basic statement and each UGC in each initial answer
Basic statement group in appearance is combined into a basic sentence dictionary;Each initial answer pair is calculated further according to each basic sentence dictionary
The corresponding sentence vector of each UGC content of the sentence vector sum answered.Specifically, each basic sentence dictionary may include: dictionary
List item 1, dictionary list item 2 ..., dictionary list item P;Wherein, P is the natural number less than or equal to the sum of X and Y.Dictionary list item 1 includes:
Sentence mark 1 and basic statement 1;Dictionary list item 2 includes: sentence mark 2 and basic statement 2;...;Dictionary list item P includes: language
Sentence mark P and basic statement P.It should be noted that if certain in initial answer in 1 some initial statement and UGC content 1
When a UGC sentence is identical, in the basis sentence dictionary, which can be merged with the UGC sentence in the same word
In allusion quotation list item.
S303, determined according to the corresponding sentence vector of the corresponding each UGC content of sentence vector sum of each initial answer it is each
The corresponding alternative answer of a initial problem.
In a specific embodiment of the present invention, server can be each according to the corresponding sentence vector sum of each initial answer
The corresponding sentence vector of UGC content determines the corresponding alternative answer of each initial problem.Specifically, server can calculate each
The similarity of the corresponding sentence vector of each UGC content of the initial corresponding sentence vector sum of answer, when each initial answer is corresponding
The similarity of the corresponding sentence vector of each UGC content of sentence vector sum when being greater than preset threshold, server can will be similar
The UGC content that degree is greater than preset threshold is determined as the corresponding alternative answer of each initial problem.
S304, each initial corresponding word vectors of answer and the corresponding word vectors of each alternative answer are calculated.
In a specific embodiment of the present invention, server can calculate corresponding word vectors of each initial answer and each
The corresponding word vectors of alternative answer.Specifically, server can according to basic word be unit will each initial answer and respectively
A alternative answer is divided into the first word dictionary and the second word dictionary;Then according to the first word dictionary and the second word dictionary
Calculate the corresponding word vectors of each initial answer and the corresponding word vectors of each alternative answer.
S305, it is determined respectively according to each initial corresponding word vectors of answer and the corresponding word vectors of each alternative answer
The corresponding target answer of a initial problem.
In a specific embodiment of the present invention, server can be according to corresponding word vectors of each initial answer and each
The alternative corresponding word vectors of answer determine the corresponding target answer of each initial problem.Specifically, server can calculate respectively
The similarity of a initial answer corresponding word vectors and the corresponding word vectors of each alternative answer, server can will be similar
It spends maximum alternative answer and is determined as the corresponding target answer of each initial problem.For example, the corresponding word vectors of initial answer
For A=(0,1,1 ..., 2,1);The alternative corresponding word vectors of answer are C1=(3,0,0 ..., 2,1), C2=(0,3,1 ...,
1,0), C3=(1,2,0 ..., 4,0), C4=(2,1,1 ..., 0,0).In this step, server can calculate separately C1 and A
Similarity, the similarity of C2 and A, the similarity of C3 and A, the similarity of C4 and A;Then by the highest alternative answer of similarity
It is determined as target answer.
S306, target question and answer money is excavated according to each initial problem and the corresponding target answer of each initial problem
Source.
In a specific embodiment of the present invention, server can be corresponding according to each initial problem and each initial problem
Target answer excavate target question and answer resource.Specifically, server can be corresponding according to initial problem 1 and initial problem 1
Target answer 1, initial problem 2 and the corresponding target answer 2 of initial problem 2 ..., initial problem N and initial problem N
Corresponding target answer N, excavates target question and answer resource to<initial problem 1, target answer 1>,<initial problem 2, target answer
2>...,<initial problem N, target answer N>;Each target question and answer resource is to composition target question and answer resource.
The method for digging for the question and answer resource that the embodiment of the present invention proposes, first in each question and answer pair in community's question and answer resource
Extract the corresponding initial answer of each initial problem;Then it is provided according to the corresponding initial answer of each initial problem and vertical class
Each UGC content in source determines the corresponding target answer of each initial problem;Finally according to each initial problem and each
Target question and answer resource is excavated in the corresponding target answer of initial problem.That is, in the inventive solutions, Ke Yigen
Determine that each initial problem is corresponding according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource
Target answer;Then target question and answer money is excavated according to each initial problem and the corresponding target answer of each initial problem
Source.In the method for digging of existing question and answer resource, by way of manual review and artificial correction, from magnanimity, many and diverse
The question and answer resource of a collection of high quality is excavated in UGC content.Using the method for digging of existing question and answer resource, human cost is too
Greatly, efficiency is too low, it is difficult to be applied in actual product.Therefore, compared to the prior art, the question and answer money that the embodiment of the present invention proposes
The method for digging in source can not only save excavating cost, can also improve digging efficiency and excavate accuracy;Also, the present invention
The technical solution realization of embodiment is simple and convenient, it is universal to be convenient for, and the scope of application is wider.
Example IV
Fig. 4 is the first structure diagram of the excavating gear for the question and answer resource that the embodiment of the present invention four provides.Such as Fig. 4 institute
Show, the excavating gear of question and answer resource includes: that described device includes: extraction module 401, determining module 402 and excavation module 403;
Wherein,
The extraction module 401, for extracting each initial problem in each question and answer pair in community's question and answer resource
Corresponding initial answer;
The determining module 402, for according to each in the corresponding initial answer of each initial problem and vertical class resource
A UGC content determines the corresponding target answer of each initial problem;
The excavation module 403, for being dug according to each initial problem and the corresponding target answer of each initial problem
Excavate target question and answer resource.
Further, the determining module 402 is specifically used for according to the corresponding initial answer of each initial problem and hangs down
Each UGC content in class resource determines the corresponding alternative answer of each initial problem;It is corresponding just according to each initial problem
Beginning answer and the corresponding alternative answer of each initial problem determine the corresponding target answer of each initial problem.
Fig. 5 is the second structural schematic diagram of the excavating gear for the question and answer resource that the embodiment of the present invention four provides.Such as Fig. 5 institute
Show, the determining module 402 includes: computational submodule 4021 and determining submodule 4022;Wherein,
The computational submodule 4021, for calculating each UGC content pair of the corresponding sentence vector sum of each initial answer
The sentence vector answered;
The determining submodule 4022, for according to the corresponding each UGC content pair of sentence vector sum of each initial answer
The sentence vector answered determines the corresponding alternative answer of each initial problem.
Further, the computational submodule 4021 is specifically used for according to basic statement being unit by each initial answer
The first sentence dictionary and the second sentence dictionary are respectively divided into each UGC content;According to the first sentence dictionary and described
Second sentence dictionary calculates the corresponding sentence vector of the corresponding each UGC content of sentence vector sum of each initial answer.
Further, the computational submodule 4021 is also used to calculate corresponding word vectors of each initial answer and each
The corresponding word vectors of a alternative answer;
The determining submodule 4022 is also used to according to each initial corresponding word vectors of answer and each alternative answer
Corresponding word vectors determine the corresponding target answer of each initial problem.
Method provided by any embodiment of the invention can be performed in the excavating gear of above-mentioned question and answer resource, has execution method
Corresponding functional module and beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to the present invention is arbitrarily real
The method for digging of the question and answer resource of example offer is provided.
Embodiment five
Fig. 6 is the structural schematic diagram for the server that the embodiment of the present invention five provides.Fig. 6, which is shown, to be suitable for being used to realizing this hair
The block diagram of the exemplary servers of bright embodiment.The server 12 that Fig. 6 is shown is only an example, should not be to of the invention real
The function and use scope for applying example bring any restrictions.
As shown in fig. 6, server 12 is showed in the form of universal computing device.The component of server 12 may include but not
Be limited to: one or more processor or processing unit 16, system storage 28 connect different system components (including system
Memory 28 and processing unit 16) bus 18.
Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)
Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Server 12 typically comprises a variety of computer system readable media.These media can be and any can be serviced
The usable medium that device 12 accesses, including volatile and non-volatile media, moveable and immovable medium.
System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (RAM) 30 and/or cache memory 32.Server 12 may further include other removable/nonremovable
, volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for reading and writing not removable
Dynamic, non-volatile magnetic media (Fig. 6 do not show, commonly referred to as " hard disk drive ").Although being not shown in Fig. 6, can provide
Disc driver for being read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to removable anonvolatile optical disk
The CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver can
To be connected by one or more data media interfaces with bus 18.Memory 28 may include at least one program product,
The program product has one group of (for example, at least one) program module, these program modules are configured to perform each implementation of the invention
The function of example.
Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28
In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and
It may include the realization of network environment in program data, each of these examples or certain combination.Program module 42 is usual
Execute the function and/or method in embodiment described in the invention.
Server 12 can also be logical with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.)
Letter, can also be enabled a user to one or more equipment interact with the server 12 communicate, and/or with make the server
The 12 any equipment (such as network interface card, modem etc.) that can be communicated with one or more of the other calculating equipment communicate.
This communication can be carried out by input/output (I/O) interface 22.Also, server 12 can also pass through network adapter 20
With one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.
As shown, network adapter 20 is communicated by bus 18 with other modules of server 12.It should be understood that although not showing in figure
Out, can in conjunction with server 12 use other hardware and/or software module, including but not limited to: microcode, device driver,
Redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and
Data processing, such as realize the method for digging of question and answer resource provided by the embodiment of the present invention.
Embodiment six
The embodiment of the present invention six provides a kind of computer storage medium.
The computer readable storage medium of the embodiment of the present invention, can be using one or more computer-readable media
Any combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.Computer
Readable storage medium storing program for executing for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, dress
It sets or device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium wraps
It includes: there is the electrical connection of one or more conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only
Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory
(CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable
Storage medium can be it is any include or storage program tangible medium, the program can be commanded execution system, device or
Device use or in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion
Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.?
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or
Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as mentioned using Internet service
It is connected for quotient by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention
It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also
It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.
Claims (12)
1. a kind of method for digging of question and answer resource, which is characterized in that the described method includes:
The corresponding initial answer of each initial problem is extracted in each question and answer pair in community's question and answer resource;
Each initial problem is determined according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource
Corresponding target answer;
Target question and answer resource is excavated according to each initial problem and the corresponding target answer of each initial problem.
2. the method according to claim 1, wherein it is described according to the corresponding initial answer of each initial problem with
And each UGC content in vertical class resource determines the corresponding target answer of each initial problem, comprising:
Each initial problem is determined according to each UGC content in the corresponding initial answer of each initial problem and vertical class resource
Corresponding alternative answer;
It is determined according to the corresponding initial answer of each initial problem and the corresponding alternative answer of each initial problem each initial
The corresponding target answer of problem.
3. according to the method described in claim 2, it is characterized in that, it is described according to the corresponding initial answer of each initial problem with
And each UGC content in vertical class resource determines the corresponding alternative answer of each initial problem, comprising:
Calculate the corresponding sentence vector of each UGC content of the corresponding sentence vector sum of each initial answer;
Each initial problem is determined according to the corresponding sentence vector of the corresponding each UGC content of sentence vector sum of each initial answer
Corresponding alternative answer.
4. according to the method described in claim 3, it is characterized in that, described calculate the corresponding sentence vector sum of each initial answer
The corresponding sentence vector of each UGC content, comprising:
Each initial answer and each UGC content are respectively divided into the first sentence dictionary and second for unit according to basic statement
Sentence dictionary;
It is each that the corresponding sentence vector sum of each initial answer is calculated according to the first sentence dictionary and the second sentence dictionary
The corresponding sentence vector of a UGC content.
5. according to the method described in claim 2, it is characterized in that, it is described according to the corresponding initial answer of each initial problem with
And the corresponding alternative answer of each initial problem determines the corresponding target answer of each initial problem, comprising:
Calculate the corresponding word vectors of each initial answer and the corresponding word vectors of each alternative answer;
It is determined and each is initially asked according to each initial corresponding word vectors of answer and the corresponding word vectors of each alternative answer
Inscribe corresponding target answer.
6. a kind of excavating gear of question and answer resource, which is characterized in that described device includes: extraction module, determining module and excavation
Module;Wherein,
The extraction module, it is corresponding just for extracting each initial problem in each question and answer pair in community's question and answer resource
Beginning answer;
The determining module, for according in each UGC in the corresponding initial answer of each initial problem and vertical class resource
Hold and determines the corresponding target answer of each initial problem;
The excavation module, for excavating target according to each initial problem and the corresponding target answer of each initial problem
Question and answer resource.
7. device according to claim 6, it is characterised in that:
The determining module, specifically for according to each in the corresponding initial answer of each initial problem and vertical class resource
UGC content determines the corresponding alternative answer of each initial problem;According to the corresponding initial answer of each initial problem and each
The corresponding alternative answer of initial problem determines the corresponding target answer of each initial problem.
8. device according to claim 7, which is characterized in that the determining module includes: computational submodule and determines sub
Module;Wherein,
The computational submodule, for calculating the corresponding sentence of each UGC content of the corresponding sentence vector sum of each initial answer
Vector;
The determining submodule, for according to the corresponding sentence of the corresponding each UGC content of sentence vector sum of each initial answer
Vector determines the corresponding alternative answer of each initial problem.
9. device according to claim 8, it is characterised in that:
The computational submodule is specifically used for being that unit distinguishes each initial answer and each UGC content according to basic statement
It is divided into the first sentence dictionary and the second sentence dictionary;It is calculated according to the first sentence dictionary and the second sentence dictionary each
The corresponding sentence vector of the corresponding each UGC content of sentence vector sum of a initial answer.
10. device according to claim 7, which is characterized in that the computational submodule is also used to calculate and each initially answer
The corresponding word vectors of case and the corresponding word vectors of each alternative answer;
The determining submodule is also used to according to each initial corresponding word vectors of answer and the corresponding word of each alternative answer
Language vector determines the corresponding target answer of each initial problem.
11. a kind of server characterized by comprising
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method for digging of the question and answer resource as described in any one of claims 1 to 5.
12. a kind of storage medium, is stored thereon with computer program, which is characterized in that the realization when program is executed by processor
The method for digging of question and answer resource as described in any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810696978.4A CN109062973A (en) | 2018-06-29 | 2018-06-29 | A kind of method for digging, device, server and the storage medium of question and answer resource |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810696978.4A CN109062973A (en) | 2018-06-29 | 2018-06-29 | A kind of method for digging, device, server and the storage medium of question and answer resource |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109062973A true CN109062973A (en) | 2018-12-21 |
Family
ID=64818461
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810696978.4A Pending CN109062973A (en) | 2018-06-29 | 2018-06-29 | A kind of method for digging, device, server and the storage medium of question and answer resource |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109062973A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109783631A (en) * | 2019-02-02 | 2019-05-21 | 北京百度网讯科技有限公司 | Method of calibration, device, computer equipment and the storage medium of community's question and answer data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928864A (en) * | 2006-09-22 | 2007-03-14 | 浙江大学 | FAQ based Chinese natural language ask and answer method |
CN102254039A (en) * | 2011-08-11 | 2011-11-23 | 武汉安问科技发展有限责任公司 | Searching engine-based network searching method |
CN105159996A (en) * | 2015-09-07 | 2015-12-16 | 百度在线网络技术(北京)有限公司 | Deep question-and-answer service providing method and device based on artificial intelligence |
US20160110364A1 (en) * | 2014-10-18 | 2016-04-21 | International Business Machines Corporation | Realtime Ingestion via Multi-Corpus Knowledge Base with Weighting |
US20170192976A1 (en) * | 2016-01-06 | 2017-07-06 | International Business Machines Corporation | Ranking answers in ground truth of a question-answering system |
-
2018
- 2018-06-29 CN CN201810696978.4A patent/CN109062973A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928864A (en) * | 2006-09-22 | 2007-03-14 | 浙江大学 | FAQ based Chinese natural language ask and answer method |
CN102254039A (en) * | 2011-08-11 | 2011-11-23 | 武汉安问科技发展有限责任公司 | Searching engine-based network searching method |
US20160110364A1 (en) * | 2014-10-18 | 2016-04-21 | International Business Machines Corporation | Realtime Ingestion via Multi-Corpus Knowledge Base with Weighting |
CN105159996A (en) * | 2015-09-07 | 2015-12-16 | 百度在线网络技术(北京)有限公司 | Deep question-and-answer service providing method and device based on artificial intelligence |
US20170192976A1 (en) * | 2016-01-06 | 2017-07-06 | International Business Machines Corporation | Ranking answers in ground truth of a question-answering system |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109783631A (en) * | 2019-02-02 | 2019-05-21 | 北京百度网讯科技有限公司 | Method of calibration, device, computer equipment and the storage medium of community's question and answer data |
CN109783631B (en) * | 2019-02-02 | 2022-05-17 | 北京百度网讯科技有限公司 | Community question-answer data verification method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106991154A (en) | Webpage rendering intent, device, terminal and server | |
CN110008045A (en) | Polymerization, device, equipment and the storage medium of micro services | |
JP6756079B2 (en) | Artificial intelligence-based ternary check method, equipment and computer program | |
CN113407850B (en) | Method and device for determining and acquiring virtual image and electronic equipment | |
CN111460815B (en) | Rule processing method, apparatus, medium, and electronic device | |
CN107315779A (en) | Log analysis method and system | |
CN109885628A (en) | A kind of tensor transposition method, device, computer and storage medium | |
CN110288710A (en) | A kind of processing method of three-dimensional map, processing unit and terminal device | |
US20170091188A1 (en) | Presenting answers from concept-based representation of a topic oriented pipeline | |
CN109753644A (en) | A kind of RichText Edition method, apparatus, mobile terminal and storage medium | |
CN108920083A (en) | A kind of input method application method and device | |
CN109145164A (en) | Data processing method, device, equipment and medium | |
CN109062973A (en) | A kind of method for digging, device, server and the storage medium of question and answer resource | |
CN107301220A (en) | Method, device, equipment and the storage medium of data-driven view | |
CN107992242A (en) | A kind of switching method of suspension windows, device, equipment and storage medium | |
CN109657127A (en) | A kind of answer acquisition methods, device, server and storage medium | |
CN110704766A (en) | Interface rendering optimization method and device based on real-time snapshot and electronic equipment | |
US10482171B2 (en) | Digital form optimization | |
CN110083290A (en) | A kind of method and apparatus handling page turning | |
CN109343838A (en) | Chat feature development approach and system, terminal and computer readable storage medium | |
CN109347899A (en) | The method of daily record data is written in distributed memory system | |
US9342514B2 (en) | Multicultural collaborative editing method, apparatus and program product | |
KR20210061156A (en) | System and method of providing civil model linking 3 dimensional model and analysis model | |
CN109739623A (en) | A kind of method, apparatus of virtual machine (vm) migration, equipment and storage medium | |
CN108920715A (en) | Intelligent householder method, device, server and the storage medium of customer service |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181221 |
|
RJ01 | Rejection of invention patent application after publication |