CN105095444A - Information acquisition method and device - Google Patents

Information acquisition method and device Download PDF

Info

Publication number
CN105095444A
CN105095444A CN201510441024.5A CN201510441024A CN105095444A CN 105095444 A CN105095444 A CN 105095444A CN 201510441024 A CN201510441024 A CN 201510441024A CN 105095444 A CN105095444 A CN 105095444A
Authority
CN
China
Prior art keywords
answer
word
information
context
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510441024.5A
Other languages
Chinese (zh)
Inventor
霍华荣
马艳军
吴华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201510441024.5A priority Critical patent/CN105095444A/en
Publication of CN105095444A publication Critical patent/CN105095444A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The application discloses an information acquisition method and device. An embodiment of the method includes: acquiring a plurality of question-answer pairs in a data set, extracting at least one question word and at least one answer word of each question-answer pair; determining a context of the question word and the answer word; taking the question word, the answer word and the context as a training sample, training a preset model, and obtaining a word vector set; receiving question information to be responded; based on the word vector set, acquiring answer information matching the question information from the data set. The information acquisition method and device may train the word vector through evaluating a correlation of the question-answer pairs from a semantic aspect, and a training speed and accuracy of the word vector are improved. Based on that when the word vector acquires the matched information, a complicated supervised neural network training is not needed, the speed and accuracy of information acquisition can be improved.

Description

Information getting method and device
Technical field
The application relates to field of computer technology, is specifically related to field of terminal technology, particularly relates to a kind of information getting method and device.
Background technology
The approach of current people obtaining information is on the internet mainly search engine, and user needs to browse a large amount of webpages could obtain answer, and efficiency is lower.Degree of depth question and answer technology makes search more intelligent, provides answer more accurately to user, reduces the cost of user's obtaining information.Along with the development of online question and answer websites such as such as " Baidu are known ", create the data of the question and answer mode generated by user in a large number, they provide Data support for degree of depth question and answer.
But these question and answer are uneven to quality, mainly comprise following two kinds of problems: " escape " and " tediously long "." escape " refers to that the main contents answered are uncorrelated with problem, gives an irrelevant answer." tediously long " refers to that the content answered is long, except Answer Sentence, also containing non-immediate Answer Sentences such as a large amount of uncorrelated, supplementary notes.
Summary of the invention
In view of above-mentioned defect of the prior art or deficiency, expect to provide a kind of scheme improving acquisition of information speed and accuracy.In order to realize above-mentioned one or more object, this application provides a kind of information getting method and device.
First aspect, this application provides a kind of information getting method, and described method comprises: obtain the multiple question and answer pair of data centralization, extract at least one right problem word of each question and answer and at least one answer word; Determine the context of described problem word and described answer word; Using described problem word, answer word and context as training sample, training preset model, obtains term vector set; Receive problem information to be responded; Based on described term vector set, obtain the answer information of mating with described problem information from described data centralization.
Second aspect, this application provides a kind of information acquisition device, and described device comprises: extraction module, for obtaining the multiple question and answer pair of data centralization, extracts at least one right problem word of each question and answer and at least one answer word; Determination module, for determining the context of described problem word and described answer word; Training module, for using described problem word, answer word and context as training sample, training preset model, obtain term vector set; Receiver module, for receiving problem information to be responded; Acquisition module, for based on described term vector set, obtains the answer information of mating with described problem information from described data centralization.
The information getting method that the application provides and device, first the multiple question and answer pair of data centralization can be obtained, extract at least one right problem word of each question and answer and at least one answer word, then the context of described problem word and described answer word is determined, afterwards using described problem word, answer word and context as training sample, training preset model, obtain term vector set, finally receive problem information to be responded, and based on described term vector set, obtain the answer information of mating with described problem information from described data centralization.The application trains term vector by the correlativity evaluating question and answer right from semantic aspect, improves term vector training speed and accuracy.When obtaining match information based on term vector, there is supervision neural metwork training without the need to complexity, improve speed and the accuracy of acquisition of information.
Accompanying drawing explanation
By reading the detailed description done non-limiting example done with reference to the following drawings, the other features, objects and advantages of the application will become more obvious:
Fig. 1 shows the exemplary system architecture 100 can applying the embodiment of the present application;
Fig. 2 shows the process flow diagram of an embodiment of the information getting method provided according to the application;
Fig. 3 shows the process flow diagram of another embodiment of the information getting method provided according to the application;
Fig. 4 shows the process flow diagram of another embodiment of the information getting method provided according to the application;
Fig. 5 shows the process flow diagram of an embodiment of the method for the answer information of mating with problem information that obtains from data centralization provided according to the application;
Fig. 6 shows the functional module framework schematic diagram of an embodiment of the information acquisition device 600 provided according to the application; And
Fig. 7 shows the structural representation of the computer system 700 of terminal device or the server be suitable for for realizing the embodiment of the present application.
Embodiment
Below in conjunction with drawings and Examples, the application is described in further detail.Be understandable that, specific embodiment described herein is only for explaining related invention, but not the restriction to this invention.It also should be noted that, for convenience of description, in accompanying drawing, illustrate only the part relevant to Invention.
It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.Below with reference to the accompanying drawings and describe the application in detail in conjunction with the embodiments.
Please refer to Fig. 1, it illustrates the exemplary system architecture 100 can applying the embodiment of the present application.
As shown in Figure 1, system architecture 100 can comprise terminal device 101,102, network 103 and server 104.Network 103 in order to provide the medium of communication link between terminal device 101,102 and server 104.Network 103 can comprise various connection type, such as wired, wireless communication link or fiber optic cables etc.
User 110 can use terminal device 101,102 mutual by network 103 and server 104, to receive or to send message etc.Such as, user can obtain the answer information etc. of mating with problem information to be responded by terminal device 101,102 from server 104 by network 103.Terminal device 101,102 can be provided with the application of various telecommunication customer end, such as JICQ, mailbox client, social platform software etc.
Terminal device 101,102 can be various electronic equipment, includes but not limited to PC, smart mobile phone, panel computer, personal digital assistant etc.
Server 104 can be to provide the server of various service.The process such as server can store the data received, analysis, and result is fed back to terminal device.
It should be noted that, the information getting method that the embodiment of the present application provides can be performed by terminal device 101,102, also can be performed by server 104; Information acquisition device can be arranged in terminal device 101,102, also can be arranged in server 104.In certain embodiments, can train preset model in server 104, the term vector set obtained can be stored in terminal device 101,102, for obtaining the answer information of mating with problem information.Such as, if network 103 is unimpeded, return after can obtaining the answer information of mating with wait the problem information responded from data centralization by server 104, if do not have network or network 103 not smooth, directly can obtain the answer information of mating with problem information to be responded from data centralization by terminal device 101,102.
Should be appreciated that, the number of the terminal device in Fig. 1, network and server is only schematic.According to realizing needs, the terminal device of arbitrary number, network and server can be had.
With further reference to Fig. 2, it illustrates the flow process 200 of an embodiment of the information getting method provided according to the application.
As shown in Figure 2, in step 201, obtain the multiple question and answer pair of data centralization, extract at least one right problem word of each question and answer and at least one answer word.
Obtaining information to be carried out based on term vector, first need to collect sample to train term vector.In the present embodiment, the multiple question and answer of data centralization can first be obtained to the sample as training term vector.Above-mentioned data set such as the database built in advance, can contain a large amount of question and answer pair in this database.Such as, question and answer pair can be obtained from network, and be kept at data centralization.Above-mentioned data set can be kept at server or terminal.
Get the multiple question and answer of data centralization to rear, at least one right problem word of each question and answer and at least one answer word can be extracted further.Each question and answer are to being made up of a problem sentence and one or more answer sentence.In the present embodiment, the problem sentence of question and answer centering and answer sentence can be split, be split as the problem sentence and answer sentence that are made up of one or more word respectively.Therefore, the word in each problem sentence and answer sentence can be extracted, respectively as problem word or answer word.
In an optional implementation of the present embodiment, above-mentioned question and answer to can comprise voice response to word question and answer pair.When getting voice response pair, speech recognition technology can also be passed through, by voice response to being converted to word question and answer pair.Further, at least one right problem word of the word question and answer after conversion and at least one answer word is extracted.
In an optional implementation of the present embodiment, in order to distinguish problem word and answer word, the first prefix can be added for each problem word, for each answer word adds the second prefix.Such as, can be that each problem word adds prefix " Q ", for each answer word adds prefix " A ".
In step 202., the context of problem identificatioin word and answer word.
After extracting at least one right problem word of each question and answer and at least one answer word in step 201, can the context of problem identificatioin word and answer word further.Such as, by pre-defined rule, the problem word of same question and answer centering and answer word identical context can be set to, or also each problem word and each answer word the context of each problem word and each answer word can be set to by the context in former problem sentence and answer sentence.In one implementation, can the contextual length of offering question word and answer word.The context of problem word and answer word identical length (e.g., 7) can be set to, also the context of problem word and answer word different length can be set to.
In step 203, using problem word, answer word and context as training sample, training preset model, obtains term vector set.
Get the context of problem word, answer word and problem identificatioin word and answer word in data centralization after, using these data as sample data, the training pattern preset can be trained.Because the final purpose of training to determine term vector set, therefore term vector set can be regarded as a unknown parameter in preset model, then preset model be trained.When above-mentioned parameter can allow preset model meet specific training objective, just can think that parameter is now exactly the term vector set needing to determine.
In an optional implementation of the present embodiment, above-mentioned each term vector can be the low dimension real number vector that dimension is not more than 1000.Such as, the concrete form of the term vector finally determined can be following form: [0.355 ,-0.687 ,-0.168,0.103 ,-0.231 ...] low-dimensional real number vector, dimension is generally no more than the integer of 1000.If dimension very little, then fully can not represent the difference between each word, and dimension is too many, then calculated amount can be larger.Alternatively, the dimension of term vector between 50 to 1000, thus can take into account accuracy and counting yield simultaneously.
In step 204, problem information to be responded is received.
After obtaining term vector in step 203, obtaining information can be carried out based on term vector.First problem information to be responded can be received.Particularly, user can input its problem of wanting to inquire about in the search box of browser, and such as, the problems referred to above can be key word or complete sentence.Namely the content of user's input can be used as problem information to be responded.
In step 205, based on term vector set, obtain the answer information of mating with problem information from data centralization.
In the present embodiment, when search system receive user input after the problem information responded, can first search for this problem information in data centralization, to obtain the multiple answer information corresponding with this problem information.Then in multiple answer information, retrieve one or more answer information of mating with problem information.Such as, based on the matching degree of term vector set computational problem information and each answer information, matching degree can be met the answer information of pre-conditioned (e.g., being greater than 80%) as the answer information of mating with problem information.
In an optional implementation of the present embodiment, can the one or more answer information of mating with problem information obtained be presented in the terminal, check for user.
The information getting method that the present embodiment provides, first the multiple question and answer pair of data centralization can be obtained, extract at least one right problem word of each question and answer and at least one answer word, then the context of problem identificatioin word and answer word, afterwards by problem word, answer word and context are as training sample, training preset model, obtain term vector set, finally receive problem information to be responded, and based on term vector set, the answer information of mating with problem information is obtained from data centralization, term vector is trained by the correlativity evaluating question and answer right from semantic aspect, improve term vector training speed and accuracy.When obtaining match information based on term vector, there is supervision neural metwork training without the need to complexity, improve speed and the accuracy of acquisition of information.
With further reference to Fig. 3, it illustrates the flow process 300 of another embodiment of the information getting method provided according to the application.
As shown in Figure 3, in step 301, obtain the multiple question and answer pair of data centralization, extract at least one right problem word of each question and answer and at least one answer word.
In the present embodiment, the step 301 in above-mentioned realization flow 300 is substantially identical with the step 201 in aforementioned realization flow 200, does not repeat them here.
In step 302, the context of each problem word is determined.
In the present embodiment, the context of each problem word can first be determined.Such as, the former context of each problem word place problem sentence can be defined as the context of each problem word.As for following question and answer pair: " association's notebook is handy? ", " computer of association individual sensation is pretty good ", can the context of problem identificatioin word " notebook " be " associating handy ".
In step 303, the context of arbitrary problem word is defined as the context of the right all answer words of these problem word place question and answer.
In the present embodiment, the context of arbitrary problem word can be defined as the context of the right all answer words of these problem word place question and answer.A question and answer centering, problem sentence can corresponding one or more answer sentence, and namely each problem word can corresponding one or more answer word.In the present embodiment, can be that arbitrary problem word of a question and answer centering and all answer words of its correspondence arrange identical context.Particularly, the context of arbitrary problem word can be defined as the context of the right all answer words of these problem word place question and answer.Such as, for following question and answer pair: " association notebook handy? ", " computer of association individual sensation is pretty good ", when the context of problem word " notebook " is for " associating handy ", the context of each answer word " association ", " computer ", " individual ", " sensation ", " well " all can be set to the context identical with problem word " notebook " and " associates handy ".
In step 304, using problem word, answer word and context as training sample, training preset model, obtains term vector set.
In the present embodiment, preset model can be following function:
p = &Sigma; < q , a > &Element; D &Sigma; i = 1 | q | ( log p ( q i | C q i ) + &Sigma; j = 1 | a | ( log p ( a j | C q i ) )
Wherein, <q, a> are question and answer pair in data set D, | q| is the number of this question and answer centering problem word, q ifor the vector of this question and answer centering i-th problem word, C qifor the context vector of this question and answer centering i-th problem word, | a| is the number of this question and answer centering answer word, a jfor the vector of this question and answer centering jth answer word; P (q i| C qi) and p (a j| C qi) determined by following formula:
p ( w | C w ) = exp ( w , C w ) &Sigma; u = 1 V exp ( w u , C w u )
W is the vector of arbitrary word, C wfor the context vector of this word, w ufor the vector of u word in described data set D, C wube the context vector of u word, V is the number containing word in described data set D.
As can be seen from above-mentioned formula, in the present embodiment, the vectorial q of a question and answer centering problem word iwhen determining, the context vector of each answer word is all identical, and is all the context vector C of this problem word qi.
In an optional implementation of the present embodiment, with above-mentioned function maximization for training objective, term vector set can be determined.
In step 305, problem information to be responded is received.
Within step 306, based on term vector set, obtain the answer information of mating with problem information from data centralization.
In the present embodiment, the step 305-306 in above-mentioned realization flow 300 is substantially identical with the step 204-205 in aforementioned realization flow 200 respectively, does not repeat them here.
In the present embodiment, can after the context of problem identificatioin word, the context of arbitrary problem word is defined as the context of the right all answer words of these problem word place question and answer, further using problem word, answer word and context as training sample, training preset model, obtain term vector set, the accuracy of term vector can be improved.
With further reference to Fig. 4, it illustrates the flow process 400 of another embodiment of the information getting method provided according to the application.
As shown in Figure 4, in step 401, obtain the multiple question and answer pair of data centralization, extract at least one right problem word of each question and answer and at least one answer word.
In the present embodiment, the step 401 in above-mentioned realization flow 400 is substantially identical with the step 201 in aforementioned realization flow 200, does not repeat them here.
In step 402, the context of each answer word is determined.
In the present embodiment, the context of each answer word can first be determined.Such as, the former context of each answer word place answer sentence can be defined as the context of each answer word.As for following question and answer pair: " association's notebook is handy? ", " computer of association individual sensation is pretty good ", can determine that the context of answer word " computer " is " individual of association feels good ".Alternatively or preferably, when arranging the contextual length of answer word (e.g., 7), the context of above-mentioned answer word " computer " can be " individual's sensation of association ".
In step 403, the context of arbitrary answer word is defined as the context of the right all problems word of these answer word place question and answer.
In the present embodiment, the context of arbitrary answer word can be defined as the context of the right all problems word of these answer word place question and answer.A question and answer centering, an answer sentence can corresponding one or more problem word.In the present embodiment, can be that arbitrary answer word of a question and answer centering and all problems word of its correspondence arrange identical context.Particularly, the context of arbitrary answer word can be defined as the context of the right all problems word of these answer word place question and answer.Such as, for following question and answer pair: " association notebook handy? ", " computer of association individual sensation is pretty good ", when the context of answer word " computer " can be " individual of association feels good ", the context of each problem word " association ", " notebook ", " handy " all can be set to the context " individual of association feel good " identical with answer word " computer ".Alternatively or preferably, when arranging the contextual length of answer word (e.g., 7), the context of above-mentioned answer word " computer " can be " individual's sensation of association ".Now, the context of each problem word " association ", " notebook ", " handy " all can be set to the context " individual of association feel " identical with answer word " computer ".
In step 404, using problem word, answer word and context as training sample, training preset model, obtains term vector set.
In the present embodiment, preset model can be following function:
p = &Sigma; < q , a > &Element; D &Sigma; j = 1 | q | ( log p ( a j | C a j ) + &Sigma; i = 1 | a | ( log p ( q i | C a j ) )
Wherein, <q, a> are question and answer pair in data set D, | q| is the number of this question and answer centering problem word, q ifor the vector of this question and answer centering i-th problem word, C ajfor the context of this question and answer centering jth answer word, | a| is the number of this question and answer centering answer word, a jfor the vector of this question and answer centering jth answer word; P (a j| C aj) and p (q i| C aj) determined by following formula:
p ( w | C w ) = exp ( w , C w ) &Sigma; u = 1 V exp ( w u , C w u )
W is the vector of arbitrary word, C wfor the context vector of this word, w ufor the vector of u word in data set D, C wube the context vector of u word, V is the number containing word in data set D.
As can be seen from above-mentioned formula, in the present embodiment, the vectorial a of a question and answer centering answer word jwhen determining, the context vector of each problem word is all identical, and is all the context vector C of this answer word aj.
In an optional implementation of the present embodiment, with above-mentioned function maximization for training objective, term vector set can be determined.
In step 405, problem information to be responded is received.
In a step 406, based on term vector set, obtain the answer information of mating with problem information from data centralization.
In the present embodiment, the step 405-406 in above-mentioned realization flow 400 is substantially identical with the step 204-205 in aforementioned realization flow 200 respectively, does not repeat them here.
In the present embodiment, can after the context determining answer word, the context of arbitrary answer word is defined as the context of the right all problems word of these answer word place question and answer, further using problem word, answer word and context as training sample, training preset model, obtain term vector set, the accuracy of term vector can be improved.
With further reference to Fig. 5, it illustrates the flow process 500 of an embodiment of the method for the answer information of mating with problem information that obtains from data centralization provided according to the application.
As shown in Figure 5, in step 501, according to term vector set, the Answer Sentence subvector of each answer information of problem sentence vector sum data centralization of construction problem information.
In the present embodiment, based on training the term vector obtained, the answer information of mating with problem information to be responded can be obtained from data centralization.Particularly, can first according to term vector set, the Answer Sentence subvector of each answer information of problem sentence vector sum data centralization of construction problem information.
In one implementation, can according to the Answer Sentence subvector of each answer information of problem sentence vector sum data centralization of following formula construction problem information:
s = &Sigma; i = 1 m 1 logc i * w i
Wherein, s is the vector of arbitrary sentence, and m is the length of this sentence, w ifor the term vector of i-th word in this sentence, c ifor the number of times that i-th word in this sentence occurs in data centralization.
First, problem information and each answer information of data centralization can be split as multiple word.When splitting problem information or answer information, if the sentence that problem information or answer information are made up of multiple word, then can be split as multiple word according to general syntax rule; If problem information or answer information are words, then can regard this word as word after fractionation.Like this, each problem information or answer information can be split at least one word.Then, each word can be represented with training the vector obtained.Then, above-mentioned formula can be used to be configured to problem information to be responded and each answer information of data centralization comprise the problem sentence vector sum Answer Sentence subvector of one or more term vector.
In step 502, the correlativity of computational problem sentence vector and each Answer Sentence subvector.
When after the Answer Sentence subvector obtaining each answer information of problem sentence vector sum data centralization in step 501, can correlativity between the Answer Sentence subvector of each answer information of computational problem sentence vector sum data centralization.Correlativity can represent the degree that is associated between two sentence vectors, and the larger explanation of relevance values two sentence vectors are more close, and its span can be [-1,1].When correlativity is 1, can think that two sentence vectors are identical.And when similarity is-1, then can think that two sentence vectors are completely different.
In one implementation, can according to the correlativity of following formulae discovery problem sentence vector with Answer Sentence subvector:
S c o r e ( s a , s b ) = s a &CenterDot; s b s a 2 s b 2 + &lambda; C ( s a , s b )
Wherein, s awith s bbe respectively the vector of problem sentence a and answer sentence b, Score (s a, s b) be problem sentence vector s awith Answer Sentence subvector s bcorrelativity, λ is the constant between 0.18-0.24, C (s a, s b) be the Term co-occurrence number of times between sentence a and sentence b.
In step 503, based on correlativity, determine the answer information of mating with problem information.
When in above-mentioned steps 502, after having calculated problem sentence vector and the correlativity of each Answer Sentence subvector, according to the concrete numerical value of correlativity, one or more answer information of mating with problem information can be determined.In a kind of possible implementation, answer information that can be corresponding by the Answer Sentence subvector the highest with the correlativity of problem sentence vector is defined as the answer information of mating with problem information.In another kind of implementation, can relevance threshold be pre-set, be more than or equal to the correlativity of problem sentence vector the answer information that answer information corresponding to the Answer Sentence subvector of above-mentioned threshold value be defined as mating with problem information.
Information getting method in the present embodiment, can the Answer Sentence subvector of the first each answer information of problem sentence vector sum data centralization of construction problem information, then the correlativity of computational problem sentence vector and each Answer Sentence subvector, last based on correlativity, determine the answer information of mating with problem information.When the application obtains match information based on term vector, there is supervision neural metwork training without the need to complexity, improve speed and the accuracy of acquisition of information.
Although it should be noted that the operation describing the inventive method in the accompanying drawings with particular order, this is not that requirement or hint must perform these operations according to this particular order, or must perform the result that all shown operation could realize expectation.On the contrary, the step described in process flow diagram can change execution sequence.Additionally or alternatively, some step can be omitted, multiple step be merged into a step and perform, and/or a step is decomposed into multiple step and perform.
With further reference to Fig. 6, it illustrates the functional module framework schematic diagram of an embodiment of the information acquisition device 600 provided according to the application.
As shown in Figure 6, the information acquisition device 600 that the present embodiment provides comprises: extraction module 610, determination module 620, training module 630, receiver module 640 and acquisition module 650.Wherein, extraction module 610, for obtaining the multiple question and answer pair of data centralization, extracts at least one right problem word of each question and answer and at least one answer word; Determination module 620 is for the context of problem identificatioin word and answer word; Training module 630 for using problem word, answer word and context as training sample, training preset model, obtain term vector set; Receiver module 640 is for receiving problem information to be responded; Acquisition module 650, for based on term vector set, obtains the answer information of mating with problem information from data centralization.
In an optional implementation of the present embodiment, determination module 620 is for the context of problem identificatioin word and answer word according to the following steps: the context determining each problem word; The context of arbitrary problem word is defined as the context of the right all answer words of these problem word place question and answer.
In another optional implementation of the present embodiment, preset model is with minor function:
p = &Sigma; < q , a > &Element; D &Sigma; i = 1 | q | ( log p ( q i | C q i ) + &Sigma; j = 1 | a | ( log p ( a j | C q i ) )
Wherein, <q, a> are question and answer pair in data set D, | q| is the number of this question and answer centering problem word, q ifor the vector of this question and answer centering i-th problem word, C qifor the context vector of this question and answer centering i-th problem word, | a| is the number of this question and answer centering answer word, a jfor the vector of this question and answer centering jth answer word; P (q i| C qi) and p (a j| C qi) determined by following formula:
p ( w | C w ) = exp ( w , C w ) &Sigma; u = 1 V exp ( w u , C w u )
W is the vector of arbitrary word, C wfor the context vector of this word, w ufor the vector of u word in described data set D, C wube the context vector of u word, V is the number containing word in described data set D.
In another optional implementation of the present embodiment, determination module 620 is for the context of problem identificatioin word and answer word according to the following steps: the context determining each answer word; The context of arbitrary answer word is defined as the context of the right all problems word of these answer word place question and answer.
In another optional implementation of the present embodiment, preset model is with minor function:
p = &Sigma; < q , a > &Element; D &Sigma; j = 1 | q | ( log p ( a j | C a j ) + &Sigma; i = 1 | a | ( log p ( q i | C a j ) )
Wherein, <q, a> are question and answer pair in data set D, | q| is the number of this question and answer centering problem word, q ifor the vector of this question and answer centering i-th problem word, C ajfor the context of this question and answer centering jth answer word, | a| is the number of this question and answer centering answer word, a jfor the vector of this question and answer centering jth answer word; P (a j| C aj) and p (q i| C aj) determined by following formula:
p ( w | C w ) = exp ( w , C w ) &Sigma; u = 1 V exp ( w u , C w u )
W is the vector of arbitrary word, C wfor the context vector of this word, w ufor the vector of u word in described data set D, C wube the context vector of u word, V is the number containing word in described data set D.
In another optional implementation of the present embodiment, training module 630, for training preset model according to the following steps, obtains term vector set: turn to training objective so that preset model is maximum, determine term vector set.
In another optional implementation of the present embodiment, acquisition module 650 comprises: constructor module, for according to term vector set, and the Answer Sentence subvector of each answer information of problem sentence vector sum data centralization of construction problem information; Calculating sub module, for the correlativity of computational problem sentence vector with each Answer Sentence subvector; Determine submodule, for based on correlativity, determine the answer information of mating with problem information.
In another optional implementation of the present embodiment, constructor module is used for the Answer Sentence subvector according to each answer information of problem sentence vector sum data centralization of following formula construction problem information:
s = &Sigma; i = 1 m 1 logc i * w i
Wherein, s is the vector of arbitrary sentence, and m is the length of this sentence, w ifor the term vector of i-th word in this sentence, c ifor the number of times that i-th word in this sentence occurs in data centralization.
In another optional implementation of the present embodiment, calculating sub module is used for according to the correlativity of following formulae discovery problem sentence vector with Answer Sentence subvector:
S c o r e ( s a , s b ) = s a &CenterDot; s b s a 2 s b 2 + &lambda; C ( s a , s b )
Wherein, s awith s bbe respectively the vector of problem sentence a and answer sentence b, Score (s a, s b) be problem sentence vector s awith Answer Sentence subvector s bcorrelativity, λ is the constant between 0.18-0.24, C (s a, s b) be the Term co-occurrence number of times between sentence a and sentence b.
In another optional implementation of the present embodiment, question and answer to comprise voice response to word question and answer pair; Device also comprises: modular converter, for by voice response to being converted to word question and answer pair.
Should be appreciated that all unit of recording in information acquisition device shown in Fig. 6 or module corresponding with each step in the method described with reference to figure 2-5.Thus, above for the module that operation and the feature of method description are equally applicable to the equipment shown in Fig. 6 and wherein comprise, do not repeat them here.
The information acquisition device that the present embodiment provides, first the multiple question and answer pair of data centralization can be obtained by extraction module, extract at least one right problem word of each question and answer and at least one answer word, then the context of determination module problem identificatioin word and answer word, training module is by problem word afterwards, answer word and context are as training sample, training preset model, obtain term vector set, last receiver module receives problem information to be responded, and by acquisition module based on term vector set, the answer information of mating with problem information is obtained from data centralization, term vector is trained by the correlativity evaluating question and answer right from semantic aspect, improve term vector training speed and accuracy.When obtaining match information based on term vector, there is supervision neural metwork training without the need to complexity, improve speed and the accuracy of acquisition of information.
Below with reference to Fig. 7, it illustrates the structural representation of the computer system 700 of terminal device or the server be suitable for for realizing the embodiment of the present application.
As shown in Figure 7, computer system 700 comprises CPU (central processing unit) (CPU) 701, and it or can be loaded into the program random access storage device (RAM) 703 from storage area 708 and perform various suitable action and process according to the program be stored in ROM (read-only memory) (ROM) 702.In RAM703, also store system 700 and operate required various program and data.CPU701, ROM702 and RAM703 are connected with each other by bus 704.I/O (I/O) interface 705 is also connected to bus 704.
I/O interface 705 is connected to: the importation 706 comprising keyboard, mouse etc. with lower component; Comprise the output 707 of such as cathode-ray tube (CRT) (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.; Comprise the storage area 708 of hard disk etc.; And comprise the communications portion 709 of network interface unit of such as LAN card, modulator-demodular unit etc.Communications portion 709 is via the network executive communication process of such as the Internet.Driver 710 is also connected to I/O interface 705 as required.Detachable media 711, such as disk, CD, magneto-optic disk, semiconductor memory etc., be arranged on driver 710 as required, so that the computer program read from it is mounted into storage area 708 as required.
Especially, according to embodiment of the present disclosure, the process that reference flow sheet describes above may be implemented as computer software programs.Such as, embodiment of the present disclosure comprises a kind of computer program, and it comprises the computer program visibly comprised on a machine-readable medium, and described computer program comprises the program code for the method shown in flowchart.In such embodiments, this computer program can be downloaded and installed from network by communications portion 709, and/or is mounted from detachable media 711.
Process flow diagram in accompanying drawing and block diagram, illustrate according to the architectural framework in the cards of the system of various embodiments of the invention, method and computer program product, function and operation.In this, each square frame in process flow diagram or block diagram can represent a part for module, program segment or a code, and a part for described module, program segment or code comprises one or more executable instruction for realizing the logic function specified.Also it should be noted that at some as in the realization of replacing, the function marked in square frame also can be different from occurring in sequence of marking in accompanying drawing.Such as, in fact the square frame that two adjoining lands represent can perform substantially concurrently, and they also can perform by contrary order sometimes, and this determines according to involved function.Also it should be noted that, the combination of the square frame in each square frame in block diagram and/or process flow diagram and block diagram and/or process flow diagram, can realize by the special hardware based system of the function put rules into practice or operation, or can realize with the combination of specialized hardware and computer instruction.
Be described in unit module involved in the embodiment of the present application to be realized by the mode of software, also can be realized by the mode of hardware.Described unit module also can be arranged within a processor, such as, can be described as: a kind of processor comprises extraction module, determination module, training module, receiver module and acquisition module.Wherein, the title of these unit modules does not form the restriction to this unit module itself under certain conditions, such as, extraction module can also be described to " for obtaining the multiple question and answer pair of data centralization, extracting the module of at least one right problem word of each question and answer and at least one answer word ".
As another aspect, present invention also provides a kind of computer-readable recording medium, this computer-readable recording medium can be the computer-readable recording medium comprised in device described in above-described embodiment; Also can be individualism, be unkitted the computer-readable recording medium allocated in terminal.Described computer-readable recording medium stores more than one or one program, and described program is used for performance description in the information getting method of the application by one or more than one processor.
More than describe and be only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art are to be understood that, invention scope involved in the application, be not limited to the technical scheme of the particular combination of above-mentioned technical characteristic, also should be encompassed in when not departing from described inventive concept, other technical scheme of being carried out combination in any by above-mentioned technical characteristic or its equivalent feature and being formed simultaneously.The technical characteristic that such as, disclosed in above-mentioned feature and the application (but being not limited to) has similar functions is replaced mutually and the technical scheme formed.

Claims (17)

1. an information getting method, is characterized in that, described method comprises:
Obtain the multiple question and answer pair of data centralization, extract at least one right problem word of each question and answer and at least one answer word;
Determine the context of described problem word and described answer word;
Using described problem word, answer word and context as training sample, training preset model, obtains term vector set;
Receive problem information to be responded;
Based on described term vector set, obtain the answer information of mating with described problem information from described data centralization.
2. method according to claim 1, is characterized in that, describedly determines that the context of described problem word and described answer word comprises:
Determine the context of each problem word;
The context of arbitrary problem word is defined as the context of the right all answer words of these problem word place question and answer.
3. method according to claim 1, is characterized in that, describedly determines that the context of described problem word and described answer word comprises:
Determine the context of each answer word;
The context of arbitrary answer word is defined as the context of the right all problems word of these answer word place question and answer.
4. method according to claim 1, is characterized in that, described training preset model, obtains term vector set and comprises:
Turn to training objective so that described preset model is maximum, determine described term vector set.
5. method according to claim 1, is characterized in that, described based on described term vector set, obtains the answer information of mating with described problem information comprise from described data centralization:
According to described term vector set, the Answer Sentence subvector of each answer information of data centralization described in the problem sentence vector sum constructing described problem information;
Calculate the correlativity of described problem sentence vector and described each Answer Sentence subvector;
Based on described correlativity, determine and the answer information that described problem information mates.
6. method according to claim 5, is characterized in that, described in the problem sentence vector sum of the described problem information of described structure, the Answer Sentence subvector of each answer information of data centralization comprises:
According to the number of times that term vector and each word of each word in sentence occur in described data centralization, structure sentence vector.
7. method according to claim 5, is characterized in that, described calculating described problem sentence vector comprises with the correlativity of described Answer Sentence subvector:
Described correlativity is determined according to described problem sentence vector, described Answer Sentence subvector and the Term co-occurrence number of times between problem sentence and answer sentence.
8., according to the arbitrary described method of claim 1-7, it is characterized in that, described question and answer to comprise voice response to word question and answer pair;
Described method also comprises:
By described voice response to being converted to word question and answer pair.
9. method according to claim 8, is characterized in that, also comprises:
For each problem word adds the first prefix, for each answer word adds the second prefix.
10. method according to claim 9, is characterized in that, also comprises:
Present one or more answer information of mating with described problem information.
11. methods according to claim 10, is characterized in that, each term vector is the low dimension real number vector that dimension is not more than 1000.
12. 1 kinds of information acquisition devices, is characterized in that, described device comprises:
Extraction module, for obtaining the multiple question and answer pair of data centralization, extracts at least one right problem word of each question and answer and at least one answer word;
Determination module, for determining the context of described problem word and described answer word;
Training module, for using described problem word, answer word and context as training sample, training preset model, obtain term vector set;
Receiver module, for receiving problem information to be responded;
Acquisition module, for based on described term vector set, obtains the answer information of mating with described problem information from described data centralization.
13. devices according to claim 12, is characterized in that, described determination module is used for the context determining described problem word and described answer word according to the following steps:
Determine the context of each problem word;
The context of arbitrary problem word is defined as the context of the right all answer words of these problem word place question and answer.
14. devices according to claim 12, is characterized in that, described determination module is used for the context determining described problem word and described answer word according to the following steps:
Determine the context of each answer word;
The context of arbitrary answer word is defined as the context of the right all problems word of these answer word place question and answer.
15. devices according to claim 12, is characterized in that, described training module is used for training preset model according to the following steps, obtains term vector set:
Turn to training objective so that described preset model is maximum, determine described term vector set.
16. devices according to claim 12, is characterized in that, described acquisition module comprises:
Constructor module, for according to described term vector set, the Answer Sentence subvector of each answer information of data centralization described in the problem sentence vector sum constructing described problem information;
Calculating sub module, for calculating the correlativity of described problem sentence vector and described each Answer Sentence subvector;
Determine submodule, for based on described correlativity, determine and the answer information that described problem information mates.
17., according to the arbitrary described device of claim 12-16, is characterized in that, described question and answer to comprise voice response to word question and answer pair;
Described device also comprises:
Modular converter, for by described voice response to being converted to word question and answer pair.
CN201510441024.5A 2015-07-24 2015-07-24 Information acquisition method and device Pending CN105095444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510441024.5A CN105095444A (en) 2015-07-24 2015-07-24 Information acquisition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510441024.5A CN105095444A (en) 2015-07-24 2015-07-24 Information acquisition method and device

Publications (1)

Publication Number Publication Date
CN105095444A true CN105095444A (en) 2015-11-25

Family

ID=54575881

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510441024.5A Pending CN105095444A (en) 2015-07-24 2015-07-24 Information acquisition method and device

Country Status (1)

Country Link
CN (1) CN105095444A (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469212A (en) * 2016-09-05 2017-03-01 北京百度网讯科技有限公司 Man-machine interaction method based on artificial intelligence and device
CN106484664A (en) * 2016-10-21 2017-03-08 竹间智能科技(上海)有限公司 Similarity calculating method between a kind of short text
CN106572001A (en) * 2016-10-31 2017-04-19 厦门快商通科技股份有限公司 Conversation method and system for intelligent customer service
WO2017092380A1 (en) * 2015-12-03 2017-06-08 华为技术有限公司 Method for human-computer dialogue, neural network system and user equipment
CN107220296A (en) * 2017-04-28 2017-09-29 北京拓尔思信息技术股份有限公司 The generation method of question and answer knowledge base, the training method of neutral net and equipment
CN107632987A (en) * 2016-07-19 2018-01-26 腾讯科技(深圳)有限公司 One kind dialogue generation method and device
CN107657575A (en) * 2017-09-30 2018-02-02 四川智美高科科技有限公司 A kind of government affairs Intelligent Service terminal device and application method based on artificial intelligence
CN107957989A (en) * 2017-10-23 2018-04-24 阿里巴巴集团控股有限公司 Term vector processing method, device and equipment based on cluster
CN108170663A (en) * 2017-11-14 2018-06-15 阿里巴巴集团控股有限公司 Term vector processing method, device and equipment based on cluster
CN108292305A (en) * 2015-12-04 2018-07-17 三菱电机株式会社 Method for handling sentence
CN108376144A (en) * 2018-01-12 2018-08-07 上海大学 Man-machine more wheel dialogue methods that scene based on deep neural network automatically switches
CN108536807A (en) * 2018-04-04 2018-09-14 联想(北京)有限公司 A kind of information processing method and device
CN108804611A (en) * 2018-05-30 2018-11-13 浙江大学 A kind of dialogue reply generation method and system based on self comment Sequence Learning
CN108897771A (en) * 2018-05-30 2018-11-27 东软集团股份有限公司 Automatic question-answering method, device, computer readable storage medium and electronic equipment
CN108920604A (en) * 2018-06-27 2018-11-30 百度在线网络技术(北京)有限公司 Voice interactive method and equipment
CN109815341A (en) * 2019-01-22 2019-05-28 安徽省泰岳祥升软件有限公司 Text extraction model training method, text extraction method and text extraction device
CN109858528A (en) * 2019-01-10 2019-06-07 平安科技(深圳)有限公司 Recommender system training method, device, computer equipment and storage medium
CN109977428A (en) * 2019-03-29 2019-07-05 北京金山数字娱乐科技有限公司 A kind of method and device that answer obtains
CN110008322A (en) * 2019-03-25 2019-07-12 阿里巴巴集团控股有限公司 Art recommended method and device under more wheel session operational scenarios
CN110059152A (en) * 2018-12-25 2019-07-26 阿里巴巴集团控股有限公司 A kind of training method, device and the equipment of text information prediction model
CN110175333A (en) * 2019-06-04 2019-08-27 科大讯飞股份有限公司 A kind of evidence guides method, apparatus, equipment and storage medium
CN111061851A (en) * 2019-12-12 2020-04-24 中国科学院自动化研究所 Given fact-based question generation method and system
CN111444701A (en) * 2019-01-16 2020-07-24 阿里巴巴集团控股有限公司 Method and device for prompting inquiry
CN112199476A (en) * 2019-06-23 2021-01-08 国际商业机器公司 Automated decision making to select a leg after partial correct answers in a conversational intelligence tutor system
US11954441B2 (en) 2021-11-16 2024-04-09 Acer Incorporated Device and method for generating article markup information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101086843A (en) * 2006-06-07 2007-12-12 中国科学院自动化研究所 A sentence similarity recognition method for voice answer system
CN101377777A (en) * 2007-09-03 2009-03-04 北京百问百答网络技术有限公司 Automatic inquiring and answering method and system
CN104090890A (en) * 2013-12-12 2014-10-08 深圳市腾讯计算机系统有限公司 Method, device and server for obtaining similarity of key words
CN104699763A (en) * 2015-02-11 2015-06-10 中国科学院新疆理化技术研究所 Text similarity measuring system based on multi-feature fusion
CN104778158A (en) * 2015-03-04 2015-07-15 新浪网技术(中国)有限公司 Method and device for representing text

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101086843A (en) * 2006-06-07 2007-12-12 中国科学院自动化研究所 A sentence similarity recognition method for voice answer system
CN101377777A (en) * 2007-09-03 2009-03-04 北京百问百答网络技术有限公司 Automatic inquiring and answering method and system
CN104090890A (en) * 2013-12-12 2014-10-08 深圳市腾讯计算机系统有限公司 Method, device and server for obtaining similarity of key words
CN104699763A (en) * 2015-02-11 2015-06-10 中国科学院新疆理化技术研究所 Text similarity measuring system based on multi-feature fusion
CN104778158A (en) * 2015-03-04 2015-07-15 新浪网技术(中国)有限公司 Method and device for representing text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
祖永亮: "基于多特征融合的中文自动问答系统研究与设计", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844368B (en) * 2015-12-03 2020-06-16 华为技术有限公司 Method for man-machine conversation, neural network system and user equipment
US11640515B2 (en) 2015-12-03 2023-05-02 Huawei Technologies Co., Ltd. Method and neural network system for human-computer interaction, and user equipment
WO2017092380A1 (en) * 2015-12-03 2017-06-08 华为技术有限公司 Method for human-computer dialogue, neural network system and user equipment
CN106844368A (en) * 2015-12-03 2017-06-13 华为技术有限公司 For interactive method, nerve network system and user equipment
CN108292305A (en) * 2015-12-04 2018-07-17 三菱电机株式会社 Method for handling sentence
CN107632987A (en) * 2016-07-19 2018-01-26 腾讯科技(深圳)有限公司 One kind dialogue generation method and device
US10740564B2 (en) 2016-07-19 2020-08-11 Tencent Technology (Shenzhen) Company Limited Dialog generation method, apparatus, and device, and storage medium
CN107632987B (en) * 2016-07-19 2018-12-07 腾讯科技(深圳)有限公司 A kind of dialogue generation method and device
CN106469212A (en) * 2016-09-05 2017-03-01 北京百度网讯科技有限公司 Man-machine interaction method based on artificial intelligence and device
US11645547B2 (en) 2016-09-05 2023-05-09 Beijing Baidu Netcom Science And Technology Co., Ltd. Human-machine interactive method and device based on artificial intelligence
CN106484664A (en) * 2016-10-21 2017-03-08 竹间智能科技(上海)有限公司 Similarity calculating method between a kind of short text
CN106484664B (en) * 2016-10-21 2019-03-01 竹间智能科技(上海)有限公司 Similarity calculating method between a kind of short text
CN106572001B (en) * 2016-10-31 2019-10-11 厦门快商通科技股份有限公司 A kind of dialogue method and system of intelligent customer service
CN106572001A (en) * 2016-10-31 2017-04-19 厦门快商通科技股份有限公司 Conversation method and system for intelligent customer service
CN107220296A (en) * 2017-04-28 2017-09-29 北京拓尔思信息技术股份有限公司 The generation method of question and answer knowledge base, the training method of neutral net and equipment
CN107220296B (en) * 2017-04-28 2020-01-17 北京拓尔思信息技术股份有限公司 Method for generating question-answer knowledge base, method and equipment for training neural network
CN107657575A (en) * 2017-09-30 2018-02-02 四川智美高科科技有限公司 A kind of government affairs Intelligent Service terminal device and application method based on artificial intelligence
CN107957989B9 (en) * 2017-10-23 2021-01-12 创新先进技术有限公司 Cluster-based word vector processing method, device and equipment
US10769383B2 (en) 2017-10-23 2020-09-08 Alibaba Group Holding Limited Cluster-based word vector processing method, device, and apparatus
CN107957989A (en) * 2017-10-23 2018-04-24 阿里巴巴集团控股有限公司 Term vector processing method, device and equipment based on cluster
CN107957989B (en) * 2017-10-23 2020-11-17 创新先进技术有限公司 Cluster-based word vector processing method, device and equipment
US10846483B2 (en) 2017-11-14 2020-11-24 Advanced New Technologies Co., Ltd. Method, device, and apparatus for word vector processing based on clusters
CN108170663A (en) * 2017-11-14 2018-06-15 阿里巴巴集团控股有限公司 Term vector processing method, device and equipment based on cluster
CN108376144B (en) * 2018-01-12 2021-10-12 上海大学 Man-machine multi-round conversation method for automatic scene switching based on deep neural network
CN108376144A (en) * 2018-01-12 2018-08-07 上海大学 Man-machine more wheel dialogue methods that scene based on deep neural network automatically switches
CN108536807B (en) * 2018-04-04 2022-03-25 联想(北京)有限公司 Information processing method and device
CN108536807A (en) * 2018-04-04 2018-09-14 联想(北京)有限公司 A kind of information processing method and device
CN108897771A (en) * 2018-05-30 2018-11-27 东软集团股份有限公司 Automatic question-answering method, device, computer readable storage medium and electronic equipment
CN108804611A (en) * 2018-05-30 2018-11-13 浙江大学 A kind of dialogue reply generation method and system based on self comment Sequence Learning
US10984793B2 (en) 2018-06-27 2021-04-20 Baidu Online Network Technology (Beijing) Co., Ltd. Voice interaction method and device
CN108920604A (en) * 2018-06-27 2018-11-30 百度在线网络技术(北京)有限公司 Voice interactive method and equipment
CN110059152A (en) * 2018-12-25 2019-07-26 阿里巴巴集团控股有限公司 A kind of training method, device and the equipment of text information prediction model
CN109858528B (en) * 2019-01-10 2024-05-14 平安科技(深圳)有限公司 Recommendation system training method and device, computer equipment and storage medium
CN109858528A (en) * 2019-01-10 2019-06-07 平安科技(深圳)有限公司 Recommender system training method, device, computer equipment and storage medium
CN111444701A (en) * 2019-01-16 2020-07-24 阿里巴巴集团控股有限公司 Method and device for prompting inquiry
CN109815341B (en) * 2019-01-22 2023-10-10 安徽省泰岳祥升软件有限公司 Text extraction model training method, text extraction method and device
CN109815341A (en) * 2019-01-22 2019-05-28 安徽省泰岳祥升软件有限公司 Text extraction model training method, text extraction method and text extraction device
CN110008322A (en) * 2019-03-25 2019-07-12 阿里巴巴集团控股有限公司 Art recommended method and device under more wheel session operational scenarios
CN110008322B (en) * 2019-03-25 2023-04-07 创新先进技术有限公司 Method and device for recommending dialogues in multi-turn conversation scene
CN109977428B (en) * 2019-03-29 2024-04-02 北京金山数字娱乐科技有限公司 Answer obtaining method and device
CN109977428A (en) * 2019-03-29 2019-07-05 北京金山数字娱乐科技有限公司 A kind of method and device that answer obtains
CN110175333A (en) * 2019-06-04 2019-08-27 科大讯飞股份有限公司 A kind of evidence guides method, apparatus, equipment and storage medium
CN110175333B (en) * 2019-06-04 2023-09-26 科大讯飞股份有限公司 Evidence guiding method, device, equipment and storage medium
CN112199476A (en) * 2019-06-23 2021-01-08 国际商业机器公司 Automated decision making to select a leg after partial correct answers in a conversational intelligence tutor system
CN111061851B (en) * 2019-12-12 2023-08-08 中国科学院自动化研究所 Question generation method and system based on given facts
CN111061851A (en) * 2019-12-12 2020-04-24 中国科学院自动化研究所 Given fact-based question generation method and system
US11954441B2 (en) 2021-11-16 2024-04-09 Acer Incorporated Device and method for generating article markup information

Similar Documents

Publication Publication Date Title
CN105095444A (en) Information acquisition method and device
CN107491547A (en) Searching method and device based on artificial intelligence
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN107220386A (en) Information-pushing method and device
CN110674271B (en) Question and answer processing method and device
CN104598611B (en) The method and system being ranked up to search entry
CN112667794A (en) Intelligent question-answer matching method and system based on twin network BERT model
CN112000791A (en) Motor fault knowledge extraction system and method
US10824816B2 (en) Semantic parsing method and apparatus
US20110258054A1 (en) Automatic Generation of Bid Phrases for Online Advertising
CN104615767A (en) Searching-ranking model training method and device and search processing method
CN104657496A (en) Method and equipment for calculating information hot value
CN104471568A (en) Learning-based processing of natural language questions
CN113535963B (en) Long text event extraction method and device, computer equipment and storage medium
CN104715063A (en) Search ranking method and search ranking device
CN110765254A (en) Multi-document question-answering system model integrating multi-view answer reordering
CN111695338A (en) Interview content refining method, device, equipment and medium based on artificial intelligence
CN113051365A (en) Industrial chain map construction method and related equipment
CN111639247A (en) Method, apparatus, device and computer-readable storage medium for evaluating quality of review
CN111984792A (en) Website classification method and device, computer equipment and storage medium
CN113707299A (en) Auxiliary diagnosis method and device based on inquiry session and computer equipment
CN111831810A (en) Intelligent question and answer method, device, equipment and storage medium
CN114266443A (en) Data evaluation method and device, electronic equipment and storage medium
CN110310012B (en) Data analysis method, device, equipment and computer readable storage medium
CN112632377A (en) Recommendation method based on user comment emotion analysis and matrix decomposition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20151125

RJ01 Rejection of invention patent application after publication