CN106095758B - A kind of literary works guess method of word-based vector model - Google Patents
A kind of literary works guess method of word-based vector model Download PDFInfo
- Publication number
- CN106095758B CN106095758B CN201610439566.3A CN201610439566A CN106095758B CN 106095758 B CN106095758 B CN 106095758B CN 201610439566 A CN201610439566 A CN 201610439566A CN 106095758 B CN106095758 B CN 106095758B
- Authority
- CN
- China
- Prior art keywords
- literary works
- guess
- word
- corpus
- works
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The present invention relates to a kind of literary works of word-based vector model guess methods, belong to technical field of information processing, including literary works construction of knowledge base and literary works knowledge are guessed two stages.In the building stage: collecting the small-scale corpus of specific literary works, therefrom excavate the literary works Feature Words;Term vector correlation model is obtained using the small-scale corpus training term vector neural network;Calculating and each higher related term of the Feature Words degree of association based on this model, to construct literary works guess knowledge base.In the guess stage: system randomly chooses Feature Words as guess object, the related term of the specific word is extracted from knowledge base and successively to guess person's publicity, and guess person makes inferences answer.The present invention has found the Feature Words incidence relation of specific literary works using term vector model analysis method, and investigates reader in the form of guessing and improve the interest of reading while interaction between enhancing literary works and reader to the familiarity of literary works.
Description
Technical field
It is the present invention relates to a kind of literary works of word-based vector model guess method, in particular to a kind of to be based on term vector
The literary works guess method of text deep layer complex information relationship, belongs to the information processing technology in model automatic mining literary works
Field.
Background technique
Specific literary works refer to literary works or portfolio with particular story background and plot, such
Often length is longer for literary works, and relationship is complicated between personage, things.On the one hand, such literary works are read to need
A large amount of energy and time is spent, in nowadays rhythm of life quick in this way, people, which are difficult to extract a large amount of time out, to be gone completely
Whole works are read, one kind is thus needed quickly, absorbs literature knowledge full of interest and interactive mode;Another party
Face after readers ' reading crosses certain specific literary works, can there is certain understanding to the literary works, and degree of understanding has deeply and has shallow, reading
Person is merely able to qualitatively evaluate oneself familiarity to the literary works relevant knowledge, can not quantitatively evaluate, so I
Need a kind of method and can quantitatively investigate reader to the familiarity of specific literary works relevant knowledge.
Knowledge guess is a kind of for reflecting guess person to Opening field or a certain restriction domain knowledge familiarity
Mode, guess person's information according to one section of word or several words make inferences answer, and prompt information is less or information
The degree of association is lower, then answer difficulty is bigger, and the knowledge quantity for needing the person of guess to have is also bigger.
Knowledge guess is applied in literary works, reader can not only be investigated with a variety of answering modes to the ripe of literary works
Degree is known, reader can also be made to quickly understand the topological relation of the entities such as high priest in literary works, things, promote reader's
Read interest.
Currently, in terms of the building of guess knowledge base, it is main by manually being constructed, need a large amount of field special
Family's knowledge is cooperated.In literary works guess construction of knowledge base, the literary works of specific subject can be considered a field,
Construct this field guess knowledge base, expert must have very deep understanding to the literary works, to high priest in literary works,
Relationship between things clearly can just construct the guess knowledge base of high quality very much.Artificial constructed method has following a few sides
Face disadvantage: guess construction of knowledge base process is very slow, and each problem requires domain expert's manual construction problem and answer, and
Guessing, the general topic of knowledge base is more, and manual construction difficulty is larger;Domain-specialist knowledge is excessively relied on, it is such as ripe to the field literature
It is inadequate to know degree, will be unable to building high quality guess knowledge base;For the literary works of different themes, artificial constructed method can
Transplantability is poor, to the construction method that a certain theme literary works are applicable in, with the poor effect on another theme literary works.
The present invention will utilize natural language processing related tool and side for these problems present in artificial constructed method
Method, automatic, science quickly and efficiently construct specific literary works guess knowledge base, and this method has stronger portable
Property.After building guess knowledge base, guess knowledge base can be used to carry out answer in a manner of a variety of guesses for guess person, to quickly inhale
It receives literary works relevant knowledge or qualitatively evaluates and tests oneself familiarity to the literary works relevant knowledge.
Summary of the invention
The purpose of the present invention is guess knowledge to solve how automatic, science, quickly and efficiently constructing specific literary works
How library to the familiarity of specific literary works and makes reader not read over specific literature so as to quantitative assessment reader
The problem of relevant knowledge of the literary works is quickly understood on the basis of works original text proposes a kind of word-based vector model
Literary works guess method, this method are used to excavate its text deep layer complex information relationship simultaneously to a certain specific literary works
Knowledge base is constructed, related term is extracted from knowledge base and is guessed to guess person's publicity.
Idea of the invention is that automatic mining goes out text from its relevant small-scale corpus to a certain specific literary works
The information relationship of word deep layer complexity is constructed knowledge base according to a certain correlation rule, and is presented in the form of visual presentation competing
The person of guessing carries out answer, so as to quickly, scientifically investigate guess person to the familiarity of this literary works, can also excavate
Interest in literary works out increases interactive.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of literary works guess method of word-based vector model, is divided into literary works construction of knowledge base and literary works
Knowledge is guessed two stages, and the literary works construction of knowledge base stage includes the following steps:
Step 1, the related text corpus of specific literary works, including but not limited to literary works original work and this article are collected
The related encyclopaedic knowledge entry of works and correlative study document are learned, the small-scale corpus of specific literary works is constructed;
Step 2, natural language text pretreatment work is carried out to the small-scale corpus of the literary works built, removal is not
Related text noise;
Step 3, to going the small-scale corpus after noise to be named entity using natural language processing tool or method
Identification, is added to obtained name entity as the distinctive Feature Words of the literary works in Feature Words vocabulary;
Step 4, whole Feature Words in Feature Words vocabulary are added in the dictionary for word segmentation of participle tool, use participle word
Allusion quotation segments the small-scale corpus of specific literary works, corpus after being segmented, and by all words of corpus after participle
It is no duplicate to be added in vocabulary;
Step 5, bluebeard compound vector analysis tool uses after participle corpus as input and obtains the small-scale language of the literary works
The term vector model of material, and calculate and the maximally related one group of related term of each Feature Words, building literary works guess knowledge base;
The literary works knowledge guess stage includes the following steps:
Step 6, it guesses the stage into literary works knowledge, system randomly chooses a Feature Words as guess object, and
The highest top n related term of the specific word degree of association is extracted from literary works guess knowledge base;
Step 7, the N number of related term retained in step 6 is divided into M group, every group has no less than 2 related terms, foundation respectively
Degree of association size is that different groups set difficulty level;
Step 8, system respectively randomly selects out a related term from M group, and from low to high successively according to relational degree taxis
To guess person's publicity;
Step 9, guess person makes inferences answer according to the related term of publicity, and system judges that it is answered and correctly then records public affairs
Show the time, while entering next topic;It still answers wrong or does not answer when related term disappears, be then recorded as failure, while entering next
Topic;
Step 10, after the problem of guess person answers certain amount, guess terminates, during system is according to guess person's answer
The time of cost, accuracy carry out overall merit, and provide score, reflect that guess person is familiar with journey to the literary works with this
Degree.
In the step 3 when being named Entity recognition, the name entity for representing synonymy is aligned.
In the step 5, bluebeard compound vector analysis tool, the text after using participle obtains the literature as input corpus
The term vector model of the small-scale corpus of works, when calculating one group of related term maximally related with each Feature Words, with two term vectors
Between the degree of association of the cosine similarity calculated result as two words.
In the step 9, guess person makes inferences answer according to the related term of publicity, and guess person is either one answers
Topic form is also possible to more people and races to be the first to answer a question form.
Beneficial effect
The prior art is compared, the invention has the characteristics that:
1) literary works provided by the present invention are guessed method, by from a certain specific small-scale corpus of literary works from
The dynamic information relationship for excavating text deep layer complexity can make reader quickly understand high priest in literary works, things etc. real
The topological relation of body.Reader does not need completely to read whole literary works, so that it may have a comparison is deep to recognize the works
Know.
2) through the invention provided by literary works guess method, can it is automatic, quickly, scientifically construct specific literature
The guess knowledge base of works, this method effectively prevent the inefficiencies of manual construction method, excessively dependence domain-specialist knowledge, can
The disadvantages of transplantability is poor.
3) after provided literary works guess method builds guess knowledge base through the invention, guess person can be used competing
Guess that knowledge base carries out answer in a manner of a variety of guesses, system is carried out according to the time of cost, accuracy during guess person's answer
Overall merit, and score is provided, quantitatively reflect guess person to the familiarity of the literary works with this.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of the literary works guess method of word-based vector model of the embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference
Attached drawing, the present invention is described in more detail.
The literary works guess method of word-based vector model of the invention, is divided into literary works construction of knowledge base and literature
Works knowledge is guessed two stages.Its principle is:
In the literary works construction of knowledge base stage, the small-scale corpus of a certain specific literary works is collected first, it is secondly logical
It crosses Text Pretreatment and therefrom extracts the distinctive Feature Words of the literary works, such as name, place name, time, event, followed by
The small-scale corpus training term vector neural network obtains term vector correlation model, finally calculates and each feature based on this model
The higher related term of word association degree, to construct literary works guess knowledge base.
Guess the stage in literary works knowledge, system randomly choose one Feature Words as object of guessing first, then from
Literary works are guessed extracts the related term of the specific word in knowledge base, and successively to guess person's publicity, guess person is according to being received
Related term makes inferences, until correctly answering out the specific word.
Fig. 1 is the flow diagram of the literary works guess method of word-based vector model provided by the invention.In order to more
The method of the present invention process is illustrated well, is described in detail by taking literary works Heroes of the Marshes as an example.As shown in Figure 1, this method packet
Include following steps:
Step 101, collect the related text corpus of specific literary works, including but not limited to literary works original work, with should
The related encyclopaedic knowledge entry of literary works and correlative study document, construct the small-scale corpus of specific literary works.
Specific literary works refer to literary works or portfolio with particular story background and plot, such as " water
Waterside passes ", The Romance of the Three Kingdoms, " Star War " series, " Harry Potter " series etc..Collecting specific literary works related text language
During material, need to choose the corpus of text of high quality, so-called high quality corpus refers to corpus content and the literary works
The original work content degree of correlation is very high and only introduces the corpus of a small amount of noise.The literary works related text corpus quality being collected into is got over
Height, the term vector model that step 105 constructs are better.
In the present embodiment, in order to construct the small-scale corpus of Heroes of the Marshes, it is necessary first to collect the related text of Heroes of the Marshes
Word corpus.Heroes of the Marshes original work shares 120 chapters, on this basis, by high priest and event etc. in Heroes of the Marshes literary works
427 entries are obtained the Baidu hundred of corresponding entry using web crawlers as query word automatically on Baidupedia website
Section's webpage extracts corresponding entry corpus of text, has obtained plain text relevant to the influence of the Water Margin personage, historical background and literature
The small-scale corpus of the Heroes of the Marshes of form, total size are 6.87M.
Step 102, natural language text pretreatment work is carried out to the small-scale corpus of the specific literary works built,
Uncorrelated text noise is removed, in symbol, English character such as without practical significance, the serial number in entry and webpage
Advertising information etc., this step can be further improved the quality for the literary works corpus being collected into.After denoising, Heroes of the Marshes
Small-scale corpus is further compressed to 6.59M.
Step 103, use the small-scale corpus of Heroes of the Marshes as input, natural language processing tool or method are to corpus
It is named Entity recognition, includes but is not limited to the name entity of name, place in identification text, obtained name entity is made
It is added in Feature Words vocabulary for the distinctive Feature Words of the literary works.
In the present embodiment, Entity recognition is named using HanLP tool.In HanLP, name Entity recognition is to make
It is existing for a subsequent process of participle, i.e., sentence is first subjected to cutting, then identify whether the word being syncopated as is name
Entity.In the present embodiment, using the small-scale text corpus of Heroes of the Marshes as input, HanLP new word discovery function is opened, it is defeated
Out for after the participle with part-of-speech tagging as a result, word and part of speech "/" are divided, " nr " such as " Song Jiang/nr ", " apartment for the newly-weds/ns "
The part of speech for indicating word " Song Jiang " is name, and " ns " indicates that the part of speech of word " apartment for the newly-weds " is place name.It can be automatic by HanLP
Such as " Song Jiang ", " Song Gongming ", " Lu Zhishen ", " apartment for the newly-weds " name entity are excavated, under normal circumstances, program automatic mining goes out
Name entity can have a small amount of mistake, need expert to name Entity recognition result be filtered.In addition, due to different lives
The meaning of name entity expression is possible to identical, therefore when being named Entity recognition, real to the name for representing synonymy
Body is aligned.For example, in the present embodiment, " Song Jiang " and " Song Gongming " is the name entity for representing synonymy, is needed
It is aligned, i.e., " Song Gongming " is replaced with into " Song Jiang ".The Feature Words that these name entities are constituted belong to this portion of Heroes of the Marshes
The distinctive personage of literary works or object are to have pointing clearly to property and representative feature in literary works, are added to
Guess object in Feature Words vocabulary, i.e., as literary works knowledge guess link.
Step 104, using participle kit, in conjunction with the Feature Words vocabulary generated in step 103, by Feature Words whole in table
It is added in the dictionary for word segmentation of participle tool, and the small-scale corpus of specific literary works is segmented using dictionary for word segmentation,
Corpus after being segmented, and all words of corpus after participle are added in vocabulary without duplicate.
In the present embodiment, it is segmented using HanLP participle kit, the Feature Words vocabulary that will be generated in step 103
In whole Feature Words be added in the dictionary for word segmentation of HanLP, close the new word discovery function of HanLP, it is original small for inputting
Scale corpus, output are the corpus of text after segmenting;All words of corpus after participle are added to number without duplicate again
According in table, Heroes of the Marshes vocabulary is constructed.
Step 103 participle is to need to carry out specially to obtain the name entity in Heroes of the Marshes literary works, i.e. Feature Words
Family's filtering;Step 104 dictionary for word segmentation is that the updated dictionary of extension is different with step 103 word segmentation result, is known by name entity
The Feature Words that do not excavate afterwards keep the participle effect of corpus of text in step 104 more preferable.
Step 105, bluebeard compound vector analysis tool selects suitable parameter, and corpus after participle is used to obtain as input
The term vector model of the small-scale corpus of the literary works;
Term vector model can state word with vector form, similar by calculating the cosine between term vector
Degree, reflects the degree of association between word, is associated between the degree of association two words of bigger explanation closer.Further calculate vocabulary
In any one word and other each words cosine similarity, can excavate and the highest one group of word of the word association degree.Certainly,
Those skilled in the art will be seen that removing is reflected outside the degree of association between word using cosine similarity, can also use
Any one is able to reflect the method realization of the degree of association between different terms, such as Euclidean distance, manhatton distance.
In the present embodiment, it selects Word2vec as term vector analysis tool, utilizes the small rule of Heroes of the Marshes after participle
Mould corpus training Word2vec neural network obtains the term vector model of 200 dimensions.By term vector model, available " the Water Margin
Pass " term vector of all words in vocabulary.Further directed to each Feature Words of Heroes of the Marshes calculate its with it is other in vocabulary
The degree of association of word is simultaneously ranked up, and can be obtained and the maximally related one group of related term of the specific word.Such as Feature Words " Lu Zhi
It is deep ", the highest one group of word of the degree of association includes:
Rule of thumb as can be seen that the above related term has with the development of the plot of Lu Zhishen in Heroes of the Marshes story
Close connection, meet people read literary works when thinking habit and the mode of thinking.
After one group of related term for successively calculating each Feature Words, i.e. the building of completion literary works guess knowledge base.
Step 106, it guesses the stage into literary works knowledge, system randomly chooses a Feature Words and is used as guess object,
The specific word may be high priest, main matter, main place etc. in literary works.Further know from literary works guess
Know the top n related term and its degree of association that the specific word is extracted in library.
In the present embodiment, if selecting " 100 singly eight incite somebody to action " in Heroes of the Marshes as guess object, system is taken out at random
The Feature Words got be " Lu Zhishen ", then will from literary works guess knowledge base in extract " Lu Zhishen " 8 related terms and
The degree of association.The setting of the N value should be not higher than the related term of any one Feature Words in all literary works guess knowledge base
Number.
In practice, if guess object is personage, the word that personage is similarly in related term cannot be guess object
Reference and suggesting effect well are provided, therefore a kind of method for filtering related term can be provided, similar name entity is carried out
Filtering.Such as guess object be personage when, then will be similarly in related term personage vocabulary filtering, and guess object be place
When, then the vocabulary that place is similarly in related term is filtered.In the present embodiment, it is extracted from literary works guess knowledge base
In 8 related terms of " Lu Zhishen ", including two names: " Wu Song " and " history into ".Under this filtering rule, name will be similarly
Two related terms be filtered, filtered 6 related terms include:
Step 107, the N number of related term retained in step 106 is divided into M group, every group has N/M related term, foundation respectively
Degree of association size is that different groups set difficulty level.
Have much to the association ordering rule that related term is grouped, main group basis is characterized word word associated therewith
Degree of association size.In the present embodiment, 6 related terms of Feature Words " Lu Zhishen " have been obtained by previous step, according to the degree of association
Size sequence is equally divided into 3 groups, and every group of 2 words take highest two related terms of the degree of association as level-one difficulty group, two intermediate
Word is as second level difficulty group, and two minimum words of the degree of association are as three-level difficulty group.
Step 108, system respectively randomly selects out a related term from M group, and according to relational degree taxis from low to high according to
It is secondary to guess person's publicity.
In the present embodiment, system extracts related term " dandy monk ", " wineshop " and " Baozhusi " from three groups at random, and presses
According to relational degree taxis from low to high first to guess person's publicity three-level difficulty group related term " Baozhusi ", and publicity two after 5 seconds
Grade difficulty group related term " wineshop ", publicity level-one difficulty group related term " dandy monk " after 10 seconds, related term all disappears after 20 seconds.
Step 109, guess person makes inferences answer according to the related term of publicity, and system judges that its answer correctly then records
The publicity time, while entering next topic;It still answers wrong or does not answer when related term disappears, be then recorded as failure, while under entrance
One topic.
In the present embodiment, related term " Baozhusi " is to guess person's publicity, if guess person answered out correctly at the 3rd second
Feature Words " Lu Zhishen ", then system records 3 seconds Reaction times, while entering next topic;If guess person still answers after 20 seconds
Mistake is not answered, then system records this topic and answers failure, and enters next topic.
Further, the guess mode of guess person is either single answer form, is also possible to more people and races to be the first to answer a question form.When
Guess mode is more people when racing to be the first to answer a question, can be with first correct person of racing to be the first to answer a question in Reaction time when racing to be the first to answer a question the time and answering successfully
Between, it is other artificially to race to be the first to answer a question failure.
Step 110, after the problem of guess person answers certain amount, guess terminates, and system is according to guess person's answer process
The time of middle cost, accuracy carry out overall merit, and provide score, reflect that guess person is familiar with journey to the literary works with this
Degree.
In the present embodiment, if guess person answers 10 problems altogether, every problem has three groups of related terms, and related term is most
The long display time is 35 seconds, that is, answering 10 problem maximum durations is 350 seconds.If guess person answers correct 9 problem, used time 140 altogether
Second, then its score are as follows: 100 (9/10+ (350-140)/350)=150 (total score is 200 points).Score is higher, reflects guess person
There is good familiarity to Heroes of the Marshes.While answer, guess person also can be carried out study, understand main in literary works
It is interrelated between the Feature Words such as personage, things, the relevant knowledge more quickly, in a manner of interaction to absorb Heroes of the Marshes.
Certainly, it will be understood by those skilled in the art that other point systems also can be used in standards of grading herein, but answer should be met
The time spent in journey less, its higher score of accuracy just should be higher condition.It only in this way, could be to guess person to text
The familiarity for learning works provides the evaluation for correctly meeting the natural law.
Above-described specific descriptions have carried out further specifically the purpose of invention, technical scheme and beneficial effects
It is bright, it should be understood that the above is only a specific embodiment of the present invention, the protection model being not intended to limit the present invention
It encloses, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should be included in the present invention
Protection scope within.
Claims (4)
- The method 1. a kind of literary works of word-based vector model are guessed, which is characterized in that this method includes literary works knowledge Library building and literary works knowledge are guessed two stages, are specifically comprised the following steps:Step 101, the related text corpus of specific literary works, including but not limited to literary works original work and the literature are collected The related encyclopaedic knowledge entry of works and correlative study document, construct the small-scale corpus of specific literary works;Step 102, natural language text pretreatment work is carried out to the small-scale corpus of the literary works built, removes not phase Close text noise;Step 103, know to going the small-scale corpus after noise to be named entity using natural language processing tool or method Not, it is added to obtained name entity as the distinctive Feature Words of the literary works in Feature Words vocabulary;Step 104, whole Feature Words in Feature Words vocabulary are added in the dictionary for word segmentation of participle tool, use dictionary for word segmentation The small-scale corpus of specific literary works is segmented, corpus after being segmented, and by all words of corpus after participle without It is duplicate to be added in vocabulary;Step 105, bluebeard compound vector analysis tool uses after participle corpus as input and obtains the small-scale corpus of the literary works Term vector model, and calculate with the maximally related one group of related term of each Feature Words, building literary works guess knowledge base;Step 106, it guesses the stage into literary works knowledge, system randomly chooses a Feature Words as guess object, and from The highest top n related term of the specific word degree of association is extracted in literary works guess knowledge base;Step 107, the N number of related term retained in step 106 is divided into M group, every group has no less than 2 related terms, foundation respectively Degree of association size is that different groups set difficulty level;Step 108, system respectively randomly selects out a related term from M group, and according to relational degree taxis from low to high successively to Guess person's publicity;Step 109, guess person makes inferences answer according to the related term of publicity, and system judges that it is answered and correctly then records publicity Time, while entering next topic;It still answers wrong or does not answer when related term disappears, be then recorded as failure, while entering next topic;Step 110, after the problem of guess person answers certain amount, guess terminates, and system is according to flower during guess person's answer Time for taking, accuracy carry out overall merit, and provide score, reflect guess person to the familiarity of the literary works with this.
- The method 2. a kind of literary works of word-based vector model according to claim 1 are guessed, it is characterised in that: described In step 103, when being named Entity recognition, the name entity for representing synonymy is aligned.
- The method 3. a kind of literary works of word-based vector model according to claim 1 are guessed, it is characterised in that: described In step 105, bluebeard compound vector analysis tool uses after participle corpus as input and obtains the small-scale corpus of the literary works Term vector model, when calculating one group of related term maximally related with each Feature Words, with the cosine similarity between two term vectors The degree of association of the calculated result as two words.
- The method 4. a kind of literary works of word-based vector model according to claim 1 to 3 are guessed, feature exist In: in the step 109, guess person makes inferences answer according to the related term of publicity, and guess mode is either one answers Topic form is also possible to more people and races to be the first to answer a question form.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610439566.3A CN106095758B (en) | 2016-06-17 | 2016-06-17 | A kind of literary works guess method of word-based vector model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610439566.3A CN106095758B (en) | 2016-06-17 | 2016-06-17 | A kind of literary works guess method of word-based vector model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106095758A CN106095758A (en) | 2016-11-09 |
CN106095758B true CN106095758B (en) | 2018-12-04 |
Family
ID=57236694
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610439566.3A Expired - Fee Related CN106095758B (en) | 2016-06-17 | 2016-06-17 | A kind of literary works guess method of word-based vector model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106095758B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106776562B (en) * | 2016-12-20 | 2020-07-28 | 上海智臻智能网络科技股份有限公司 | Keyword extraction method and extraction system |
CN108694443B (en) * | 2017-04-05 | 2021-09-17 | 富士通株式会社 | Neural network-based language model training method and device |
CN109285098A (en) * | 2018-12-12 | 2019-01-29 | 广东小天才科技有限公司 | A kind of study householder method and study auxiliary client, e-learning equipment |
CN112953816B (en) * | 2021-03-19 | 2022-12-30 | 上海掌门科技有限公司 | Method, device, medium and program product for issuing guesses in friend space |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103605702A (en) * | 2013-11-08 | 2014-02-26 | 北京邮电大学 | Word similarity based network text classification method |
US8812297B2 (en) * | 2010-04-09 | 2014-08-19 | International Business Machines Corporation | Method and system for interactively finding synonyms using positive and negative feedback |
CN104881401A (en) * | 2015-05-27 | 2015-09-02 | 大连理工大学 | Patent literature clustering method |
-
2016
- 2016-06-17 CN CN201610439566.3A patent/CN106095758B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8812297B2 (en) * | 2010-04-09 | 2014-08-19 | International Business Machines Corporation | Method and system for interactively finding synonyms using positive and negative feedback |
CN103605702A (en) * | 2013-11-08 | 2014-02-26 | 北京邮电大学 | Word similarity based network text classification method |
CN104881401A (en) * | 2015-05-27 | 2015-09-02 | 大连理工大学 | Patent literature clustering method |
Non-Patent Citations (2)
Title |
---|
《An approach to sentiment analysis of short Chinese texts based on SVMs》;Lu Xing 等;《Control Conference (CCC), 2015 34th Chinese》;20150914;9115-9120 * |
《基于微博的知识词条推荐算法研究》;汤斌;《中国优秀硕士学位论文全文数据库信息科技辑》;20160315;I138-7611 * |
Also Published As
Publication number | Publication date |
---|---|
CN106095758A (en) | 2016-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106328147B (en) | Speech recognition method and device | |
CN106095758B (en) | A kind of literary works guess method of word-based vector model | |
CN109815491B (en) | Answer scoring method, device, computer equipment and storage medium | |
CN107122413A (en) | A kind of keyword extracting method and device based on graph model | |
CN107729468A (en) | Answer extracting method and system based on deep learning | |
US20100306248A1 (en) | Document processing method and system | |
CN104408093A (en) | News event element extracting method and device | |
CN109543110A (en) | A kind of microblog emotional analysis method and system | |
CN103854063B (en) | A kind of prediction of event occurrence risk method for early warning based on internet opening imformation | |
CN111209384A (en) | Question and answer data processing method and device based on artificial intelligence and electronic equipment | |
CN108153732B (en) | Examination method and device for interrogation notes | |
CN103425635A (en) | Method and device for recommending answers | |
CN108121702A (en) | Mathematics subjective item reads and appraises method and system | |
CN105760439A (en) | Figure cooccurrence relation graph establishing method based on specific behavior cooccurrence network | |
CN110717324A (en) | Judgment document answer information extraction method, device, extractor, medium and equipment | |
CN110472203B (en) | Article duplicate checking and detecting method, device, equipment and storage medium | |
CN106547733A (en) | A kind of name entity recognition method towards particular text | |
CN105183808A (en) | Problem classification method and apparatus | |
CN105260385A (en) | Picture retrieval method | |
Zhou et al. | Neural storyline extraction model for storyline generation from news articles | |
CN107679075A (en) | Method for monitoring network and equipment | |
CN115221864A (en) | Multi-mode false news detection method and system | |
CN113886524A (en) | Network security threat event extraction method based on short text | |
Lee et al. | SQuARe: A large-scale dataset of sensitive questions and acceptable responses created through human-machine collaboration | |
CN106355455A (en) | Method for extracting product feature information from online shopping user comments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20181204 Termination date: 20190617 |
|
CF01 | Termination of patent right due to non-payment of annual fee |