CN108846094A - A method of based on index in classification interaction - Google Patents
A method of based on index in classification interaction Download PDFInfo
- Publication number
- CN108846094A CN108846094A CN201810617412.8A CN201810617412A CN108846094A CN 108846094 A CN108846094 A CN 108846094A CN 201810617412 A CN201810617412 A CN 201810617412A CN 108846094 A CN108846094 A CN 108846094A
- Authority
- CN
- China
- Prior art keywords
- text
- character
- participle
- keyword
- interaction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of modes based on index in classification interaction, include the following steps:A, it selects text and replicates, paste into system, system will carry out Word Intelligent Segmentation automatically, be shown phrase with block mode after participle;B, blocky participle is supported to choose, and participle is brought into the text box of top after click, the participle chosen is again tapped on, then cancels selection;C, retrieval interaction is carried out, the present invention can carry out Word Intelligent Segmentation to one section of text, text can be segmented automatically after duplication paste text data, phrase after participle is shown with bulk, user can freely pull combination phrase, and single phrase or combined phrase can be used as keyword and retrieved, it is only necessary to which keyword is dragged in operation system, it can be retrieved and be attempted automatically, it is convenient and efficient.
Description
Technical field
The present invention relates to retrieval technique field, specially a kind of mode based on index in classification interaction.
Background technique
Retrieval is a kind of Chinese vocabulary, is referred to from the specific information requirement of user, uses one to specific information aggregate
Fixed method, technological means therefrom find out relevant information according to certain clue and rule;In cybertimes, we without when without
Carry out retrieval with carving.Mode there are mainly two types of being retrieved on the internet:Catalogue browsing and use search engine;Catalogue is clear
For the mode look at i.e. Yet Another Hierarchically Officious Ora by the way of, user can click catalogue according to their own needs, go deep into next straton
Catalogue, to find the information of oneself needs.This mode is convenient for searching certain a kind of information aggregate, but pinpoint energy
Power is not strong;Search engine is presently the most a kind of common Web Search Tools.User only needs to submit the demand of oneself, search
Engine can return to large result.These results are ranked up according to the correlation putd question to retrieval.
The mode of retrieval interaction at present is retrieved by way of being manually entered text mostly, such as Google, Baidu
Equal search engines, we the modes such as input by keyboard, to be retrieved, more links being manually entered.And if necessary
Cross-system retrieve, and needs to be repeatedly input in multiple systems, comparatively laborious.
Summary of the invention
The purpose of the present invention is to provide a kind of modes based on index in classification interaction, to solve to mention in above-mentioned background technique
Out the problem of.
To achieve the above object, the present invention provides the following technical solutions:A method of based on index in classification interaction, including
Following steps:
A, it selects text and to replicate, paste into system, system will carry out Word Intelligent Segmentation automatically, by phrase with bulk after participle
Mode is shown;
B, blocky participle is supported to choose, and participle is brought into the text box of top after click, again taps on the participle chosen, then
Cancel selection;
C, retrieval interaction is carried out, after pulling retrieval, the result after retrieval is directly shown.
Preferably, Word Intelligent Segmentation method is as follows in the step A:
A, the characteristic information of text to be segmented is obtained, wherein the characteristic information includes paragraph division, punctuation mark or sky
At least one of lattice symbol;
B, it according to the characteristic information, determines described wait segment all natural sections in text;
C, natural section is divided into ambiguity section and non-ambiguity section;
D, it determines the candidate word in ambiguity section, and candidate word is matched with the text in non-ambiguity section;
E, the word segmentation regulation of candidate word is determined according to matching result, and is carried out according to text of the word segmentation regulation to ambiguity section
Word segmentation processing.
Preferably, it includes that single participle pulls retrieval that interaction is retrieved in the step C;The multiple participles of text box, group unification
Play retrieval;Multiselect combination is retrieved.
Preferably, text matching technique is as follows in the step d:
1) character in tested text, is subjected to individual segmentation, the character string after being divided;
2), the character in the character string after segmentation is matched with the key character in library of falling to set up type respectively;It is described fall
Typesetting library is the position letter for being decomposed and being recorded each key character character by character to the keyword of input in the keyword
It is formed after breath;
3) rule, is determined according to the fuzziness of setting, when determining that key character matches in each keyword of successful match
The values of ambiguity used obtains the matching fuzziness of each keyword;
4), according to the matching fuzziness of each keyword, the average blur degree of the keyword of input is determined, according to described flat
Equal fuzziness determines whether the tested text meets filter condition.
Preferably, the participle processing method in the step e is as follows:
A), obtain wait segment the corresponding first eigenvector of each individual character in sentence and the corresponding second feature vector of two words;
B), according to the first eigenvector and second feature vector, the current third feature vector of each individual character is determined;
C), the third feature vector current according to preset Chinese character label transfer matrix and each individual character, will it is described to
It segments sentence and carries out word segmentation processing.
Compared with prior art, the beneficial effects of the invention are as follows:The present invention can carry out Word Intelligent Segmentation, duplication to one section of text
Text can be segmented automatically after paste text data, the phrase after participle is shown with bulk, and user can freely pull
Combination phrase, single phrase or combined phrase can be used as keyword and retrieved, it is only necessary to which keyword is dragged to business system
On system, it can be retrieved and be attempted automatically, it is convenient and efficient;In addition, the Word Intelligent Segmentation method that the present invention uses effectively improves
Relevance between word segmentation result and text context to be segmented, so that the accuracy of participle gets a promotion.
Detailed description of the invention
Fig. 1 is schematic diagram after Word Intelligent Segmentation of the present invention;
Fig. 2 is that phrase of the present invention pulls permutation and combination schematic diagram;
Fig. 3 is that the single participle of the present invention pulls retrieval schematic diagram;
Fig. 4 is that schematic diagram is retrieved in the multiple participle combinations of the present invention together;
Fig. 5 is multiselect combined retrieval schematic diagram of the present invention;
Fig. 6 is display schematic diagram after present invention retrieval.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Fig. 1-6 is please referred to, the present invention provides a kind of technical solution:A method of based on index in classification interaction, including with
Lower step:
A, it selects text and to replicate, paste into system, system will carry out Word Intelligent Segmentation automatically, by phrase with bulk after participle
Mode is shown;
B, blocky participle is supported to choose, and participle is brought into the text box of top after click, again taps on the participle chosen, then
Cancel selection;
C, retrieval interaction is carried out, after pulling retrieval, the result after retrieval is directly shown.
In the present invention, Word Intelligent Segmentation method is as follows in step A:
A, the characteristic information of text to be segmented is obtained, wherein the characteristic information includes paragraph division, punctuation mark or sky
At least one of lattice symbol;
B, it according to the characteristic information, determines described wait segment all natural sections in text;
C, natural section is divided into ambiguity section and non-ambiguity section;
D, it determines the candidate word in ambiguity section, and candidate word is matched with the text in non-ambiguity section;
E, the word segmentation regulation of candidate word is determined according to matching result, and is carried out according to text of the word segmentation regulation to ambiguity section
Word segmentation processing.
The Word Intelligent Segmentation method that the present invention uses effectively increases being associated between word segmentation result and text context to be segmented
Property, so that the accuracy of participle gets a promotion.
In addition, retrieving interaction in the present invention, in step C includes that single participle pulls retrieval;The multiple participles of text box, combination
It retrieves together;Multiselect combination is retrieved.
In the present invention, text matching technique is as follows in step d:
1) character in tested text, is subjected to individual segmentation, the character string after being divided;
2), the character in the character string after segmentation is matched with the key character in library of falling to set up type respectively;It is described fall
Typesetting library is the position letter for being decomposed and being recorded each key character character by character to the keyword of input in the keyword
It is formed after breath;
3) rule, is determined according to the fuzziness of setting, when determining that key character matches in each keyword of successful match
The values of ambiguity used obtains the matching fuzziness of each keyword;
4), according to the matching fuzziness of each keyword, the average blur degree of the keyword of input is determined, according to described flat
Equal fuzziness determines whether the tested text meets filter condition.
Formed by establishing key word library and fall typesetting library, it is established that keyword inverted index, then for tested text by
One is filtered matching, and the fuzziness strategy based on setting, carries out fuzzy matching, is filtered after obtaining matching result.
In addition, the participle processing method in step e is as follows in the present invention:
A), obtain wait segment the corresponding first eigenvector of each individual character in sentence and the corresponding second feature vector of two words;
B), according to the first eigenvector and second feature vector, the current third feature vector of each individual character is determined;
C), the third feature vector current according to preset Chinese character label transfer matrix and each individual character, will it is described to
It segments sentence and carries out word segmentation processing.
The participle processing method realizes the word segmentation processing for treating participle sentence, and process is simple, is easily achieved, and simplifies net
Network structure, the requirement for reducing volume and memory to mobile terminal, improve user experience.
In conclusion the present invention can carry out Word Intelligent Segmentation to one section of text, it can be automatically to text after duplication paste text data
This is segmented, and the phrase after participle is shown with bulk, and user can freely pull combination phrase, single phrase or combination
Phrase can be used as keyword and be retrieved, it is only necessary to keyword is dragged in operation system, can carry out automatically retrieval and
It attempts, it is convenient and efficient;In addition, the Word Intelligent Segmentation method that the present invention uses effectively increases word segmentation result and text context to be segmented
Between relevance so that participle accuracy get a promotion.
It although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with
A variety of variations, modification, replacement can be carried out to these embodiments without departing from the principles and spirit of the present invention by understanding
And modification, the scope of the present invention is defined by the appended.
Claims (5)
1. a kind of mode based on index in classification interaction, it is characterised in that;Include the following steps;
A, it selects text and to replicate, paste into system, system will carry out Word Intelligent Segmentation automatically, by phrase with block mode after participle
It is shown;
B, blocky participle is supported to choose, and participle is brought into the text box of top after click, the participle chosen is again tapped on, then cancels
Selection;
C, retrieval interaction is carried out, after pulling retrieval, the result after retrieval is directly shown.
2. a kind of mode based on index in classification interaction according to claim 1, it is characterised in that;Intelligence in the step A
Energy segmenting method is as follows;
A, the characteristic information of text to be segmented is obtained, wherein the characteristic information includes paragraph division, punctuation mark or space character
At least one of;
B, it according to the characteristic information, determines described wait segment all natural sections in text;
C, natural section is divided into ambiguity section and non-ambiguity section;
D, it determines the candidate word in ambiguity section, and candidate word is matched with the text in non-ambiguity section;
E, the word segmentation regulation of candidate word is determined according to matching result, and is segmented according to text of the word segmentation regulation to ambiguity section
Processing.
3. a kind of mode based on index in classification interaction according to claim 1, it is characterised in that;It is examined in the step C
Rope interaction includes that single participle pulls retrieval;The multiple participles of text box, combination are retrieved together;Multiselect combination is retrieved.
4. a kind of mode based on index in classification interaction according to claim 2, it is characterised in that;The step d Chinese
This matching process is as follows;
1) character in tested text, is subjected to individual segmentation, the character string after being divided;
2), the character in the character string after segmentation is matched with the key character in library of falling to set up type respectively;It is described to fall to set up type
Library is to be decomposed and recorded each key character character by character to the keyword of input after the location information in the keyword
It is formed;
3) rule, is determined according to the fuzziness of setting, is determined and is used when key character matches in each keyword of successful match
Values of ambiguity, obtain the matching fuzziness of each keyword;
4), according to the matching fuzziness of each keyword, the average blur degree of the keyword of input is determined, according to the average mould
Paste degree determines whether the tested text meets filter condition.
5. a kind of mode based on index in classification interaction according to claim 2, it is characterised in that;In the step e
Participle processing method is as follows;
A), obtain wait segment the corresponding first eigenvector of each individual character in sentence and the corresponding second feature vector of two words;
B), according to the first eigenvector and second feature vector, the current third feature vector of each individual character is determined;
C), the third feature vector current according to preset Chinese character label transfer matrix and each individual character, by described wait segment
Sentence carries out word segmentation processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810617412.8A CN108846094A (en) | 2018-06-15 | 2018-06-15 | A method of based on index in classification interaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810617412.8A CN108846094A (en) | 2018-06-15 | 2018-06-15 | A method of based on index in classification interaction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108846094A true CN108846094A (en) | 2018-11-20 |
Family
ID=64202987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810617412.8A Pending CN108846094A (en) | 2018-06-15 | 2018-06-15 | A method of based on index in classification interaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108846094A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800352A (en) * | 2018-12-30 | 2019-05-24 | 上海触乐信息科技有限公司 | Method, system and the terminal device of information push are carried out based on clipbook |
CN111310481A (en) * | 2020-01-19 | 2020-06-19 | 百度在线网络技术(北京)有限公司 | Speech translation method, device, computer equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102541960A (en) * | 2010-12-31 | 2012-07-04 | 北大方正集团有限公司 | Method and device of fuzzy retrieval |
CN104750673A (en) * | 2013-12-31 | 2015-07-01 | 中国移动通信集团公司 | Text matching and filtering method and text matching and filtering device |
CN105447187A (en) * | 2015-12-15 | 2016-03-30 | 广州神马移动信息科技有限公司 | Webpage search method and system |
CN105989030A (en) * | 2015-02-02 | 2016-10-05 | 阿里巴巴集团控股有限公司 | Text retrieval method and device |
CN107832302A (en) * | 2017-11-22 | 2018-03-23 | 北京百度网讯科技有限公司 | Participle processing method, device, mobile terminal and computer-readable recording medium |
CN107918604A (en) * | 2017-11-13 | 2018-04-17 | 彩讯科技股份有限公司 | A kind of Chinese segmenting method and device |
-
2018
- 2018-06-15 CN CN201810617412.8A patent/CN108846094A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102541960A (en) * | 2010-12-31 | 2012-07-04 | 北大方正集团有限公司 | Method and device of fuzzy retrieval |
CN104750673A (en) * | 2013-12-31 | 2015-07-01 | 中国移动通信集团公司 | Text matching and filtering method and text matching and filtering device |
CN105989030A (en) * | 2015-02-02 | 2016-10-05 | 阿里巴巴集团控股有限公司 | Text retrieval method and device |
CN105447187A (en) * | 2015-12-15 | 2016-03-30 | 广州神马移动信息科技有限公司 | Webpage search method and system |
CN107918604A (en) * | 2017-11-13 | 2018-04-17 | 彩讯科技股份有限公司 | A kind of Chinese segmenting method and device |
CN107832302A (en) * | 2017-11-22 | 2018-03-23 | 北京百度网讯科技有限公司 | Participle processing method, device, mobile terminal and computer-readable recording medium |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800352A (en) * | 2018-12-30 | 2019-05-24 | 上海触乐信息科技有限公司 | Method, system and the terminal device of information push are carried out based on clipbook |
CN109800352B (en) * | 2018-12-30 | 2022-08-12 | 上海触乐信息科技有限公司 | Method, system and terminal device for pushing information based on clipboard |
CN111310481A (en) * | 2020-01-19 | 2020-06-19 | 百度在线网络技术(北京)有限公司 | Speech translation method, device, computer equipment and storage medium |
CN111310481B (en) * | 2020-01-19 | 2021-05-18 | 百度在线网络技术(北京)有限公司 | Speech translation method, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107609121B (en) | News text classification method based on LDA and word2vec algorithm | |
WO2021121198A1 (en) | Semantic similarity-based entity relation extraction method and apparatus, device and medium | |
CN106649260B (en) | Product characteristic structure tree construction method based on comment text mining | |
CN113268995B (en) | Chinese academy keyword extraction method, device and storage medium | |
CN1728142B (en) | Phrase identification method and device in an information retrieval system | |
CN111324771B (en) | Video tag determination method and device, electronic equipment and storage medium | |
CN114065758B (en) | Document keyword extraction method based on hypergraph random walk | |
CN112395395B (en) | Text keyword extraction method, device, equipment and storage medium | |
JP2005526317A (en) | Method and system for automatically searching a concept hierarchy from a document corpus | |
US8583669B2 (en) | Query suggestion for efficient legal E-discovery | |
CN109086355B (en) | Hot-spot association relation analysis method and system based on news subject term | |
CN115796181A (en) | Text relation extraction method for chemical field | |
CN107844493B (en) | File association method and system | |
CN113434636A (en) | Semantic-based approximate text search method and device, computer equipment and medium | |
CN111160007B (en) | Search method and device based on BERT language model, computer equipment and storage medium | |
CN111625621A (en) | Document retrieval method and device, electronic equipment and storage medium | |
CN110888970A (en) | Text generation method, device, terminal and storage medium | |
CN109614493B (en) | Text abbreviation recognition method and system based on supervision word vector | |
CN111325018A (en) | Domain dictionary construction method based on web retrieval and new word discovery | |
CN111008530A (en) | Complex semantic recognition method based on document word segmentation | |
CN111429184A (en) | User portrait extraction method based on text information | |
CN111090994A (en) | Chinese-internet-forum-text-oriented event place attribution province identification method | |
CN112434134A (en) | Search model training method and device, terminal equipment and storage medium | |
CN106570196B (en) | Video program searching method and device | |
CN108595413B (en) | Answer extraction method based on semantic dependency tree |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181120 |