CN108846094A

CN108846094A - A method of based on index in classification interaction

Info

Publication number: CN108846094A
Application number: CN201810617412.8A
Authority: CN
Inventors: 何中; 汤海泉; 严伟; 戴建峰; 顾永新; 王斌; 何登; 巢振军
Original assignee: JIANGSU ZHONGWEI TECHNOLOGY SOFTWARE SYSTEM Co Ltd
Current assignee: JIANGSU ZHONGWEI TECHNOLOGY SOFTWARE SYSTEM Co Ltd
Priority date: 2018-06-15
Filing date: 2018-06-15
Publication date: 2018-11-20

Abstract

The invention discloses a kind of modes based on index in classification interaction, include the following steps：A, it selects text and replicates, paste into system, system will carry out Word Intelligent Segmentation automatically, be shown phrase with block mode after participle；B, blocky participle is supported to choose, and participle is brought into the text box of top after click, the participle chosen is again tapped on, then cancels selection；C, retrieval interaction is carried out, the present invention can carry out Word Intelligent Segmentation to one section of text, text can be segmented automatically after duplication paste text data, phrase after participle is shown with bulk, user can freely pull combination phrase, and single phrase or combined phrase can be used as keyword and retrieved, it is only necessary to which keyword is dragged in operation system, it can be retrieved and be attempted automatically, it is convenient and efficient.

Description

A method of based on index in classification interaction

Technical field

The present invention relates to retrieval technique field, specially a kind of mode based on index in classification interaction.

Background technique

Retrieval is a kind of Chinese vocabulary, is referred to from the specific information requirement of user, uses one to specific information aggregate Fixed method, technological means therefrom find out relevant information according to certain clue and rule；In cybertimes, we without when without Carry out retrieval with carving.Mode there are mainly two types of being retrieved on the internet：Catalogue browsing and use search engine；Catalogue is clear For the mode look at i.e. Yet Another Hierarchically Officious Ora by the way of, user can click catalogue according to their own needs, go deep into next straton Catalogue, to find the information of oneself needs.This mode is convenient for searching certain a kind of information aggregate, but pinpoint energy Power is not strong；Search engine is presently the most a kind of common Web Search Tools.User only needs to submit the demand of oneself, search Engine can return to large result.These results are ranked up according to the correlation putd question to retrieval.

The mode of retrieval interaction at present is retrieved by way of being manually entered text mostly, such as Google, Baidu Equal search engines, we the modes such as input by keyboard, to be retrieved, more links being manually entered.And if necessary Cross-system retrieve, and needs to be repeatedly input in multiple systems, comparatively laborious.

Summary of the invention

The purpose of the present invention is to provide a kind of modes based on index in classification interaction, to solve to mention in above-mentioned background technique Out the problem of.

To achieve the above object, the present invention provides the following technical solutions：A method of based on index in classification interaction, including Following steps：

A, it selects text and to replicate, paste into system, system will carry out Word Intelligent Segmentation automatically, by phrase with bulk after participle Mode is shown；

B, blocky participle is supported to choose, and participle is brought into the text box of top after click, again taps on the participle chosen, then Cancel selection；

C, retrieval interaction is carried out, after pulling retrieval, the result after retrieval is directly shown.

Preferably, Word Intelligent Segmentation method is as follows in the step A：

A, the characteristic information of text to be segmented is obtained, wherein the characteristic information includes paragraph division, punctuation mark or sky At least one of lattice symbol；

B, it according to the characteristic information, determines described wait segment all natural sections in text；

C, natural section is divided into ambiguity section and non-ambiguity section；

D, it determines the candidate word in ambiguity section, and candidate word is matched with the text in non-ambiguity section；

E, the word segmentation regulation of candidate word is determined according to matching result, and is carried out according to text of the word segmentation regulation to ambiguity section Word segmentation processing.

Preferably, it includes that single participle pulls retrieval that interaction is retrieved in the step C；The multiple participles of text box, group unification Play retrieval；Multiselect combination is retrieved.

Preferably, text matching technique is as follows in the step d：

1) character in tested text, is subjected to individual segmentation, the character string after being divided；

2), the character in the character string after segmentation is matched with the key character in library of falling to set up type respectively；It is described fall Typesetting library is the position letter for being decomposed and being recorded each key character character by character to the keyword of input in the keyword It is formed after breath；

3) rule, is determined according to the fuzziness of setting, when determining that key character matches in each keyword of successful match The values of ambiguity used obtains the matching fuzziness of each keyword；

4), according to the matching fuzziness of each keyword, the average blur degree of the keyword of input is determined, according to described flat Equal fuzziness determines whether the tested text meets filter condition.

Preferably, the participle processing method in the step e is as follows：

A), obtain wait segment the corresponding first eigenvector of each individual character in sentence and the corresponding second feature vector of two words；

B), according to the first eigenvector and second feature vector, the current third feature vector of each individual character is determined；

C), the third feature vector current according to preset Chinese character label transfer matrix and each individual character, will it is described to It segments sentence and carries out word segmentation processing.

Compared with prior art, the beneficial effects of the invention are as follows：The present invention can carry out Word Intelligent Segmentation, duplication to one section of text Text can be segmented automatically after paste text data, the phrase after participle is shown with bulk, and user can freely pull Combination phrase, single phrase or combined phrase can be used as keyword and retrieved, it is only necessary to which keyword is dragged to business system On system, it can be retrieved and be attempted automatically, it is convenient and efficient；In addition, the Word Intelligent Segmentation method that the present invention uses effectively improves Relevance between word segmentation result and text context to be segmented, so that the accuracy of participle gets a promotion.

Detailed description of the invention

Fig. 1 is schematic diagram after Word Intelligent Segmentation of the present invention；

Fig. 2 is that phrase of the present invention pulls permutation and combination schematic diagram；

Fig. 3 is that the single participle of the present invention pulls retrieval schematic diagram；

Fig. 4 is that schematic diagram is retrieved in the multiple participle combinations of the present invention together；

Fig. 5 is multiselect combined retrieval schematic diagram of the present invention；

Fig. 6 is display schematic diagram after present invention retrieval.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Fig. 1-6 is please referred to, the present invention provides a kind of technical solution：A method of based on index in classification interaction, including with Lower step：

In the present invention, Word Intelligent Segmentation method is as follows in step A：

The Word Intelligent Segmentation method that the present invention uses effectively increases being associated between word segmentation result and text context to be segmented Property, so that the accuracy of participle gets a promotion.

In addition, retrieving interaction in the present invention, in step C includes that single participle pulls retrieval；The multiple participles of text box, combination It retrieves together；Multiselect combination is retrieved.

In the present invention, text matching technique is as follows in step d：

Formed by establishing key word library and fall typesetting library, it is established that keyword inverted index, then for tested text by One is filtered matching, and the fuzziness strategy based on setting, carries out fuzzy matching, is filtered after obtaining matching result.

In addition, the participle processing method in step e is as follows in the present invention：

The participle processing method realizes the word segmentation processing for treating participle sentence, and process is simple, is easily achieved, and simplifies net Network structure, the requirement for reducing volume and memory to mobile terminal, improve user experience.

In conclusion the present invention can carry out Word Intelligent Segmentation to one section of text, it can be automatically to text after duplication paste text data This is segmented, and the phrase after participle is shown with bulk, and user can freely pull combination phrase, single phrase or combination Phrase can be used as keyword and be retrieved, it is only necessary to keyword is dragged in operation system, can carry out automatically retrieval and It attempts, it is convenient and efficient；In addition, the Word Intelligent Segmentation method that the present invention uses effectively increases word segmentation result and text context to be segmented Between relevance so that participle accuracy get a promotion.

It although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with A variety of variations, modification, replacement can be carried out to these embodiments without departing from the principles and spirit of the present invention by understanding And modification, the scope of the present invention is defined by the appended.

Claims

1. a kind of mode based on index in classification interaction, it is characterised in that；Include the following steps；

A, it selects text and to replicate, paste into system, system will carry out Word Intelligent Segmentation automatically, by phrase with block mode after participle It is shown；

B, blocky participle is supported to choose, and participle is brought into the text box of top after click, the participle chosen is again tapped on, then cancels Selection；

2. a kind of mode based on index in classification interaction according to claim 1, it is characterised in that；Intelligence in the step A Energy segmenting method is as follows；

A, the characteristic information of text to be segmented is obtained, wherein the characteristic information includes paragraph division, punctuation mark or space character At least one of；

E, the word segmentation regulation of candidate word is determined according to matching result, and is segmented according to text of the word segmentation regulation to ambiguity section Processing.

3. a kind of mode based on index in classification interaction according to claim 1, it is characterised in that；It is examined in the step C Rope interaction includes that single participle pulls retrieval；The multiple participles of text box, combination are retrieved together；Multiselect combination is retrieved.

4. a kind of mode based on index in classification interaction according to claim 2, it is characterised in that；The step d Chinese This matching process is as follows；

2), the character in the character string after segmentation is matched with the key character in library of falling to set up type respectively；It is described to fall to set up type Library is to be decomposed and recorded each key character character by character to the keyword of input after the location information in the keyword It is formed；

3) rule, is determined according to the fuzziness of setting, is determined and is used when key character matches in each keyword of successful match Values of ambiguity, obtain the matching fuzziness of each keyword；

4), according to the matching fuzziness of each keyword, the average blur degree of the keyword of input is determined, according to the average mould Paste degree determines whether the tested text meets filter condition.

5. a kind of mode based on index in classification interaction according to claim 2, it is characterised in that；In the step e Participle processing method is as follows；

C), the third feature vector current according to preset Chinese character label transfer matrix and each individual character, by described wait segment Sentence carries out word segmentation processing.