CN106933998A - A kind of inaccurate method of solution ApacheSolr phrase searches - Google Patents
A kind of inaccurate method of solution ApacheSolr phrase searches Download PDFInfo
- Publication number
- CN106933998A CN106933998A CN201710117467.8A CN201710117467A CN106933998A CN 106933998 A CN106933998 A CN 106933998A CN 201710117467 A CN201710117467 A CN 201710117467A CN 106933998 A CN106933998 A CN 106933998A
- Authority
- CN
- China
- Prior art keywords
- phrase
- search
- participle
- solution
- apachesolr
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3325—Reformulation based on results of preceding query
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method that solution ApacheSolr phrase searches are forbidden;It is characterized in that:The method comprises the following steps:Data receiver, QParserPlugin receives the search statement parameter that client is transmitted by http agreements;Phrase is searched, the phrase in using matching regular expressions search statement parameter in QParserPlugin, obtains phrase set;Data participle and replacement, pattern participle is indexed by the phrase in the phrase set that step 2 is obtained;The phrase of point good word is replaced into the phrase in initial search sentence;Data conversion, the search statement that will be replaced by the grammar parser of Apache Solr changes into Query;Data processing and output, into the search procedure of Apache Solr, after the completion of output data.The present invention extends the grammar parser of Apache Solr by the way of plug-in unit, rewrites its syntax parsing rule, solves the problems, such as that phrase search is forbidden.There is provided the grammar parser expansion plugin of plug type;Searched again for after pattern participle is indexed to phrase using indexing model.
Description
Technical field
It is exactly that a kind of solution ApacheSolr phrase searches are forbidden the present invention relates to web search technical field
Method.
Background technology
There is a kind of search grammer in Apache Solr " phrase search " both PhraseQuery;The grammer of phrase search is
Quotation marks are added on keyword, search principle is that distance is the slop parameter sizes specified after the keyword participle in quotation marks;But
Be the document participle when indexing result can than search when it is more to the result of Query participles, therefore can cause indexing model with
Search pattern is mismatched, so as to cause " phrase search " inaccurate problem.
The invention provides a kind of method, before Apache Solr search operations are entered, first by phrase search grammer
Keyword according to indexing model participle, then replace original phrase search sentence, finally enter search operation.
The content of the invention
The technical problem to be solved in the present invention be Apache Solr in phrase search, because of indexing model and search pattern
Word segmentation result is inconsistent so as to the problem for causing search inaccurate.
In order to solve the above technical problems, the present invention uses following technological means:
A kind of inaccurate method of solution ApacheSolr phrase searches;It is characterized in that:The method comprises the following steps:
Step 1:Data receiver, QParserPlugin receives the search statement that client is transmitted by http agreements
Parameter;
Step 2:Phrase is searched, short in using matching regular expressions search statement parameter in QParserPlugin
Language, obtains phrase set;
Step 3:Data participle and replacement, pattern participle is indexed by the phrase in the phrase set that step 2 is obtained;
The phrase of point good word is replaced into the phrase in initial search sentence;
Step 4:Data conversion, the search statement that will be replaced by the grammar parser of Apache Solr is changed into
Query;
Step 5:Data processing and output, into the search procedure of Apache Solr, after the completion of output data.
Preferably, the present invention further technical scheme is:
Described phrase is searched, and calls getString methods to obtain search statement first in parse methods, is then made
Gone to match " the phrase search sentence " in search statement with the regular expression of matching " plus quotation marks sentence ".
Described data participle and replacement, call segmenter by indexing model to matching the phrase participle for coming, and finally will
Sentence after participle replaces original search statement.
The grammar parser of described data conversion Apache Solr, writes AntfactQParserPlugin classes, and after
Hold the QParserPlugin of Apache Solr and rewrite createParser methods, return value is AntfactQParser classes
Type.
The last change data of grammar parser of described data conversion Apache Solr is configured for solrconfig.xml
Configured in file<queryParser>, class is AntfactQParserPlugin;So can be with the configuration of dynamic flexible certainly
The queryParser of definition.
The present invention extends the grammar parser of Apache Solr by the way of plug-in unit, rewrites its syntax parsing rule, solution
The inaccurate problem of phrase search of having determined.There is provided the grammar parser expansion plugin of plug type;Using indexing model to phrase
Searched again for after being indexed pattern participle.
First, AntfactQParser classes are write, and is inherited the LuceneQParser classes of Apache Solr and is rewritten parse
Method;Call getString methods to obtain search statement first in parse methods, then use matching " plus quotation marks sentence "
Regular expression go match search statement in " phrase search sentence ", then call segmenter by indexing model to matching
The phrase participle for coming, finally replaces original search statement by the sentence after participle;So neither influence original Apache Solr
LuceneQParser function, again can self-defined syntax parsing rule according to demand;
2nd, AntfactQParserPlugin classes are write, and is inherited the QParserPlugin of Apache Solr and is rewritten
CreateParser methods, return value is AntfactQParser types;Finally match somebody with somebody in solrconfig.xml configuration files
Put<queryParser>, class is AntfactQParserPlugin;So can be customized with the configuration of dynamic flexible
queryParser。
Brief description of the drawings
Fig. 1 is a kind of inaccurate method flow diagram of solution ApacheSolr phrase searches of the invention.
Fig. 2 is a kind of inaccurate method structured flowchart of solution ApacheSolr phrase searches of the invention.
Specific embodiment
With reference to embodiment, the present invention is further illustrated.
Specific embodiment 1:
It can be seen from Fig. 1, Fig. 2, a kind of inaccurate method of solution ApacheSolr phrase searches of the present invention;Its feature exists
In:The method comprises the following steps:Step 1:Data receiver, QParserPlugin is received client and is transmitted by http agreements
The search statement parameter for coming over;Step 2:Phrase is searched, and matching regular expressions search statement is used in QParserPlugin
Phrase in parameter, obtains phrase set;GetString methods are called to obtain search statement first in parse methods, then
Go to match " the phrase search sentence " in search statement, step 3 using the regular expression of matching " plus quotation marks sentence ":Data point
Word and replacement, pattern participle is indexed by the phrase in the phrase set that step 2 is obtained;The phrase of point good word is replaced former
Phrase in beginning search statement;Call segmenter by indexing model to matching the phrase participle for coming, finally by the language after participle
Sentence replaces original search statement, step 4:Data conversion, the search that will be replaced by the grammar parser of Apache Solr
Sentence changes into Query;The grammar parser of data conversion Apache Solr, writes AntfactQParserPlugin classes, and after
Hold the QParserPlugin of Apache Solr and rewrite createParser methods, return value is AntfactQParser classes
Type, the last change data of grammar parser of data conversion Apache Solr is configuration in solrconfig.xml configuration files<
queryParser>, class is AntfactQParserPlugin;So can be customized with the configuration of dynamic flexible
QueryParser, step 5:Data processing and output, into the search procedure of Apache Solr, after the completion of output data.
Specific embodiment 2:
1st, the first step, QParserPlugin receives the search statement parameter that client is transmitted by http agreements;
2nd, second step, the phrase in using matching regular expressions search statement parameter in QParserPlugin, obtains
Phrase set;
3rd, the 3rd step, the phrase in traversal phrase set is indexed pattern participle;
4th, the 4th step, the phrase in initial search sentence is replaced by the phrase of point good word;
5th, the 5th step, the search statement that the grammar parser of Apache Solr will be replaced changes into Query.
6th, the 6th step, into the search procedure of Apache Solr;
Due to the foregoing is only specific embodiment of the invention, but protection not limited to this of the invention, any skill
The technical staff in art field it is contemplated that equivalent change or the replacement of the technical program technical characteristic, all cover of the invention
Within protection domain.
Claims (6)
1. a kind of method that solution ApacheSolr phrase searches are forbidden;It is characterized in that:The method comprises the following steps:
Step 1:Data receiver, QParserPlugin receives the search statement parameter that client is transmitted by http agreements;
Step 2:Phrase is searched, the phrase in using matching regular expressions search statement parameter in QParserPlugin, is obtained
To phrase set;
Step 3:Data participle and replacement, pattern participle is indexed by the phrase in the phrase set that step 2 is obtained;To divide
The phrase of good word replaces the phrase in initial search sentence;
Step 4:Data conversion, the search statement that will be replaced by the grammar parser of Apache Solr changes into Query;
Step 5:Data processing and output, into the search procedure of Apache Solr, after the completion of output data.
2. the method that a kind of solution ApacheSolr phrase searches according to claim 1 are forbidden;It is characterized in that:It is described
Phrase search, call getString methods to obtain search statement first in parse methods, then use matching " plus quotation marks
The regular expression of sentence " goes to match " the phrase search sentence " in search statement.
3. the method that a kind of solution ApacheSolr phrase searches according to claim 1 are forbidden;It is characterized in that:It is described
Data participle and replacement, call segmenter by indexing model to match come phrase participle, finally by the sentence after participle
Replace original search statement.
4. the method that a kind of solution ApacheSolr phrase searches according to claim 1 are forbidden;It is characterized in that:It is described
Data conversion Apache Solr grammar parser, write AntfactQParserPlugin classes, and inherit Apache
The QParserPlugin of Solr simultaneously rewrites createParser methods, and return value is AntfactQParser types.
5. a kind of inaccurate method of solution ApacheSolr phrase searches according to claim 1 or 3;It is characterized in that:
The last change data of grammar parser of described data conversion Apache Solr be solrconfig.xml configuration files in match somebody with somebody
Put<queryParser>, class is AntfactQParserPlugin;So can be customized with the configuration of dynamic flexible
queryParser。
6. the method that a kind of solution ApacheSolr phrase searches according to claim 1 are forbidden;It is characterized in that:It is described
The inaccurate plug-in unit of solution Apache Solr phrase searches, be exactly the solrconfig.xml in Apache solr, configuration text
Configured in part<queryParser>.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710117467.8A CN106933998B (en) | 2017-03-01 | 2017-03-01 | Method for solving inaccurate Apache Solr phrase search |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710117467.8A CN106933998B (en) | 2017-03-01 | 2017-03-01 | Method for solving inaccurate Apache Solr phrase search |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106933998A true CN106933998A (en) | 2017-07-07 |
CN106933998B CN106933998B (en) | 2021-03-02 |
Family
ID=59423888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710117467.8A Active CN106933998B (en) | 2017-03-01 | 2017-03-01 | Method for solving inaccurate Apache Solr phrase search |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106933998B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102682036A (en) * | 2011-03-18 | 2012-09-19 | 新奥特(北京)视频技术有限公司 | Non-editing based method and system for searching media assets |
CN103488702A (en) * | 2013-09-06 | 2014-01-01 | 云南电力试验研究院(集团)有限公司电力研究院 | SorlCloud based unstructured data retrieval method and system |
-
2017
- 2017-03-01 CN CN201710117467.8A patent/CN106933998B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102682036A (en) * | 2011-03-18 | 2012-09-19 | 新奥特(北京)视频技术有限公司 | Non-editing based method and system for searching media assets |
CN103488702A (en) * | 2013-09-06 | 2014-01-01 | 云南电力试验研究院(集团)有限公司电力研究院 | SorlCloud based unstructured data retrieval method and system |
Non-Patent Citations (1)
Title |
---|
作者不详: "Solr实现Low Level查询解析", 《程序园 HTTPS://WWW.CNBLOGS.COM/LVFEILONG/P/FSDF32ERWRF.HTML》 * |
Also Published As
Publication number | Publication date |
---|---|
CN106933998B (en) | 2021-03-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9342301B2 (en) | Converting and input script to a natural language description | |
US10534830B2 (en) | Dynamically updating a running page | |
US11138005B2 (en) | Methods and systems for automatically generating documentation for software | |
US20140350913A1 (en) | Translation device and method | |
US20060282453A1 (en) | Methods and systems for transforming an and/or command tree into a command data model | |
US20070006196A1 (en) | Methods and systems for extracting information from computer code | |
US20070006179A1 (en) | Methods and systems for transforming a parse graph into an and/or command tree | |
CN103942137A (en) | Browser compatibility testing method and device | |
CN113051285A (en) | SQL statement conversion method, system, equipment and storage medium | |
CN111831384A (en) | Language switching method and device, equipment and storage medium | |
US11403078B2 (en) | Interface layout interference detection | |
JP2021111327A (en) | Method for generating api knowledge graph, system, and non-transitory computer-readable medium | |
CN113254023B (en) | Object reading method and device and electronic equipment | |
US20080243904A1 (en) | Methods and apparatus for storing XML data in relations | |
CN108509187B (en) | Method and system for automatically generating MIB function code of software platform | |
CN106326314B (en) | Webpage information extraction method and device | |
CN106933998A (en) | A kind of inaccurate method of solution ApacheSolr phrase searches | |
TWI643077B (en) | Method and adjustment device for adaptively adjusting database structure | |
CN112733517B (en) | Method for checking requirement template conformity, electronic equipment and storage medium | |
CN116362219A (en) | Information extraction template generation method and device, medium and equipment | |
CN113608748B (en) | Data processing method, device and equipment for converting C language into Java language | |
KR100921563B1 (en) | Method of sentence compression using the dependency grammar parse tree | |
CA2964481C (en) | Systems and methods for normalized schema comparison | |
JP2015090622A (en) | Shortened sentence generation device, method, and program | |
KR100691261B1 (en) | System and method for supporting xquery update language |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |