CN110069610A - Search method, device, equipment and storage medium based on Solr - Google Patents

Search method, device, equipment and storage medium based on Solr Download PDF

Info

Publication number
CN110069610A
CN110069610A CN201910205809.0A CN201910205809A CN110069610A CN 110069610 A CN110069610 A CN 110069610A CN 201910205809 A CN201910205809 A CN 201910205809A CN 110069610 A CN110069610 A CN 110069610A
Authority
CN
China
Prior art keywords
retrieval
information
chinese
default
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910205809.0A
Other languages
Chinese (zh)
Other versions
CN110069610B (en
Inventor
杨昭
曾文韬
马兰
孙文宇
何维
王海君
刘菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910205809.0A priority Critical patent/CN110069610B/en
Publication of CN110069610A publication Critical patent/CN110069610A/en
Application granted granted Critical
Publication of CN110069610B publication Critical patent/CN110069610B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3322Query formulation using system suggestions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of search methods based on Solr, comprising the following steps: receives information retrieval requests, obtains the corresponding retrieval information of the retrieval request;When the retrieval information is Chinese retrieval information, judge whether the character quantity of the Chinese retrieval information is more than preset standard amount;When the character quantity of the Chinese retrieval information is no more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained;Default searching database is inquired, obtains the corresponding target article of the index field, and export the target article as search result.The invention also discloses a kind of retrieval device, equipment and storage medium based on Solr.The present invention is converted into corresponding index field for information is retrieved by default retrieval dictionary during retrieval, so that information retrieval is more comprehensively accurate by carrying out the default retrieval dictionary of data analysis building to a large amount of text datas.

Description

Search method, device, equipment and storage medium based on Solr
Technical field
The present invention relates to field of computer technology, more particularly to the search method based on Solr, device, equipment and storage are situated between Matter.
Background technique
With the extensive use of big data, in our daily life, more and more data of injection.Data volume Increase also brings corresponding problem while bringing convenient.
For example, how the information required for getting oneself in huge database is at a very big problem.That is, Current information retrieval is normally based on the technology of exact matching or fuzzy matching, since the meaning of retrieval information cannot be parsed, Lead to cannot to carry out the information for accurately comprehensively retrieving user's needs, how more it is accurate comprehensively carry out information retrieval at For current technical problem urgently to be resolved.
Summary of the invention
The main purpose of the present invention is to provide a kind of search method based on Solr, device, equipment and storage medium, purports Solving the incomplete technical problem of current information retrieval inaccuracy.
To achieve the above object, the present invention provides the search method based on Solr, the search method packet based on Solr Include following steps:
Information retrieval requests are received, the corresponding retrieval information of the retrieval request is obtained;
The retrieval information be Chinese retrieval information when, judge the Chinese retrieval information character quantity whether be more than Preset standard amount;
When the character quantity of the Chinese retrieval information is no more than preset standard amount, obtain in default retrieval dictionary with institute State the corresponding index field of Chinese retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and the target article is made For search result output.
Optionally, when the character quantity in the Chinese retrieval information is no more than preset standard amount, default inspection is obtained In rope dictionary the step of index field corresponding with the Chinese retrieval information before, comprising:
Text data is crawled from network, and word segmentation processing is carried out to the text data by default Chinese Word Automatic Segmentation and is obtained Corresponding word, and each word is summarized into composition sample set;
The frequency of occurrences of identical word in the sample set is counted, and by each identical word by frequency of occurrences height Sequence forms word list;
The word for the forward preset quantity that sorts in the word list is chosen as index terms, utilizes the index phrase Corresponding term vector is converted by the index terms in the basic dictionary at basic dictionary, and by default term vector model;
According to the term vector of each index terms, determine the approximate word of each index terms, by the index terms with it is corresponding close It is saved like word association, generates default retrieval dictionary.
Optionally, the term vector according to each index terms determines the approximate word of each index terms, by the index terms The step of being saved with corresponding approximate word association, generating default retrieval dictionary, comprising:
Using each of basic dictionary index terms as the first index terms, first index terms is calculated Cosine value in term vector and the basic dictionary in addition to first index terms between the term vector of the second index terms;
When there is the target cosine value for being greater than default cosine value, the corresponding approximate index of the target cosine value is obtained Word, and using the approximate index terms as the approximate word of first index terms;
First index terms is saved with corresponding approximate word association, generates default retrieval dictionary.
Optionally, when the character quantity in the Chinese retrieval information is no more than preset standard amount, default inspection is obtained In rope dictionary the step of index field corresponding with the Chinese retrieval information, comprising:
When the character quantity of the Chinese retrieval information is no more than preset standard amount, by default Chinese Word Automatic Segmentation to institute It states Chinese retrieval information and carries out word segmentation processing, obtain the corresponding keyword set of the Chinese retrieval information;
Keyword in the keyword set is compared with the index terms in default retrieval dictionary, is obtained and each institute State target index terms similar in keyword and the associated approximate word of the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information.
Optionally, described to inquire default searching database, the corresponding target article of the index field is obtained, and will be described The step of target article is exported as search result, comprising:
The index field is combined, the corresponding retrieval formula of the Chinese retrieval information is obtained, inquires default inspection Rope database obtains the corresponding target article of the retrieval formula;
By default weight mapping table, the weight of each index field in the retrieval formula is set, and presses the index field Weight be each target article be ranked up, formed article sorted lists;
It is exported the article sorted lists as search result.
Optionally, described to inquire default searching database, the corresponding target article of the index field is obtained, and will be described After the step of target article is exported as search result, comprising:
The user behavior data based on the article sorted lists is received, by the browsing time in the user behavior data The retrieval article of user's concern in the article sorted lists is determined with the browsing time and is marked;
When receiving browsing record queries instruction, the retrieval article of mark is exported, for user query.
Optionally, described when the retrieval information is Chinese retrieval information, judge the character of the Chinese retrieval information After the step of whether quantity is more than preset standard amount, comprising:
When the character quantity of the Chinese retrieval information is more than preset standard amount, the Chinese retrieval information is divided Sentence processing, obtains the corresponding simple sentence of the Chinese retrieval information;
The simple sentence that subordinate sentence is handled carries out word segmentation processing, obtains corresponding keyword, and obtains default retrieval dictionary In with the associated approximate word of the approximate target index terms of the keyword and the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and the target article is made For search result output.
In addition, to achieve the above object, the retrieval device the present invention also provides a kind based on Solr is described to be based on Solr Retrieval device include:
Request receiving module obtains the corresponding retrieval information of the retrieval request for receiving information retrieval requests;
Character quantity judgment module, for judging the Chinese retrieval when the retrieval information is Chinese retrieval information Whether the character quantity of information is more than preset standard amount;
It determines index module, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, obtains Index field corresponding with the Chinese retrieval information in default retrieval dictionary;
As a result output module obtains the corresponding target article of the index field for inquiring default searching database, and It is exported the target article as search result.
In addition, to achieve the above object, the present invention also provides a kind of retrieval facilities based on Solr;
The retrieval facility based on Solr includes: memory, processor and is stored on the memory and can be in institute State the computer program run on processor, in which:
The step of the search method based on Solr as described above is realized when the computer program is executed by the processor Suddenly.
In addition, to achieve the above object, the present invention also provides computer storage mediums;
Computer program, the realization when computer program is executed by processor are stored in the computer storage medium Such as the step of the above-mentioned search method based on Solr.
A kind of search method based on Solr, device, equipment and the storage medium that the embodiment of the present invention proposes, pass through reception Information retrieval requests obtain the corresponding retrieval information of the retrieval request;When the retrieval information is Chinese retrieval information, sentence Whether the character quantity for the Chinese retrieval information of breaking is more than preset standard amount;The Chinese retrieval information character quantity not When more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained;Inquiry is default Searching database obtains the corresponding target article of the index field, and exports the target article as search result.? Server grabs the text data of magnanimity from network in the present invention, and is generated by carrying out data analysis to mass text data Default retrieval dictionary, during retrieve information, server obtains retrieval information, then to the information type of retrieval information with Character quantity is identified, after server determines retrieval information for no more than the Chinese retrieval information of preset standard amount, benefit Corresponding index field is converted by Chinese retrieval information with default retrieval dictionary, is then examined using determining index field Rope avoids the case where directly utilizing the missing inspection of Chinese retrieval information appearance, while can effectively identify the language of Chinese retrieval information Justice, so that information retrieval is more comprehensively accurate.
Detailed description of the invention
Fig. 1 is the apparatus structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is that the present invention is based on the flow diagrams of the search method first embodiment of Solr;
Fig. 3 is the functional block diagram of retrieval one embodiment of device the present invention is based on Solr.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
As shown in Figure 1, the server that Fig. 1 is the hardware running environment that the embodiment of the present invention is related to (is called and is based on The retrieval facility of Solr, wherein the retrieval facility based on Solr can be to be made of the retrieval device individually based on Solr, Can be to be combined by other devices with the retrieval device based on Solr and be formed) structural schematic diagram.
Server of the embodiment of the present invention refers to a management resource and provides the computer of service for user, is generally divided into file Server, database server and apps server.The computer or computer system for running the above software are also referred to as Server.For common PC (personal computer) personal computer, server is in stability, safety, property Energy etc. requires higher;As shown in Figure 1, the server may include: processor 1001, such as central processing unit (Central Processing Unit, CPU), network interface 1004, user interface 1003, memory 1005, communication bus 1002, hardware such as chipset, disk system, network etc..Wherein, communication bus 1002 is for realizing the connection between these components Communication.User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard), optional user Interface 1003 can also include standard wireline interface and wireless interface.Network interface 1004 optionally may include having for standard Line interface, wireless interface (such as Wireless Fidelity WIreless-FIdelity, WIFI interface).Memory 1005 can be high speed with Machine accesses memory (random access memory, RAM), is also possible to stable memory (non-volatile ), such as magnetic disk storage memory.Memory 1005 optionally can also be the storage dress independently of aforementioned processor 1001 It sets.
Optionally, server can also include camera, RF (Radio Frequency, radio frequency) circuit, sensor, sound Frequency circuit, WiFi module;Input unit, than display screen, touch screen;Network interface can be blue in blanking wireless interface in addition to WiFi Tooth, probe etc..It will be understood by those skilled in the art that server architecture shown in Fig. 1 does not constitute the restriction to server, It may include perhaps combining certain components or different component layouts than illustrating more or fewer components.
As shown in Figure 1, the computer software product, which is stored in a storage medium, (storage medium: is called computer storage Medium, computer media, readable medium, readable storage medium storing program for executing, computer readable storage medium are directly medium etc., storage Medium can be non-volatile readable storage medium, such as RAM, magnetic disk, CD) in, including some instructions use is so that an end End equipment (can be mobile phone, computer, server, air conditioner or the network equipment etc.) executes each embodiment institute of the present invention The method stated, as may include operating system, network communication module, use in a kind of memory 1005 of computer storage medium Family interface module and computer program.
In server shown in Fig. 1, network interface 1004 be mainly used for connect background data base, with background data base into Row data communication;User interface 1003 is mainly used for connection client, and (client, is called user terminal or terminal, and the present invention is implemented Example terminal can be also possible to mobile terminal with fixed terminal, e.g., intelligent air condition, intelligent electric lamp, intelligent power with network savvy, Intelligent sound box, autonomous driving vehicle, PC, smart phone, tablet computer, E-book reader, portable computer etc., are wrapped in terminal Containing sensor such as optical sensor, motion sensor and other sensors, details are not described herein), data are carried out with client Communication;And processor 1001 can be used for calling the computer program stored in memory 1005, and it is real to execute the present invention or less Step in the search method based on Solr of example offer is provided.
The present embodiment proposes a kind of search method based on Solr, applied to server as shown in Figure 1.It is each in the present invention Solr in a embodiment is developed using Java language, is based primarily upon hypertext transfer protocol (HTTP) and Apache Lucene (the full-text search engine kit that Apache Lucene is an open source code) is realized.It independent is searched that is, Solr is one Rope application server externally provides the api interface for being similar to Web-service.
Referring to Fig. 2, the present invention is based in the first embodiment of the search method of Solr, the retrieval side based on Solr Method includes:
Step S10 receives information retrieval requests, obtains the corresponding retrieval information of the retrieval request.
Server receives information retrieval requests, and after server receives information retrieval requests, server obtains information The corresponding retrieval information of retrieval request;Wherein, the triggering mode for the information retrieval requests that server receives is not especially limited, For example, user inputs sentence: " accuracy for how improving information retrieval " in terminal, and it is based on " how improving information retrieval Accuracy " triggers information retrieval requests, and information retrieval requests are sent to server by terminal, and server receives information retrieval and asks It asks, " accuracy for how improving information retrieval " is used as the corresponding retrieval information of information retrieval requests by server;For another example, it uses Family inputs the article for needing duplicate checking, and voice input " searching similar article " triggers information retrieval requests, and terminal is by information Retrieval request is sent to server, and server receives information retrieval requests, and server will need the article of duplicate checking as the information The corresponding retrieval information of retrieval request.
It should be noted that retrieval information can be it is different types of, for example, Chinese or English.That is, being obtained in server After getting retrieval information, server determines whether the retrieval information is Chinese first according to the character types of retrieval information, if When retrieval information is foreign language, then it is Chinese inspection in retrieval information that retrieval information can be translated as Chinese retrieval information by server When rope information, following steps are executed, specifically:
Step S20 judges the character quantity of the Chinese retrieval information when the retrieval information is Chinese retrieval information It whether is more than preset standard amount.
After server determines retrieval information for Chinese retrieval information according to the character types of retrieval information, server is obtained The character quantity (or being called information content) of retrieval information is taken, then, server will retrieve the character quantity and pre- bidding of information Quasi- amount is compared, wherein preset standard amount refers to pre-set character quantity critical value, for example, preset standard amount is arranged For 100 bytes, if the character quantity of retrieval information is no more than preset standard amount, that is, the word for the retrieval information that server obtains It is smaller to accord with quantity, server carries out information retrieval using word conversion retrieval mode, specifically:
Step S30 obtains default term when the character quantity of the Chinese retrieval information is no more than preset standard amount Index field corresponding with the Chinese retrieval information in allusion quotation.
When server determines that the character quantity of Chinese retrieval information is no more than preset standard amount, server is in order to realize standard It really rapidly searches, server is needed Chinese retrieval information processing, that is,
Step S31 carries out word segmentation processing to the Chinese retrieval information by default Chinese Word Automatic Segmentation, obtains the Chinese Retrieve the corresponding keyword set of information.
Chinese retrieval information is pressed default Chinese Word Automatic Segmentation and carries out word segmentation processing by server, and it is corresponding to obtain retrieval information Keyword set;Wherein, default Chinese Word Automatic Segmentation refer to it is pre-set by a chinese character sequence be cut into one by one individually Word algorithm, for example, default Chinese Word Automatic Segmentation can are as follows: the segmentation methods based on string matching or the participle based on statistics Algorithm.
Keyword in the keyword set is compared with the index terms in default retrieval dictionary, obtains by step S32 It takes and the associated approximate word of target index terms similar in each keyword and the target index terms;The target is indexed Word and the corresponding approximate word are as the corresponding index field of the Chinese retrieval information.
Keyword in keyword set is compared (its with each index terms in default retrieval dictionary by server In, default retrieval dictionary refers to then pre-set index terms dictionary is chosen for example, server crawls network mass data 50000 index terms establish retrieval dictionary, store its similar approximate word respectively for each index terms in retrieval dictionary), Server calculates the similarity of the keyword in keyword set and each index terms in default retrieval dictionary, and server obtains With the highest index terms of crucial Word similarity, server is being obtained using the index terms as the target index terms with Keywords matching To after target index terms, server obtains the associated approximate word of the target index terms in default retrieval dictionary;Server should The associated approximate word of target index terms and the target index terms is as the corresponding index field of Chinese retrieval information.
Retrieval dictionary is preset in the present embodiment, server is by retrieving dictionary for the corresponding Chinese retrieval of retrieval request The case where information is converted into corresponding index field, and retrieval can be made more comprehensive, prevent retrieval from omitting.
Step S40 inquires default searching database, obtains the corresponding target article of the index field, and by the mesh Article is marked to export as search result.
After obtaining the corresponding index field of Chinese retrieval information, server inquires default searching database, wherein pre- If searching database refers to the corresponding database of user search information, for example, Baidu library;It is corresponding that server obtains index field Target article, and exported target article as the corresponding search result of retrieval request.
Server inquires default searching database, obtains the target article comprising index field;For example, server is default It include that the associated approximate word of " house property " has " real estate " in retrieval dictionary, " developer ", " room rate " is then received in server Chinese retrieval information is " flat price is how many when ", and server can convert Chinese retrieval information to index field " real estate The current xxx " in boundary, that is to say, that even if absolutely not same words in article in default searching database, search engine remains unchanged energy From wherein finding out certain semantic association, that is, the present invention using default retrieval dictionary scans for that the accurate of retrieval can be improved Degree.
Server grabs the text data of magnanimity from network in the present embodiment, and by carrying out to mass text data Data analysis generates default retrieval dictionary, and during retrieving information, server obtains retrieval information, then to retrieval information Information type and character quantity identified, server determine retrieval information be no more than preset standard amount Chinese retrieval After information, corresponding index field is converted for Chinese retrieval information by default retrieval dictionary, then utilizes determining rope Draw field to be retrieved, so that information retrieval is more comprehensively accurate.
Further, on the basis of first embodiment of the invention, the search method the present invention is based on Solr is proposed Second embodiment.
The present embodiment be in first embodiment before step S30 the step of, server obtains default retrieval in step s 30 Index field corresponding with Chinese retrieval information in dictionary, before this, server needs to pre-establish retrieval dictionary, this implementation The establishment step of retrieval dictionary is specifically illustrated in example, comprising:
Step S01, crawls text data from network, segments by default Chinese Word Automatic Segmentation to the text data Processing obtains corresponding word, and each word is summarized composition sample set.
Server crawls the text data of magnanimity from network, and server extracts text by handling text data The word for including in notebook data, specifically, comprising: 1, server pre-processes text data: data prediction, including letter Traditional font conversion, removes xml symbol, and by word contents processing, data, default term vector model (word2vec) are trained in single file Principle is the semantic relation trained between word based on Term co-occurrence.Different entry contents need to be separated and be trained;2, server is by pre- If Chinese Word Automatic Segmentation carries out word segmentation processing to the text data and obtains corresponding word, corresponding word is obtained;Wherein, in advance If Chinese Word Automatic Segmentation is identical with the first embodiment, the present embodiment is not repeated.After text data word segmentation processing, server It carries out the word obtained after text data word segmentation processing to summarize composition sample set.
Step S02 counts the frequency of occurrences of identical word in the sample set, and by each identical word by appearance Frequency height sorts, and forms word list.
The frequency of occurrences of identical word in server statistics sample set, and identical word is pressed out into existing frequency height and is arranged Sequence forms word list, that is, Chinese word is more under normal conditions, and comprising the uncommon word being of little use, server selects word The higher word of the frequency of occurrences carries out term vector training, specifically:
Step S03, chooses the word of the forward preset quantity that sorts in the word list as index terms, using described Index terms forms basic dictionary, and converts corresponding word for the index terms in the basic dictionary by default term vector model Vector.
Server chooses the index terms for the forward preset quantity that sorts in word list, wherein preset quantity refers in advance The index terms quantity of the retrieval dictionary of setting, for example, preset quantity is set as 5000, that is, server is chosen from word list Higher 5000 everyday words of the frequency of occurrences, server is using 5000 everyday words as index terms, and server is by the index of selection Vocabulary is total, forms basic dictionary.
After obtaining basic dictionary, index terms in basic dictionary is carried out characteristic processing by server: also make term vector encode, Common coding mode has one hot coding (BOW bag of words discrete representation mode) and word-based vector model even depth The dense vector of low-dimensional that model training obtains is practised, term vector model word2vec is commonly referred to as word embedding Distributed representation;Then, server carries out term vector training by the method for machine learning, that is, clothes After business device is encoded term vector, text data can be converted into numeric data, be input to preset machine learning Model carries out calculating training.
That is, index terms is constituted input layer by default term vector model by server, each word use one-hot to Amount form indicates that if vocabulary is V, each word means that into V dimensional vector, corresponding word corresponding element is configured to 1, Remaining is 0.One-hot vector is multiplied with weight matrix W1 and is equivalent to simply select a line in W1.If input C term vector, the activation primitive of hidden layer are exactly the hot spot row in statistical matrix in fact, are then averaged divided by C. That is, the activation primitive of implicit layer unit is exactly simple linear operation (directly by weight and defeated as next layer Enter).From hidden layer to output layer, the score in vocabulary is calculated for each word with a weight matrix W2, is obtained each high The corresponding distributed term vector of frequency word.
Step S04 determines the approximate word of each index terms according to the term vector of each index terms, by the index terms with Corresponding approximation word association saves, and generates default retrieval dictionary.
After term vector training is completed, server generates default retrieval dictionary according to the term vector of index terms, specifically, Include:
Step a calculates first rope using each of basic dictionary index terms as the first index terms Draw the cosine in the term vector and the basic dictionary of word in addition to first index terms between the term vector of the second index terms Value;
Step b obtains the corresponding approximation of the target cosine value when there is the target cosine value for being greater than default cosine value Index terms, and using the approximate index terms as the approximate word of first index terms;
First index terms is saved with corresponding approximate word association, generates default retrieval dictionary by step c.
That is, server is using each index terms as the first index terms, and calculate the term vector and base of the first index terms Cosine value in plinth dictionary in addition to the first index terms between the term vector of other second index terms;That is, server cosine The similarity of value the first index terms of characterization and other second index terms, server is by the cosine value being calculated and default cosine Value is compared, wherein default cosine value refers to pre-set cosine value critical value, for example, default cosine value is set as 0.9;Server determines that, when there is the target cosine value for being greater than default cosine value, it is corresponding close that server obtains target cosine value Like index terms, and using approximate index terms as the approximate word of the first index terms;By first index terms and corresponding approximate word Association saves, and generates default retrieval dictionary.
In the present embodiment by default retrieval dictionary, it can make server that will retrieve information according to default retrieval dictionary It is converted, obtains corresponding index information, so that the meaning of server parsing retrieval information, so that retrieval is more accurate.
Further, on the basis of the above embodiments, the third for proposing the search method the present invention is based on Solr is real Apply example.
The present embodiment is the refinement of step S40 in first embodiment, and retrieval information is specifically illustrated in the present embodiment and is determined The step of, the search method based on Solr includes:
The index field is combined by step S41, is obtained the corresponding retrieval formula of the Chinese retrieval information, is looked into Default searching database is ask, the corresponding target article of the retrieval formula is obtained.
Index field is combined by server, obtains the corresponding retrieval formula of Chinese retrieval information, then, server is looked into Default searching database is ask, the corresponding target article of retrieval formula is obtained.That is, server merges index field, for example, The index field of one article is that the conjunctive word that " house property " server obtains " house property " has " real estate ", " developer ", then, service Device merges " house property ", " real estate " and " developer ", generates corresponding index xml, only needs to inquire one in server inquiry It is secondary, so that it may to inquire the target article comprising " house property ", " real estate " and " developer ".
The weight of each index field in the retrieval formula is arranged by default weight mapping table in step S42, and by described The weight of index field is that each target article is ranked up, and forms article sorted lists;Using the article sorted lists as inspection The output of hitch fruit.
After obtaining target article, server presses default weight mapping table, and (default weight mapping table setting is preset Word type and weight mapping table, such as default weight mapping table in be provided with title respective weights 50%, adjective is corresponding Weight is 30%, and pronoun respective weights are that 20%), server obtains the weight of each index field, and presses the weight of index field It is ranked up for each target article, forms article sorted lists;It is exported article sorted lists as search result.
The significance level of server each index field when being indexed may be different in the present embodiment, in order to make The target article that must be inquired is more accurate, can preset and different weight rules is arranged, user is rapidly checked To the information of needs.
Further, on the basis of 3rd embodiment, propose the search method the present invention is based on Solr the 4th is real Apply example.
The present embodiment be in first embodiment after step S40 the step of, server can be according to user in the present embodiment Behavioral data carries out mark and the guarantee of retrieval article, specifically, comprising:
Step S50 receives the user behavior data based on the article sorted lists, by the user behavior data Browsing time and browsing time determine the retrieval article that user pays close attention in the article sorted lists and mark.
Server receives the user behavior data based on article sorted lists, that is, includes multiple in article sorted lists Article, user can check each article, collection of server user behavior data, and press the browsing time in user behavior data The several and browsing time determines the retrieval article that user pays close attention in article sorted lists and marks.
Step S60 exports the retrieval article of mark, when receiving browsing record queries instruction so that user looks into It askes.
User can trigger browsing record queries instruction with terminal, and terminal will browse record queries instruction and be sent to server, Server is when receiving browsing record queries instruction, the retrieval article of server output mark, for user query.? Server saves browsing record early period of user according to user behavior data in the present embodiment, can be convenient user into Row is paid a return visit.
Further, on the basis of the above embodiments, propose the search method the present invention is based on Solr the 5th is real Apply example.
The present embodiment be in first embodiment after step S20 the step of, specifically in Chinese retrieval in the present embodiment It is the search method of server, specifically when the character quantity of information is more than preset standard amount, comprising:
Step S70 believes the Chinese retrieval when the character quantity of the Chinese retrieval information is more than preset standard amount Breath carries out subordinate sentence processing, obtains the corresponding simple sentence of the Chinese retrieval information.
When server determines the character quantity of Chinese retrieval information more than preset standard amount, server believes Chinese retrieval Breath carries out a point processing, obtains the corresponding simple sentence of Chinese retrieval information, wherein server carries out at subordinate sentence Chinese retrieval information Reason, can be divided into two kinds of situations, and it is a very long complex sentence that a kind of situation, which is Chinese retrieval information, in order to improve information retrieval standard One complex sentence is split as multiple simple sentences arranged side by side by exactness, server, another situation is that Chinese retrieval information is a text Chapter or a paragraph, server are divided into multiple simple sentences according to its punctuate.
Step S80, the simple sentence that subordinate sentence is handled carry out word segmentation processing, obtain corresponding keyword, and obtain default Retrieve in dictionary with the associated approximate word of the approximate target index terms of the keyword and the target index terms;By the mesh Index terms and the corresponding approximate word are marked as the corresponding index field of the Chinese retrieval information.
The simple sentence that server handles subordinate sentence carries out word segmentation processing, obtains corresponding keyword, wherein simple sentence participle Processing is referred to first embodiment, does not repeat in the present embodiment, and server obtains close with keyword in default retrieval dictionary As target index terms, that is, keyword is compared server with the index terms in default retrieval dictionary, obtains and keyword For similar index terms as target index terms, server obtains the associated approximate word of target index terms in default retrieval dictionary;Clothes Device be engaged in using target index terms and corresponding approximate word as the corresponding index field of Chinese retrieval information.
Step S90 inquires default searching database, obtains the corresponding target article of the index field, and by the mesh Article is marked to export as search result.
After obtaining the corresponding target index terms of Chinese retrieval information, server inquires default searching database, wherein Default searching database refers to the corresponding database of user search information, for example, Baidu library;Server obtains index field pair The target article answered, and exported target article as the corresponding search result of retrieval request.
The Chinese information that character quantity is more than preset standard amount is carried out subordinate sentence processing by server in the present embodiment, and is pressed Information retrieval is carried out according to the step in first embodiment, so that information retrieval is more intelligent.
In addition, the embodiment of the present invention also proposes a kind of retrieval device based on Solr, described based on Solr's referring to Fig. 3 Retrieving device includes:
Request receiving module 10 obtains the corresponding retrieval information of the retrieval request for receiving information retrieval requests;
Character quantity judgment module 20, for when the retrieval information is Chinese retrieval information, judging the Chinese inspection Whether the character quantity of rope information is more than preset standard amount;
It determines index module 30, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, obtains Take index field corresponding with the Chinese retrieval information in default retrieval dictionary;
As a result output module 40 obtain the corresponding target article of the index field for inquiring default searching database, And it is exported the target article as search result.
Optionally, the retrieval device based on Solr, comprising:
Sample process module, for crawling text data from network, by default Chinese Word Automatic Segmentation to the textual data Corresponding word is obtained according to word segmentation processing is carried out, and each word is summarized into composition sample set;
Frequency statistics module, for counting the frequency of occurrences of identical word in the sample set, and will be each identical Word sorts by frequency of occurrences height, forms word list;
Word training module, for choosing the word for the forward preset quantity that sorts in the word list as index Word forms basic dictionary using the index terms, and is turned the index terms in the basic dictionary by default term vector model Turn to corresponding term vector;
Dictionary generation module determines the approximate word of each index terms for the term vector according to each index terms, will be described Index terms is saved with corresponding approximate word association, generates default retrieval dictionary.
Optionally, the dictionary generation module, comprising:
Cosine calculating is used for, for counting using each of basic dictionary index terms as the first index terms Calculate first index terms term vector and the basic dictionary in addition to first index terms the second index terms word to Cosine value between amount;
Similar word query unit, for obtaining more than the target when there is the target cosine value for being greater than default cosine value The corresponding approximate index terms of string value, and using the approximate index terms as the approximate word of first index terms;
Generation unit is saved, for saving first index terms with corresponding approximate word association, generates default retrieval Dictionary.
Optionally, the determining index module 30, comprising:
Participle unit, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, in default Literary segmentation methods carry out word segmentation processing to the Chinese retrieval information, obtain the corresponding keyword set of the Chinese retrieval information It closes;
Word comparing unit, for by the index terms in the keyword and default retrieval dictionary in the keyword set into Row compares, and obtains and the associated approximate word of target index terms similar in each keyword and the target index terms;
Index field determination unit, for being examined using the target index terms and the corresponding approximate word as the Chinese The corresponding index field of rope information.
Optionally, the result output module 40, comprising:
Information assembled unit obtains the corresponding inspection of the Chinese retrieval information for the index field to be combined Rope formula inquires default searching database, obtains the corresponding target article of the retrieval formula;
Article sequencing unit, for the weight of each index field in the retrieval formula to be arranged by default weight mapping table, And be ranked up by the weight of the index field for each target article, form article sorted lists;
Information output unit, for being exported the article sorted lists as search result.
Optionally, the retrieval device based on Solr, comprising:
Article standard module, for receiving the user behavior data based on the article sorted lists, by user's row For in data browsing time and the browsing time determine retrieval article that user in the article sorted lists pays close attention to and mark;
Standard output module, for exporting the retrieval article of mark when receiving browsing record queries instruction, with For user query.
Optionally, the retrieval device based on Solr, comprising:
Subordinate sentence processing module will be described when being more than preset standard amount for the character quantity in the Chinese retrieval information Chinese retrieval information carries out subordinate sentence processing, obtains the corresponding simple sentence of the Chinese retrieval information;
Word comparison module, the simple sentence for handling subordinate sentence carry out word segmentation processing, obtain corresponding keyword, and Obtain in default retrieval dictionary with the associated approximate word of the approximate target index terms of the keyword and the target index terms; Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information;
Search and output module obtains the corresponding target article of the index field for inquiring default searching database, and It is exported the target article as search result.
Wherein, the step of each Implement of Function Module of the retrieval device based on Solr can refer to that the present invention is based on Solr's Each embodiment of search method, details are not described herein again.
In addition, the embodiment of the present invention also proposes a kind of computer storage medium.
Computer program, the realization when computer program is executed by processor are stored in the computer storage medium Operation in search method provided by the above embodiment based on Solr.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body/operation/object is distinguished with another entity/operation/object, without necessarily requiring or implying these entity/operations/ There are any actual relationship or orders between object;The terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that the process, method, article or the system that include a series of elements not only include that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of system.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except there is also other identical elements in process, method, article or the system for including the element.
For device embodiment, since it is substantially similar to the method embodiment, related so describing fairly simple Place illustrates referring to the part of embodiment of the method.The apparatus embodiments described above are merely exemplary, wherein making It may or may not be physically separated for the unit of separate part description.In can selecting according to the actual needs Some or all of the modules realize the purpose of the present invention program.Those of ordinary skill in the art are not making the creative labor In the case where, it can it understands and implements.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone, Computer, server, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of search method based on Solr, which is characterized in that the search method based on Solr the following steps are included:
Information retrieval requests are received, the corresponding retrieval information of the retrieval request is obtained;
When the retrieval information is Chinese retrieval information, judge whether the character quantity of the Chinese retrieval information is more than default Standard volume;
When the character quantity of the Chinese retrieval information is no more than preset standard amount, obtain in default retrieval dictionary with it is described in The corresponding index field of text retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and using the target article as inspection The output of hitch fruit.
2. as described in claim 1 based on the search method of Solr, which is characterized in that described in the Chinese retrieval information When character quantity is no more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained The step of before, comprising:
Text data is crawled from network, and word segmentation processing is carried out to the text data by default Chinese Word Automatic Segmentation and is corresponded to Word, and each word is summarized into composition sample set;
The frequency of occurrences of identical word in the sample set is counted, and each identical word is arranged by frequency of occurrences height Sequence forms word list;
The word for the forward preset quantity that sorts in the word list is chosen as index terms, forms base using the index terms Plinth dictionary, and corresponding term vector is converted for the index terms in the basic dictionary by default term vector model;
According to the term vector of each index terms, the approximate word of each index terms is determined, by the index terms and corresponding approximate word Association saves, and generates default retrieval dictionary.
3. as claimed in claim 2 based on the search method of Solr, which is characterized in that the word according to each index terms to Amount, determines the approximate word of each index terms, and the index terms is saved with corresponding approximate word association, generates default term The step of allusion quotation, comprising:
Using each of basic dictionary index terms as the first index terms, calculate the word of first index terms to Cosine value in amount and the basic dictionary in addition to first index terms between the term vector of the second index terms;
When there is the target cosine value for being greater than default cosine value, the corresponding approximate index terms of the target cosine value is obtained, and Using the approximate index terms as the approximate word of first index terms;
First index terms is saved with corresponding approximate word association, generates default retrieval dictionary.
4. as described in claim 1 based on the search method of Solr, which is characterized in that described in the Chinese retrieval information When character quantity is no more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained The step of, comprising:
When the character quantity of the Chinese retrieval information is no more than preset standard amount, by default Chinese Word Automatic Segmentation in described Text retrieval information carries out word segmentation processing, obtains the corresponding keyword set of the Chinese retrieval information;
Keyword in the keyword set is compared with the index terms in default retrieval dictionary, is obtained and each pass Target index terms similar in keyword and the associated approximate word of the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information.
5. as described in claim 1 based on the search method of Solr, which is characterized in that it is described to inquire default searching database, Obtain the corresponding target article of the index field, and the step of target article is exported as search result, comprising:
The index field is combined, the corresponding retrieval formula of the Chinese retrieval information is obtained, inquires default retrieval number According to library, the corresponding target article of the retrieval formula is obtained;
By default weight mapping table, the weight of each index field in the retrieval formula is set, and presses the power of the index field Weight is that each target article is ranked up, and forms article sorted lists;
It is exported the article sorted lists as search result.
6. as claimed in claim 5 based on the search method of Solr, which is characterized in that it is described to inquire default searching database, The corresponding target article of the index field is obtained, and after the step of target article is exported as search result, packet It includes:
The user behavior data based on the article sorted lists is received, by browsing time in the user behavior data and clear It lookes at and retrieval article that the time determines that user in the article sorted lists pays close attention to and marks;
When receiving browsing record queries instruction, the retrieval article of mark is exported, for user query.
7. as described in claim 1 based on the search method of Solr, which is characterized in that it is described the retrieval information be Chinese When retrieving information, after the step of whether character quantity for judging the Chinese retrieval information is more than preset standard amount, comprising:
When the character quantity of the Chinese retrieval information is more than preset standard amount, the Chinese retrieval information is carried out at subordinate sentence Reason, obtains the corresponding simple sentence of the Chinese retrieval information;
The simple sentence that subordinate sentence is handled carries out word segmentation processing, obtains corresponding keyword, and obtain in default retrieval dictionary with The approximate target index terms of keyword and the associated approximate word of the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and using the target article as inspection The output of hitch fruit.
8. a kind of retrieval device based on Solr, which is characterized in that the retrieval device based on Solr includes:
Request receiving module obtains the corresponding retrieval information of the retrieval request for receiving information retrieval requests;
Character quantity judgment module, for judging the Chinese retrieval information when the retrieval information is Chinese retrieval information Character quantity whether be more than preset standard amount;
It determines index module, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, obtains default Retrieve index field corresponding with the Chinese retrieval information in dictionary;
As a result output module obtains the corresponding target article of the index field for inquiring default searching database, and by institute Target article is stated to export as search result.
9. a kind of retrieval facility based on Solr, which is characterized in that the retrieval facility based on Solr includes: memory, place It manages device and is stored in the computer program that can be run on the memory and on the processor, in which:
When the computer program is executed by the processor realize as described in any one of claims 1 to 7 based on Solr Search method the step of.
10. a kind of computer storage medium, which is characterized in that be stored with computer program, institute in the computer storage medium State the search method based on Solr realized as described in any one of claims 1 to 7 when computer program is executed by processor The step of.
CN201910205809.0A 2019-03-16 2019-03-16 Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium Active CN110069610B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910205809.0A CN110069610B (en) 2019-03-16 2019-03-16 Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910205809.0A CN110069610B (en) 2019-03-16 2019-03-16 Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110069610A true CN110069610A (en) 2019-07-30
CN110069610B CN110069610B (en) 2024-03-19

Family

ID=67365343

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910205809.0A Active CN110069610B (en) 2019-03-16 2019-03-16 Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110069610B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619067A (en) * 2019-08-27 2019-12-27 深圳证券交易所 Industry classification-based retrieval method and retrieval device and readable storage medium
CN110705302A (en) * 2019-10-11 2020-01-17 掌阅科技股份有限公司 Named entity recognition method, electronic device and computer storage medium
CN110941702A (en) * 2019-11-26 2020-03-31 北京明略软件系统有限公司 Retrieval method and device for laws and regulations and laws and readable storage medium
CN111078960A (en) * 2019-12-20 2020-04-28 金现代信息产业股份有限公司 Method and system for realizing real-time retrieval of power dispatching system equipment
CN111209378A (en) * 2019-12-26 2020-05-29 航天信息股份有限公司企业服务分公司 Ordered hierarchical ordering method based on business dictionary weight
CN111223533A (en) * 2019-12-24 2020-06-02 深圳市联影医疗数据服务有限公司 Medical data retrieval method and system
CN111708942A (en) * 2020-06-12 2020-09-25 北京达佳互联信息技术有限公司 Multimedia resource pushing method, device, server and storage medium
CN111767378A (en) * 2020-06-24 2020-10-13 北京墨丘科技有限公司 Method and device for intelligently recommending scientific and technical literature
CN111859091A (en) * 2020-07-21 2020-10-30 山东省科院易达科技咨询有限公司 Search result aggregation method and device based on artificial intelligence
CN112052309A (en) * 2020-09-07 2020-12-08 深圳壹账通智能科技有限公司 Text data retrieval method, related equipment and readable storage medium
CN112380445A (en) * 2020-11-30 2021-02-19 深圳前海微众银行股份有限公司 Data query method, device, equipment and storage medium
CN112749162A (en) * 2020-12-31 2021-05-04 浙江省方大标准信息有限公司 ES-based rapid retrieval and sorting method for inspection and detection mechanism
CN114186059A (en) * 2021-11-01 2022-03-15 东风汽车集团股份有限公司 Article classification method and device
CN115455147A (en) * 2022-09-09 2022-12-09 浪潮卓数大数据产业发展有限公司 Full-text retrieval method and system
CN115495483A (en) * 2022-09-21 2022-12-20 企查查科技有限公司 Data batch processing method, device, equipment and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010267247A (en) * 2010-02-08 2010-11-25 Ntt Data Corp Device and method for retrieving information, terminal equipment, and program
WO2014087424A2 (en) * 2012-12-03 2014-06-12 Parthys Reverse Informatics Analytic Solutions (P) Ltd. Information retrieval, extraction and visualisation
CN108038096A (en) * 2017-11-10 2018-05-15 平安科技(深圳)有限公司 Knowledge database documents method for quickly retrieving, application server computer readable storage medium storing program for executing
CN108509474A (en) * 2017-09-15 2018-09-07 腾讯科技(深圳)有限公司 Search for the synonym extended method and device of information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010267247A (en) * 2010-02-08 2010-11-25 Ntt Data Corp Device and method for retrieving information, terminal equipment, and program
WO2014087424A2 (en) * 2012-12-03 2014-06-12 Parthys Reverse Informatics Analytic Solutions (P) Ltd. Information retrieval, extraction and visualisation
CN108509474A (en) * 2017-09-15 2018-09-07 腾讯科技(深圳)有限公司 Search for the synonym extended method and device of information
CN108038096A (en) * 2017-11-10 2018-05-15 平安科技(深圳)有限公司 Knowledge database documents method for quickly retrieving, application server computer readable storage medium storing program for executing

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619067A (en) * 2019-08-27 2019-12-27 深圳证券交易所 Industry classification-based retrieval method and retrieval device and readable storage medium
CN110705302A (en) * 2019-10-11 2020-01-17 掌阅科技股份有限公司 Named entity recognition method, electronic device and computer storage medium
CN110705302B (en) * 2019-10-11 2023-12-12 掌阅科技股份有限公司 Named entity identification method, electronic equipment and computer storage medium
CN110941702A (en) * 2019-11-26 2020-03-31 北京明略软件系统有限公司 Retrieval method and device for laws and regulations and laws and readable storage medium
CN111078960B (en) * 2019-12-20 2023-09-05 金现代信息产业股份有限公司 Method and system for realizing real-time retrieval of power dispatching system equipment
CN111078960A (en) * 2019-12-20 2020-04-28 金现代信息产业股份有限公司 Method and system for realizing real-time retrieval of power dispatching system equipment
CN111223533A (en) * 2019-12-24 2020-06-02 深圳市联影医疗数据服务有限公司 Medical data retrieval method and system
CN111223533B (en) * 2019-12-24 2024-02-13 深圳市联影医疗数据服务有限公司 Medical data retrieval method and system
CN111209378A (en) * 2019-12-26 2020-05-29 航天信息股份有限公司企业服务分公司 Ordered hierarchical ordering method based on business dictionary weight
CN111209378B (en) * 2019-12-26 2024-03-12 航天信息股份有限公司企业服务分公司 Ordered hierarchical ordering method based on business dictionary weights
CN111708942A (en) * 2020-06-12 2020-09-25 北京达佳互联信息技术有限公司 Multimedia resource pushing method, device, server and storage medium
CN111708942B (en) * 2020-06-12 2023-08-08 北京达佳互联信息技术有限公司 Multimedia resource pushing method, device, server and storage medium
CN111767378A (en) * 2020-06-24 2020-10-13 北京墨丘科技有限公司 Method and device for intelligently recommending scientific and technical literature
CN111859091A (en) * 2020-07-21 2020-10-30 山东省科院易达科技咨询有限公司 Search result aggregation method and device based on artificial intelligence
CN111859091B (en) * 2020-07-21 2021-06-04 山东省科院易达科技咨询有限公司 Search result aggregation method and device based on artificial intelligence
CN112052309A (en) * 2020-09-07 2020-12-08 深圳壹账通智能科技有限公司 Text data retrieval method, related equipment and readable storage medium
CN112380445A (en) * 2020-11-30 2021-02-19 深圳前海微众银行股份有限公司 Data query method, device, equipment and storage medium
CN112749162B (en) * 2020-12-31 2021-08-17 浙江省方大标准信息有限公司 ES-based rapid retrieval and sorting method for inspection and detection mechanism
CN112749162A (en) * 2020-12-31 2021-05-04 浙江省方大标准信息有限公司 ES-based rapid retrieval and sorting method for inspection and detection mechanism
CN114186059A (en) * 2021-11-01 2022-03-15 东风汽车集团股份有限公司 Article classification method and device
CN115455147A (en) * 2022-09-09 2022-12-09 浪潮卓数大数据产业发展有限公司 Full-text retrieval method and system
CN115495483A (en) * 2022-09-21 2022-12-20 企查查科技有限公司 Data batch processing method, device, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN110069610B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
CN110069610A (en) Search method, device, equipment and storage medium based on Solr
US8468156B2 (en) Determining a geographic location relevant to a web page
US6594654B1 (en) Systems and methods for continuously accumulating research information via a computer network
US8478749B2 (en) Method and apparatus for determining relevant search results using a matrix framework
US9881037B2 (en) Method for systematic mass normalization of titles
US10552467B2 (en) System and method for language sensitive contextual searching
US20120265787A1 (en) Identifying query formulation suggestions for low-match queries
US20120131033A1 (en) Automated scheme for identifying user intent in real-time
US20060161543A1 (en) Systems and methods for providing search results based on linguistic analysis
WO2016100835A1 (en) Question answering from structured and unstructured data sources
Im et al. Linked tag: image annotation using semantic relationships between image tags
WO2009039392A1 (en) A system for entity search and a method for entity scoring in a linked document database
CN107085583B (en) Electronic document management method and device based on content
CN103136228A (en) Image search method and image search device
US9971782B2 (en) Document tagging and retrieval using entity specifiers
CN102200974A (en) Unified information retrieval intelligent agent system and method for search engine
CN110245357B (en) Main entity identification method and device
US20090265383A1 (en) System and method for providing image labeling game using cbir
US9305103B2 (en) Method or system for semantic categorization
CN111949755B (en) Information query method and device for hazardous chemicals, electronic equipment and medium
CN114064606A (en) Database migration method, device, equipment, storage medium and system
CN111222918A (en) Keyword mining method and device, electronic equipment and storage medium
CN112613320A (en) Method and device for acquiring similar sentences, storage medium and electronic equipment
WO2021210210A1 (en) Document retrieval device, document retrieval system, and document retrieval method
de La Cruz-Caicedo et al. Semantic annotation of SOAP web services based on word sense disambiguation techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant