CN110069610A - Search method, device, equipment and storage medium based on Solr - Google Patents
Search method, device, equipment and storage medium based on Solr Download PDFInfo
- Publication number
- CN110069610A CN110069610A CN201910205809.0A CN201910205809A CN110069610A CN 110069610 A CN110069610 A CN 110069610A CN 201910205809 A CN201910205809 A CN 201910205809A CN 110069610 A CN110069610 A CN 110069610A
- Authority
- CN
- China
- Prior art keywords
- retrieval
- information
- chinese
- default
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 230000011218 segmentation Effects 0.000 claims description 30
- 238000012545 processing Methods 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 12
- 230000006399 behavior Effects 0.000 claims description 11
- 238000007689 inspection Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 8
- 239000000203 mixture Substances 0.000 claims description 5
- 235000013399 edible fruits Nutrition 0.000 claims description 3
- 238000007405 data analysis Methods 0.000 abstract description 3
- 241001269238 Data Species 0.000 abstract 1
- 239000000523 sample Substances 0.000 description 10
- 238000004891 communication Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003203 everyday effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000003542 behavioural effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 230000002354 daily effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3322—Query formulation using system suggestions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of search methods based on Solr, comprising the following steps: receives information retrieval requests, obtains the corresponding retrieval information of the retrieval request;When the retrieval information is Chinese retrieval information, judge whether the character quantity of the Chinese retrieval information is more than preset standard amount;When the character quantity of the Chinese retrieval information is no more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained;Default searching database is inquired, obtains the corresponding target article of the index field, and export the target article as search result.The invention also discloses a kind of retrieval device, equipment and storage medium based on Solr.The present invention is converted into corresponding index field for information is retrieved by default retrieval dictionary during retrieval, so that information retrieval is more comprehensively accurate by carrying out the default retrieval dictionary of data analysis building to a large amount of text datas.
Description
Technical field
The present invention relates to field of computer technology, more particularly to the search method based on Solr, device, equipment and storage are situated between
Matter.
Background technique
With the extensive use of big data, in our daily life, more and more data of injection.Data volume
Increase also brings corresponding problem while bringing convenient.
For example, how the information required for getting oneself in huge database is at a very big problem.That is,
Current information retrieval is normally based on the technology of exact matching or fuzzy matching, since the meaning of retrieval information cannot be parsed,
Lead to cannot to carry out the information for accurately comprehensively retrieving user's needs, how more it is accurate comprehensively carry out information retrieval at
For current technical problem urgently to be resolved.
Summary of the invention
The main purpose of the present invention is to provide a kind of search method based on Solr, device, equipment and storage medium, purports
Solving the incomplete technical problem of current information retrieval inaccuracy.
To achieve the above object, the present invention provides the search method based on Solr, the search method packet based on Solr
Include following steps:
Information retrieval requests are received, the corresponding retrieval information of the retrieval request is obtained;
The retrieval information be Chinese retrieval information when, judge the Chinese retrieval information character quantity whether be more than
Preset standard amount;
When the character quantity of the Chinese retrieval information is no more than preset standard amount, obtain in default retrieval dictionary with institute
State the corresponding index field of Chinese retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and the target article is made
For search result output.
Optionally, when the character quantity in the Chinese retrieval information is no more than preset standard amount, default inspection is obtained
In rope dictionary the step of index field corresponding with the Chinese retrieval information before, comprising:
Text data is crawled from network, and word segmentation processing is carried out to the text data by default Chinese Word Automatic Segmentation and is obtained
Corresponding word, and each word is summarized into composition sample set;
The frequency of occurrences of identical word in the sample set is counted, and by each identical word by frequency of occurrences height
Sequence forms word list;
The word for the forward preset quantity that sorts in the word list is chosen as index terms, utilizes the index phrase
Corresponding term vector is converted by the index terms in the basic dictionary at basic dictionary, and by default term vector model;
According to the term vector of each index terms, determine the approximate word of each index terms, by the index terms with it is corresponding close
It is saved like word association, generates default retrieval dictionary.
Optionally, the term vector according to each index terms determines the approximate word of each index terms, by the index terms
The step of being saved with corresponding approximate word association, generating default retrieval dictionary, comprising:
Using each of basic dictionary index terms as the first index terms, first index terms is calculated
Cosine value in term vector and the basic dictionary in addition to first index terms between the term vector of the second index terms;
When there is the target cosine value for being greater than default cosine value, the corresponding approximate index of the target cosine value is obtained
Word, and using the approximate index terms as the approximate word of first index terms;
First index terms is saved with corresponding approximate word association, generates default retrieval dictionary.
Optionally, when the character quantity in the Chinese retrieval information is no more than preset standard amount, default inspection is obtained
In rope dictionary the step of index field corresponding with the Chinese retrieval information, comprising:
When the character quantity of the Chinese retrieval information is no more than preset standard amount, by default Chinese Word Automatic Segmentation to institute
It states Chinese retrieval information and carries out word segmentation processing, obtain the corresponding keyword set of the Chinese retrieval information;
Keyword in the keyword set is compared with the index terms in default retrieval dictionary, is obtained and each institute
State target index terms similar in keyword and the associated approximate word of the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information.
Optionally, described to inquire default searching database, the corresponding target article of the index field is obtained, and will be described
The step of target article is exported as search result, comprising:
The index field is combined, the corresponding retrieval formula of the Chinese retrieval information is obtained, inquires default inspection
Rope database obtains the corresponding target article of the retrieval formula;
By default weight mapping table, the weight of each index field in the retrieval formula is set, and presses the index field
Weight be each target article be ranked up, formed article sorted lists;
It is exported the article sorted lists as search result.
Optionally, described to inquire default searching database, the corresponding target article of the index field is obtained, and will be described
After the step of target article is exported as search result, comprising:
The user behavior data based on the article sorted lists is received, by the browsing time in the user behavior data
The retrieval article of user's concern in the article sorted lists is determined with the browsing time and is marked;
When receiving browsing record queries instruction, the retrieval article of mark is exported, for user query.
Optionally, described when the retrieval information is Chinese retrieval information, judge the character of the Chinese retrieval information
After the step of whether quantity is more than preset standard amount, comprising:
When the character quantity of the Chinese retrieval information is more than preset standard amount, the Chinese retrieval information is divided
Sentence processing, obtains the corresponding simple sentence of the Chinese retrieval information;
The simple sentence that subordinate sentence is handled carries out word segmentation processing, obtains corresponding keyword, and obtains default retrieval dictionary
In with the associated approximate word of the approximate target index terms of the keyword and the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and the target article is made
For search result output.
In addition, to achieve the above object, the retrieval device the present invention also provides a kind based on Solr is described to be based on Solr
Retrieval device include:
Request receiving module obtains the corresponding retrieval information of the retrieval request for receiving information retrieval requests;
Character quantity judgment module, for judging the Chinese retrieval when the retrieval information is Chinese retrieval information
Whether the character quantity of information is more than preset standard amount;
It determines index module, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, obtains
Index field corresponding with the Chinese retrieval information in default retrieval dictionary;
As a result output module obtains the corresponding target article of the index field for inquiring default searching database, and
It is exported the target article as search result.
In addition, to achieve the above object, the present invention also provides a kind of retrieval facilities based on Solr;
The retrieval facility based on Solr includes: memory, processor and is stored on the memory and can be in institute
State the computer program run on processor, in which:
The step of the search method based on Solr as described above is realized when the computer program is executed by the processor
Suddenly.
In addition, to achieve the above object, the present invention also provides computer storage mediums;
Computer program, the realization when computer program is executed by processor are stored in the computer storage medium
Such as the step of the above-mentioned search method based on Solr.
A kind of search method based on Solr, device, equipment and the storage medium that the embodiment of the present invention proposes, pass through reception
Information retrieval requests obtain the corresponding retrieval information of the retrieval request;When the retrieval information is Chinese retrieval information, sentence
Whether the character quantity for the Chinese retrieval information of breaking is more than preset standard amount;The Chinese retrieval information character quantity not
When more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained;Inquiry is default
Searching database obtains the corresponding target article of the index field, and exports the target article as search result.?
Server grabs the text data of magnanimity from network in the present invention, and is generated by carrying out data analysis to mass text data
Default retrieval dictionary, during retrieve information, server obtains retrieval information, then to the information type of retrieval information with
Character quantity is identified, after server determines retrieval information for no more than the Chinese retrieval information of preset standard amount, benefit
Corresponding index field is converted by Chinese retrieval information with default retrieval dictionary, is then examined using determining index field
Rope avoids the case where directly utilizing the missing inspection of Chinese retrieval information appearance, while can effectively identify the language of Chinese retrieval information
Justice, so that information retrieval is more comprehensively accurate.
Detailed description of the invention
Fig. 1 is the apparatus structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is that the present invention is based on the flow diagrams of the search method first embodiment of Solr;
Fig. 3 is the functional block diagram of retrieval one embodiment of device the present invention is based on Solr.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
As shown in Figure 1, the server that Fig. 1 is the hardware running environment that the embodiment of the present invention is related to (is called and is based on
The retrieval facility of Solr, wherein the retrieval facility based on Solr can be to be made of the retrieval device individually based on Solr,
Can be to be combined by other devices with the retrieval device based on Solr and be formed) structural schematic diagram.
Server of the embodiment of the present invention refers to a management resource and provides the computer of service for user, is generally divided into file
Server, database server and apps server.The computer or computer system for running the above software are also referred to as
Server.For common PC (personal computer) personal computer, server is in stability, safety, property
Energy etc. requires higher;As shown in Figure 1, the server may include: processor 1001, such as central processing unit
(Central Processing Unit, CPU), network interface 1004, user interface 1003, memory 1005, communication bus
1002, hardware such as chipset, disk system, network etc..Wherein, communication bus 1002 is for realizing the connection between these components
Communication.User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard), optional user
Interface 1003 can also include standard wireline interface and wireless interface.Network interface 1004 optionally may include having for standard
Line interface, wireless interface (such as Wireless Fidelity WIreless-FIdelity, WIFI interface).Memory 1005 can be high speed with
Machine accesses memory (random access memory, RAM), is also possible to stable memory (non-volatile
), such as magnetic disk storage memory.Memory 1005 optionally can also be the storage dress independently of aforementioned processor 1001
It sets.
Optionally, server can also include camera, RF (Radio Frequency, radio frequency) circuit, sensor, sound
Frequency circuit, WiFi module;Input unit, than display screen, touch screen;Network interface can be blue in blanking wireless interface in addition to WiFi
Tooth, probe etc..It will be understood by those skilled in the art that server architecture shown in Fig. 1 does not constitute the restriction to server,
It may include perhaps combining certain components or different component layouts than illustrating more or fewer components.
As shown in Figure 1, the computer software product, which is stored in a storage medium, (storage medium: is called computer storage
Medium, computer media, readable medium, readable storage medium storing program for executing, computer readable storage medium are directly medium etc., storage
Medium can be non-volatile readable storage medium, such as RAM, magnetic disk, CD) in, including some instructions use is so that an end
End equipment (can be mobile phone, computer, server, air conditioner or the network equipment etc.) executes each embodiment institute of the present invention
The method stated, as may include operating system, network communication module, use in a kind of memory 1005 of computer storage medium
Family interface module and computer program.
In server shown in Fig. 1, network interface 1004 be mainly used for connect background data base, with background data base into
Row data communication;User interface 1003 is mainly used for connection client, and (client, is called user terminal or terminal, and the present invention is implemented
Example terminal can be also possible to mobile terminal with fixed terminal, e.g., intelligent air condition, intelligent electric lamp, intelligent power with network savvy,
Intelligent sound box, autonomous driving vehicle, PC, smart phone, tablet computer, E-book reader, portable computer etc., are wrapped in terminal
Containing sensor such as optical sensor, motion sensor and other sensors, details are not described herein), data are carried out with client
Communication;And processor 1001 can be used for calling the computer program stored in memory 1005, and it is real to execute the present invention or less
Step in the search method based on Solr of example offer is provided.
The present embodiment proposes a kind of search method based on Solr, applied to server as shown in Figure 1.It is each in the present invention
Solr in a embodiment is developed using Java language, is based primarily upon hypertext transfer protocol (HTTP) and Apache Lucene
(the full-text search engine kit that Apache Lucene is an open source code) is realized.It independent is searched that is, Solr is one
Rope application server externally provides the api interface for being similar to Web-service.
Referring to Fig. 2, the present invention is based in the first embodiment of the search method of Solr, the retrieval side based on Solr
Method includes:
Step S10 receives information retrieval requests, obtains the corresponding retrieval information of the retrieval request.
Server receives information retrieval requests, and after server receives information retrieval requests, server obtains information
The corresponding retrieval information of retrieval request;Wherein, the triggering mode for the information retrieval requests that server receives is not especially limited,
For example, user inputs sentence: " accuracy for how improving information retrieval " in terminal, and it is based on " how improving information retrieval
Accuracy " triggers information retrieval requests, and information retrieval requests are sent to server by terminal, and server receives information retrieval and asks
It asks, " accuracy for how improving information retrieval " is used as the corresponding retrieval information of information retrieval requests by server;For another example, it uses
Family inputs the article for needing duplicate checking, and voice input " searching similar article " triggers information retrieval requests, and terminal is by information
Retrieval request is sent to server, and server receives information retrieval requests, and server will need the article of duplicate checking as the information
The corresponding retrieval information of retrieval request.
It should be noted that retrieval information can be it is different types of, for example, Chinese or English.That is, being obtained in server
After getting retrieval information, server determines whether the retrieval information is Chinese first according to the character types of retrieval information, if
When retrieval information is foreign language, then it is Chinese inspection in retrieval information that retrieval information can be translated as Chinese retrieval information by server
When rope information, following steps are executed, specifically:
Step S20 judges the character quantity of the Chinese retrieval information when the retrieval information is Chinese retrieval information
It whether is more than preset standard amount.
After server determines retrieval information for Chinese retrieval information according to the character types of retrieval information, server is obtained
The character quantity (or being called information content) of retrieval information is taken, then, server will retrieve the character quantity and pre- bidding of information
Quasi- amount is compared, wherein preset standard amount refers to pre-set character quantity critical value, for example, preset standard amount is arranged
For 100 bytes, if the character quantity of retrieval information is no more than preset standard amount, that is, the word for the retrieval information that server obtains
It is smaller to accord with quantity, server carries out information retrieval using word conversion retrieval mode, specifically:
Step S30 obtains default term when the character quantity of the Chinese retrieval information is no more than preset standard amount
Index field corresponding with the Chinese retrieval information in allusion quotation.
When server determines that the character quantity of Chinese retrieval information is no more than preset standard amount, server is in order to realize standard
It really rapidly searches, server is needed Chinese retrieval information processing, that is,
Step S31 carries out word segmentation processing to the Chinese retrieval information by default Chinese Word Automatic Segmentation, obtains the Chinese
Retrieve the corresponding keyword set of information.
Chinese retrieval information is pressed default Chinese Word Automatic Segmentation and carries out word segmentation processing by server, and it is corresponding to obtain retrieval information
Keyword set;Wherein, default Chinese Word Automatic Segmentation refer to it is pre-set by a chinese character sequence be cut into one by one individually
Word algorithm, for example, default Chinese Word Automatic Segmentation can are as follows: the segmentation methods based on string matching or the participle based on statistics
Algorithm.
Keyword in the keyword set is compared with the index terms in default retrieval dictionary, obtains by step S32
It takes and the associated approximate word of target index terms similar in each keyword and the target index terms;The target is indexed
Word and the corresponding approximate word are as the corresponding index field of the Chinese retrieval information.
Keyword in keyword set is compared (its with each index terms in default retrieval dictionary by server
In, default retrieval dictionary refers to then pre-set index terms dictionary is chosen for example, server crawls network mass data
50000 index terms establish retrieval dictionary, store its similar approximate word respectively for each index terms in retrieval dictionary),
Server calculates the similarity of the keyword in keyword set and each index terms in default retrieval dictionary, and server obtains
With the highest index terms of crucial Word similarity, server is being obtained using the index terms as the target index terms with Keywords matching
To after target index terms, server obtains the associated approximate word of the target index terms in default retrieval dictionary;Server should
The associated approximate word of target index terms and the target index terms is as the corresponding index field of Chinese retrieval information.
Retrieval dictionary is preset in the present embodiment, server is by retrieving dictionary for the corresponding Chinese retrieval of retrieval request
The case where information is converted into corresponding index field, and retrieval can be made more comprehensive, prevent retrieval from omitting.
Step S40 inquires default searching database, obtains the corresponding target article of the index field, and by the mesh
Article is marked to export as search result.
After obtaining the corresponding index field of Chinese retrieval information, server inquires default searching database, wherein pre-
If searching database refers to the corresponding database of user search information, for example, Baidu library;It is corresponding that server obtains index field
Target article, and exported target article as the corresponding search result of retrieval request.
Server inquires default searching database, obtains the target article comprising index field;For example, server is default
It include that the associated approximate word of " house property " has " real estate " in retrieval dictionary, " developer ", " room rate " is then received in server
Chinese retrieval information is " flat price is how many when ", and server can convert Chinese retrieval information to index field " real estate
The current xxx " in boundary, that is to say, that even if absolutely not same words in article in default searching database, search engine remains unchanged energy
From wherein finding out certain semantic association, that is, the present invention using default retrieval dictionary scans for that the accurate of retrieval can be improved
Degree.
Server grabs the text data of magnanimity from network in the present embodiment, and by carrying out to mass text data
Data analysis generates default retrieval dictionary, and during retrieving information, server obtains retrieval information, then to retrieval information
Information type and character quantity identified, server determine retrieval information be no more than preset standard amount Chinese retrieval
After information, corresponding index field is converted for Chinese retrieval information by default retrieval dictionary, then utilizes determining rope
Draw field to be retrieved, so that information retrieval is more comprehensively accurate.
Further, on the basis of first embodiment of the invention, the search method the present invention is based on Solr is proposed
Second embodiment.
The present embodiment be in first embodiment before step S30 the step of, server obtains default retrieval in step s 30
Index field corresponding with Chinese retrieval information in dictionary, before this, server needs to pre-establish retrieval dictionary, this implementation
The establishment step of retrieval dictionary is specifically illustrated in example, comprising:
Step S01, crawls text data from network, segments by default Chinese Word Automatic Segmentation to the text data
Processing obtains corresponding word, and each word is summarized composition sample set.
Server crawls the text data of magnanimity from network, and server extracts text by handling text data
The word for including in notebook data, specifically, comprising: 1, server pre-processes text data: data prediction, including letter
Traditional font conversion, removes xml symbol, and by word contents processing, data, default term vector model (word2vec) are trained in single file
Principle is the semantic relation trained between word based on Term co-occurrence.Different entry contents need to be separated and be trained;2, server is by pre-
If Chinese Word Automatic Segmentation carries out word segmentation processing to the text data and obtains corresponding word, corresponding word is obtained;Wherein, in advance
If Chinese Word Automatic Segmentation is identical with the first embodiment, the present embodiment is not repeated.After text data word segmentation processing, server
It carries out the word obtained after text data word segmentation processing to summarize composition sample set.
Step S02 counts the frequency of occurrences of identical word in the sample set, and by each identical word by appearance
Frequency height sorts, and forms word list.
The frequency of occurrences of identical word in server statistics sample set, and identical word is pressed out into existing frequency height and is arranged
Sequence forms word list, that is, Chinese word is more under normal conditions, and comprising the uncommon word being of little use, server selects word
The higher word of the frequency of occurrences carries out term vector training, specifically:
Step S03, chooses the word of the forward preset quantity that sorts in the word list as index terms, using described
Index terms forms basic dictionary, and converts corresponding word for the index terms in the basic dictionary by default term vector model
Vector.
Server chooses the index terms for the forward preset quantity that sorts in word list, wherein preset quantity refers in advance
The index terms quantity of the retrieval dictionary of setting, for example, preset quantity is set as 5000, that is, server is chosen from word list
Higher 5000 everyday words of the frequency of occurrences, server is using 5000 everyday words as index terms, and server is by the index of selection
Vocabulary is total, forms basic dictionary.
After obtaining basic dictionary, index terms in basic dictionary is carried out characteristic processing by server: also make term vector encode,
Common coding mode has one hot coding (BOW bag of words discrete representation mode) and word-based vector model even depth
The dense vector of low-dimensional that model training obtains is practised, term vector model word2vec is commonly referred to as word embedding
Distributed representation;Then, server carries out term vector training by the method for machine learning, that is, clothes
After business device is encoded term vector, text data can be converted into numeric data, be input to preset machine learning
Model carries out calculating training.
That is, index terms is constituted input layer by default term vector model by server, each word use one-hot to
Amount form indicates that if vocabulary is V, each word means that into V dimensional vector, corresponding word corresponding element is configured to 1,
Remaining is 0.One-hot vector is multiplied with weight matrix W1 and is equivalent to simply select a line in W1.If input
C term vector, the activation primitive of hidden layer are exactly the hot spot row in statistical matrix in fact, are then averaged divided by C.
That is, the activation primitive of implicit layer unit is exactly simple linear operation (directly by weight and defeated as next layer
Enter).From hidden layer to output layer, the score in vocabulary is calculated for each word with a weight matrix W2, is obtained each high
The corresponding distributed term vector of frequency word.
Step S04 determines the approximate word of each index terms according to the term vector of each index terms, by the index terms with
Corresponding approximation word association saves, and generates default retrieval dictionary.
After term vector training is completed, server generates default retrieval dictionary according to the term vector of index terms, specifically,
Include:
Step a calculates first rope using each of basic dictionary index terms as the first index terms
Draw the cosine in the term vector and the basic dictionary of word in addition to first index terms between the term vector of the second index terms
Value;
Step b obtains the corresponding approximation of the target cosine value when there is the target cosine value for being greater than default cosine value
Index terms, and using the approximate index terms as the approximate word of first index terms;
First index terms is saved with corresponding approximate word association, generates default retrieval dictionary by step c.
That is, server is using each index terms as the first index terms, and calculate the term vector and base of the first index terms
Cosine value in plinth dictionary in addition to the first index terms between the term vector of other second index terms;That is, server cosine
The similarity of value the first index terms of characterization and other second index terms, server is by the cosine value being calculated and default cosine
Value is compared, wherein default cosine value refers to pre-set cosine value critical value, for example, default cosine value is set as
0.9;Server determines that, when there is the target cosine value for being greater than default cosine value, it is corresponding close that server obtains target cosine value
Like index terms, and using approximate index terms as the approximate word of the first index terms;By first index terms and corresponding approximate word
Association saves, and generates default retrieval dictionary.
In the present embodiment by default retrieval dictionary, it can make server that will retrieve information according to default retrieval dictionary
It is converted, obtains corresponding index information, so that the meaning of server parsing retrieval information, so that retrieval is more accurate.
Further, on the basis of the above embodiments, the third for proposing the search method the present invention is based on Solr is real
Apply example.
The present embodiment is the refinement of step S40 in first embodiment, and retrieval information is specifically illustrated in the present embodiment and is determined
The step of, the search method based on Solr includes:
The index field is combined by step S41, is obtained the corresponding retrieval formula of the Chinese retrieval information, is looked into
Default searching database is ask, the corresponding target article of the retrieval formula is obtained.
Index field is combined by server, obtains the corresponding retrieval formula of Chinese retrieval information, then, server is looked into
Default searching database is ask, the corresponding target article of retrieval formula is obtained.That is, server merges index field, for example,
The index field of one article is that the conjunctive word that " house property " server obtains " house property " has " real estate ", " developer ", then, service
Device merges " house property ", " real estate " and " developer ", generates corresponding index xml, only needs to inquire one in server inquiry
It is secondary, so that it may to inquire the target article comprising " house property ", " real estate " and " developer ".
The weight of each index field in the retrieval formula is arranged by default weight mapping table in step S42, and by described
The weight of index field is that each target article is ranked up, and forms article sorted lists;Using the article sorted lists as inspection
The output of hitch fruit.
After obtaining target article, server presses default weight mapping table, and (default weight mapping table setting is preset
Word type and weight mapping table, such as default weight mapping table in be provided with title respective weights 50%, adjective is corresponding
Weight is 30%, and pronoun respective weights are that 20%), server obtains the weight of each index field, and presses the weight of index field
It is ranked up for each target article, forms article sorted lists;It is exported article sorted lists as search result.
The significance level of server each index field when being indexed may be different in the present embodiment, in order to make
The target article that must be inquired is more accurate, can preset and different weight rules is arranged, user is rapidly checked
To the information of needs.
Further, on the basis of 3rd embodiment, propose the search method the present invention is based on Solr the 4th is real
Apply example.
The present embodiment be in first embodiment after step S40 the step of, server can be according to user in the present embodiment
Behavioral data carries out mark and the guarantee of retrieval article, specifically, comprising:
Step S50 receives the user behavior data based on the article sorted lists, by the user behavior data
Browsing time and browsing time determine the retrieval article that user pays close attention in the article sorted lists and mark.
Server receives the user behavior data based on article sorted lists, that is, includes multiple in article sorted lists
Article, user can check each article, collection of server user behavior data, and press the browsing time in user behavior data
The several and browsing time determines the retrieval article that user pays close attention in article sorted lists and marks.
Step S60 exports the retrieval article of mark, when receiving browsing record queries instruction so that user looks into
It askes.
User can trigger browsing record queries instruction with terminal, and terminal will browse record queries instruction and be sent to server,
Server is when receiving browsing record queries instruction, the retrieval article of server output mark, for user query.?
Server saves browsing record early period of user according to user behavior data in the present embodiment, can be convenient user into
Row is paid a return visit.
Further, on the basis of the above embodiments, propose the search method the present invention is based on Solr the 5th is real
Apply example.
The present embodiment be in first embodiment after step S20 the step of, specifically in Chinese retrieval in the present embodiment
It is the search method of server, specifically when the character quantity of information is more than preset standard amount, comprising:
Step S70 believes the Chinese retrieval when the character quantity of the Chinese retrieval information is more than preset standard amount
Breath carries out subordinate sentence processing, obtains the corresponding simple sentence of the Chinese retrieval information.
When server determines the character quantity of Chinese retrieval information more than preset standard amount, server believes Chinese retrieval
Breath carries out a point processing, obtains the corresponding simple sentence of Chinese retrieval information, wherein server carries out at subordinate sentence Chinese retrieval information
Reason, can be divided into two kinds of situations, and it is a very long complex sentence that a kind of situation, which is Chinese retrieval information, in order to improve information retrieval standard
One complex sentence is split as multiple simple sentences arranged side by side by exactness, server, another situation is that Chinese retrieval information is a text
Chapter or a paragraph, server are divided into multiple simple sentences according to its punctuate.
Step S80, the simple sentence that subordinate sentence is handled carry out word segmentation processing, obtain corresponding keyword, and obtain default
Retrieve in dictionary with the associated approximate word of the approximate target index terms of the keyword and the target index terms;By the mesh
Index terms and the corresponding approximate word are marked as the corresponding index field of the Chinese retrieval information.
The simple sentence that server handles subordinate sentence carries out word segmentation processing, obtains corresponding keyword, wherein simple sentence participle
Processing is referred to first embodiment, does not repeat in the present embodiment, and server obtains close with keyword in default retrieval dictionary
As target index terms, that is, keyword is compared server with the index terms in default retrieval dictionary, obtains and keyword
For similar index terms as target index terms, server obtains the associated approximate word of target index terms in default retrieval dictionary;Clothes
Device be engaged in using target index terms and corresponding approximate word as the corresponding index field of Chinese retrieval information.
Step S90 inquires default searching database, obtains the corresponding target article of the index field, and by the mesh
Article is marked to export as search result.
After obtaining the corresponding target index terms of Chinese retrieval information, server inquires default searching database, wherein
Default searching database refers to the corresponding database of user search information, for example, Baidu library;Server obtains index field pair
The target article answered, and exported target article as the corresponding search result of retrieval request.
The Chinese information that character quantity is more than preset standard amount is carried out subordinate sentence processing by server in the present embodiment, and is pressed
Information retrieval is carried out according to the step in first embodiment, so that information retrieval is more intelligent.
In addition, the embodiment of the present invention also proposes a kind of retrieval device based on Solr, described based on Solr's referring to Fig. 3
Retrieving device includes:
Request receiving module 10 obtains the corresponding retrieval information of the retrieval request for receiving information retrieval requests;
Character quantity judgment module 20, for when the retrieval information is Chinese retrieval information, judging the Chinese inspection
Whether the character quantity of rope information is more than preset standard amount;
It determines index module 30, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, obtains
Take index field corresponding with the Chinese retrieval information in default retrieval dictionary;
As a result output module 40 obtain the corresponding target article of the index field for inquiring default searching database,
And it is exported the target article as search result.
Optionally, the retrieval device based on Solr, comprising:
Sample process module, for crawling text data from network, by default Chinese Word Automatic Segmentation to the textual data
Corresponding word is obtained according to word segmentation processing is carried out, and each word is summarized into composition sample set;
Frequency statistics module, for counting the frequency of occurrences of identical word in the sample set, and will be each identical
Word sorts by frequency of occurrences height, forms word list;
Word training module, for choosing the word for the forward preset quantity that sorts in the word list as index
Word forms basic dictionary using the index terms, and is turned the index terms in the basic dictionary by default term vector model
Turn to corresponding term vector;
Dictionary generation module determines the approximate word of each index terms for the term vector according to each index terms, will be described
Index terms is saved with corresponding approximate word association, generates default retrieval dictionary.
Optionally, the dictionary generation module, comprising:
Cosine calculating is used for, for counting using each of basic dictionary index terms as the first index terms
Calculate first index terms term vector and the basic dictionary in addition to first index terms the second index terms word to
Cosine value between amount;
Similar word query unit, for obtaining more than the target when there is the target cosine value for being greater than default cosine value
The corresponding approximate index terms of string value, and using the approximate index terms as the approximate word of first index terms;
Generation unit is saved, for saving first index terms with corresponding approximate word association, generates default retrieval
Dictionary.
Optionally, the determining index module 30, comprising:
Participle unit, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, in default
Literary segmentation methods carry out word segmentation processing to the Chinese retrieval information, obtain the corresponding keyword set of the Chinese retrieval information
It closes;
Word comparing unit, for by the index terms in the keyword and default retrieval dictionary in the keyword set into
Row compares, and obtains and the associated approximate word of target index terms similar in each keyword and the target index terms;
Index field determination unit, for being examined using the target index terms and the corresponding approximate word as the Chinese
The corresponding index field of rope information.
Optionally, the result output module 40, comprising:
Information assembled unit obtains the corresponding inspection of the Chinese retrieval information for the index field to be combined
Rope formula inquires default searching database, obtains the corresponding target article of the retrieval formula;
Article sequencing unit, for the weight of each index field in the retrieval formula to be arranged by default weight mapping table,
And be ranked up by the weight of the index field for each target article, form article sorted lists;
Information output unit, for being exported the article sorted lists as search result.
Optionally, the retrieval device based on Solr, comprising:
Article standard module, for receiving the user behavior data based on the article sorted lists, by user's row
For in data browsing time and the browsing time determine retrieval article that user in the article sorted lists pays close attention to and mark;
Standard output module, for exporting the retrieval article of mark when receiving browsing record queries instruction, with
For user query.
Optionally, the retrieval device based on Solr, comprising:
Subordinate sentence processing module will be described when being more than preset standard amount for the character quantity in the Chinese retrieval information
Chinese retrieval information carries out subordinate sentence processing, obtains the corresponding simple sentence of the Chinese retrieval information;
Word comparison module, the simple sentence for handling subordinate sentence carry out word segmentation processing, obtain corresponding keyword, and
Obtain in default retrieval dictionary with the associated approximate word of the approximate target index terms of the keyword and the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information;
Search and output module obtains the corresponding target article of the index field for inquiring default searching database, and
It is exported the target article as search result.
Wherein, the step of each Implement of Function Module of the retrieval device based on Solr can refer to that the present invention is based on Solr's
Each embodiment of search method, details are not described herein again.
In addition, the embodiment of the present invention also proposes a kind of computer storage medium.
Computer program, the realization when computer program is executed by processor are stored in the computer storage medium
Operation in search method provided by the above embodiment based on Solr.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body/operation/object is distinguished with another entity/operation/object, without necessarily requiring or implying these entity/operations/
There are any actual relationship or orders between object;The terms "include", "comprise" or its any other variant meaning
Covering non-exclusive inclusion, so that the process, method, article or the system that include a series of elements not only include that
A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or
The intrinsic element of system.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged
Except there is also other identical elements in process, method, article or the system for including the element.
For device embodiment, since it is substantially similar to the method embodiment, related so describing fairly simple
Place illustrates referring to the part of embodiment of the method.The apparatus embodiments described above are merely exemplary, wherein making
It may or may not be physically separated for the unit of separate part description.In can selecting according to the actual needs
Some or all of the modules realize the purpose of the present invention program.Those of ordinary skill in the art are not making the creative labor
In the case where, it can it understands and implements.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in one as described above
In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone,
Computer, server, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of search method based on Solr, which is characterized in that the search method based on Solr the following steps are included:
Information retrieval requests are received, the corresponding retrieval information of the retrieval request is obtained;
When the retrieval information is Chinese retrieval information, judge whether the character quantity of the Chinese retrieval information is more than default
Standard volume;
When the character quantity of the Chinese retrieval information is no more than preset standard amount, obtain in default retrieval dictionary with it is described in
The corresponding index field of text retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and using the target article as inspection
The output of hitch fruit.
2. as described in claim 1 based on the search method of Solr, which is characterized in that described in the Chinese retrieval information
When character quantity is no more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained
The step of before, comprising:
Text data is crawled from network, and word segmentation processing is carried out to the text data by default Chinese Word Automatic Segmentation and is corresponded to
Word, and each word is summarized into composition sample set;
The frequency of occurrences of identical word in the sample set is counted, and each identical word is arranged by frequency of occurrences height
Sequence forms word list;
The word for the forward preset quantity that sorts in the word list is chosen as index terms, forms base using the index terms
Plinth dictionary, and corresponding term vector is converted for the index terms in the basic dictionary by default term vector model;
According to the term vector of each index terms, the approximate word of each index terms is determined, by the index terms and corresponding approximate word
Association saves, and generates default retrieval dictionary.
3. as claimed in claim 2 based on the search method of Solr, which is characterized in that the word according to each index terms to
Amount, determines the approximate word of each index terms, and the index terms is saved with corresponding approximate word association, generates default term
The step of allusion quotation, comprising:
Using each of basic dictionary index terms as the first index terms, calculate the word of first index terms to
Cosine value in amount and the basic dictionary in addition to first index terms between the term vector of the second index terms;
When there is the target cosine value for being greater than default cosine value, the corresponding approximate index terms of the target cosine value is obtained, and
Using the approximate index terms as the approximate word of first index terms;
First index terms is saved with corresponding approximate word association, generates default retrieval dictionary.
4. as described in claim 1 based on the search method of Solr, which is characterized in that described in the Chinese retrieval information
When character quantity is no more than preset standard amount, index field corresponding with the Chinese retrieval information in default retrieval dictionary is obtained
The step of, comprising:
When the character quantity of the Chinese retrieval information is no more than preset standard amount, by default Chinese Word Automatic Segmentation in described
Text retrieval information carries out word segmentation processing, obtains the corresponding keyword set of the Chinese retrieval information;
Keyword in the keyword set is compared with the index terms in default retrieval dictionary, is obtained and each pass
Target index terms similar in keyword and the associated approximate word of the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information.
5. as described in claim 1 based on the search method of Solr, which is characterized in that it is described to inquire default searching database,
Obtain the corresponding target article of the index field, and the step of target article is exported as search result, comprising:
The index field is combined, the corresponding retrieval formula of the Chinese retrieval information is obtained, inquires default retrieval number
According to library, the corresponding target article of the retrieval formula is obtained;
By default weight mapping table, the weight of each index field in the retrieval formula is set, and presses the power of the index field
Weight is that each target article is ranked up, and forms article sorted lists;
It is exported the article sorted lists as search result.
6. as claimed in claim 5 based on the search method of Solr, which is characterized in that it is described to inquire default searching database,
The corresponding target article of the index field is obtained, and after the step of target article is exported as search result, packet
It includes:
The user behavior data based on the article sorted lists is received, by browsing time in the user behavior data and clear
It lookes at and retrieval article that the time determines that user in the article sorted lists pays close attention to and marks;
When receiving browsing record queries instruction, the retrieval article of mark is exported, for user query.
7. as described in claim 1 based on the search method of Solr, which is characterized in that it is described the retrieval information be Chinese
When retrieving information, after the step of whether character quantity for judging the Chinese retrieval information is more than preset standard amount, comprising:
When the character quantity of the Chinese retrieval information is more than preset standard amount, the Chinese retrieval information is carried out at subordinate sentence
Reason, obtains the corresponding simple sentence of the Chinese retrieval information;
The simple sentence that subordinate sentence is handled carries out word segmentation processing, obtains corresponding keyword, and obtain in default retrieval dictionary with
The approximate target index terms of keyword and the associated approximate word of the target index terms;
Using the target index terms and the corresponding approximate word as the corresponding index field of the Chinese retrieval information;
Default searching database is inquired, obtains the corresponding target article of the index field, and using the target article as inspection
The output of hitch fruit.
8. a kind of retrieval device based on Solr, which is characterized in that the retrieval device based on Solr includes:
Request receiving module obtains the corresponding retrieval information of the retrieval request for receiving information retrieval requests;
Character quantity judgment module, for judging the Chinese retrieval information when the retrieval information is Chinese retrieval information
Character quantity whether be more than preset standard amount;
It determines index module, when being no more than preset standard amount for the character quantity in the Chinese retrieval information, obtains default
Retrieve index field corresponding with the Chinese retrieval information in dictionary;
As a result output module obtains the corresponding target article of the index field for inquiring default searching database, and by institute
Target article is stated to export as search result.
9. a kind of retrieval facility based on Solr, which is characterized in that the retrieval facility based on Solr includes: memory, place
It manages device and is stored in the computer program that can be run on the memory and on the processor, in which:
When the computer program is executed by the processor realize as described in any one of claims 1 to 7 based on Solr
Search method the step of.
10. a kind of computer storage medium, which is characterized in that be stored with computer program, institute in the computer storage medium
State the search method based on Solr realized as described in any one of claims 1 to 7 when computer program is executed by processor
The step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910205809.0A CN110069610B (en) | 2019-03-16 | 2019-03-16 | Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910205809.0A CN110069610B (en) | 2019-03-16 | 2019-03-16 | Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110069610A true CN110069610A (en) | 2019-07-30 |
CN110069610B CN110069610B (en) | 2024-03-19 |
Family
ID=67365343
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910205809.0A Active CN110069610B (en) | 2019-03-16 | 2019-03-16 | Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110069610B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619067A (en) * | 2019-08-27 | 2019-12-27 | 深圳证券交易所 | Industry classification-based retrieval method and retrieval device and readable storage medium |
CN110705302A (en) * | 2019-10-11 | 2020-01-17 | 掌阅科技股份有限公司 | Named entity recognition method, electronic device and computer storage medium |
CN110941702A (en) * | 2019-11-26 | 2020-03-31 | 北京明略软件系统有限公司 | Retrieval method and device for laws and regulations and laws and readable storage medium |
CN111078960A (en) * | 2019-12-20 | 2020-04-28 | 金现代信息产业股份有限公司 | Method and system for realizing real-time retrieval of power dispatching system equipment |
CN111209378A (en) * | 2019-12-26 | 2020-05-29 | 航天信息股份有限公司企业服务分公司 | Ordered hierarchical ordering method based on business dictionary weight |
CN111223533A (en) * | 2019-12-24 | 2020-06-02 | 深圳市联影医疗数据服务有限公司 | Medical data retrieval method and system |
CN111708942A (en) * | 2020-06-12 | 2020-09-25 | 北京达佳互联信息技术有限公司 | Multimedia resource pushing method, device, server and storage medium |
CN111767378A (en) * | 2020-06-24 | 2020-10-13 | 北京墨丘科技有限公司 | Method and device for intelligently recommending scientific and technical literature |
CN111859091A (en) * | 2020-07-21 | 2020-10-30 | 山东省科院易达科技咨询有限公司 | Search result aggregation method and device based on artificial intelligence |
CN112052309A (en) * | 2020-09-07 | 2020-12-08 | 深圳壹账通智能科技有限公司 | Text data retrieval method, related equipment and readable storage medium |
CN112380445A (en) * | 2020-11-30 | 2021-02-19 | 深圳前海微众银行股份有限公司 | Data query method, device, equipment and storage medium |
CN112749162A (en) * | 2020-12-31 | 2021-05-04 | 浙江省方大标准信息有限公司 | ES-based rapid retrieval and sorting method for inspection and detection mechanism |
CN114186059A (en) * | 2021-11-01 | 2022-03-15 | 东风汽车集团股份有限公司 | Article classification method and device |
CN115455147A (en) * | 2022-09-09 | 2022-12-09 | 浪潮卓数大数据产业发展有限公司 | Full-text retrieval method and system |
CN115495483A (en) * | 2022-09-21 | 2022-12-20 | 企查查科技有限公司 | Data batch processing method, device, equipment and computer readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010267247A (en) * | 2010-02-08 | 2010-11-25 | Ntt Data Corp | Device and method for retrieving information, terminal equipment, and program |
WO2014087424A2 (en) * | 2012-12-03 | 2014-06-12 | Parthys Reverse Informatics Analytic Solutions (P) Ltd. | Information retrieval, extraction and visualisation |
CN108038096A (en) * | 2017-11-10 | 2018-05-15 | 平安科技(深圳)有限公司 | Knowledge database documents method for quickly retrieving, application server computer readable storage medium storing program for executing |
CN108509474A (en) * | 2017-09-15 | 2018-09-07 | 腾讯科技(深圳)有限公司 | Search for the synonym extended method and device of information |
-
2019
- 2019-03-16 CN CN201910205809.0A patent/CN110069610B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010267247A (en) * | 2010-02-08 | 2010-11-25 | Ntt Data Corp | Device and method for retrieving information, terminal equipment, and program |
WO2014087424A2 (en) * | 2012-12-03 | 2014-06-12 | Parthys Reverse Informatics Analytic Solutions (P) Ltd. | Information retrieval, extraction and visualisation |
CN108509474A (en) * | 2017-09-15 | 2018-09-07 | 腾讯科技(深圳)有限公司 | Search for the synonym extended method and device of information |
CN108038096A (en) * | 2017-11-10 | 2018-05-15 | 平安科技(深圳)有限公司 | Knowledge database documents method for quickly retrieving, application server computer readable storage medium storing program for executing |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619067A (en) * | 2019-08-27 | 2019-12-27 | 深圳证券交易所 | Industry classification-based retrieval method and retrieval device and readable storage medium |
CN110705302A (en) * | 2019-10-11 | 2020-01-17 | 掌阅科技股份有限公司 | Named entity recognition method, electronic device and computer storage medium |
CN110705302B (en) * | 2019-10-11 | 2023-12-12 | 掌阅科技股份有限公司 | Named entity identification method, electronic equipment and computer storage medium |
CN110941702A (en) * | 2019-11-26 | 2020-03-31 | 北京明略软件系统有限公司 | Retrieval method and device for laws and regulations and laws and readable storage medium |
CN111078960B (en) * | 2019-12-20 | 2023-09-05 | 金现代信息产业股份有限公司 | Method and system for realizing real-time retrieval of power dispatching system equipment |
CN111078960A (en) * | 2019-12-20 | 2020-04-28 | 金现代信息产业股份有限公司 | Method and system for realizing real-time retrieval of power dispatching system equipment |
CN111223533A (en) * | 2019-12-24 | 2020-06-02 | 深圳市联影医疗数据服务有限公司 | Medical data retrieval method and system |
CN111223533B (en) * | 2019-12-24 | 2024-02-13 | 深圳市联影医疗数据服务有限公司 | Medical data retrieval method and system |
CN111209378A (en) * | 2019-12-26 | 2020-05-29 | 航天信息股份有限公司企业服务分公司 | Ordered hierarchical ordering method based on business dictionary weight |
CN111209378B (en) * | 2019-12-26 | 2024-03-12 | 航天信息股份有限公司企业服务分公司 | Ordered hierarchical ordering method based on business dictionary weights |
CN111708942A (en) * | 2020-06-12 | 2020-09-25 | 北京达佳互联信息技术有限公司 | Multimedia resource pushing method, device, server and storage medium |
CN111708942B (en) * | 2020-06-12 | 2023-08-08 | 北京达佳互联信息技术有限公司 | Multimedia resource pushing method, device, server and storage medium |
CN111767378A (en) * | 2020-06-24 | 2020-10-13 | 北京墨丘科技有限公司 | Method and device for intelligently recommending scientific and technical literature |
CN111859091A (en) * | 2020-07-21 | 2020-10-30 | 山东省科院易达科技咨询有限公司 | Search result aggregation method and device based on artificial intelligence |
CN111859091B (en) * | 2020-07-21 | 2021-06-04 | 山东省科院易达科技咨询有限公司 | Search result aggregation method and device based on artificial intelligence |
CN112052309A (en) * | 2020-09-07 | 2020-12-08 | 深圳壹账通智能科技有限公司 | Text data retrieval method, related equipment and readable storage medium |
CN112380445A (en) * | 2020-11-30 | 2021-02-19 | 深圳前海微众银行股份有限公司 | Data query method, device, equipment and storage medium |
CN112749162B (en) * | 2020-12-31 | 2021-08-17 | 浙江省方大标准信息有限公司 | ES-based rapid retrieval and sorting method for inspection and detection mechanism |
CN112749162A (en) * | 2020-12-31 | 2021-05-04 | 浙江省方大标准信息有限公司 | ES-based rapid retrieval and sorting method for inspection and detection mechanism |
CN114186059A (en) * | 2021-11-01 | 2022-03-15 | 东风汽车集团股份有限公司 | Article classification method and device |
CN115455147A (en) * | 2022-09-09 | 2022-12-09 | 浪潮卓数大数据产业发展有限公司 | Full-text retrieval method and system |
CN115495483A (en) * | 2022-09-21 | 2022-12-20 | 企查查科技有限公司 | Data batch processing method, device, equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110069610B (en) | 2024-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110069610A (en) | Search method, device, equipment and storage medium based on Solr | |
US8468156B2 (en) | Determining a geographic location relevant to a web page | |
US6594654B1 (en) | Systems and methods for continuously accumulating research information via a computer network | |
US8965872B2 (en) | Identifying query formulation suggestions for low-match queries | |
US8478749B2 (en) | Method and apparatus for determining relevant search results using a matrix framework | |
US9881037B2 (en) | Method for systematic mass normalization of titles | |
US20120131033A1 (en) | Automated scheme for identifying user intent in real-time | |
US20060161543A1 (en) | Systems and methods for providing search results based on linguistic analysis | |
EP3234872A1 (en) | Question answering from structured and unstructured data sources | |
Im et al. | Linked tag: image annotation using semantic relationships between image tags | |
WO2009039392A1 (en) | A system for entity search and a method for entity scoring in a linked document database | |
CN107085583B (en) | Electronic document management method and device based on content | |
CN103136228A (en) | Image search method and image search device | |
US20090112845A1 (en) | System and method for language sensitive contextual searching | |
US9971782B2 (en) | Document tagging and retrieval using entity specifiers | |
CN102200974A (en) | Unified information retrieval intelligent agent system and method for search engine | |
CN110245357B (en) | Main entity identification method and device | |
US20090265383A1 (en) | System and method for providing image labeling game using cbir | |
CN116226494A (en) | Crawler system and method for information search | |
US9305103B2 (en) | Method or system for semantic categorization | |
CN111949755B (en) | Information query method and device for hazardous chemicals, electronic equipment and medium | |
CN114064606A (en) | Database migration method, device, equipment, storage medium and system | |
CN111222918A (en) | Keyword mining method and device, electronic equipment and storage medium | |
CN112613320A (en) | Method and device for acquiring similar sentences, storage medium and electronic equipment | |
Hartmann et al. | Using similarity measures for context-aware user interfaces |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |