CN106294695A - A kind of implementation method towards the biggest data search engine - Google Patents
A kind of implementation method towards the biggest data search engine Download PDFInfo
- Publication number
- CN106294695A CN106294695A CN201610640922.8A CN201610640922A CN106294695A CN 106294695 A CN106294695 A CN 106294695A CN 201610640922 A CN201610640922 A CN 201610640922A CN 106294695 A CN106294695 A CN 106294695A
- Authority
- CN
- China
- Prior art keywords
- index
- search engine
- data
- document
- rose
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of implementation method towards the biggest data search engine, relate to search engine technique field.Based on HTTP and Apache Lucene, build ROSE search engine system;Creating the index of ROSE search engine system, after index creation is good, fileinfo can be retrieved by user with input inquiry condition, when user input query condition, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.The method can be good at the full-text search function of real-time streaming data, and jointly completes calculating task with distributed system, makes full use of high-speed computation and the storage of cluster, improves the response speed of Data Analysis Services.
Description
Technical field
The present invention relates to search engine technique field, particularly relate to a kind of realization side towards the biggest data search engine
Method.
Background technology
A lot of web applications are directed to the analyzing and processing of mass data, and the mass data storage of general formatting exists
Data base, nonformatted data stores with document form, or mixes storage with data base and document form.As data base and
File system runs into the data volume that TB data are the biggest, and its analyzing and processing speed will become very slow, and response speed can not
Meet the demand of user.
Traditional network application system framework, mainly has C/S model (or B/S), and S refers to Server (server end), and B refers to
Browser (browser end), C refer to Client (client), differ only in main business logic and be placed on client before both
End is also placed on server end.As it is shown in figure 1, as a example by C/S model, client passes through UI, the data produced alternately with user
Typically can submit to server by network mode and carry out Business Processing, the business datum after process can be stored in data base or literary composition
In part system, wait that secondary uses, the such as operation such as data query, statistics and data mining.This framework (is often referred in big data
The data volume of TB level) in the case of, the analyzing and processing bottleneck of data is concentrated mainly on the I/O of data base and file system, internal memory and
CPU disposal abilities etc., can cause system response even to cannot respond to too slowly, and this system the most not possess extensibility,
Increase storage and calculating resource can not improve its performance.
Apache Hadoop distributed computing system is a software frame realized with java language, by big gauge
Running the Distributed Calculation of mass data in the cluster of calculation machine composition, it can allow application program support thousands of nodes and PB level
Other data.It mainly solves data volume problem, has superiority in the storage processing big data quantity and simple computation problem.It is suitable for
In the batch processing task of massive data files, be not suitable for the scene that requirement of real-time is high, be not suitable for user operation, amendment data frequency
Numerous scene.
Summary of the invention
The technical problem to be solved is to provide a kind of implementation method towards the biggest data search engine, should
Method can be good at the full-text search function of real-time streaming data, and jointly completes calculating task with distributed system, fills
Divide high-speed computation and the storage utilizing cluster, improve the response speed of Data Analysis Services.Achieve expanding of the biggest data
The analyzing and processing of exhibition, the data that system produces need not first store, and directly can be processed in real time and be reflected in response results.
For solving above-mentioned technical problem, the technical solution used in the present invention is: one is drawn towards the biggest data search
The implementation method held up, including implemented below step:
1) based on HTTP and Apache Lucene, ROSE search engine system is built;
2) index of ROSE search engine system is created, by the document information of various forms and database data are entered
Row information extraction, and select different text analyzers to carry out text analyzing according to file type, create index, generate index number
According to storehouse;
3), after index creation is good, fileinfo can be retrieved by user with input inquiry condition, works as user input query
During condition, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.
The technical scheme optimized further is described step 2) in create the step of index and comprise the following steps:
A, appointment create the catalogue indexed;
B, establishment Directory object;
Index file object IndexWriter is write in C, establishment;
D, obtain source file File array to determine index content;
E, with circulation by each file write index, first create Document object and Field object, represent number respectively
According to the Column Properties in the data line in the table of storehouse and this row;Then Field is joined in Document, finally by
IndexWriter calls function addDocument and document index is write in index data base;
Index object IndexWriter is write in F, closedown.
The technical scheme optimized further is described step 2) in the step of retrieval comprise the following steps:
Index object IndexReader is read in A, establishment;
B, establishment object search IndexSearcher;
C, establishment morphological analysis object Analyer;
D, establishment syntactic analysis object QueryParser
E, QueryParser call parser and carry out syntactic analysis, generate query grammar tree, put it in Query;
F, IndexSearcher call search method and scan for query grammar tree Query, obtain result set
TopDocs;
G, according to TopDocs obtain corresponding ScoreDoc;
H, according to ScoreDoc obtain corresponding Document document;
I, according to Document obtain corresponding Field attribute.
The technical scheme optimized further is that ROSE search engine system is provided with the http interface of standard to realize logarithm
According to index increase, delete, revise, inquire about.
The technical scheme optimized further is that ROSE search engine system can quickly set up cluster by Zookeeper,
And go to search according to the correlation behavior of the cluster safeguarded in server after doing hash operation according to the ID value of current index record
Hash value, in which Range, finds the shard of correspondence;Leader sets up in this shard index, until Leader
Node updates has terminated, and version number and document finally are transmitted to belong to together the replicas node of a Shard.
Use and have the beneficial effects that produced by technique scheme: present invention have the advantage that
(1) full-text search of real-time streaming data is supported
ROSE is based primarily upon HTTP and Apache Lucene and realizes, it is possible to the full text well completing real-time streaming data is searched
Suo Gongneng;The field changed in just data base can be inquired, as some table in data base is realized insert mono-
Or many data, what he can be real-time indexes the data creation just now increasing insertion.And its permission is looked into by unique key
Look for the latest edition data of any document, and need not reopen searcher.
(2) analyzing and processing based on real-time streaming data is supported
ROSE not merely supports the full-text search of real-time streaming data, but also supports to be analyzed the data searched place
Reason.ROSE can be grouped according to the field of Facet and add up while search key, and it can't be revised and look into
Asking object information, simply add count information according to classification on Query Result, then user does into one according to count information
The inquiry of step.
(3) the extendible plug-in unit system for full-text search
ROSE can realize some specific functions by more integrated plug-in units, realizes including KAnalyzer, mmseg4j
The Chinese word segmentation function of full-text search, it is possible to integrated solr_pagerlai realize full-text search after search element two-page separation function.
Extendible plug-in unit system makes ROSE more quickly with convenient.
Accompanying drawing explanation
Fig. 1 is traditional network application system Organization Chart;
Fig. 2 is ROSE search engine system structure chart of the present invention;
Fig. 3 is the architectural framework figure of ROSE search engine system of the present invention;
Fig. 4 is index creation of the present invention and search procedure figure.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise
Embodiment, broadly falls into the scope of protection of the invention.
As in figure 2 it is shown, the invention discloses a kind of implementation method towards the biggest data search engine, including following reality
Existing step:
1) based on HTTP and Apache Lucene, ROSE (Real-time OceanData Search Engine) is built
Search engine system;
2) index of ROSE search engine system is created, by the document information of various forms and database data are entered
Row information extraction, and select different text analyzers to carry out text analyzing according to file type, create index, generate index number
According to storehouse;
3), after index creation is good, fileinfo can be retrieved by user with input inquiry condition, works as user input query
During condition, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.
The step creating index comprises the following steps: (with reference to Fig. 3 and Fig. 4)
A, appointment create the catalogue indexed;
B, establishment Directory object;
Index file object IndexWriter is write in C, establishment;
D, obtain source file File array to determine index content;
E, with circulation by each file write index, first create Document object and Field object, represent number respectively
According to the Column Properties in the data line in the table of storehouse and this row;Then Field is joined in Document, finally by
IndexWriter calls function addDocument and document index is write in index data base;
Index object IndexWriter is write in F, closedown.
After index creation is good, index file just can be retrieved by user with input inquiry condition, the step bag of its retrieval
Include following steps:
Index object IndexReader is read in A, establishment;
B, establishment object search IndexSearcher;
C, establishment morphological analysis object Analyer;
D, establishment syntactic analysis object QueryParser
E, QueryParser call parser and carry out syntactic analysis, generate query grammar tree, put it in Query;
F, IndexSearcher call search method and scan for query grammar tree Query, obtain result set
TopDocs;
G, according to TopDocs obtain corresponding ScoreDoc;
H, according to ScoreDoc obtain corresponding Document document;
I, according to Document obtain corresponding Field attribute.
About Lucene system for be made up of 7 bag modules altogether, respectively: analysis, document, index,
QueryParser, search, store, util.Cooperate between each bag module work, and each bag has the most again specifically
Function: analysis module is mainly responsible for Language Processing and morphological analysis;Including dividing that some acquiescences of Lucene carry
Word device, as filtered out the StopAnalyzer class of " stop-word " and conventional StandardAnalyzer class,
WhitespaceAnalyzer presses the class etc. of space character participle;Document module is mainly used in management document structure, quite
Multiple information " territory " (Field) can be comprised in the list structure of relational database, a document, be similar in relation table
Corresponding row;Index module is mainly responsible for index management, including creating index, deletion index, read-write index, merging and optimize
Index etc.;Store module is mainly responsible for read-write and storage index;QueryParser is mainly responsible for syntactic analysis, for resolve and
Perform query statement;Search module is mainly responsible for searching, managing, searches out result set according to condition from index file;util
Module is tool kit, is some common tool classes and the set of method.
The embodiment optimized further is that ROSE search engine system is provided with the http interface of standard and realizes data
Index increase, delete, revise, inquire about.In ROSE, user is by the ROSE being deployed in servlet server
Web application sends HTTP request and starts index and search;ROSE accepts request, determines suitable ROSE to be used
RequestHandler, then processes request.Returned response in the same way by HTTP, default configuration returns the mark of ROSE
Quasi-XML responds, it is also possible to the standby response format of configuration ROSE.
Four different indexes can be transmitted to ROSE index servlet to ask:
Add/update allow to ROSE add document or update document, until submit to after just can search these add and
Update.
Commit tells ROSE, it should make all changes done since submitting to last time to search.
The file of optimize reconstruct Lucene, to improve search performance, performs after having indexed to optimize generally to compare
Good.If updating relatively more frequent, then should arrange to optimize utilization rate is relatively low when.One index can also be just without optimizing
Often run.Optimization is a time-consuming more process.
Delete can be specified by id or inquiry, deletes by id and deletion has the document specifying id;Delete by inquiry
Except all documents that Delete query is returned.
Realize adding document index then to have only to call searching interface and submit XML message in the way of HTTP POST.
The embodiment optimized further is that ROSE search engine system can quickly set up cluster by Zookeeper, and
ID value according to current index record goes to search according to the correlation behavior of the cluster safeguarded in server after doing hash operation
Hash value, in which Range, finds the shard of correspondence;Leader sets up in this shard index, until Leader
Node updates has terminated, and version number and document finally are transmitted to belong to together the replicas node of a Shard.
Present disclosure applies equally to put into the system of actual operation, it is only necessary to do small on source code to application program
Amendment, system deployment to increase by 1 index server or an index server cluster according to historical data amount.
The main flow of ROSE search engine system application includes:
(1) user sends add request by client, and submits corresponding document to;
(2) server-side application receives the document that client submits to, and file is stored in file system and to data
Storehouse updates relative recording;
(3) index server call analyzing and processing application program the data that user submits are analyzed process, and general at
Data after reason are indexed;
(4) user sends Query, Update or Delete request by client;
(5) after server-side application receives client's request, direct search index server is straight by index server
Connect the inquired about data of return or perform update, delete operation.
Advantage of the present invention is:
1) full-text search of real-time streaming data and Distributed Calculation function
ROSE is based primarily upon HTTP and Apache Lucene and realizes, it is possible to the full text well completing real-time streaming data is searched
Suo Gongneng.ROSE is an independent enterprise-level search application server.Principle is that document utilizes XML to be added to one by Http
In search set;Inquiring about this set is also to receive an XML/JSON response by http to realize.Its key property includes:
Efficiently, caching function, vertical search function flexibly, be highlighted Search Results, improve availability by index copy, carry
Field is defined, type and text analyzing is set, it is provided that Web-based enterprise management interface etc. for a set of powerful Data Schema.
The core concept of Distributed Calculation function is that ROSE is completed calculating task, fully profit jointly by a distributed system
With power high-speed computation and the storage of cluster.There is the feature of high fault tolerance, and be designed to be deployed in cheap (low-
Cost) on hardware.And it provides high transmission rates (high throughput) to carry out the data of access application, is suitable for those
There is the application program of super large data set (large data set).
2) extendible distributed computing architecture
ROSE can quickly set up cluster by Zookeeper, and provides simple slicing algorithm, i.e. according to current
The ID value of index record does hash operation, after go to search hash value at which according to the correlation behavior of cluster safeguarded in server
In individual Range, find the shard of correspondence;Leader sets up in this shard index, until Leader node updates terminates
Complete, version number and document finally are transmitted to belong to together the replicas node of a Shard.Therefore, this framework can be dynamic
Carrying out dispose, work including hardware can be increased simultaneously, configurable multiple servers manage data simultaneously.
3) extendible plug-in unit system
The stream realizing real-time big data processes, then high-speed access data and quickly return result data result set are one
The problem that must must consider.And realize the full-text search of ROSE based on HTTP and Apache Lucene and can extend other plug-in units
Complete specific function.Such as IKAnalyzer, the segmenter such as mmseg4j, paoding realizes Chinese word segmentation function, it is possible to
Integrated solr_pager realizes searching for two-page separation function, and data can be processed and divide by this characteristic faster
Analysis.
Claims (5)
1. the implementation method towards in real time big data search engine, it is characterised in that: include implemented below step:
1) based on HTTP and Apache Lucene, ROSE search engine system is built;
2) index of ROSE search engine system is created, by the document information of various forms and database data are carried out letter
Breath extraction, and select different text analyzers to carry out text analyzing according to file type, create index, generate index data
Storehouse;
3), after index creation is good, fileinfo can be retrieved by user with input inquiry condition, when user input query condition
Time, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: described
Step 2) in create index step comprise the following steps:
A, appointment create the catalogue indexed;
B, establishment Directory object;
Index file object IndexWriter is write in C, establishment;
D, obtain source file File array to determine index content;
E, with circulation by each file write index, first create Document object and Field object, respectively representation database
Data line in table and the Column Properties in this row;Then Field is joined in Document, finally by IndexWriter
Call function addDocument document index to be write in index data base;
Index object IndexWriter is write in F, closedown.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: described
Step 2) in retrieval step comprise the following steps:
Index object IndexReader is read in A, establishment;
B, establishment object search IndexSearcher;
C, establishment morphological analysis object Analyer;
D, establishment syntactic analysis object QueryParser
E, QueryParser call parser and carry out syntactic analysis, generate query grammar tree, put it in Query;
F, IndexSearcher call search method and scan for query grammar tree Query, obtain result set TopDocs;
G, according to TopDocs obtain corresponding ScoreDoc;
H, according to ScoreDoc obtain corresponding Document document;
I, according to Document obtain corresponding Field attribute.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: ROSE
Search engine system is provided with the http interface of standard and realizes the increase of the index to data, deletes, revises, inquires about.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: ROSE
Search engine system can quickly set up cluster by Zookeeper, and is hash behaviour according to the id value of current index record
After work, the correlation behavior according to the cluster safeguarded in server goes lookup hash value in which Range, finds correspondence
shard;Leader sets up in this shard index, until Leader node updates has terminated, finally by version number and literary composition
Shelves are transmitted to belong to together the replicas node of a Shard.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610640922.8A CN106294695A (en) | 2016-08-08 | 2016-08-08 | A kind of implementation method towards the biggest data search engine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610640922.8A CN106294695A (en) | 2016-08-08 | 2016-08-08 | A kind of implementation method towards the biggest data search engine |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106294695A true CN106294695A (en) | 2017-01-04 |
Family
ID=57666019
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610640922.8A Pending CN106294695A (en) | 2016-08-08 | 2016-08-08 | A kind of implementation method towards the biggest data search engine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106294695A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106933999A (en) * | 2017-03-01 | 2017-07-07 | 湖南蚁坊软件股份有限公司 | A kind of ApacheLucene highlighted methods of scoring of independent search |
CN107463692A (en) * | 2017-08-11 | 2017-12-12 | 山东合天智汇信息技术有限公司 | Super large text data is synchronized to the method and system of search engine |
CN108228743A (en) * | 2017-12-18 | 2018-06-29 | 深圳供电局有限公司 | A kind of real-time big data search engine system |
CN109635275A (en) * | 2018-11-06 | 2019-04-16 | 交控科技股份有限公司 | Literature content retrieval and recognition methods and device |
CN109800412A (en) * | 2018-12-10 | 2019-05-24 | 鲁东大学 | A kind of Chinese word segmentation and big data information retrieval method and device |
CN111190929A (en) * | 2019-12-27 | 2020-05-22 | 四川师范大学 | Data storage query method and device, electronic equipment and storage medium |
CN111209462A (en) * | 2020-01-02 | 2020-05-29 | 北京字节跳动网络技术有限公司 | Data processing method, device and equipment |
CN111291003A (en) * | 2020-01-21 | 2020-06-16 | 浙江工商大学 | Data reading method and device and electronic equipment |
CN111611222A (en) * | 2020-04-27 | 2020-09-01 | 上海鼎茂信息技术有限公司 | Data dynamic processing method based on distributed storage |
CN111723261A (en) * | 2019-03-22 | 2020-09-29 | 昆明逆火科技股份有限公司 | Search engine-based DNA comparison algorithm |
CN112948533A (en) * | 2021-04-13 | 2021-06-11 | 天津禄智技术有限公司 | Text retrieval method for multiple retrieval and sequencing |
CN113886505A (en) * | 2021-09-28 | 2022-01-04 | 西安阳易信息技术有限公司 | Management system for realizing dynamic modeling based on search engine and relational database |
CN114579596A (en) * | 2022-05-06 | 2022-06-03 | 达而观数据(成都)有限公司 | Method and system for updating index data of search engine in real time |
CN116226470A (en) * | 2023-05-09 | 2023-06-06 | 南昌大学 | Management method, system, equipment and medium for ocean space-time data |
CN116719839A (en) * | 2023-08-10 | 2023-09-08 | 北京合思信息技术有限公司 | Data query method and device of accounting file and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103927342A (en) * | 2014-03-28 | 2014-07-16 | 苏州中炎工贸有限公司 | Vertical search engine system on basis of big data |
CN104199977A (en) * | 2014-09-24 | 2014-12-10 | 浪潮软件股份有限公司 | Method for searching based on data creation information in database |
CN105183884A (en) * | 2015-09-24 | 2015-12-23 | 西安未来国际信息股份有限公司 | Search engine system and method based on big data technique |
CN105701234A (en) * | 2016-02-19 | 2016-06-22 | 浪潮通用软件有限公司 | C # full-text retrieval-based implementation method |
CN105740472A (en) * | 2016-03-14 | 2016-07-06 | 中国科学院计算技术研究所 | Distributed real-time full-text search method and system |
-
2016
- 2016-08-08 CN CN201610640922.8A patent/CN106294695A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103927342A (en) * | 2014-03-28 | 2014-07-16 | 苏州中炎工贸有限公司 | Vertical search engine system on basis of big data |
CN104199977A (en) * | 2014-09-24 | 2014-12-10 | 浪潮软件股份有限公司 | Method for searching based on data creation information in database |
CN105183884A (en) * | 2015-09-24 | 2015-12-23 | 西安未来国际信息股份有限公司 | Search engine system and method based on big data technique |
CN105701234A (en) * | 2016-02-19 | 2016-06-22 | 浪潮通用软件有限公司 | C # full-text retrieval-based implementation method |
CN105740472A (en) * | 2016-03-14 | 2016-07-06 | 中国科学院计算技术研究所 | Distributed real-time full-text search method and system |
Non-Patent Citations (4)
Title |
---|
于天恩: "《Lucene搜索引擎开发权威经典》", 31 October 2008 * |
柴洁: ""基于IKAnalyzer 和Lucene 的地理编码中文搜索引擎的研究与实现"", 《城市勘测》 * |
梁丽雯: ""全文检索实现"", 《软件服务》 * |
蔡学锋: ""基于Solr的搜索引擎核心技术研究与应用"", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106933999A (en) * | 2017-03-01 | 2017-07-07 | 湖南蚁坊软件股份有限公司 | A kind of ApacheLucene highlighted methods of scoring of independent search |
CN106933999B (en) * | 2017-03-01 | 2020-05-08 | 湖南蚁坊软件股份有限公司 | Apache lucene score highlighting method for independent search |
CN107463692A (en) * | 2017-08-11 | 2017-12-12 | 山东合天智汇信息技术有限公司 | Super large text data is synchronized to the method and system of search engine |
CN107463692B (en) * | 2017-08-11 | 2019-10-18 | 山东合天智汇信息技术有限公司 | Super large text data is synchronized to the method and system of search engine |
CN108228743A (en) * | 2017-12-18 | 2018-06-29 | 深圳供电局有限公司 | A kind of real-time big data search engine system |
CN109635275A (en) * | 2018-11-06 | 2019-04-16 | 交控科技股份有限公司 | Literature content retrieval and recognition methods and device |
CN109800412A (en) * | 2018-12-10 | 2019-05-24 | 鲁东大学 | A kind of Chinese word segmentation and big data information retrieval method and device |
CN111723261B (en) * | 2019-03-22 | 2021-08-13 | 昆明逆火科技股份有限公司 | Search engine-based DNA comparison algorithm |
CN111723261A (en) * | 2019-03-22 | 2020-09-29 | 昆明逆火科技股份有限公司 | Search engine-based DNA comparison algorithm |
CN111190929A (en) * | 2019-12-27 | 2020-05-22 | 四川师范大学 | Data storage query method and device, electronic equipment and storage medium |
CN111190929B (en) * | 2019-12-27 | 2023-07-14 | 四川师范大学 | Data storage query method and device, electronic equipment and storage medium |
CN111209462A (en) * | 2020-01-02 | 2020-05-29 | 北京字节跳动网络技术有限公司 | Data processing method, device and equipment |
CN111291003B (en) * | 2020-01-21 | 2021-01-05 | 浙江工商大学 | Data reading method and device and electronic equipment |
CN111291003A (en) * | 2020-01-21 | 2020-06-16 | 浙江工商大学 | Data reading method and device and electronic equipment |
CN111611222A (en) * | 2020-04-27 | 2020-09-01 | 上海鼎茂信息技术有限公司 | Data dynamic processing method based on distributed storage |
CN112948533A (en) * | 2021-04-13 | 2021-06-11 | 天津禄智技术有限公司 | Text retrieval method for multiple retrieval and sequencing |
CN113886505A (en) * | 2021-09-28 | 2022-01-04 | 西安阳易信息技术有限公司 | Management system for realizing dynamic modeling based on search engine and relational database |
CN113886505B (en) * | 2021-09-28 | 2024-04-30 | 西安阳易信息技术有限公司 | Management system for realizing dynamic modeling based on search engine and relational database |
CN114579596A (en) * | 2022-05-06 | 2022-06-03 | 达而观数据(成都)有限公司 | Method and system for updating index data of search engine in real time |
CN114579596B (en) * | 2022-05-06 | 2022-09-06 | 达而观数据(成都)有限公司 | Method and system for updating index data of search engine in real time |
CN116226470A (en) * | 2023-05-09 | 2023-06-06 | 南昌大学 | Management method, system, equipment and medium for ocean space-time data |
CN116226470B (en) * | 2023-05-09 | 2023-07-28 | 南昌大学 | Management method, system, equipment and medium for ocean space-time data |
CN116719839A (en) * | 2023-08-10 | 2023-09-08 | 北京合思信息技术有限公司 | Data query method and device of accounting file and electronic equipment |
CN116719839B (en) * | 2023-08-10 | 2024-01-26 | 北京合思信息技术有限公司 | Data query method and device of accounting file and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106294695A (en) | A kind of implementation method towards the biggest data search engine | |
CN110291517B (en) | Query language interoperability in graph databases | |
CN111259006B (en) | Universal distributed heterogeneous data integrated physical aggregation, organization, release and service method and system | |
CN107402995B (en) | Distributed newSQL database system and method | |
US11468103B2 (en) | Relational modeler and renderer for non-relational data | |
CN109101652B (en) | Label creating and managing system | |
CN110032604B (en) | Data storage device, translation device and database access method | |
US7337163B1 (en) | Multidimensional database query splitting | |
JP3842573B2 (en) | Structured document search method, structured document management apparatus and program | |
EP2874077B1 (en) | Stateless database cache | |
US8924373B2 (en) | Query plans with parameter markers in place of object identifiers | |
US20220083618A1 (en) | Method And System For Scalable Search Using MicroService And Cloud Based Search With Records Indexes | |
CN108228743A (en) | A kind of real-time big data search engine system | |
US9734176B2 (en) | Index merge ordering | |
CN114461603A (en) | Multi-source heterogeneous data fusion method and device | |
US20110184956A1 (en) | Accessing digitally published content using re-indexing of search results | |
US10949409B2 (en) | On-demand, dynamic and optimized indexing in natural language processing | |
US10776368B1 (en) | Deriving cardinality values from approximate quantile summaries | |
KR101955376B1 (en) | Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method | |
US8200673B2 (en) | System and method for on-demand indexing | |
CN114443599A (en) | Data synchronization method and device, electronic equipment and storage medium | |
US20050060307A1 (en) | System, method, and service for datatype caching, resolving, and escalating an SQL template with references | |
CN107291875B (en) | Metadata organization management method and system based on metadata graph | |
US11556525B2 (en) | Hybrid online analytical processing (OLAP) and relational query processing | |
CN115185973A (en) | Data resource sharing method, platform, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170104 |
|
RJ01 | Rejection of invention patent application after publication |