CN106294695A - A kind of implementation method towards the biggest data search engine - Google Patents

A kind of implementation method towards the biggest data search engine Download PDF

Info

Publication number
CN106294695A
CN106294695A CN201610640922.8A CN201610640922A CN106294695A CN 106294695 A CN106294695 A CN 106294695A CN 201610640922 A CN201610640922 A CN 201610640922A CN 106294695 A CN106294695 A CN 106294695A
Authority
CN
China
Prior art keywords
index
search engine
data
document
rose
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610640922.8A
Other languages
Chinese (zh)
Inventor
张剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Net Peace Computer Security Detection Technique Co Ltd Of Shenzhen
Original Assignee
Net Peace Computer Security Detection Technique Co Ltd Of Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Net Peace Computer Security Detection Technique Co Ltd Of Shenzhen filed Critical Net Peace Computer Security Detection Technique Co Ltd Of Shenzhen
Priority to CN201610640922.8A priority Critical patent/CN106294695A/en
Publication of CN106294695A publication Critical patent/CN106294695A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of implementation method towards the biggest data search engine, relate to search engine technique field.Based on HTTP and Apache Lucene, build ROSE search engine system;Creating the index of ROSE search engine system, after index creation is good, fileinfo can be retrieved by user with input inquiry condition, when user input query condition, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.The method can be good at the full-text search function of real-time streaming data, and jointly completes calculating task with distributed system, makes full use of high-speed computation and the storage of cluster, improves the response speed of Data Analysis Services.

Description

A kind of implementation method towards the biggest data search engine
Technical field
The present invention relates to search engine technique field, particularly relate to a kind of realization side towards the biggest data search engine Method.
Background technology
A lot of web applications are directed to the analyzing and processing of mass data, and the mass data storage of general formatting exists Data base, nonformatted data stores with document form, or mixes storage with data base and document form.As data base and File system runs into the data volume that TB data are the biggest, and its analyzing and processing speed will become very slow, and response speed can not Meet the demand of user.
Traditional network application system framework, mainly has C/S model (or B/S), and S refers to Server (server end), and B refers to Browser (browser end), C refer to Client (client), differ only in main business logic and be placed on client before both End is also placed on server end.As it is shown in figure 1, as a example by C/S model, client passes through UI, the data produced alternately with user Typically can submit to server by network mode and carry out Business Processing, the business datum after process can be stored in data base or literary composition In part system, wait that secondary uses, the such as operation such as data query, statistics and data mining.This framework (is often referred in big data The data volume of TB level) in the case of, the analyzing and processing bottleneck of data is concentrated mainly on the I/O of data base and file system, internal memory and CPU disposal abilities etc., can cause system response even to cannot respond to too slowly, and this system the most not possess extensibility, Increase storage and calculating resource can not improve its performance.
Apache Hadoop distributed computing system is a software frame realized with java language, by big gauge Running the Distributed Calculation of mass data in the cluster of calculation machine composition, it can allow application program support thousands of nodes and PB level Other data.It mainly solves data volume problem, has superiority in the storage processing big data quantity and simple computation problem.It is suitable for In the batch processing task of massive data files, be not suitable for the scene that requirement of real-time is high, be not suitable for user operation, amendment data frequency Numerous scene.
Summary of the invention
The technical problem to be solved is to provide a kind of implementation method towards the biggest data search engine, should Method can be good at the full-text search function of real-time streaming data, and jointly completes calculating task with distributed system, fills Divide high-speed computation and the storage utilizing cluster, improve the response speed of Data Analysis Services.Achieve expanding of the biggest data The analyzing and processing of exhibition, the data that system produces need not first store, and directly can be processed in real time and be reflected in response results.
For solving above-mentioned technical problem, the technical solution used in the present invention is: one is drawn towards the biggest data search The implementation method held up, including implemented below step:
1) based on HTTP and Apache Lucene, ROSE search engine system is built;
2) index of ROSE search engine system is created, by the document information of various forms and database data are entered Row information extraction, and select different text analyzers to carry out text analyzing according to file type, create index, generate index number According to storehouse;
3), after index creation is good, fileinfo can be retrieved by user with input inquiry condition, works as user input query During condition, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.
The technical scheme optimized further is described step 2) in create the step of index and comprise the following steps:
A, appointment create the catalogue indexed;
B, establishment Directory object;
Index file object IndexWriter is write in C, establishment;
D, obtain source file File array to determine index content;
E, with circulation by each file write index, first create Document object and Field object, represent number respectively According to the Column Properties in the data line in the table of storehouse and this row;Then Field is joined in Document, finally by IndexWriter calls function addDocument and document index is write in index data base;
Index object IndexWriter is write in F, closedown.
The technical scheme optimized further is described step 2) in the step of retrieval comprise the following steps:
Index object IndexReader is read in A, establishment;
B, establishment object search IndexSearcher;
C, establishment morphological analysis object Analyer;
D, establishment syntactic analysis object QueryParser
E, QueryParser call parser and carry out syntactic analysis, generate query grammar tree, put it in Query;
F, IndexSearcher call search method and scan for query grammar tree Query, obtain result set TopDocs;
G, according to TopDocs obtain corresponding ScoreDoc;
H, according to ScoreDoc obtain corresponding Document document;
I, according to Document obtain corresponding Field attribute.
The technical scheme optimized further is that ROSE search engine system is provided with the http interface of standard to realize logarithm According to index increase, delete, revise, inquire about.
The technical scheme optimized further is that ROSE search engine system can quickly set up cluster by Zookeeper, And go to search according to the correlation behavior of the cluster safeguarded in server after doing hash operation according to the ID value of current index record Hash value, in which Range, finds the shard of correspondence;Leader sets up in this shard index, until Leader Node updates has terminated, and version number and document finally are transmitted to belong to together the replicas node of a Shard.
Use and have the beneficial effects that produced by technique scheme: present invention have the advantage that
(1) full-text search of real-time streaming data is supported
ROSE is based primarily upon HTTP and Apache Lucene and realizes, it is possible to the full text well completing real-time streaming data is searched Suo Gongneng;The field changed in just data base can be inquired, as some table in data base is realized insert mono- Or many data, what he can be real-time indexes the data creation just now increasing insertion.And its permission is looked into by unique key Look for the latest edition data of any document, and need not reopen searcher.
(2) analyzing and processing based on real-time streaming data is supported
ROSE not merely supports the full-text search of real-time streaming data, but also supports to be analyzed the data searched place Reason.ROSE can be grouped according to the field of Facet and add up while search key, and it can't be revised and look into Asking object information, simply add count information according to classification on Query Result, then user does into one according to count information The inquiry of step.
(3) the extendible plug-in unit system for full-text search
ROSE can realize some specific functions by more integrated plug-in units, realizes including KAnalyzer, mmseg4j The Chinese word segmentation function of full-text search, it is possible to integrated solr_pagerlai realize full-text search after search element two-page separation function. Extendible plug-in unit system makes ROSE more quickly with convenient.
Accompanying drawing explanation
Fig. 1 is traditional network application system Organization Chart;
Fig. 2 is ROSE search engine system structure chart of the present invention;
Fig. 3 is the architectural framework figure of ROSE search engine system of the present invention;
Fig. 4 is index creation of the present invention and search procedure figure.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise Embodiment, broadly falls into the scope of protection of the invention.
As in figure 2 it is shown, the invention discloses a kind of implementation method towards the biggest data search engine, including following reality Existing step:
1) based on HTTP and Apache Lucene, ROSE (Real-time OceanData Search Engine) is built Search engine system;
2) index of ROSE search engine system is created, by the document information of various forms and database data are entered Row information extraction, and select different text analyzers to carry out text analyzing according to file type, create index, generate index number According to storehouse;
3), after index creation is good, fileinfo can be retrieved by user with input inquiry condition, works as user input query During condition, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.
The step creating index comprises the following steps: (with reference to Fig. 3 and Fig. 4)
A, appointment create the catalogue indexed;
B, establishment Directory object;
Index file object IndexWriter is write in C, establishment;
D, obtain source file File array to determine index content;
E, with circulation by each file write index, first create Document object and Field object, represent number respectively According to the Column Properties in the data line in the table of storehouse and this row;Then Field is joined in Document, finally by IndexWriter calls function addDocument and document index is write in index data base;
Index object IndexWriter is write in F, closedown.
After index creation is good, index file just can be retrieved by user with input inquiry condition, the step bag of its retrieval Include following steps:
Index object IndexReader is read in A, establishment;
B, establishment object search IndexSearcher;
C, establishment morphological analysis object Analyer;
D, establishment syntactic analysis object QueryParser
E, QueryParser call parser and carry out syntactic analysis, generate query grammar tree, put it in Query;
F, IndexSearcher call search method and scan for query grammar tree Query, obtain result set TopDocs;
G, according to TopDocs obtain corresponding ScoreDoc;
H, according to ScoreDoc obtain corresponding Document document;
I, according to Document obtain corresponding Field attribute.
About Lucene system for be made up of 7 bag modules altogether, respectively: analysis, document, index, QueryParser, search, store, util.Cooperate between each bag module work, and each bag has the most again specifically Function: analysis module is mainly responsible for Language Processing and morphological analysis;Including dividing that some acquiescences of Lucene carry Word device, as filtered out the StopAnalyzer class of " stop-word " and conventional StandardAnalyzer class, WhitespaceAnalyzer presses the class etc. of space character participle;Document module is mainly used in management document structure, quite Multiple information " territory " (Field) can be comprised in the list structure of relational database, a document, be similar in relation table Corresponding row;Index module is mainly responsible for index management, including creating index, deletion index, read-write index, merging and optimize Index etc.;Store module is mainly responsible for read-write and storage index;QueryParser is mainly responsible for syntactic analysis, for resolve and Perform query statement;Search module is mainly responsible for searching, managing, searches out result set according to condition from index file;util Module is tool kit, is some common tool classes and the set of method.
The embodiment optimized further is that ROSE search engine system is provided with the http interface of standard and realizes data Index increase, delete, revise, inquire about.In ROSE, user is by the ROSE being deployed in servlet server Web application sends HTTP request and starts index and search;ROSE accepts request, determines suitable ROSE to be used RequestHandler, then processes request.Returned response in the same way by HTTP, default configuration returns the mark of ROSE Quasi-XML responds, it is also possible to the standby response format of configuration ROSE.
Four different indexes can be transmitted to ROSE index servlet to ask:
Add/update allow to ROSE add document or update document, until submit to after just can search these add and Update.
Commit tells ROSE, it should make all changes done since submitting to last time to search.
The file of optimize reconstruct Lucene, to improve search performance, performs after having indexed to optimize generally to compare Good.If updating relatively more frequent, then should arrange to optimize utilization rate is relatively low when.One index can also be just without optimizing Often run.Optimization is a time-consuming more process.
Delete can be specified by id or inquiry, deletes by id and deletion has the document specifying id;Delete by inquiry Except all documents that Delete query is returned.
Realize adding document index then to have only to call searching interface and submit XML message in the way of HTTP POST.
The embodiment optimized further is that ROSE search engine system can quickly set up cluster by Zookeeper, and ID value according to current index record goes to search according to the correlation behavior of the cluster safeguarded in server after doing hash operation Hash value, in which Range, finds the shard of correspondence;Leader sets up in this shard index, until Leader Node updates has terminated, and version number and document finally are transmitted to belong to together the replicas node of a Shard.
Present disclosure applies equally to put into the system of actual operation, it is only necessary to do small on source code to application program Amendment, system deployment to increase by 1 index server or an index server cluster according to historical data amount.
The main flow of ROSE search engine system application includes:
(1) user sends add request by client, and submits corresponding document to;
(2) server-side application receives the document that client submits to, and file is stored in file system and to data Storehouse updates relative recording;
(3) index server call analyzing and processing application program the data that user submits are analyzed process, and general at Data after reason are indexed;
(4) user sends Query, Update or Delete request by client;
(5) after server-side application receives client's request, direct search index server is straight by index server Connect the inquired about data of return or perform update, delete operation.
Advantage of the present invention is:
1) full-text search of real-time streaming data and Distributed Calculation function
ROSE is based primarily upon HTTP and Apache Lucene and realizes, it is possible to the full text well completing real-time streaming data is searched Suo Gongneng.ROSE is an independent enterprise-level search application server.Principle is that document utilizes XML to be added to one by Http In search set;Inquiring about this set is also to receive an XML/JSON response by http to realize.Its key property includes: Efficiently, caching function, vertical search function flexibly, be highlighted Search Results, improve availability by index copy, carry Field is defined, type and text analyzing is set, it is provided that Web-based enterprise management interface etc. for a set of powerful Data Schema.
The core concept of Distributed Calculation function is that ROSE is completed calculating task, fully profit jointly by a distributed system With power high-speed computation and the storage of cluster.There is the feature of high fault tolerance, and be designed to be deployed in cheap (low- Cost) on hardware.And it provides high transmission rates (high throughput) to carry out the data of access application, is suitable for those There is the application program of super large data set (large data set).
2) extendible distributed computing architecture
ROSE can quickly set up cluster by Zookeeper, and provides simple slicing algorithm, i.e. according to current The ID value of index record does hash operation, after go to search hash value at which according to the correlation behavior of cluster safeguarded in server In individual Range, find the shard of correspondence;Leader sets up in this shard index, until Leader node updates terminates Complete, version number and document finally are transmitted to belong to together the replicas node of a Shard.Therefore, this framework can be dynamic Carrying out dispose, work including hardware can be increased simultaneously, configurable multiple servers manage data simultaneously.
3) extendible plug-in unit system
The stream realizing real-time big data processes, then high-speed access data and quickly return result data result set are one The problem that must must consider.And realize the full-text search of ROSE based on HTTP and Apache Lucene and can extend other plug-in units Complete specific function.Such as IKAnalyzer, the segmenter such as mmseg4j, paoding realizes Chinese word segmentation function, it is possible to Integrated solr_pager realizes searching for two-page separation function, and data can be processed and divide by this characteristic faster Analysis.

Claims (5)

1. the implementation method towards in real time big data search engine, it is characterised in that: include implemented below step:
1) based on HTTP and Apache Lucene, ROSE search engine system is built;
2) index of ROSE search engine system is created, by the document information of various forms and database data are carried out letter Breath extraction, and select different text analyzers to carry out text analyzing according to file type, create index, generate index data Storehouse;
3), after index creation is good, fileinfo can be retrieved by user with input inquiry condition, when user input query condition Time, first carry out text analyzing, then from index data base search index, finally the result obtained is returned to user.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: described Step 2) in create index step comprise the following steps:
A, appointment create the catalogue indexed;
B, establishment Directory object;
Index file object IndexWriter is write in C, establishment;
D, obtain source file File array to determine index content;
E, with circulation by each file write index, first create Document object and Field object, respectively representation database Data line in table and the Column Properties in this row;Then Field is joined in Document, finally by IndexWriter Call function addDocument document index to be write in index data base;
Index object IndexWriter is write in F, closedown.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: described Step 2) in retrieval step comprise the following steps:
Index object IndexReader is read in A, establishment;
B, establishment object search IndexSearcher;
C, establishment morphological analysis object Analyer;
D, establishment syntactic analysis object QueryParser
E, QueryParser call parser and carry out syntactic analysis, generate query grammar tree, put it in Query;
F, IndexSearcher call search method and scan for query grammar tree Query, obtain result set TopDocs;
G, according to TopDocs obtain corresponding ScoreDoc;
H, according to ScoreDoc obtain corresponding Document document;
I, according to Document obtain corresponding Field attribute.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: ROSE Search engine system is provided with the http interface of standard and realizes the increase of the index to data, deletes, revises, inquires about.
A kind of implementation method towards the biggest data search engine the most according to claim 1, it is characterised in that: ROSE Search engine system can quickly set up cluster by Zookeeper, and is hash behaviour according to the id value of current index record After work, the correlation behavior according to the cluster safeguarded in server goes lookup hash value in which Range, finds correspondence shard;Leader sets up in this shard index, until Leader node updates has terminated, finally by version number and literary composition Shelves are transmitted to belong to together the replicas node of a Shard.
CN201610640922.8A 2016-08-08 2016-08-08 A kind of implementation method towards the biggest data search engine Pending CN106294695A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610640922.8A CN106294695A (en) 2016-08-08 2016-08-08 A kind of implementation method towards the biggest data search engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610640922.8A CN106294695A (en) 2016-08-08 2016-08-08 A kind of implementation method towards the biggest data search engine

Publications (1)

Publication Number Publication Date
CN106294695A true CN106294695A (en) 2017-01-04

Family

ID=57666019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610640922.8A Pending CN106294695A (en) 2016-08-08 2016-08-08 A kind of implementation method towards the biggest data search engine

Country Status (1)

Country Link
CN (1) CN106294695A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933999A (en) * 2017-03-01 2017-07-07 湖南蚁坊软件股份有限公司 A kind of ApacheLucene highlighted methods of scoring of independent search
CN107463692A (en) * 2017-08-11 2017-12-12 山东合天智汇信息技术有限公司 Super large text data is synchronized to the method and system of search engine
CN108228743A (en) * 2017-12-18 2018-06-29 深圳供电局有限公司 A kind of real-time big data search engine system
CN109635275A (en) * 2018-11-06 2019-04-16 交控科技股份有限公司 Literature content retrieval and recognition methods and device
CN109800412A (en) * 2018-12-10 2019-05-24 鲁东大学 A kind of Chinese word segmentation and big data information retrieval method and device
CN111190929A (en) * 2019-12-27 2020-05-22 四川师范大学 Data storage query method and device, electronic equipment and storage medium
CN111209462A (en) * 2020-01-02 2020-05-29 北京字节跳动网络技术有限公司 Data processing method, device and equipment
CN111291003A (en) * 2020-01-21 2020-06-16 浙江工商大学 Data reading method and device and electronic equipment
CN111611222A (en) * 2020-04-27 2020-09-01 上海鼎茂信息技术有限公司 Data dynamic processing method based on distributed storage
CN111723261A (en) * 2019-03-22 2020-09-29 昆明逆火科技股份有限公司 Search engine-based DNA comparison algorithm
CN112948533A (en) * 2021-04-13 2021-06-11 天津禄智技术有限公司 Text retrieval method for multiple retrieval and sequencing
CN113886505A (en) * 2021-09-28 2022-01-04 西安阳易信息技术有限公司 Management system for realizing dynamic modeling based on search engine and relational database
CN114579596A (en) * 2022-05-06 2022-06-03 达而观数据(成都)有限公司 Method and system for updating index data of search engine in real time
CN116226470A (en) * 2023-05-09 2023-06-06 南昌大学 Management method, system, equipment and medium for ocean space-time data
CN116719839A (en) * 2023-08-10 2023-09-08 北京合思信息技术有限公司 Data query method and device of accounting file and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927342A (en) * 2014-03-28 2014-07-16 苏州中炎工贸有限公司 Vertical search engine system on basis of big data
CN104199977A (en) * 2014-09-24 2014-12-10 浪潮软件股份有限公司 Method for searching based on data creation information in database
CN105183884A (en) * 2015-09-24 2015-12-23 西安未来国际信息股份有限公司 Search engine system and method based on big data technique
CN105701234A (en) * 2016-02-19 2016-06-22 浪潮通用软件有限公司 C # full-text retrieval-based implementation method
CN105740472A (en) * 2016-03-14 2016-07-06 中国科学院计算技术研究所 Distributed real-time full-text search method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927342A (en) * 2014-03-28 2014-07-16 苏州中炎工贸有限公司 Vertical search engine system on basis of big data
CN104199977A (en) * 2014-09-24 2014-12-10 浪潮软件股份有限公司 Method for searching based on data creation information in database
CN105183884A (en) * 2015-09-24 2015-12-23 西安未来国际信息股份有限公司 Search engine system and method based on big data technique
CN105701234A (en) * 2016-02-19 2016-06-22 浪潮通用软件有限公司 C # full-text retrieval-based implementation method
CN105740472A (en) * 2016-03-14 2016-07-06 中国科学院计算技术研究所 Distributed real-time full-text search method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
于天恩: "《Lucene搜索引擎开发权威经典》", 31 October 2008 *
柴洁: ""基于IKAnalyzer 和Lucene 的地理编码中文搜索引擎的研究与实现"", 《城市勘测》 *
梁丽雯: ""全文检索实现"", 《软件服务》 *
蔡学锋: ""基于Solr的搜索引擎核心技术研究与应用"", 《中国优秀硕士学位论文全文数据库》 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933999A (en) * 2017-03-01 2017-07-07 湖南蚁坊软件股份有限公司 A kind of ApacheLucene highlighted methods of scoring of independent search
CN106933999B (en) * 2017-03-01 2020-05-08 湖南蚁坊软件股份有限公司 Apache lucene score highlighting method for independent search
CN107463692A (en) * 2017-08-11 2017-12-12 山东合天智汇信息技术有限公司 Super large text data is synchronized to the method and system of search engine
CN107463692B (en) * 2017-08-11 2019-10-18 山东合天智汇信息技术有限公司 Super large text data is synchronized to the method and system of search engine
CN108228743A (en) * 2017-12-18 2018-06-29 深圳供电局有限公司 A kind of real-time big data search engine system
CN109635275A (en) * 2018-11-06 2019-04-16 交控科技股份有限公司 Literature content retrieval and recognition methods and device
CN109800412A (en) * 2018-12-10 2019-05-24 鲁东大学 A kind of Chinese word segmentation and big data information retrieval method and device
CN111723261B (en) * 2019-03-22 2021-08-13 昆明逆火科技股份有限公司 Search engine-based DNA comparison algorithm
CN111723261A (en) * 2019-03-22 2020-09-29 昆明逆火科技股份有限公司 Search engine-based DNA comparison algorithm
CN111190929A (en) * 2019-12-27 2020-05-22 四川师范大学 Data storage query method and device, electronic equipment and storage medium
CN111190929B (en) * 2019-12-27 2023-07-14 四川师范大学 Data storage query method and device, electronic equipment and storage medium
CN111209462A (en) * 2020-01-02 2020-05-29 北京字节跳动网络技术有限公司 Data processing method, device and equipment
CN111291003B (en) * 2020-01-21 2021-01-05 浙江工商大学 Data reading method and device and electronic equipment
CN111291003A (en) * 2020-01-21 2020-06-16 浙江工商大学 Data reading method and device and electronic equipment
CN111611222A (en) * 2020-04-27 2020-09-01 上海鼎茂信息技术有限公司 Data dynamic processing method based on distributed storage
CN112948533A (en) * 2021-04-13 2021-06-11 天津禄智技术有限公司 Text retrieval method for multiple retrieval and sequencing
CN113886505A (en) * 2021-09-28 2022-01-04 西安阳易信息技术有限公司 Management system for realizing dynamic modeling based on search engine and relational database
CN113886505B (en) * 2021-09-28 2024-04-30 西安阳易信息技术有限公司 Management system for realizing dynamic modeling based on search engine and relational database
CN114579596A (en) * 2022-05-06 2022-06-03 达而观数据(成都)有限公司 Method and system for updating index data of search engine in real time
CN114579596B (en) * 2022-05-06 2022-09-06 达而观数据(成都)有限公司 Method and system for updating index data of search engine in real time
CN116226470A (en) * 2023-05-09 2023-06-06 南昌大学 Management method, system, equipment and medium for ocean space-time data
CN116226470B (en) * 2023-05-09 2023-07-28 南昌大学 Management method, system, equipment and medium for ocean space-time data
CN116719839A (en) * 2023-08-10 2023-09-08 北京合思信息技术有限公司 Data query method and device of accounting file and electronic equipment
CN116719839B (en) * 2023-08-10 2024-01-26 北京合思信息技术有限公司 Data query method and device of accounting file and electronic equipment

Similar Documents

Publication Publication Date Title
CN106294695A (en) A kind of implementation method towards the biggest data search engine
CN110291517B (en) Query language interoperability in graph databases
CN111259006B (en) Universal distributed heterogeneous data integrated physical aggregation, organization, release and service method and system
CN107402995B (en) Distributed newSQL database system and method
US11468103B2 (en) Relational modeler and renderer for non-relational data
CN109101652B (en) Label creating and managing system
CN110032604B (en) Data storage device, translation device and database access method
US7337163B1 (en) Multidimensional database query splitting
JP3842573B2 (en) Structured document search method, structured document management apparatus and program
EP2874077B1 (en) Stateless database cache
US8924373B2 (en) Query plans with parameter markers in place of object identifiers
US20220083618A1 (en) Method And System For Scalable Search Using MicroService And Cloud Based Search With Records Indexes
CN108228743A (en) A kind of real-time big data search engine system
US9734176B2 (en) Index merge ordering
CN114461603A (en) Multi-source heterogeneous data fusion method and device
US20110184956A1 (en) Accessing digitally published content using re-indexing of search results
US10949409B2 (en) On-demand, dynamic and optimized indexing in natural language processing
US10776368B1 (en) Deriving cardinality values from approximate quantile summaries
KR101955376B1 (en) Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method
US8200673B2 (en) System and method for on-demand indexing
CN114443599A (en) Data synchronization method and device, electronic equipment and storage medium
US20050060307A1 (en) System, method, and service for datatype caching, resolving, and escalating an SQL template with references
CN107291875B (en) Metadata organization management method and system based on metadata graph
US11556525B2 (en) Hybrid online analytical processing (OLAP) and relational query processing
CN115185973A (en) Data resource sharing method, platform, device and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170104

RJ01 Rejection of invention patent application after publication