CN107103067A - A kind of method of data synchronization and system based on search engine - Google Patents
A kind of method of data synchronization and system based on search engine Download PDFInfo
- Publication number
- CN107103067A CN107103067A CN201710254007.XA CN201710254007A CN107103067A CN 107103067 A CN107103067 A CN 107103067A CN 201710254007 A CN201710254007 A CN 201710254007A CN 107103067 A CN107103067 A CN 107103067A
- Authority
- CN
- China
- Prior art keywords
- data
- index
- data source
- warehouse
- synchrodata
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/273—Asynchronous replication or reconciliation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of method of data synchronization based on search engine, including:Warehouse is indexed according to service creation, index field is created in index warehouse;Parse the corresponding data source of index field, and the corresponding full dose synchrodata information of disposition data source in the form of the first table;The corresponding sync cap of full dose synchrodata information is called, data source is imported into index warehouse, the synchronization of full dose data is completed.The problem of present invention is used to solve data synchronization process inefficiency, realize real-time automatically synchronizing data, it is ensured that the integrality and accuracy of data.
Description
Technical field
The present invention relates to communication technical field, particularly a kind of method of data synchronization and system based on search engine.
Background technology
In this information-based epoch, search engine plays extremely important effect in all trades and professions, for example, on-line shop
Storekeeper needs, to brief introduction of the search engine offer on its commodity, to search for and read for buyer.And traditional search engine, its
Back-stage management is that the transmission of data is realized by representing the character string of data message, but character string can not be directly very clear
The person of being developed know that data may be lost or distort during synchrodata, cause data message inaccurate, and
Manager is difficult to know in a short time again.In addition, the form species for the data source that the user such as businessman provides is various, and backstage is managed
The plug-in unit of data source is parsed in reason can not very meet the parsing of data source of all kinds of forms, cause the parsing of data source to be lost
Lose, this strong influence efficiency and accuracy of the search engine for data information transfer.
The content of the invention
The present invention provides a kind of method of data synchronization and system based on search engine, for solving data synchronization process effect
The problem of rate is low, realizes real-time automatically synchronizing data, it is ensured that the integrality and accuracy of data.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:A kind of method of data synchronization based on search engine,
Including:
Step 1, according to service creation index warehouse, create index field in the index warehouse;
Step 2, the corresponding data source of the parsing index field, and the configuration data source correspondence in the form of the first table
Full dose synchrodata information;
Step 3, the corresponding sync cap of the full dose synchrodata information is called, the data source is imported into the index
Warehouse, completes the synchronization of full dose data.
The beneficial effects of the invention are as follows:The present invention configures the same step number of full dose by configuration index field and in the form of a table
It is believed that breath, it is to avoid it is traditional by the use of character string etc. as information carrier the problem of, realize the visualization of data message;Separately
Outside, after full dose data to be imported to index warehouse by sync cap, administrative staff, which enter the back-stage management page, can check full dose
Whether data are present in index warehouse.This method drastically increase by full dose data import index warehouse success rate and
Accuracy rate, realizes real-time automatically synchronizing data, it is ensured that the integrality and accuracy of data.
On the basis of above-mentioned technical proposal, the present invention can also do following improvement.
Further, the synchronous method also includes:
Step 4, when the data source is to that should have incremental data, configured and once described increased per synchronous in the form of the second table
Measure the time interval of data;
Step 5, the sync cap is called, the incremental data and the time interval are imported into the index warehouse;
Step 6, record in second table time that the incremental data imports the index warehouse;
Step 7, using the moment as starting point, wait after the time interval, the synchronous incremental data completes incremental number
According to synchronization.
Further beneficial effect is the present invention:After the corresponding full dose data of a business are synchronized, if the full dose number
During according to there is corresponding incremental data, the incremental data can also be synchronized, increasing the flexibility of data syn-chronization.
Further, the step 1 includes:
Step 1.1, according to service creation index warehouse;
Step 1.2, the importing configuration files into the index warehouse;
Step 1.3, the configuration querying information in the configuration file, the Query Information include Data source table and data source
Unique encodings;
Step 1.4, according to the Data source table, create index field, the index field includes index field name and institute
State the corresponding field type of field name.
Further, when first table is configured with the corresponding full dose synchrodata information of multiple data sources,
The step 3 includes:According to the order of the data source unique encodings, call the full dose synchrodata information corresponding successively
Multiple data sources are imported the index warehouse by sync cap;Or,
When first table is configured with the corresponding full dose synchrodata information of multiple data sources, and receive only
During the instruction of a synchronous data source, the step 3 includes:According to the corresponding data source unique encodings of the data source, adjust
The index warehouse is imported with the corresponding sync cap of the full dose synchrodata information, and by the data source.
Further, it is necessary to which when one data source of re-synchronization or multiple data sources, the step 3 also includes:As needed
The corresponding data source unique encodings of data source of re-synchronization, call the corresponding synchronization of the full dose synchrodata information to connect
Mouthful, it would be desirable to the data source of re-synchronization imports the index warehouse.
Present invention also offers a kind of data synchronous system based on search engine, including:
Index field creation module, is indexed for indexing warehouse according to service creation, and being created in the index warehouse
Field;
Synchrodata information collocation module, for the index field created according to the index field creation module,
Parse the corresponding data source of the index field, and the corresponding full dose synchrodata of the configuration data source in the form of the first table
Information;
Synchrodata import modul, for the same step number of the full dose configured according to the synchrodata information collocation module
It is believed that breath, calls the corresponding sync cap of the full dose synchrodata information, and the data source is imported into the index warehouse.
The beneficial effects of the invention are as follows:The system is by index field creation module configuration index field and passes through synchronization
Data message configuration module configures full dose synchrodata information in the form of a table, it is to avoid traditional is used as letter by the use of character string etc.
The problem of ceasing carrier, realizes the visualization of data message;In addition, full dose data are imported by synchrodata import modul
Index behind warehouse, administrative staff, which enter the back-stage management page, can check whether full dose data are present in index warehouse.This is
System drastically increases the success rate and accuracy rate that full dose data are imported to index warehouse, realizes real-time automatically synchronizing data,
It ensure that the integrality and accuracy of data.
Further, the synchrodata information collocation module is additionally operable to:When the data source is to that should have incremental data, with
Time interval of the form configuration of second table per synchronous once described incremental data;
The synchrodata import modul is additionally operable to:The sync cap is called, by the incremental data and the time
Interval imports the index warehouse;
The synchrodata information collocation module is additionally operable to:The incremental data importing is recorded in second table described
Index the time in warehouse;
The synchrodata import modul is additionally operable to:Using the moment as starting point, wait after the time interval, synchronous institute
State incremental data.
Further, the index field creation module specifically for:
Warehouse is indexed according to service creation, the importing configuration files into the index warehouse are matched somebody with somebody in the configuration file
Query Information is put, the Query Information includes Data source table and data source unique encodings, according to the Data source table, creates index
Field, the index field includes index field name field type corresponding with the field name.
Further, when first table is configured with the corresponding full dose synchrodata information of multiple data sources,
The synchrodata import modul is used for:According to the order of the data source unique encodings, the same step number of the full dose is called successively
It is believed that ceasing corresponding sync cap, multiple data sources are imported into the index warehouse;Or,
When first table is configured with the corresponding full dose synchrodata information of multiple data sources, and receive only
During the instruction of a synchronous data source, the synchrodata import modul is used for:According to the corresponding data source of the data source
Unique encodings, call the corresponding sync cap of the full dose synchrodata information, and the data source is imported into the index warehouse.
Further, it is necessary to which when one data source of re-synchronization or multiple data sources, the synchrodata import modul is also used
In:In the corresponding data source unique encodings of the data source of re-synchronization as needed, the full dose synchrodata is called to believe
Cease corresponding sync cap, it would be desirable to which the data source of re-synchronization imports the index warehouse.
Brief description of the drawings
Fig. 1 is a kind of schematic flow sheet for method of data synchronization based on search engine that the embodiment of the present invention one is provided;
Fig. 2 is a kind of schematic flow sheet for method of data synchronization based on search engine that the embodiment of the present invention two is provided;
Fig. 3 is Fig. 1 and/or the schematic flow sheet of the step 110 in Fig. 2;
Fig. 4 is a kind of schematic structure for data synchronous system based on search engine that the embodiment of the present invention three is provided
Figure.
Embodiment
The principle and feature of the present invention are described below in conjunction with accompanying drawing, the given examples are served only to explain the present invention, and
It is non-to be used to limit the scope of the present invention.
Embodiment one:
A kind of method of data synchronization 100 based on search engine, as shown in figure 1, including:
Step 110, according to service creation index warehouse, index warehouse in create index field;
Step 120, the corresponding data source of parsing index field, and the corresponding full dose of disposition data source in the form of the first table
Synchrodata information;
Step 130, the corresponding sync cap of full dose synchrodata information is called, data source is imported into index warehouse, completed
The synchronization of full dose data.
Embodiment two:
Optionally, as an alternative embodiment of the invention, as shown in Fig. 2 methods described 100 includes:
Step 110, according to service creation index warehouse, index warehouse in create index field;
Step 120, the corresponding data source of parsing index field, and the corresponding full dose of disposition data source in the form of the first table
Synchrodata information;
Step 130, the corresponding sync cap of full dose synchrodata information is called, data source is imported into index warehouse, completed
The synchronization of full dose data;
Step 140, when data source is to that should have incremental data, configured in the form of the second table per a synchronous incremental data
Time interval;
Step 150, sync cap is called, incremental data and time interval are imported into index warehouse;
Step 160, in the second table recording increment data import index warehouse at the time of;
Step 170, using the above-mentioned moment as starting point, wait after above-mentioned time interval, synchronous incremental data, complete incremental data
Synchronization.
Specifically, in the above-described embodiments, as shown in figure 3, the step 110 in Fig. 1 and/or Fig. 2 includes:
Step 111, according to service creation index warehouse corel;
Step 112, the importing configuration files into the core1/conf files in index warehouse;
Step 113, the configuration querying information in configuration file, Query Information include rule searching, Data source table and data
Source unique encodings;
Step 114, according to Data source table, create index field, the index field includes index field name and field name pair
The field type answered.
Wherein, configuration querying information in step 113, be specifically:Unique Key parameter values are set to data source
Unique encodings (i.e. id), for example:<unique Key>doc id</unique Key>;By default search field ginsengs
Acquiescence is the corresponding field of data source in search Data source table when numerical value is set to search, for example:<default search
field>doctitle</default search field>;By solr query parser parameter attributes default
Operator value is set to the rule searching of acquiescence, for example:<Solr queryparser default operator="
OR"/>.Wherein, business one data source of correspondence, data source one unique encodings of correspondence.
Index field is created in step 114, is specifically:Setting in Data source table needs to be synchronized to the number in search engine
According to the corresponding field name in source, field type, for example:<Field name=" doctitle " type=" text_ik " indexed
=" true " stored=" true " omitNorms=" true "/>, wherein, name property values are field name, type attributes
It is worth for field type.
In addition, it is necessary to explanation, in embodiment one and embodiment two, for parsing the corresponding data source of index field
The parsing data algorithm increased income for apache tika in plug-in unit, and the plug-in unit of increasing income of instrument have passed through optimization so that this is opened
Source plug-in unit can be to any form (for example:Mysql, oracle, txt, word, ppt, excel and pdf) data source solved
Analysis, and the success rate parsed is 100%, which greatly improves the success rate that full dose data are imported to index warehouse and accurately
Rate, it is ensured that the integrality and accuracy of data.
Simultaneously, it is necessary to which explanation, full dose synchrodata information is configuration information, the configuration information is remembered by three tables
Link information, the number of Data source table and the corresponding index field of data source of data source are recorded, for positioning, connecting and search for
Data source, in addition, the synchronization on incremental data, except above-mentioned three tables, in addition to the 4th table, the 4th token record is same
Walk the time interval of incremental data and incremental data is imported to the time for indexing warehouse.In addition, many numbers of multinomial business correspondence
According to source, the configuration information of each business can be all recorded in above-mentioned table, for example, the full dose for data source is synchronous, above-mentioned three
Table is respectively table 1, table 2 and table 3, and business has A business and B business, then the corresponding configuration information of A business is distributed in the of table 1
The first row of a line, the first row of table 2 and table 3, the corresponding configuration information of B business is distributed in the second row of table 1, the second of table 2
The third line of row and table 3.
When full dose data (i.e. data source) are to that there should be incremental data and when needing synchronous, step 140~170 are performed, wherein,
After calling sync cap that time interval and incremental data are successfully imported to index warehouse, the importing time is recorded, it is above-mentioned waiting
After time interval, above-mentioned incremental data can synchronously enter the corresponding index warehouse of full dose data, complete the synchronization of incremental data,
For example, time interval be 5 minutes, incremental data import index warehouse time for 8 points 15 minutes, wait 5 minutes after, 8: 20
Divide synchronous incremental data, the synchronization of incremental data is completed, without calling the corresponding sync cap of full dose synchrodata information again.
In addition, above-mentioned time interval can be determined on a case-by-case basis.
After calling the corresponding sync cap of full dose synchrodata information and data source imported into index warehouse, staff
The back-stage management page can be logged in, into index warehouse, inquiry data source whether there is, if it does, explanation is imported successfully, if
Be not present, it may be possible to which the data source address that user provides is not connected, or hardware problem, can manual intervention, again from step
110 proceed by data syn-chronization operation.
Specifically, in embodiment two, in step 120, step 140 and step 160, during configuration full dose synchrodata information
Need to set up the first table, need to set up the second table when configuring the time interval of corresponding incremental data often synchronization once.Wherein,
One table includes search_db_tb tables, search_db tables, search_db_tb_field tables, and the second table includes sys_task tables.
Wherein,
Search_db_tb tables:
id | db_id | tb_name | query | delta_query_id | delta_query |
The project that search_db_tb tables include is:Id, db_id, tb_name, query, delta_query_id and
delta_query.Wherein, id is table major key;Db_id is the major key of search_db tables;Tb_name is data table name;pk_
Id is tables of data major key;Inquiry sql sentences when query is synchrodata;When delta_query_id is increment synchronization data
The inquiry sql sentences of execution, Query Result is to need the data id of increment synchronization;When delta_query is increment synchronization data
The inquiry sql sentences of execution, the sql implementing results inquiry data in delta_query_id field values.
Search_db tables:
id | service_id | url | driver | username | passwordid |
The project that search_db tables include is:Id, service_id, url, driver, username and
passwordid.Id is table major key;Service_id is the major key of search_index tables;Url, driver, username,
Password-id is data road link information.
Search_db_tb_field tables:
The project that search_db_tb_field tables include is:id、tb_id、field_name、index_name、is_
Filter_html, is_pinyin, index_pinyin_name and doc_obtainid.Id is table major key;Tb_id is
The major key of search_db_tb tables;Field_name is that tables of data will be synchronized to the field name in search engine;index_name
For data sheet field corresponding field name in a search engine;Is_filter_html represents whether field value filters html marks
Label (1 represents filtering, and 2 represent not filter) is_pinyin represents whether to turn phonetic by field value that (1 represents to turn phonetic, and 2 represent not turn
Phonetic) index_pinyin_name is expressed as turning the field value of phonetic, is synchronized to the field name after search engine;doc_
Obtain indicates whether to take file content according to path (1 represents it is that 2 represent no).
Sys_task tables:
id | task_name | class_pash | expression | last_task_timel |
The project that sys_task tables include is:Id, task_name, class_pash, expression and last_task_
time.Id is table major key;Task_name is Incremental Transactions title;Class_pash is Incremental Transactions path;Expression is
Incremental Transactions time interval;Last_task_time is the time that last Incremental Transactions are performed.
It should be noted that a full dose synchrodata is then accordingly performed when there is multiple incremental datas to need synchronous
Multiple step 140~step 160.
Specifically, in the above-described embodiments, when the first table is configured with the corresponding full dose synchrodata information of multiple data sources
When, step 130 includes:According to the order of data source unique encodings, the corresponding synchronization of full dose synchrodata information is called to connect successively
Mouthful, multiple data sources are imported into index warehouse;Or,
When the first table is configured with the corresponding full dose synchrodata information of multiple data sources, and receive an only synchronous data
During the instruction in source, step 130 includes:According to the corresponding data source unique encodings of the data source, call unique containing the data source
The sync cap of coding, and the data source is imported into index warehouse.
In addition, it is necessary to which when one data source of re-synchronization or multiple data sources, step 130 also includes:As needed again
The synchronous corresponding data source unique encodings of data source, call the corresponding sync cap of full dose synchrodata information, it would be desirable to weight
New synchronous data source imports index warehouse.
The present invention configures full dose synchrodata information by configuration index field and in the form of a table, it is to avoid traditional
The problem of by the use of character string etc. as information carrier, realize the visualization of data message;In addition, inciting somebody to action complete by sync cap
Measure data to import behind index warehouse, administrative staff, which enter the back-stage management page, can check whether full dose data are present in indexing storehouse
In storehouse, further, increased income plug-in unit for apache tika because the present invention parses being used for of using the instrument of data source, and should
The parsing data algorithm increased income in plug-in unit have passed through optimization so that the plug-in unit of increasing income can be to any form (for example:mysql、
Oracle, txt, word, ppt, excel and pdf) data source parsed, and parsing success rate be 100%.This method
The success rate and accuracy rate that full dose data are imported to index warehouse are drastically increased, real-time automatically synchronizing data is realized, protected
The integrality and accuracy of data are demonstrate,proved.
Embodiment three:
Present invention also offers a kind of data synchronous system 200 based on search engine, as shown in figure 4, including:
Index field creation module, for indexing warehouse according to service creation, and creates index field in index warehouse;
Synchrodata information collocation module, for the index field created according to index field creation module, parsing index
The corresponding data source of field, and the corresponding full dose synchrodata information of disposition data source in the form of the first table;
Synchrodata import modul, for the full dose synchrodata information configured according to synchrodata information collocation module,
The corresponding sync cap of full dose synchrodata information is called, and data source is imported into index warehouse.
In addition, it is necessary to explanation, after the corresponding full dose data of a business are synchronized, if the full dose data have correspondence
Incremental data when, the incremental data can also be synchronized.Accordingly,
Synchrodata information collocation module is additionally operable to:When data source is to that there should be incremental data, matched somebody with somebody in the form of the second table
Put the time interval of every incremental data of synchronization;
Synchrodata import modul is additionally operable to:Sync cap is called, incremental data and time interval are imported into index warehouse;
Synchrodata information collocation module is additionally operable to:Recorded in the second table by incremental data import index warehouse when
Carve;
Synchrodata import modul is additionally operable to:So that constantly for starting point, after latency period, synchronous incremental data is completed
The synchronization of incremental data.
Wherein, index field creation module specifically for:
Warehouse is indexed according to service creation, the importing configuration files into index warehouse, configuration querying is believed in configuration file
Breath, and according to Query Information, index field is created, wherein, it is unique that Query Information includes rule searching, Data source table and data source
Coding, index field includes index field name field type corresponding with field name.
When the first table is configured with the corresponding full dose synchrodata information of multiple data sources, synchrodata import modul is used
In:According to the order of data source unique encodings, the corresponding sync cap of full dose synchrodata information is called successively, by multiple data
Source imports index warehouse;Or,
When the first table is configured with the corresponding full dose synchrodata information of multiple data sources, and receive an only synchronous data
During the instruction in source, synchrodata import modul is used for:According to the corresponding data source unique encodings of the data source, call full dose synchronous
The corresponding sync cap of data message, and the data source is imported into index warehouse.
When needing one data source of re-synchronization or multiple data sources, synchrodata import modul is additionally operable to:As needed
The corresponding data source unique encodings of data source of re-synchronization, call the corresponding sync cap of full dose synchrodata information, need to
The data source of re-synchronization is wanted to import index warehouse.
It should be noted that the system is developed by Java language, by index field creation module configuration index field with
And full dose synchrodata information is configured by synchrodata information collocation module in the form of a table, it is to avoid traditional utilization character
The problem of string etc. is as information carrier, realizes the visualization of data message;In addition, inciting somebody to action complete by synchrodata import modul
Measure data to import behind index warehouse, administrative staff, which enter the back-stage management page, can check whether full dose data are present in indexing storehouse
In storehouse, further, because the instrument for being used to parse data source that the present invention is used in synchrodata information collocation module is
The parsing data algorithm that apache tika increase income in plug-in unit, and the plug-in unit of increasing income have passed through optimization so that the plug-in unit of increasing income can be right
Any form is (for example:Mysql, oracle, txt, word, ppt, excel and pdf) data source parsed, and parsing
Success rate is 100%.The system drastically increases the success rate and accuracy rate that full dose data are imported to index warehouse, realizes
Real-time automatically synchronizing data, it is ensured that the integrality and accuracy of data.
Further, since the instrument for being used to parse data source that the present invention is used in synchrodata information collocation module is
The parsing data algorithm that apache tika increase income in plug-in unit, and the plug-in unit of increasing income have passed through optimization so that the plug-in unit of increasing income can be right
The data source of any form is parsed, and after tested, the success rate of parsing is 100%, unsuccessful if there is parsing, that
It is probably that the database of other side is not connected, or hardware problem.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.
Claims (10)
1. a kind of method of data synchronization based on search engine, it is characterised in that including:
Step 1, according to service creation index warehouse, create index field in the index warehouse;
Step 2, the corresponding data source of the parsing index field, and the configuration data source is corresponding complete in the form of the first table
Measure synchrodata information;
Step 3, the corresponding sync cap of the full dose synchrodata information is called, the data source is imported into the index storehouse
Storehouse, completes the synchronization of full dose data.
2. a kind of method of data synchronization based on search engine according to claim 1, it is characterised in that the synchronization side
Method also includes:
Step 4, when the data source is to that should have incremental data, configured in the form of the second table per synchronous once described incremental number
According to time interval;
Step 5, the sync cap is called, the incremental data and the time interval are imported into the index warehouse;
Step 6, at the time of record in second table incremental data and import the index warehouse;
Step 7, using the moment as starting point, wait after the time interval, the synchronous incremental data completes incremental data
It is synchronous.
3. a kind of method of data synchronization based on search engine according to claim 1 or 2, it is characterised in that the step
Rapid 1 includes:
Step 1.1, according to service creation index warehouse;
Step 1.2, the importing configuration files into the index warehouse;
Step 1.3, the configuration querying information in the configuration file, the Query Information includes Data source table and data source is unique
Coding;
Step 1.4, according to the Data source table, create index field, the index field includes index field name and the word
The corresponding field type of section name.
4. a kind of method of data synchronization based on search engine according to claim 3, it is characterised in that when described first
When table is configured with multiple data sources corresponding full dose synchrodata information, the step 3 includes:According to the data
The order of source unique encodings, calls the corresponding sync cap of the full dose synchrodata information successively, by multiple data sources
Import the index warehouse;Or,
When first table is configured with the corresponding full dose synchrodata information of multiple data sources, and receive only synchronous
During the instruction of one data source, the step 3 includes:According to the corresponding data source unique encodings of the data source, institute is called
The corresponding sync cap of full dose synchrodata information is stated, and the data source is imported into the index warehouse.
5. a kind of method of data synchronization based on search engine according to claim 3, it is characterised in that need again same
When walking a data source or multiple data sources, the step 3 includes:The corresponding number of data source of re-synchronization as needed
According to source unique encodings, the corresponding sync cap of the full dose synchrodata information is called, it would be desirable to which the data source of re-synchronization is led
Enter the index warehouse.
6. a kind of data synchronous system based on search engine, it is characterised in that including:
Index field creation module, for indexing warehouse according to service creation, and creates index field in the index warehouse;
Synchrodata information collocation module, for the index field created according to the index field creation module, parsing
The corresponding data source of the index field, and configure in the form of the first table the corresponding full dose synchrodata letter of the data source
Breath;
Synchrodata import modul, the full dose synchrodata for being configured according to the synchrodata information collocation module is believed
Breath, calls the corresponding sync cap of the full dose synchrodata information, and the data source is imported into the index warehouse.
7. a kind of data synchronous system based on search engine according to claim 6, it is characterised in that the same to step number
It is additionally operable to according to information collocation module:When the data source is to that there should be incremental data, configured in the form of the second table per synchronous one
The time interval of the secondary incremental data;
The synchrodata import modul is additionally operable to:The sync cap is called, by the incremental data and the time interval
Import the index warehouse;
The synchrodata information collocation module is additionally operable to:The incremental data is recorded in second table and imports the index
The time in warehouse;
The synchrodata import modul is additionally operable to:Using the moment as starting point, wait after the time interval, the synchronous increasing
Measure data.
8. a kind of data synchronous system based on search engine according to claim 7, it is characterised in that the index word
Section creation module specifically for:
Warehouse is indexed according to service creation, the importing configuration files into the index warehouse are configured in the configuration file and looked into
Information is ask, the Query Information includes Data source table and data source unique encodings, according to the Data source table, creates index word
Section, the index field includes index field name field type corresponding with the field name.
9. a kind of data synchronous system based on search engine according to claim 8, it is characterised in that when described first
When table is configured with multiple data sources corresponding full dose synchrodata information, the synchrodata import modul is used for:
According to the order of the data source unique encodings, the corresponding sync cap of the full dose synchrodata information is called successively, will be many
The individual data source imports the index warehouse;Or,
When first table is configured with the corresponding full dose synchrodata information of multiple data sources, and receive only synchronous
During the instruction of one data source, the synchrodata import modul is used for:It is unique according to the corresponding data source of the data source
Coding, calls the corresponding sync cap of the full dose synchrodata information, and the data source is imported into the index warehouse.
10. a kind of data synchronous system based on search engine according to claim 9, it is characterised in that need again
When a synchronous data source or multiple data sources, the synchrodata import modul is additionally operable to:The number of re-synchronization as needed
According to the corresponding data source unique encodings in source, the corresponding sync cap of the full dose synchrodata information is called, it would be desirable to weight
New synchronous data source imports the index warehouse.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710254007.XA CN107103067A (en) | 2017-04-18 | 2017-04-18 | A kind of method of data synchronization and system based on search engine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710254007.XA CN107103067A (en) | 2017-04-18 | 2017-04-18 | A kind of method of data synchronization and system based on search engine |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107103067A true CN107103067A (en) | 2017-08-29 |
Family
ID=59657051
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710254007.XA Pending CN107103067A (en) | 2017-04-18 | 2017-04-18 | A kind of method of data synchronization and system based on search engine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107103067A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108845995A (en) * | 2018-03-23 | 2018-11-20 | 腾讯科技(深圳)有限公司 | Data processing method, device, storage medium and electronic device |
CN109739887A (en) * | 2018-12-21 | 2019-05-10 | 平安科技(深圳)有限公司 | Synchronous task searching method, system, device and readable storage medium storing program for executing |
CN110147362A (en) * | 2019-04-04 | 2019-08-20 | 中电科大数据研究院有限公司 | One kind is based on the acquisition of event driven DOC DATA and processing system and its method |
CN110245134A (en) * | 2019-04-26 | 2019-09-17 | 石化盈科信息技术有限责任公司 | A kind of increment synchronization method applied to search service |
CN110263028A (en) * | 2019-04-26 | 2019-09-20 | 石化盈科信息技术有限责任公司 | A kind of full dose synchronous method applied to search service |
CN111865576A (en) * | 2020-07-03 | 2020-10-30 | 北京天空卫士网络安全技术有限公司 | Method and device for synchronizing URL classification data |
CN112507200A (en) * | 2020-12-28 | 2021-03-16 | 浪潮云信息技术股份公司 | Method and apparatus for synchronizing data into search engine |
CN113378022A (en) * | 2020-03-10 | 2021-09-10 | 北京搜狗科技发展有限公司 | In-station search platform, search method and related device |
CN115098648A (en) * | 2022-08-25 | 2022-09-23 | 歌尔股份有限公司 | Enterprise data searching method and device and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140164325A1 (en) * | 2012-12-07 | 2014-06-12 | Institute For Information Industry | Data synchronization system and method for synchronizing data |
CN105930493A (en) * | 2016-05-04 | 2016-09-07 | 北京思特奇信息技术股份有限公司 | Method and system for data synchronization between different databases |
CN106095911A (en) * | 2016-06-07 | 2016-11-09 | 腾讯科技(深圳)有限公司 | Search system and method for data synchronization |
CN106469158A (en) * | 2015-08-17 | 2017-03-01 | 杭州海康威视系统技术有限公司 | Method of data synchronization and device |
-
2017
- 2017-04-18 CN CN201710254007.XA patent/CN107103067A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140164325A1 (en) * | 2012-12-07 | 2014-06-12 | Institute For Information Industry | Data synchronization system and method for synchronizing data |
CN106469158A (en) * | 2015-08-17 | 2017-03-01 | 杭州海康威视系统技术有限公司 | Method of data synchronization and device |
CN105930493A (en) * | 2016-05-04 | 2016-09-07 | 北京思特奇信息技术股份有限公司 | Method and system for data synchronization between different databases |
CN106095911A (en) * | 2016-06-07 | 2016-11-09 | 腾讯科技(深圳)有限公司 | Search system and method for data synchronization |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108845995B (en) * | 2018-03-23 | 2020-08-18 | 腾讯科技(深圳)有限公司 | Data processing method, data processing apparatus, storage medium, and electronic apparatus |
CN108845995A (en) * | 2018-03-23 | 2018-11-20 | 腾讯科技(深圳)有限公司 | Data processing method, device, storage medium and electronic device |
CN109739887A (en) * | 2018-12-21 | 2019-05-10 | 平安科技(深圳)有限公司 | Synchronous task searching method, system, device and readable storage medium storing program for executing |
CN110147362A (en) * | 2019-04-04 | 2019-08-20 | 中电科大数据研究院有限公司 | One kind is based on the acquisition of event driven DOC DATA and processing system and its method |
CN110245134B (en) * | 2019-04-26 | 2021-07-06 | 石化盈科信息技术有限责任公司 | Increment synchronization method applied to search service |
CN110263028A (en) * | 2019-04-26 | 2019-09-20 | 石化盈科信息技术有限责任公司 | A kind of full dose synchronous method applied to search service |
CN110263028B (en) * | 2019-04-26 | 2021-06-15 | 石化盈科信息技术有限责任公司 | Full-scale synchronization method applied to search service |
CN110245134A (en) * | 2019-04-26 | 2019-09-17 | 石化盈科信息技术有限责任公司 | A kind of increment synchronization method applied to search service |
CN113378022A (en) * | 2020-03-10 | 2021-09-10 | 北京搜狗科技发展有限公司 | In-station search platform, search method and related device |
CN111865576A (en) * | 2020-07-03 | 2020-10-30 | 北京天空卫士网络安全技术有限公司 | Method and device for synchronizing URL classification data |
CN111865576B (en) * | 2020-07-03 | 2023-02-28 | 北京天空卫士网络安全技术有限公司 | Method and device for synchronizing URL classification data |
CN112507200A (en) * | 2020-12-28 | 2021-03-16 | 浪潮云信息技术股份公司 | Method and apparatus for synchronizing data into search engine |
CN115098648A (en) * | 2022-08-25 | 2022-09-23 | 歌尔股份有限公司 | Enterprise data searching method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107103067A (en) | A kind of method of data synchronization and system based on search engine | |
CN103902653B (en) | A kind of method and apparatus for building data warehouse table genetic connection figure | |
Bast et al. | Open information extraction via contextual sentence decomposition | |
USRE48030E1 (en) | Computer-implemented system and method for tagged and rectangular data processing | |
US10496624B2 (en) | Index key generating device, index key generating method, and search method | |
CN107608949A (en) | A kind of Text Information Extraction method and device based on semantic model | |
TW201901661A (en) | Speech recognition method and system | |
EP1225516A1 (en) | Storing data of an XML-document in a relational database | |
CN103186639B (en) | Data creation method and system | |
CN100444591C (en) | Method for acquiring front-page keyword and its application system | |
CN103177120B (en) | A kind of XPath query pattern tree matching method based on index | |
US9110852B1 (en) | Methods and systems for extracting information from text | |
CN110781183B (en) | Processing method and device for incremental data in Hive database and computer equipment | |
CN102737049A (en) | Method and system for database query | |
CN105550359B (en) | Webpage sorting method and device based on vertical search and server | |
CN101520770A (en) | Method and device for analyzing, converting and splitting structured data | |
CN110532358A (en) | A kind of template automatic generation method towards knowledge base question and answer | |
CN110909168A (en) | Knowledge graph updating method and device, storage medium and electronic device | |
CN108829651A (en) | A kind of method, apparatus of document treatment, terminal device and storage medium | |
CN111198898A (en) | Big data query method and big data query device | |
CN104021216B (en) | Message proxy server and information publish subscription method and system | |
CN104572736A (en) | Keyword extraction method and device based on social networking services | |
CN110119404B (en) | Intelligent access system and method based on natural language understanding | |
WO2018226255A1 (en) | Functional equivalence of tuples and edges in graph databases | |
CN106407288B (en) | Method and system for synchronously updating information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170829 |