CN110263028B - Full-scale synchronization method applied to search service - Google Patents

Full-scale synchronization method applied to search service Download PDF

Info

Publication number
CN110263028B
CN110263028B CN201910343332.2A CN201910343332A CN110263028B CN 110263028 B CN110263028 B CN 110263028B CN 201910343332 A CN201910343332 A CN 201910343332A CN 110263028 B CN110263028 B CN 110263028B
Authority
CN
China
Prior art keywords
tables
level
configuration information
upstream
information corresponding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910343332.2A
Other languages
Chinese (zh)
Other versions
CN110263028A (en
Inventor
陈海龙
王建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Petro CyberWorks Information Technology Co Ltd
Original Assignee
Petro CyberWorks Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Petro CyberWorks Information Technology Co Ltd filed Critical Petro CyberWorks Information Technology Co Ltd
Priority to CN201910343332.2A priority Critical patent/CN110263028B/en
Publication of CN110263028A publication Critical patent/CN110263028A/en
Application granted granted Critical
Publication of CN110263028B publication Critical patent/CN110263028B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a full-scale synchronization method applied to search service. The method can level complex service data of multi-table association under different data source types and different central libraries in the application system, and the complex service data is fully synchronized to a search engine at high-efficiency speed, so that the method has good application prospect.

Description

Full-scale synchronization method applied to search service
Technical Field
The invention relates to the technical field of computer software, in particular to a full-scale synchronization method applied to search service.
Background
With the rapid development of information technology, the data volume of each application system is increasing day by day. In some large application systems, a large data distributed storage mode is often used, so when information required by a user is queried, table association query is usually required to be performed on different databases under different central libraries, and retrieval and screening conditions are complex, so that the problems of poor database query performance, response timeout and the like are caused. Therefore, how to quickly query and perform word segmentation retrieval on information required by a user from mass data is a problem that needs to be solved urgently in the construction process of each application system. The search engine technology is undoubtedly a preferred solution to the above problems because it can perform real-time query and word segmentation search of data. However, the existing data transmission service can only extract all single-table data in the database of the application system and synchronize the single-table data to the search engine, and cannot solve the complex business scenario of multi-table association synchronization to the search engine query.
The existing data transmission service can only realize the full extraction of single-table data in a database of an application system and the synchronization to a search engine, but cannot realize the rapid synchronous transmission of data of a complex index structure associated with multiple tables, and the specific description is as follows:
the single table data transmission refers to that a single table in a database is used as a source end, a search engine is used as a target end, all data in the table of the source end are extracted and transmitted to the search engine, and then the search engine is used as a data source of an application system to perform data query and retrieval. However, the synchronization of the single-table data to the search engine often fails to meet the business requirements. For example, a user wants to perform word segmentation retrieval on order information in a certain e-commerce system through a search engine, at this time, the user establishes an order index in the search engine, all order table data in a database is extracted and synchronized into the order index of the search engine through the existing data transmission service, and then, the user can perform query retrieval on the order data through the search engine. However, when a user queries an order, it is usually necessary to query relevant information such as a commodity and logistics placed in the order. Because the correlation query can not be made among different indexes in the search engine, before synchronizing the order data, the commodity data placed by the order and the logistics data to the search engine, the user needs to perform correlation processing on the data, that is, the user uses the order table as a main table and uses the commodity table and the logistics table as an auxiliary table, correlates the data in the main table and the data in the auxiliary table through a correlation key to form data information in a perfect JSON format including the order, the commodity, the logistics and the like, and then synchronizes the data information to the order index of the search engine. In this way, when the user searches for and retrieves order information in the order index of the search engine, the information of the order, such as commodities and logistics, can be searched out together.
Therefore, single-table data transmission cannot solve complex service scenarios, and even if multi-table association transmission is performed by some existing data transmission services, only simple association and transmission of small data amount can be realized. Therefore, how to level up the complex business data associated with multiple tables in different data source types and different central libraries in the application system and synchronize the complex business data to the search engine at high efficiency and full speed becomes a technical problem which needs to be solved in the industry urgently.
In order to solve the above technical problem, the present invention provides a full-scale synchronization method applied to a search service.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: in the prior art, massive complex service data associated with multiple tables under different data source types and different central libraries in an application system cannot be leveled, and the massive complex service data can be fully synchronized to a search engine at an efficient speed.
In order to solve the above technical problem, the present invention provides a full-scale synchronization method applied to search services, including:
acquiring a search engine index name which needs to be subjected to full-scale synchronous operation;
according to the search engine index name, inquiring index configuration information and data source configuration information corresponding to the search engine index name, wherein the index configuration information corresponding to the search engine index name comprises: a number of the plurality of tables configured with the search engine index name, a name of each of the plurality of tables, an association and an association key between each two of the plurality of tables, a field configured by each of the plurality of tables, and a data source configuration name corresponding to each of the plurality of tables;
connecting the data base through JDBC, and inquiring the number of data stored in the database by each table in the plurality of tables;
creating a table tree structure for the plurality of tables according to the association relation and the association key between every two tables in the plurality of tables, wherein the table tree structure comprises a plurality of levels, one table is arranged on a first level, and at least one table is arranged on the other levels except the first level;
for each level of the table tree structure, the following operations are performed in sequence starting from the lowest level to the second level:
sequentially storing index configuration information and data source configuration information corresponding to the tables on each level and the upstream tables thereof, the names of the upstream tables and the incidence relations among all the tables in the table tree structure into a stack by using a preset rule;
according to the storage sequence, sequentially taking out index configuration information and data source configuration information corresponding to the tables on each level and the upstream tables thereof from the stack, and processing data stored in a database by the tables on each level and the upstream tables thereof to form an association table of the tables on each level and the upstream tables of the tables on each level, wherein the association table comprises: all fields configured by the upstream table of the table on each level, and JSON format data formed by format conversion of all fields configured by the table on each level;
and converting the data in the table on the first level and the table associated with the table on the second level, which are formed through the operations, into the data in the JSON format, and calling an interface of a search engine to write the data in the JSON format into the search engine, so that the full-scale synchronization from the database to the search engine is realized.
In a preferred embodiment of the present invention, querying index configuration information and data source configuration information corresponding to the search engine index name according to the search engine index name includes:
inquiring index configuration information corresponding to the search engine index name according to the search engine index name;
and inquiring data source configuration information corresponding to each table in the plurality of tables according to the index configuration information corresponding to the search engine index name.
In a preferred embodiment of the present invention, querying index configuration information corresponding to the search engine index name according to the search engine index name includes:
and according to the search engine index name, inquiring index configuration information corresponding to the search engine index name from an index configuration table of a configuration library by calling an inquiry interface of the configuration library of the search engine.
In a preferred embodiment of the present invention, querying the data source configuration information corresponding to each table of the plurality of tables according to the index configuration information corresponding to the search engine index name includes:
and inquiring data source configuration information corresponding to each table in the plurality of tables from the data source configuration table of the configuration library by calling an inquiry interface of the configuration library of a search engine according to the data source configuration name corresponding to each table in the plurality of tables.
In a preferred embodiment of the present invention, the data source configuration information corresponding to each of the plurality of tables includes: a type, address, port, library name, user name, and password of a database to which the data source corresponding to each of the plurality of tables belongs.
In a preferred embodiment of the present invention, the type of the database to which the data source corresponding to each table of the plurality of tables belongs includes: SQL Server database, MYSQL database and ORACLE database.
In a preferred embodiment of the present invention, after querying the number of data pieces stored in the database for each table in the plurality of tables, and before creating a table tree structure for each two tables in the plurality of tables according to the association relationship and the association key between the two tables, the method further includes:
judging whether the number of data stored in a database by each table in the plurality of tables is 0;
if the number of data items stored in the database in the table is 0, the table is removed from the index configuration information corresponding to the search engine index name.
In a preferred embodiment of the present invention, sequentially storing, in a stack, index configuration information and data source configuration information corresponding to a table and an upstream table thereof on each level, a name of the upstream table, and an association relationship between all tables in the table tree structure by using a preset rule, includes:
for each level of the table tree structure, starting from the lowest level and up to the second level, the following operations are performed:
sequentially putting index configuration information and data source configuration information corresponding to the tables on each level and index configuration information and data source configuration information corresponding to an upstream table of the tables on each level into a stack as a unit, wherein the index configuration information corresponding to the tables on each level comprises: the index configuration information corresponding to the upstream table of the table on each level includes: the method comprises the following steps that association relations and association keys between tables on all levels and upstream tables of the tables on all levels, fields of upstream table configuration of the tables on all levels, and data source configuration names corresponding to the upstream tables of the tables on all levels;
sequentially putting index configuration information corresponding to the tables on each level, index configuration information corresponding to the upstream tables of the tables on each level, and the names of the upstream tables of the tables on each level into a stack as a unit;
and putting the association relation among all the tables in the table tree structure into a stack as a unit.
In a preferred embodiment of the present invention, sequentially fetching index configuration information and data source configuration information corresponding to a table and an upstream table thereof on each level from a stack according to a storage order, and processing data stored in a database by the table and the upstream table thereof on each level to form an association table between the table on each level and the upstream table of the table on each level, includes:
for each level of the table tree structure, the following operations are performed in sequence starting from the lowest level to the second level:
taking out index configuration information and data source configuration information corresponding to the tables on each level and index configuration information and data source configuration information corresponding to the upstream tables of the tables on each level from the stack;
uploading data stored in a database by the tables on each level and the upstream tables of the tables on each level to a data warehouse of a server through a heterogeneous data source synchronization tool;
JOIN association is carried out on the data uploaded to the data warehouse of the server through an association key between the table on each level and an upstream table of the table on each level, and an association table of the table on each level and the upstream table of the table on each level is formed.
In a preferred embodiment of the present invention, the JOIN association of the data uploaded to the data warehouse of the server by the association key between the table at each level and the upstream table of the table at each level is formed to form an association table between the table at each level and the upstream table of the table at each level, including:
JOIN association is carried out on data uploaded to a data warehouse of the server by using a structured query statement through an association key between the table on each level and an upstream table of the table on each level, so as to form an association table of the table on each level and the upstream table of the table on each level.
Compared with the prior art, one or more embodiments in the above scheme can have the following advantages or beneficial effects:
by applying the full-scale synchronization method applied to the search service, the leveling processing can be carried out on the complex service data which are associated with a plurality of tables and have different data source types and different central libraries in the application system, the full-scale synchronization can be carried out to the search engine at high-efficiency speed, and the full-scale synchronization method has good application prospect.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a detailed flowchart of a full-scale synchronization method applied to a search service according to an embodiment of the present invention;
FIG. 2 is a detailed flowchart of step S102 in FIG. 1;
FIG. 3 is a detailed flowchart of step S105 in FIG. 1;
FIG. 4 is a detailed flowchart of step S106 in FIG. 1;
FIG. 5 is a diagram of a table tree structure in an exemplary embodiment of the invention.
Detailed Description
The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.
In order to solve the technical problem that in the prior art, massive complex service data associated with multiple tables in different data source types and different central libraries in an application system cannot be leveled, and are fully synchronized to a search engine at high-efficiency speed, the embodiment of the invention provides a full synchronization method applied to search service.
Fig. 1 is a detailed flowchart of a full synchronization method applied to a search service according to an embodiment of the present invention.
As shown in fig. 1, the full-scale synchronization method applied to the search service according to the embodiment of the present invention mainly includes the following steps S101 to S107.
In step S101, a search engine index name for which a full-scale synchronization operation is required is acquired.
In step S102, index configuration information and data source configuration information corresponding to the search engine index name are queried according to the search engine index name. The specific process is shown in fig. 2.
In step S1021, index configuration information corresponding to the search engine index name is queried according to the search engine index name.
Specifically, according to the search engine index name, index configuration information corresponding to the search engine index name is inquired from an index configuration table of a configuration library (MongoDB) by calling the inquiry interface of the configuration library of the search engine.
The index configuration information corresponding to the search engine index name includes: the method includes the steps of configuring a number of a plurality of tables with a search engine index name, a name of each of the plurality of tables, an association and an association key between each two of the plurality of tables, a field configured by each of the plurality of tables, and a data source configuration name corresponding to each of the plurality of tables.
In step S1022, the data source configuration information corresponding to each of the plurality of tables is queried based on the index configuration information corresponding to the search engine index name.
Specifically, the data source configuration information corresponding to each table in the plurality of tables is inquired from the data source configuration table of the configuration library by calling an inquiry interface of the configuration library (mongoDB) of the search engine according to the data source configuration name corresponding to each table in the plurality of tables.
Wherein the data source configuration information corresponding to each of the plurality of tables includes: the type, address, port, library name, username, and password of the database to which the data source corresponding to each of the plurality of tables belongs.
Preferably, the type of the database to which the data source corresponding to each table of the plurality of tables belongs includes: SQLServer database, MYSQL database and ORACLE database.
In step S103, a connection is made to the database through JDBC, and the number of pieces of data stored in the database for each of the plurality of tables is queried.
In a preferred embodiment of the present invention, the steps further comprise: it is determined whether the number of pieces of data stored in the database for each of the plurality of tables is 0. If the number of data items stored in the database in the table is 0, the table is removed from the index allocation information corresponding to the search engine index name.
In step S104, a table tree structure is created for each of the plurality of tables according to the association relationship and the association key between each two tables of the plurality of tables.
The table tree structure comprises a plurality of levels, wherein a first level is provided with a table, and the other levels except the first level are provided with at least one table.
The table tree structure includes: the name of the table on each level, the index configuration information and the data source configuration information corresponding to the table on each level, the name of the upstream table of the table on each level, and the name of the association key between the table on each level and the upstream table of the table on each level.
For each level of the table tree structure, the following operations are performed in sequence starting from the lowest level to the second level:
in step S105, using a preset rule, the index configuration information and the data source configuration information corresponding to the table and the upstream table thereof on each level, the name of the upstream table, and the association relationship between all tables in the table tree structure are sequentially stored in a stack. The specific process is shown in fig. 3.
Specifically, for each level of the table tree structure, starting from the lowest level to the second level, the following operations are performed:
in step S1051, the index configuration information and the data source configuration information corresponding to the tables at the respective levels and the index configuration information and the data source configuration information corresponding to the upstream tables of the tables at the respective levels are put as one unit in the stack from the lowest level to the second level. Wherein, the index configuration information corresponding to the table on each level comprises: the index configuration information corresponding to the upstream table of the table on each level includes: the table configuration information includes an association relation and an association key between the table on each level and an upstream table of the table on each level, a field of the upstream table configuration of the table on each level, and a data source configuration name corresponding to the upstream table of the table on each level.
In step S1052, the index arrangement information corresponding to the table at each hierarchy, the index arrangement information corresponding to the upstream table of the table at each hierarchy, and the name of the upstream table of the table at each hierarchy are put as one unit in the stack from the lowest hierarchy to the second hierarchy.
In step S1053, the association relationships between all the tables in the table tree structure are put as a unit into a stack.
In step S106, the index configuration information and the data source configuration information corresponding to the tables at each level and the upstream tables thereof are sequentially retrieved from the stack in the order of storage, and the data stored in the database for the tables at each level and the upstream tables thereof is processed to form an association table between the tables at each level and the upstream tables of the tables at each level. Wherein the association table includes: all fields configured by the upstream table of the table on each level, and data in the JSON format formed by format conversion of all fields configured by the table on each level. The specific process is shown in fig. 4.
Specifically, for each level of the table tree structure, the following operations are performed in order from the lowest level to the second level:
in step S1061, the index configuration information and the data source configuration information corresponding to the tables at the respective levels and the index configuration information and the data source configuration information corresponding to the upstream tables of the tables at the respective levels are taken out from the stack.
In step S1062, the data stored in the database by the table on each level and the table upstream of the table on each level is uploaded to the data warehouse of the server through the heterogeneous data source synchronization tool.
In step S1063, JOIN association is performed on the data uploaded to the data warehouse of the server (i.e., the data stored in the database between the table at each level and the upstream table of the table at each level) by the association key between the table at each level and the upstream table of the table at each level, and an association table of the table at each level and the upstream table of the table at each level is formed.
Specifically, JOIN association is performed on data uploaded to a data warehouse of the server by using a Structured Query Language (SQL) through an association key between a table on each level and an upstream table of the table on each level, so as to form an association table between the table on each level and the upstream table of the table on each level.
In step S107, the data in the table on the first hierarchy and the association table of the table on the second hierarchy formed by the above operations are converted into data in the JSON format, and the interface of the search engine is called to write the data in the JSON format into the search engine, thereby achieving full-scale synchronization from the database to the search engine.
By applying the full-scale synchronization method applied to the search service, the leveling processing can be carried out on the complex service data which are associated with a plurality of tables and have different data source types and different central libraries in the application system, the full-scale synchronization can be carried out to the search engine at high-efficiency speed, and the full-scale synchronization method has good application prospect.
In order to facilitate a better understanding of the invention, the technical solutions of the invention are described in detail below by way of example.
In step S101, a search engine index name for which a full-scale synchronization operation is required is acquired.
In step S102, index configuration information and data source configuration information corresponding to the search engine index name are queried according to the search engine index name. The specific process is as follows.
In step S1021, index configuration information corresponding to the search engine index name is queried according to the search engine index name.
Specifically, according to the search engine index name, index configuration information corresponding to the search engine index name is inquired from an index configuration table of a configuration library (MongoDB) by calling the inquiry interface of the configuration library of the search engine.
The index configuration information corresponding to the search engine index name includes: the method includes the steps of configuring a number of a plurality of tables with a search engine index name, a name of each of the plurality of tables, an association and an association key between each two of the plurality of tables, a field configured by each of the plurality of tables, and a data source configuration name corresponding to each of the plurality of tables.
In step S1022, the data source configuration information corresponding to each of the plurality of tables is queried based on the index configuration information corresponding to the search engine index name.
Specifically, the data source configuration information corresponding to each table in the plurality of tables is inquired from the data source configuration table of the configuration library by calling an inquiry interface of the configuration library (mongoDB) of the search engine according to the data source configuration name corresponding to each table in the plurality of tables.
Wherein the data source configuration information corresponding to each of the plurality of tables includes: the type, address, port, library name, username, and password of the database to which the data source corresponding to each of the plurality of tables belongs.
In step S103, a connection is made to the database via JDBC, the number of pieces of data stored in the database for each of the plurality of tables is queried, and it is determined whether the number of pieces of data stored in the database for each of the plurality of tables is 0.
If the number of data items stored in the database in the table is 0, the table is removed from the index allocation information corresponding to the search engine index name.
In step S104, a table tree structure is created for each of the plurality of tables according to the association relationship and the association key between each two tables of the plurality of tables. As shown in particular in fig. 5.
In this example, the table tree structure includes three levels, a first level having one table (i.e., a first-level table a), a second level having two tables (i.e., a second-level table B and a second-level table C downstream of the first-level table a and having an association relationship with the first-level table a), and a third level having two tables (i.e., a third-level table D and a third-level table E downstream of the second-level table B and having an association relationship with the second-level table B).
For each level of the table tree structure, the following operations are performed in sequence starting from the third level to the second level:
in step S105, using a preset rule, the index configuration information and the data source configuration information corresponding to the table and the upstream table thereof on each level, the name of the upstream table, and the association relationship between all tables in the table tree structure are sequentially stored in a stack. The specific process is as follows.
Specifically, in step S1051, first, the index configuration information and the data source configuration information corresponding to the tertiary table D and the tertiary table E, and the index configuration information and the data source configuration information corresponding to the secondary table B (i.e., the upstream table of the tertiary table D and the tertiary table E) are put into the stack as one unit. Then, the index configuration information and the data source configuration information corresponding to the secondary table B and the secondary table C, and the index configuration information and the data source configuration information corresponding to the primary table a (i.e., the upstream table of the secondary table B and the secondary table C) are put into a stack as one unit.
In step S1052, first, the index configuration information corresponding to the tertiary table D and the tertiary table E, the index configuration information corresponding to the secondary table B (i.e., the upstream table of the tertiary table D and the tertiary table E), and the name of the secondary table B are put into a stack as one unit. Then, the index configuration information corresponding to the secondary table B and the secondary table C, the index configuration information corresponding to the primary table a (i.e., the upstream table of the secondary table B and the secondary table C), and the name of the primary table a are put into the stack as one unit.
In step S1053, the association relationships between all the tables in the table tree structure (i.e., the association relationships between the primary table a, the secondary table B, the secondary table C, the tertiary table D, and the tertiary table E) are put into a stack as one unit.
In step S106, the index configuration information and the data source configuration information corresponding to the tables at each level and the upstream tables thereof are sequentially retrieved from the stack in the order of storage, and the data stored in the database for the tables at each level and the upstream tables thereof is processed to form an association table between the tables at each level and the upstream tables of the tables at each level. The specific process is as follows.
Specifically, first, the index configuration information and the data source configuration information corresponding to the tertiary table D and the tertiary table E, and the index configuration information and the data source configuration information corresponding to the secondary table B (i.e., the upstream table of the tertiary table D and the tertiary table E) are taken out of the stack.
And secondly, uploading the data stored in the database by the third-level table D, the third-level table E and the second-level table B to a data warehouse of a server through a heterogeneous data source synchronization tool to form three tables (namely, the third-level table D, the third-level table E and the second-level table B).
Thirdly, JOIN association is carried out on the data (namely, the tertiary table D, the tertiary table E and the secondary table B) uploaded to the data warehouse of the server by using a Structured Query Language (SQL) through the association keys between the tertiary table D and the tertiary table E and the secondary table B, and an association table of the tertiary table D and the tertiary table E and the secondary table B, namely a B _ D _ E table, is formed. Wherein the B _ D _ E table includes: all fields configured in the secondary table B, a JSON format of data formed by format conversion of all fields configured in the tertiary table D, and a JSON format of data formed by format conversion of all fields configured in the tertiary table E.
It should be noted that, a piece of data in JSON format formed after format conversion is performed on all fields configured in the third-level table D is stored as an independent field in the B _ D _ E table, and a piece of data in JSON format formed after format conversion is performed on all fields configured in the third-level table E is stored as an independent field in the B _ D _ E table.
Next, the index configuration information and data source configuration information corresponding to secondary table B and secondary table C, and the index configuration information and data source configuration information corresponding to primary table A (i.e., the upstream tables of secondary table B and secondary table C) are taken out of the stack.
And then, uploading the data stored in the database by the secondary table B, the secondary table C and the primary table A to a data warehouse of the server through a heterogeneous data source synchronization tool to form three tables (namely, the secondary table B, the secondary table C and the primary table A).
And finally, performing JOIN association on the data (namely, the secondary table B, the secondary table C and the primary table A) in the data warehouse uploaded to the server by using a Structured Query Language (SQL) through the association keys between the secondary table B and the primary table A and the secondary table C to form an association table of the secondary table B and the primary table C and the primary table A, namely an A _ B _ C table. Wherein, the A _ B _ C table comprises: all fields configured in the first-level table A, a JSON format of data formed by format conversion of all fields in the B _ D _ E table, and a JSON format of data formed by format conversion of all fields configured in the second-level table C.
It should be noted that, a piece of data in JSON format formed after format conversion is performed on all fields in the B _ D _ E table is stored as an independent field in the a _ B _ C table, and a piece of data in JSON format formed after format conversion is performed on all fields configured in the secondary table C is stored as an independent field in the a _ B _ C table.
In step S107, the data in the a _ B _ C table is converted into data in JSON format, and an interface of the search engine is called to write the data in JSON format into the search engine in batch, so that full synchronization from the database to the search engine is realized.
Those skilled in the art will appreciate that the modules or steps of the invention described above can be implemented in a general purpose computing device, centralized on a single computing device or distributed across a network of computing devices, and optionally implemented in program code that is executable by a computing device, such that the modules or steps are stored in a memory device and executed by a computing device, fabricated separately into integrated circuit modules, or fabricated as a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A full-scale synchronization method applied to search services, comprising:
acquiring a search engine index name which needs to be subjected to full-scale synchronous operation;
according to the search engine index name, inquiring index configuration information and data source configuration information corresponding to the search engine index name, wherein the index configuration information corresponding to the search engine index name comprises: a number of the plurality of tables configured with the search engine index name, a name of each of the plurality of tables, an association and an association key between each two of the plurality of tables, a field configured by each of the plurality of tables, and a data source configuration name corresponding to each of the plurality of tables;
connecting the data base through JDBC, and inquiring the number of data stored in the database by each table in the plurality of tables;
creating a table tree structure for the plurality of tables according to the association relation and the association key between every two tables in the plurality of tables, wherein the table tree structure comprises a plurality of levels, one table is arranged on a first level, and at least one table is arranged on the other levels except the first level;
for each level of the table tree structure, the following operations are performed in sequence starting from the lowest level to the second level:
sequentially storing index configuration information and data source configuration information corresponding to the tables on each level and the upstream tables thereof, the names of the upstream tables and the incidence relations among all the tables in the table tree structure into a stack by using a preset rule;
according to the storage sequence, sequentially taking out index configuration information and data source configuration information corresponding to the tables on each level and the upstream tables thereof from the stack, and processing data stored in a database by the tables on each level and the upstream tables thereof to form an association table of the tables on each level and the upstream tables of the tables on each level, wherein the association table comprises: all fields configured by the upstream table of the table on each level, and JSON format data formed by format conversion of all fields configured by the table on each level;
and converting the data in the table on the first level and the table associated with the table on the second level, which are formed through the operations, into the data in the JSON format, and calling an interface of a search engine to write the data in the JSON format into the search engine, so that the full-scale synchronization from the database to the search engine is realized.
2. The full-scale synchronization method applied to search services according to claim 1, wherein querying index configuration information and data source configuration information corresponding to the search engine index name according to the search engine index name comprises:
inquiring index configuration information corresponding to the search engine index name according to the search engine index name;
and inquiring data source configuration information corresponding to each table in the plurality of tables according to the index configuration information corresponding to the search engine index name.
3. The full-scale synchronization method applied to search services according to claim 2, wherein querying index configuration information corresponding to the search engine index name according to the search engine index name comprises:
and according to the search engine index name, inquiring index configuration information corresponding to the search engine index name from an index configuration table of a configuration library by calling an inquiry interface of the configuration library of the search engine.
4. The full-scale synchronization method applied to search services according to claim 2, wherein querying data source configuration information corresponding to each table of the plurality of tables according to index configuration information corresponding to the search engine index name comprises:
and inquiring data source configuration information corresponding to each table in the plurality of tables from the data source configuration table of the configuration library by calling an inquiry interface of the configuration library of a search engine according to the data source configuration name corresponding to each table in the plurality of tables.
5. The full-scale synchronization method applied to search services according to claim 4, wherein the data source configuration information corresponding to each of the plurality of tables comprises: a type, address, port, library name, user name, and password of a database to which the data source corresponding to each of the plurality of tables belongs.
6. The full-scale synchronization method applied to search services according to claim 5, wherein the type of the database to which the data source corresponding to each table of the plurality of tables belongs comprises: SQL Server database, MYSQL database and ORACLE database.
7. The full-scale synchronization method applied to search services according to any one of claims 1 to 6, wherein after querying the number of data pieces stored in the database by each table of the plurality of tables and before creating a table tree structure for each two tables of the plurality of tables according to the association relationship and the association key between the tables, further comprising:
judging whether the number of data stored in a database by each table in the plurality of tables is 0;
if the number of data items stored in the database in the table is 0, the table is removed from the index configuration information corresponding to the search engine index name.
8. The full-scale synchronization method applied to search services according to claim 7, wherein the storing, in a stack, index configuration information and data source configuration information corresponding to the tables at each level and the upstream table thereof, names of the upstream table, and association relationships among all tables in the table tree structure in sequence by using a preset rule comprises:
for each level of the table tree structure, starting from the lowest level and up to the second level, the following operations are performed:
sequentially putting index configuration information and data source configuration information corresponding to the tables on each level and index configuration information and data source configuration information corresponding to an upstream table of the tables on each level into a stack as a unit, wherein the index configuration information corresponding to the tables on each level comprises: the index configuration information corresponding to the upstream table of the table on each level includes: the method comprises the following steps that association relations and association keys between tables on all levels and upstream tables of the tables on all levels, fields of upstream table configuration of the tables on all levels, and data source configuration names corresponding to the upstream tables of the tables on all levels;
sequentially putting index configuration information corresponding to the tables on each level, index configuration information corresponding to the upstream tables of the tables on each level, and the names of the upstream tables of the tables on each level into a stack as a unit;
and putting the association relation among all the tables in the table tree structure into a stack as a unit.
9. The full-scale synchronization method applied to search services according to claim 8, wherein the index configuration information and the data source configuration information corresponding to the tables at each level and the upstream tables thereof are sequentially retrieved from the stack in the storage order, and the data stored in the database by the tables at each level and the upstream tables thereof are processed to form the association table between the table at each level and the upstream table of the table at each level, comprising:
for each level of the table tree structure, the following operations are performed in sequence starting from the lowest level to the second level:
taking out index configuration information and data source configuration information corresponding to the tables on each level and index configuration information and data source configuration information corresponding to the upstream tables of the tables on each level from the stack;
uploading data stored in a database by the tables on each level and the upstream tables of the tables on each level to a data warehouse of a server through a heterogeneous data source synchronization tool;
JOIN association is carried out on the data uploaded to the data warehouse of the server through an association key between the table on each level and an upstream table of the table on each level, and an association table of the table on each level and the upstream table of the table on each level is formed.
10. The full-scale synchronization method applied to search services according to claim 9, wherein JOIN association of data uploaded to a data warehouse of a server is performed through an association key between a table on each level and an upstream table of the table on each level, and an association table of the table on each level and the upstream table of the table on each level is formed, and comprises:
JOIN association is carried out on data uploaded to a data warehouse of the server by using a structured query statement through an association key between the table on each level and an upstream table of the table on each level, so as to form an association table of the table on each level and the upstream table of the table on each level.
CN201910343332.2A 2019-04-26 2019-04-26 Full-scale synchronization method applied to search service Active CN110263028B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910343332.2A CN110263028B (en) 2019-04-26 2019-04-26 Full-scale synchronization method applied to search service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910343332.2A CN110263028B (en) 2019-04-26 2019-04-26 Full-scale synchronization method applied to search service

Publications (2)

Publication Number Publication Date
CN110263028A CN110263028A (en) 2019-09-20
CN110263028B true CN110263028B (en) 2021-06-15

Family

ID=67913911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910343332.2A Active CN110263028B (en) 2019-04-26 2019-04-26 Full-scale synchronization method applied to search service

Country Status (1)

Country Link
CN (1) CN110263028B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360367A (en) * 2011-09-29 2012-02-22 广州中浩控制技术有限公司 XBRL (Extensible Business Reporting Language) data search method and search engine
CN106657170A (en) * 2015-10-28 2017-05-10 阿里巴巴集团控股有限公司 Data synchronization method and device
CN107103067A (en) * 2017-04-18 2017-08-29 北京思特奇信息技术股份有限公司 A kind of method of data synchronization and system based on search engine
CN108121827A (en) * 2018-01-15 2018-06-05 农信银资金清算中心有限责任公司 A kind of synchronous method and device of full dose data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7966426B2 (en) * 2006-11-14 2011-06-21 Microsoft Corporation Offline synchronization capability for client application
US20120078899A1 (en) * 2010-09-27 2012-03-29 Fontana James A Systems and methods for defining objects of interest in multimedia content

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360367A (en) * 2011-09-29 2012-02-22 广州中浩控制技术有限公司 XBRL (Extensible Business Reporting Language) data search method and search engine
CN106657170A (en) * 2015-10-28 2017-05-10 阿里巴巴集团控股有限公司 Data synchronization method and device
CN107103067A (en) * 2017-04-18 2017-08-29 北京思特奇信息技术股份有限公司 A kind of method of data synchronization and system based on search engine
CN108121827A (en) * 2018-01-15 2018-06-05 农信银资金清算中心有限责任公司 A kind of synchronous method and device of full dose data

Also Published As

Publication number Publication date
CN110263028A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN106649378B (en) Data synchronization method and device
CN111459985B (en) Identification information processing method and device
CN107818115B (en) Method and device for processing data table
CN107784044B (en) Table data query method and device
CN110245134B (en) Increment synchronization method applied to search service
CN110147407B (en) Data processing method and device and database management server
CN110659282B (en) Data route construction method, device, computer equipment and storage medium
CN111767303A (en) Data query method and device, server and readable storage medium
CN102262640A (en) Method and device for full-text retrieval of document database
CN105468720A (en) Method for integrating distributed data processing systems, corresponding systems and data processing method
CN104239377A (en) Platform-crossing data retrieval method and device
CN111597160A (en) Distributed database system, distributed data processing method and device
CN110399368B (en) Method for customizing data table, data operation method and device
CN110837520A (en) Data processing method, platform and system
CN110399395A (en) Speedup query method, storage medium based on precomputation
CN102779138A (en) Hard disk access method of real time data
CN108984626B (en) Data processing method and device and server
CN110928900A (en) Multi-table data query method, device, terminal and computer storage medium
CN107291938A (en) Order Query System and method
CN110263028B (en) Full-scale synchronization method applied to search service
JP2011216029A (en) Distributed memory database system, database server, data processing method, and program thereof
CN115525655A (en) Method and system for data query slicing
CN103020300A (en) Method and device for information retrieval
CN105718485B (en) A kind of method and device by data inputting database
CN113032368A (en) Data migration method and device, storage medium and platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant