Method for writing picture data and distributed NewSQ L database system
Technical Field
The invention relates to the technical field of big data, in particular to a distributed NewSQ L database system for writing picture data.
Background
The data stored by Hbase has no data type, and is byte array. If the picture data is to be stored, the picture data needs to be stored together with data of other fields after being serialized. In an actual scene, the picture data belongs to data which is written once and read many times, the data of the picture is large, and other fields are subjected to frequent read-write operations, which causes the reduction of the reading performance when only other fields are read. Furthermore, because the substantial data in the region of Hbase needs to be flushed at the same time when the Hbase is flushed to the disk, such storage together also has an impact on the performance of writing data.
Disclosure of Invention
The embodiment of the invention aims to provide a distributed NewSQ L database system for writing picture data, which provides L OB storage, meets the picture storage requirement and solves the problem of performance reduction of reading data due to picture data storage.
To achieve the above object, an embodiment of the present invention provides a method for writing picture data, which is applicable to a distributed NewSQ L database system, and includes:
accessing a user request in an interface mode of JDCB/ODBC, wherein the user request comprises picture data needing to be written;
analyzing the user request, compiling and generating a corresponding execution plan;
generating MD5 from the picture data according to an execution plan, and writing MD5 into an original data table; simultaneously, writing the picture data into a picture data table;
and returning the processing result of completing the writing of the picture data to the user.
Further, the method also comprises the step of converting the user request into an SQ L request in the form of an SQ L statement.
Further, the analyzing the user request, compiling, and generating the corresponding execution plan includes:
judging whether a pre-stored SQ L statement corresponding to the SQ L request exists in a shared cache pool, if so, outputting an execution plan corresponding to the pre-stored SQ L statement, if not,
the SQ L request is subjected to a syntax check, which returns an error message to the user if a syntax error, otherwise,
the SQ L request is semantically checked, and if a semantic error returns an error message to the user, otherwise,
carrying out view and expression conversion on the SQ L request to obtain a corresponding conversion result;
selecting an optimizer according to the conversion result to obtain a corresponding optimizer selection result;
selecting a corresponding data connection mode and a corresponding connection sequence according to the selection result of the optimizer;
selecting a searched path according to the connection mode and the connection sequence;
and generating an execution plan according to the search path, and outputting the execution plan.
The embodiment of the invention also provides a distributed NewSQ L database system, which comprises:
the JDCB/ODBC interface unit is used for carrying out interactive operation with a user, receiving a user request and returning a processing result; wherein the user request comprises picture data needing to be written; the processing result is a processing result written in the picture data;
the system comprises a register unit, a JDCB/ODBC interface unit, a data processing unit and an SQ L Planer unit, wherein the register unit is used for accessing a user request accessed by the JDCB/ODBC interface unit, coordinating data communication among a plurality of processors and managing the whole flow, and preferentially sending the user request to the SQ L Planer unit;
an SQ L Planer unit for parsing the user request, compiling and customizing an execution plan according to the user request;
a worker unit to execute the plan in parallel, comprising: according to an execution plan, generating MD5 from the picture data needing to be written, and writing MD5 into an original data table; simultaneously, writing the picture data into a picture data table; the Hbase unit is also used for returning the processing result of the Hbase unit to the master unit;
the Hbase unit is used for storing the original data table and the picture data table, wherein the bottom layer of the Hbase unit is increased by L OB types;
and the distributed transaction manager is used for coordinating multiple parties to finish distributed transaction management when the worker unit execution plan relates to a transaction.
Further, the JDCB/ODBC interface unit is also used for converting the user request into a SQ L request in the form of a SQ L statement.
Further, the SQ L Planer unit is used to:
judging whether a pre-stored SQ L statement corresponding to the SQ L request exists in a shared cache pool, if so, outputting an execution plan corresponding to the pre-stored SQ L statement, if not,
the SQ L request is subjected to a syntax check, which returns an error message to the user if a syntax error, otherwise,
the SQ L request is semantically checked, and if a semantic error returns an error message to the user, otherwise,
carrying out view and expression conversion on the SQ L request to obtain a corresponding conversion result;
selecting an optimizer according to the conversion result to obtain a corresponding optimizer selection result;
selecting a corresponding data connection mode and a corresponding connection sequence according to the selection result of the optimizer;
selecting a searched path according to the connection mode and the connection sequence;
and generating an execution plan according to the search path, and outputting the execution plan.
Further, the method also comprises the following steps:
a monitor for taking charge of metadata management, monitoring a load of a Region of the Hbase unit, and reallocating the Region through a coprocessor module of the Hbase unit; the monitor is connected with the master unit.
Further, the monitoring the load of the Region of the Hbase unit and the reallocating the Region by the coprocessor module of the Hbase unit includes:
receiving data distribution information of the Hbase unit, and receiving load information of the worker unit in the master unit, wherein the load information comprises a load deviation value of the worker unit;
comparing the load deviation value of the worker unit with a preset load deviation threshold, and if the load deviation value is judged to exceed the threshold, triggering the Hbase unit to perform secondary distribution on the Region on the server with higher hit rate and the Region on the server with lower hit rate;
acquiring the data volume of each Region, judging the data volume of each Region and a preset data volume threshold, and triggering the Hbase unit to divide the regions exceeding the preset data volume threshold into two regions if the data volume of the Region is judged to exceed the threshold.
Further, the JDCB/ODBC interface unit includes:
the JDBC application program module is used for receiving a user request, calling a JDBC object method to give an SQ L statement and extracting a result to return to a user;
the JDBC driver manager module is used for loading and calling the JDBC driver module for the JDBC application program module;
the JDBC driver module is used for executing the calling of the JDBC object method, sending the SQ L statement corresponding to the user request to the bottom database and returning the result obtained from the bottom database to the JDBC application module.
Compared with the prior art, the method for writing the picture data and the distributed NewSQ L database system provided by the embodiment of the invention access a user request in an interface mode of JDCB/ODBC, wherein the user request comprises the picture data to be written, analyze the user request, compile and generate a corresponding execution plan, generate MD5 from the picture data according to the execution plan, write the MD5 into an original data table, write the picture data into the picture data table, and return the processing result of completing the writing of the picture data to the user, thereby providing L OB storage, meeting the picture storage requirement and solving the problem of the reduction of the Hbase reading data performance caused by the picture data storage in the prior art.
Drawings
Fig. 1 is a flowchart illustrating a method for writing picture data according to embodiment 1 of the present invention;
fig. 2 is a schematic structural diagram of a distributed NewSQ L database provided in embodiment 2 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic flowchart of a method for writing picture data according to embodiment 1 of the present invention; the embodiment comprises the following steps:
s1, accessing a user request in an interface mode of JDCB/ODBC, wherein the user request comprises picture data needing to be written;
s2, analyzing the user request, compiling and generating a corresponding execution plan;
s3, according to an execution plan, generating MD5 from the picture data, and writing the MD5 into an original data table; simultaneously, writing the picture data into a picture data table;
and S4, returning the processing result of the picture data writing completion to the user.
In the aspect of generating the index ID of the picture, the method adopts the calculation of MD5 on the picture data, uses the result of MD5 as the unique index ID. of the picture data as the modification of the atomic coverage only and the relatively independent query, and can greatly improve the retrieval speed when querying the non-picture field.
Further, step S1 includes converting the user request into a SQ L request in the form of a SQ L statement.
Further, the parsing, compiling and generating the corresponding execution plan in step S2 includes:
s21, judging whether the shared cache pool has the pre-stored SQ L corresponding to the SQ L request, if yes, outputting the corresponding execution plan of the pre-stored SQ L, if not,
s22, carrying out grammar check on the SQ L request, if the grammar error returns error information to the user, otherwise,
s23, semantic check is carried out on the SQ L request, if the semantic error returns error information to the user, otherwise,
s24, carrying out view and expression conversion on the SQ L request to obtain a corresponding conversion result;
s25, selecting an optimizer according to the conversion result to obtain a corresponding optimizer selection result;
s26, selecting a corresponding data connection mode and a connection sequence according to the result of the optimizer selection;
s27, selecting the searched path according to the connection mode and the connection sequence;
and S28, generating an execution plan according to the search path and outputting the execution plan.
When the method is specifically implemented, firstly, a user request is accessed in an interface mode of JDCB/ODBC, wherein the user request comprises JSON data needing to be written; then, analyzing the user request, compiling and generating a corresponding execution plan; then, according to an execution plan, generating MD5 from the picture data, and writing MD5 into an original data table; simultaneously, writing the picture data into a picture data table; and finally, returning the processing result of completing writing the picture data to the user.
The embodiment provides L OB storage, meets the requirement of picture storage, and solves the problem of performance degradation of Hbase read data due to picture data storage in the prior art.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a distributed NewSQ L database system according to embodiment 2 of the present invention, where embodiment 2 includes:
the JDCB/ODBC interface unit 1 is used for carrying out interactive operation with a user, receiving a user request and returning a processing result; wherein the user request comprises picture data needing to be written; the processing result is a processing result written in the picture data;
the system comprises a master unit 2, a JDCB/ODBC interface unit, a SQ L plain unit 3, a JDCB/ODBC interface unit, a register unit 2, a data communication unit and a data communication unit, wherein the master unit 2 is used for accessing a user request accessed by the JDCB/ODBC interface unit, coordinating data communication among a plurality of processors and managing the whole flow, and preferentially sending the user request to the JDCB L plain unit 3;
SQ L Planer unit 3 for parsing the user request, compiling and customizing an execution plan according to the user request;
a worker unit 4 for executing the plan in parallel, comprising: according to an execution plan, generating MD5 from the picture data needing to be written, and writing MD5 into an original data table; simultaneously, writing the picture data into a picture data table; and is also used for returning the processing result of the Hbase unit 6 to the master unit;
an Hbase unit 6, configured to store the original data table and the picture data table, wherein the bottom layer of the Hbase unit 6 is increased by L OB types;
generally, the distributed NewSQ L database system of the embodiment allows a user to flexibly establish a secondary index according to specific business logic, in practical application, the user often establishes a plurality of secondary indexes, and dynamically calculates the cost of using the index according to query conditions during use, and automatically selects the most appropriate index.
And the distributed transaction manager 5 is used for coordinating multiple parties to complete distributed transaction management when the worker unit 4 execution plan relates to a transaction.
In the aspect of generating the index ID of the picture, the method adopts the calculation of MD5 on the picture data, uses the result of MD5 as the unique index ID. of the picture data as the only index, and can only carry out atomic coverage modification and relatively independent query, and can greatly improve the retrieval speed when querying a non-picture field.
Further, the JDCB/ODBC interface unit 1 is configured to translate the user request into an SQ L request in the form of an SQ L statement.
Further, the SQ L Planer unit 3 is used for:
judging whether a pre-stored SQ L statement corresponding to the SQ L request exists in a shared cache pool, if so, outputting an execution plan corresponding to the pre-stored SQ L statement, if not,
the SQ L request is subjected to a syntax check, which returns an error message to the user if a syntax error, otherwise,
the SQ L request is semantically checked, and if a semantic error returns an error message to the user, otherwise,
carrying out view and expression conversion on the SQ L request to obtain a corresponding conversion result;
selecting an optimizer according to the conversion result to obtain a corresponding optimizer selection result;
selecting a corresponding data connection mode and a corresponding connection sequence according to the selection result of the optimizer;
selecting a searched path according to the connection mode and the connection sequence;
and generating an execution plan according to the search path, and outputting the execution plan.
Further, this embodiment further includes:
a monitor 8 for taking charge of metadata management, monitoring the load of Region of the Hbase unit 6, and reallocating the Region through the coprocessors module of the Hbase unit 6; the monitor 8 is connected to the master unit 2.
Further, the monitoring the load of the Region of the Hbase unit 6, and the reallocating the Region by the coprocessor module of the Hbase unit 6 includes:
receiving data distribution information of the Hbase unit 6, and receiving load information of the worker unit 4 in the master unit 2, wherein the load information comprises a load deviation value of the worker unit 4;
comparing the load deviation value of the worker unit 4 with a preset load deviation threshold, and if the load deviation value is judged to exceed the threshold, triggering the Hbase unit 6 to distribute the Region on the server with higher hit rate and the Region on the server with lower hit rate;
acquiring the data volume of each Region, judging the data volume of each Region and a preset data volume threshold, and triggering the Hbase unit to divide the regions exceeding the preset data volume threshold into two regions if the data volume of the Region is judged to exceed the threshold.
Further, the JDCB/ODBC interface unit 1 includes:
a JDBC application module 11 for receiving user requests and calling JDBC object methods to give SQ L statements and for retrieving results back to the user;
a JDBC driver manager module 12, configured to load and call a JDBC driver module 13 for the JDBC application module 11;
the JDBC driver module 13 is configured to execute the call of the JDBC object method, send an SQ L statement corresponding to the user request to the underlying database, and return a result obtained from the underlying database to the JDBC application module 11.
During specific implementation, firstly, a user request is received through the JDCB/ODBC interface unit 1, then, the master unit 2 accesses the user request accessed by the JDCB/ODBC interface unit 1, coordinates data communication among a plurality of processors and manages the whole process, and preferentially sends the user request to the SQ L Planer unit, then, the user request is analyzed through the SQ L Planer unit 3, an execution plan is compiled and customized according to the user request, then, the plan is executed in parallel through the worker unit 4, the plan comprises the steps of generating MD5 from the picture data needing to be written according to the execution plan, writing the MD5 into an original data table, simultaneously writing the picture data into the picture data table, and finally, the processing result of the Hbase unit 6 is returned to the master unit 2, and the processing result of the picture data written in is returned to the JDCB/ODBC interface unit 1 through the master unit 2 so as to return to a user.
The embodiment provides L OB storage, which satisfies the picture storage requirement and solves the problem of reduced data reading performance caused by picture data storage.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.