CN110515893A - Date storage method, device, equipment and computer readable storage medium - Google Patents
Date storage method, device, equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN110515893A CN110515893A CN201910684136.1A CN201910684136A CN110515893A CN 110515893 A CN110515893 A CN 110515893A CN 201910684136 A CN201910684136 A CN 201910684136A CN 110515893 A CN110515893 A CN 110515893A
- Authority
- CN
- China
- Prior art keywords
- data
- hive
- file
- protobuf
- hive table
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/113—Details of archiving
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/116—Details of conversion of file system types or formats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
Abstract
The embodiment of the invention discloses a kind of date storage method, device, equipment and computer readable storage mediums.Wherein, method includes the hive table for creating the Protobuf serializing file for reading bottom storage;By the corresponding programming language file of the description file generated of Protobuf serialized data, and it is sent in file packet to be loaded;According to the field for being used to parse data in advance in the corresponding analysis mode configuration hive table of the data parser constructed;Based on data parser, is automated using hive table and parse and read Protobuf serialized data;Wherein, data parser is used to hive table and configuration file carrying out the matching of table schema and configuration structure, and generates the Object object set and hive result object collection of hive table structure.The application realizes the parsing problem that the storage of hive data warehouse bottom Protobuf file is solved with least exploitation amount, is conducive to the data storage safety for promoting hive data warehouse.
Description
Technical field
The present embodiments relate to technical field of memory, more particularly to a kind of date storage method, device, equipment and meter
Calculation machine readable storage medium storing program for executing.
Background technique
With big data, the fast development of cloud computing, current informationization society mass data is continued to bring out, and is especially interconnected
The data volume of the companies such as net industry, telecommunications industry is constantly increased with shockingly speed, and so big data volume is for storing just
Have higher requirement.Current industry nearly all can use the mode of data compression to deposit to reduce file the storage of big data
It may require that higher calculating frequency while storing up occupied space, but reduce file storage, between EQUILIBRIUM CALCULATION FOR PROCESS and storage
Relationship makes that the required balance of industry can be reached between resource, is that those skilled in the art need the problem of paying close attention to.
Since hive data warehouse has lot of advantages, such as computing capability is expansible, data fault tolerant performance is higher, number
According to safety, integrated all advantages of HDFS, low in cost and easy to use etc., hive has been widely used as key data warehouse applications
In the application scenarios for having off-line data warehouse demand.
But since the HSQL ability to express of hive data warehouse is limited, mapreduce operation not smart enoughization of generation
It is thicker etc. with tuning granularity.Current hive data warehouse support storing data format be textfile, sequencefile,
Rcfile, orcfile, parquet, but these types of data memory format is all the unsafe storage class of data, is illegally invaded
As long as the partial data that the person of entering or unauthorized user take the type can check all data of this partial data taken, hold
It easily divulges a secret, is unfavorable for user security storing data.
Summary of the invention
The embodiment of the present disclosure provides a kind of date storage method, device, equipment and computer readable storage medium, realizes
The parsing problem that the storage of hive data warehouse bottom Protobuf file is solved with least exploitation amount is conducive to promote hive
The data storage safety of data warehouse.
In order to solve the above technical problems, the embodiment of the present invention the following technical schemes are provided:
On the one hand the embodiment of the present invention provides a kind of date storage method, comprising:
Hive table is created, the hive table is used to read the number of the Protobuf structured data storage format of bottom storage
According to;
By the corresponding programming language file of the description file generated of Protobuf serialized data, and it is sent to published article to be added
In part packet;
It is configured in the hive table according to the corresponding analysis mode of the data parser constructed in advance for parsing data
Field;
Resolver based on the data is automated using the hive table and parses and read the Protobuf serializing number
According to;
Wherein, the data parser is used to carry out in the hive table and configuration file of table schema and configuration structure
Match, and generates the Object object set and hive result object collection of the hive table structure.
Optionally, it is applied in Java language environment, the building process of the data parser includes:
It is arranged for reading configuration file and being associated with the pattern match file of the hive table structure, by the hive table
The matching of table schema and configuration structure is carried out with configuration file;
Object transformed document for being acted on behalf of the conversion logic for realizing Java object to Object object based on subclass is set;
It is arranged for traversing the nested traversal that the Protobuf serialized data generates Java object using nested object
File.
Optionally, the resolver based on the data is automated using the hive table described in parsing and reading
Protobuf serialized data includes:
The pattern match file and the object are assembled by rewriteeing the initialization function of hive data warehouse to realize
Transformed document generates the Object object set of the hive table structure;
By rewriteeing the data analytical function of hive data warehouse to realize the assembling nested traversal file generated hive
Result object collection;
Executable file packet is generated, and the java of Protobuf structure definition file generation is set in the configuration file
The executable file packet is loaded into hive environmental variance and reads institute using the specified parsing of the hive table by document location
State the entrance of Protobuf serialized data;
It executes hiveSQL query statement and reads the Protobuf serialized data.
Optionally, the resolver based on the data is automated using the hive table described in parsing and reading
Protobuf serialized data includes:
Judge whether that receiving data parsing within a preset time reads successful information;
If it is not, then carrying out the alarm of data read errors, and feedback automation parses and read the Protobuf sequence
Change the log-file information of data procedures.
On the other hand the embodiment of the present invention provides a kind of data storage device, comprising:
Hive table creation module, for creating hive table, the hive table is used to read the Protobuf knot of bottom storage
The data of structure data memory format;
Data conversion module, for the corresponding programming language of the description file generated of Protobuf serialized data is literary
Part, and be sent in file packet to be loaded;
Analysis mode configuration module, for according to the corresponding analysis mode configuration of the data parser of default building
For parsing the field of data in hive table;The data parser is used to the hive table and configuration file carrying out table schema
With the matching of configuration structure, and the Object object set and hive result object collection of the hive table structure are generated;
Datamation parses read module, for resolver based on the data, is dissolved automatically using the hive table
It analyses and reads the Protobuf serialized data.
Optionally, the data parser includes pattern matcher, object converter and nested traversal device;
The pattern matcher is used to carry out in the hive table and configuration file the matching of table schema and configuration structure;
The object converter is used to act on behalf of the conversion logic for realizing Java object to Object object based on subclass;
The nested traversal device, which is used to traverse the Protobuf serialized data using nested object, generates Java pairs
As.
Optionally, the datamation parsing read module includes:
Automation object generates submodule, for the initialization function by rewriteeing hive data warehouse to realize assembling institute
It states pattern match file and the object transformed document generates the Object object set of the hive table structure;
Hive result object collection generates submodule, for the data analytical function by rewriteeing hive data warehouse to realize
Assemble the nested traversal file generated hive result object collection;
Analysis mode specifies submodule, is arranged for generating executable file packet, and in the configuration file
The java document location that Protobuf structure definition file generates, is loaded into hive environmental variance for the executable file packet
The middle entrance that the Protobuf serialized data is read using the specified parsing of the hive table;
Reading submodule reads the Protobuf serialized data for executing hiveSQL query statement.
It optionally, further include alarm module, if being read successfully for being not received by data parsing within a preset time
Information then carries out the alarm of data read errors, and feedback automation parses and read the Protobuf serialized data mistake
The log-file information of journey.
The embodiment of the invention also provides a kind of data storage device, including processor, the processor is deposited for executing
It is realized when the computer program stored in reservoir as described in preceding any one the step of date storage method.
The embodiment of the present invention finally additionally provides a kind of computer readable storage medium, the computer readable storage medium
On be stored with data recording program, when the data recording program is executed by processor realize as described in preceding any one data storage
The step of method.
The advantages of technical solution provided by the present application, is, based on building data parser in advance, need to only configure hive table
Data parsing field can realize hive data warehouse automation parse and read bottom storage Protobuf serializing
File, since the data stored with Protobuf structured data storage format not only have data compression function, and it is highly-safe,
Improve the data storage safety of hive data warehouse;In addition, need to only modify certain parsing fields, it is not necessary to modify original programs
Code realizes the parsing problem that the storage of hive data warehouse bottom Protobuf file is solved with least exploitation amount, for
Downstream program need to only change the variation that configuration file is suitable for data structure when the data of upper layer transport have structure change, very greatly
Reducing enterprise in degree is that the application layer applications program that data structure changes and generates changes bring risk.
In addition, the embodiment of the present invention provides corresponding realization device, equipment and computer also directed to date storage method
Readable storage medium storing program for executing, further such that the method has more practicability, described device, equipment and computer readable storage medium
Have the advantages that corresponding.
It should be understood that the above general description and the following detailed description are merely exemplary, this can not be limited
It is open.
Detailed description of the invention
It, below will be to embodiment or correlation for the clearer technical solution for illustrating the embodiment of the present invention or the relevant technologies
Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this hair
Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root
Other attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of flow diagram of date storage method provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of another date storage method provided in an embodiment of the present invention;
Fig. 3 is a kind of specific embodiment structure chart of data storage device provided in an embodiment of the present invention;
Fig. 4 is another specific embodiment structure chart of data storage device provided in an embodiment of the present invention.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, with reference to the accompanying drawings and detailed description
The present invention is described in further detail.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than
Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise
Under every other embodiment obtained, shall fall within the protection scope of the present invention.
The description and claims of this application and term " first ", " second ", " third " " in above-mentioned attached drawing
Four " etc. be for distinguishing different objects, rather than for describing specific sequence.Furthermore term " includes " and " having " and
Their any deformations, it is intended that cover and non-exclusive include.Such as contain a series of steps or units process, method,
System, product or equipment are not limited to listed step or unit, but may include the step of not listing or unit.
After describing the technical solution of the embodiment of the present invention, the various non-limiting realities of detailed description below the application
Apply mode.
Referring first to Fig. 1, Fig. 1 is a kind of flow diagram of date storage method provided in an embodiment of the present invention, this hair
Bright embodiment may include the following contents:
S101: creation hive table, hive table are used to read the number of the Protobuf structured data storage format of bottom storage
According to.
In this application, the creation that any prior art carries out hive table can be used, specific creation process please join
The description for readding relevant art, just repeats no more herein.The hive table and hive table packet in the prior art created in S101
The functional module contained is all the same.
It is understood that Protobuf structured data storage format has both safety and data compression, for a kind of and language
It says the serializing structured data storage method that unrelated, platform is unrelated, expansible, and is that the structural data of portable and effective a kind of is deposited
Format is stored up, can be used for structural data serialization, serialize in other words.Hive data warehouse uses Protobuf structured data
Storage format storing data can promote the safety of storing data.It should be noted that creating the mesh of hive table in this step
Be in order to specify parsing Protobuf structured data storage format data analysis mode and parsing entrance.
S102: by the corresponding programming language file of the description file generated of Protobuf serialized data, and be sent to
In load document packet.
In the present embodiment, the format for the data that bottom stores in hive data warehouse is Protobuf, Protobuf structure
The data of data memory format presence in the form of serializing in storage or transmission, that is to say, that the embodiment of the present invention will be read
What is taken is the Protobuf serializing file of hive data warehouse bottom storage, is executing data parsing in reading process, is needing
By by the corresponding programming language file of the description file generated of Protobuf serialized data, if being held in JAVA language environment
Row, then can be by the corresponding JAVA class of the description file generated of Protobuf serialized data.
S103: it is used to parse data according in the corresponding analysis mode configuration hive table of the data parser constructed in advance
Field.
In this application, data parser can be used for carrying out in hive table and configuration file of table schema and configuration structure
Match, and generates the Object object set and hive result object collection of hive table structure.Wherein, configuration file is the configuration text of system
Part, for the Object object set of hive table structure for realizing subsequent automated operation, hive result object collection is by Protobuf
Serialized data, which is converted into, executes the operable data structure of subject.It need to only be configured by the data parser certain in hive table
Field is parsed class and achievees the purpose that the data directly stored using hiveSQL queried access bottom to parse Protobuf data,
The addition field that can be convenient, without modifying program code, downstream program when having structure change for the data of upper layer transport
The variation that configuration file is suitable for data structure only need to be changed, this will largely reduce produces since data structure changes
Raw application layer applications program changes bring risk.
S104: being based on data parser, is automated using hive table and parses and read Protobuf serialized data.
The initial method of the built-in serializing analytical function SerDe of hive can be used to parse data parser for the application, lead to
It crosses and executes hiveSQL query statement reading parsing Protobuf serializing file using the hive table of S101 creation.
In technical solution provided in an embodiment of the present invention, using preparatory building data parser, hive table need to be only configured
Data parsing field can realize hive data warehouse automation parse and read bottom storage Protobuf serializing
File since the data stored with Protobuf structured data storage format are not only highly-safe, and has data compression function,
Promote the data storage safety of hive data warehouse;In addition, realizing it is not necessary to modify Original program code with least exploitation
Amount solves the parsing problem of hive data warehouse bottom Protobuf file storage, has structure change for the data of upper layer transport
When downstream program need to only change the variation that configuration file is suitable for data structure, largely reduce enterprise be data structure
The application layer applications program change bring risk of variation and generation.
As a preferred embodiment, data parser may include configuration text when being applied in Java language environment
Part and the matched pattern matcher of hive table schema, using Java CGlib realize comprising Java object to Object object
The object converter of conversion logic uses the nested traversal device of the nested object traversal Protobuf Java object generated.Phase
It answers, the building process of data parser can include:
It is understood that can define a class first, for inheriting abstract class AbstractSerDe, to realize
Initialize, deserialize and getObjectInspector method.AbstractSerDe is one built in hive
Abstract class is mainly responsible for and defines all unconsummated abstract methods of the abstract class, and hive included interpreter is also required to inherit and be somebody's turn to do
Abstract class.Initialize is abstract method defined in AbstractSerDe abstract class, is mainly responsible for the initialization of interpreter
The preparation such as work such as loading environment variable and table information, hive included interpreter are also required to inherit the abstract class.
Deserialize is abstract method defined in AbstractSerDe abstract class, is mainly responsible for the solution of unserializing data file
The realization of analysis method, hive included interpreter are also required to inherit the abstract class.GetObjectInspector is
Abstract method defined in AbstractSerDe abstract class is mainly responsible for the data return of result object after serializing, and hive is certainly
The interpreter of band is also required to inherit the abstract class.
It is arranged for reading configuration file and being associated with the pattern match file of hive table structure, by hive table and configuration text
The matching of part progress table schema and configuration structure.It, can be by defining the class and pass that one is read configuration file in implementation process
Join hive table structure, it is such as configuration file and to be associated with the pattern match file of hive table structure.
Object transformed document for being acted on behalf of the conversion logic for realizing Java object to Object object based on subclass is set.
It, can be by defining a class in implementation process, the conversion of the Java object realized using Java CGlib to Object object
Logic, such is as object transformed document.Java CGlib is a kind of implementation of Java dynamic proxy, alternatively referred to as subclass
Agency realizes the extension to target object function by constructing a subclass object in memory.
Nested traversal file for being generated Java object using nested object traversal Protobuf serialized data is set.
In implementation process, it can be used nested time of the nested object traversal Protobuf Java object generated by defining a class
Tool is gone through, such is as nested traversal file.
Based on the data parser of above-mentioned building, S104 is parsed using the automation of hive table and is read Protobuf serializing
Data may particularly include:
It, can be by rewriteeing the initialization function of hive data warehouse to realize assembly model matching files based on analysis mode
The Object object set of hive table structure is generated with object transformed document;What initialization function herein can carry for hive
Initialize method in AbstractSerDe.
Based on analysis mode, the nested traversal text of assembling can be realized by rewriteeing the data analytical function of hive data warehouse
Part generates hive result object collection.In the AbstractSerDe that data analytical function herein can carry for hive
Deserialize method.
Executable file packet is generated, and the java file of Protobuf structure definition file generation is set in configuration file
Executable file packet is loaded into hive environmental variance and reads Protobuf serializing using the specified parsing of hive table by position
The entrance of data.
It executes hiveSQL query statement and reads Protobuf serialized data.
In view of that possibly can not parse and read hive data bins due to network cause or resolver failure and other reasons
The data of bottom storehouse layer storage, for positioning failure reason as early as possible, and repair failure in time.Optionally, in one embodiment,
Referring to Fig. 2, based on the above embodiment, may also include that
S105: judge whether that receiving data parsing within a preset time reads successful information, if it is not, then executing
S106。
S106: the alarm of data read errors, and feedback automation parsing and reading Protobuf serialized data are carried out
The log-file information of process.
In the present embodiment, by the way that self feed back step is arranged, if system is when starting to parse and read storing data to a certain
In this section of preset time period at moment, such as 10s, it is not received by data parsing and reads successful feedback information, then prove to hold
There is failure in the parsing of row data and reading process, the log-file information that system is run in timely crawl this period is simultaneously anti-
Feeding system, can to avoid by faulty relevant information before the covering of subsequent journal file, be conducive to staff it is accurate and
When bug is grabbed from journal file, efficiently repair failure, promote the overall performance of whole system.
The embodiment of the present invention provides corresponding realization device also directed to date storage method, further such that the method
With more practicability.Data storage device provided in an embodiment of the present invention is introduced below, data storage described below
Device can correspond to each other reference with above-described date storage method.
Referring to Fig. 3, Fig. 3 is a kind of structure of the data storage device provided in an embodiment of the present invention under specific embodiment
Figure, the device can include:
Hive table creation module 301, for creating hive table, hive table is used to read the Protobuf structure of bottom storage
The data of data memory format.
Data conversion module 302, for by the corresponding programming language of description file generated of Protobuf serialized data
File, and be sent in file packet to be loaded.
Analysis mode configuration module 303, for the corresponding analysis mode configuration of data parser according to default building
For parsing the field of data in hive table;Data parser is used to hive table and configuration file carrying out table schema and configuration is tied
The matching of structure, and generate the Object object set and hive result object collection of hive table structure.
Datamation parses read module 304, for being based on data parser, is automated using hive table and parses and read
Take Protobuf serialized data.
As a kind of preferred embodiment of the present embodiment, the data parser may include pattern matcher, object
Converter and nested traversal device;
Pattern matcher is used to carry out in hive table and configuration file the matching of table schema and configuration structure;
Object converter is used to act on behalf of the conversion logic for realizing Java object to Object object based on subclass;
Nesting traversal device is used to generate Java object using nested object traversal Protobuf serialized data.
Optionally, in some embodiments of the present embodiment, referring to Fig. 4, described device for example can also include accusing
Alert module 305 carries out reading data mistake if reading successful information for being not received by data parsing within a preset time
Alarm accidentally, and feedback automation parsing and the log-file information for reading the Protobuf serialized data process.
In other embodiment, the datamation parsing read module is specific can include:
Automation object generates submodule, for the initialization function by rewriteeing hive data warehouse to realize that group is die-filling
Formula matching files and object transformed document generate the Object object set of hive table structure;
Hive result object collection generates submodule, for the data analytical function by rewriteeing hive data warehouse to realize
The nested traversal file generated hive result object collection of assembling;
Analysis mode specifies submodule, for generating executable file packet, and Protobuf is arranged in configuration file and ties
Structure defines the java document location of file generated, and executable file packet is loaded into hive environmental variance and is referred to using hive table
The entrance of Protobuf serialized data is read in fixed parsing;
Reading submodule reads Protobuf serialized data for executing hiveSQL query statement.
The function of each functional module of data storage device described in the embodiment of the present invention can be according in above method embodiment
Method specific implementation, specific implementation process is referred to the associated description of above method embodiment, and details are not described herein again.
From the foregoing, it will be observed that the embodiment of the present invention, which is realized, solves hive data warehouse bottom Protobuf with least exploitation amount
The parsing problem of file storage is conducive to the data storage safety for promoting hive data warehouse.
The embodiment of the invention also provides a kind of data storage devices, specifically can include:
Memory, for storing computer program;
Processor realizes the step of date storage method described in any one embodiment as above for executing computer program
Suddenly.
The function of each functional module of data storage device described in the embodiment of the present invention can be according in above method embodiment
Method specific implementation, specific implementation process is referred to the associated description of above method embodiment, and details are not described herein again.
From the foregoing, it will be observed that the embodiment of the present invention, which is realized, solves hive data warehouse bottom Protobuf with least exploitation amount
The parsing problem of file storage is conducive to the data storage safety for promoting hive data warehouse.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored with data recording program, the number
When being executed by processor according to storage program as above date storage method described in any one embodiment the step of.
The function of each functional module of computer readable storage medium described in the embodiment of the present invention can be according to above method reality
The method specific implementation in example is applied, specific implementation process is referred to the associated description of above method embodiment, herein no longer
It repeats.
From the foregoing, it will be observed that the embodiment of the present invention, which is realized, solves hive data warehouse bottom Protobuf with least exploitation amount
The parsing problem of file storage is conducive to the data storage safety for promoting hive data warehouse.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other
The difference of embodiment, same or similar part may refer to each other between each embodiment.For being filled disclosed in embodiment
For setting, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part
Explanation.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure
And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These
Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession
Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered
Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
Above to a kind of date storage method provided by the present invention, device, equipment and computer readable storage medium into
It has gone and has been discussed in detail.Used herein a specific example illustrates the principle and implementation of the invention, the above implementation
The explanation of example is merely used to help understand method and its core concept of the invention.It should be pointed out that for the general of the art
, without departing from the principle of the present invention, can be with several improvements and modifications are made to the present invention for logical technical staff, this
A little improvement and modification are also fallen within the protection scope of the claims of the present invention.
Claims (10)
1. a kind of date storage method characterized by comprising
Hive table is created, the hive table is used to read the data of the Protobuf structured data storage format of bottom storage;
By the corresponding programming language file of the description file generated of Protobuf serialized data, and it is sent to file packet to be loaded
In;
It is configured in the hive table according to the corresponding analysis mode of the data parser constructed in advance for parsing the field of data;
Resolver based on the data is automated using the hive table and parses and read the Protobuf serialized data;
Wherein, the data parser is used to carry out in the hive table and configuration file the matching of table schema and configuration structure,
And generate the Object object set and hive result object collection of the hive table structure.
2. date storage method according to claim 1, which is characterized in that be applied in Java language environment, the number
Include: according to the building process of resolver
Be arranged for reading configuration file and being associated with the pattern match file of the hive table structure, by the hive table with match
Set the matching that file carries out table schema and configuration structure;
Object transformed document for being acted on behalf of the conversion logic for realizing Java object to Object object based on subclass is set;
It is arranged for traversing the nested traversal file that the Protobuf serialized data generates Java object using nested object.
3. date storage method according to claim 2, which is characterized in that the resolver based on the data utilizes
The hive table automation, which parses and reads the Protobuf serialized data, includes:
The pattern match file and object conversion are assembled by rewriteeing the initialization function of hive data warehouse to realize
The Object object set of hive table structure described in file generated;
By rewriteeing the data analytical function of hive data warehouse to realize the assembling nested traversal file generated hive result
Object set;
Executable file packet is generated, and the java file of Protobuf structure definition file generation is set in the configuration file
The executable file packet is loaded into hive environmental variance using described in the specified parsing reading of the hive table by position
The entrance of Protobuf serialized data;
It executes hiveSQL query statement and reads the Protobuf serialized data.
4. date storage method according to claim 1 to 3, which is characterized in that described to solve based on the data
Parser is automated using the hive table and parses and read the Protobuf serialized data and include:
Judge whether that receiving data parsing within a preset time reads successful information;
If it is not, then carrying out the alarm of data read errors, and feedback automation parses and read the Protobuf serializing number
According to the log-file information of process.
5. a kind of data storage device characterized by comprising
Hive table creation module, for creating hive table, the hive table is used to read the Protobuf structure number of bottom storage
According to the data of storage format;
Data conversion module, for by the corresponding programming language file of the description file generated of Protobuf serialized data, and
It is sent in file packet to be loaded;
Analysis mode configuration module, for configuring the hive table according to the corresponding analysis mode of data parser of default building
In for parsing the fields of data;The data parser is used to the hive table and configuration file carrying out table schema and configuration
The matching of structure, and generate the Object object set and hive result object collection of the hive table structure;
Datamation parses read module, for resolver based on the data, simultaneously using hive table automation parsing
Read the Protobuf serialized data.
6. data storage device according to claim 5, which is characterized in that the data parser includes pattern match
Device, object converter and nested traversal device;
The pattern matcher is used to carry out in the hive table and configuration file the matching of table schema and configuration structure;
The object converter is used to act on behalf of the conversion logic for realizing Java object to Object object based on subclass;
The nested traversal device, which is used to traverse the Protobuf serialized data using nested object, generates Java object.
7. data storage device according to claim 7, which is characterized in that the datamation parses read module packet
It includes:
Automation object generates submodule, for the initialization function by rewriteeing hive data warehouse to realize the assembling mould
Formula matching files and the object transformed document generate the Object object set of the hive table structure;
Hive result object collection generates submodule, for the data analytical function by rewriteeing hive data warehouse to realize assembling
The nested traversal file generated hive result object collection;
Analysis mode specifies submodule, for generating executable file packet, and Protobuf is arranged in the configuration file and ties
Structure defines the java document location of file generated, and the executable file packet is loaded into hive environmental variance using described
The entrance of the Protobuf serialized data is read in the specified parsing of hive table;
Reading submodule reads the Protobuf serialized data for executing hiveSQL query statement.
8. according to data storage device described in claim 5 to 7 any one, which is characterized in that further include alarm module, use
If reading successful information in being not received by data parsing within a preset time, the alarm of data read errors is carried out, and
Feedback automation parses and reads the log-file information of the Protobuf serialized data process.
9. a kind of data storage device, which is characterized in that including processor, the processor is used to execute to store in memory
It is realized when computer program as described in any one of Claims 1-4 the step of date storage method.
10. a kind of computer readable storage medium, which is characterized in that be stored with data on the computer readable storage medium and deposit
Program is stored up, the date storage method as described in any one of Claims 1-4 is realized when the data recording program is executed by processor
The step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910684136.1A CN110515893B (en) | 2019-07-26 | 2019-07-26 | Data storage method, device, equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910684136.1A CN110515893B (en) | 2019-07-26 | 2019-07-26 | Data storage method, device, equipment and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110515893A true CN110515893A (en) | 2019-11-29 |
CN110515893B CN110515893B (en) | 2022-12-09 |
Family
ID=68624195
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910684136.1A Active CN110515893B (en) | 2019-07-26 | 2019-07-26 | Data storage method, device, equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110515893B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111935065A (en) * | 2020-05-30 | 2020-11-13 | 中国兵器科学研究院 | Data communication method based on multi-window system and related device |
CN112637288A (en) * | 2020-12-11 | 2021-04-09 | 上海哔哩哔哩科技有限公司 | Streaming data distribution method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199879A (en) * | 2014-08-21 | 2014-12-10 | 广州华多网络科技有限公司 | Data processing method and device |
CN105760534A (en) * | 2016-03-10 | 2016-07-13 | 上海晶赞科技发展有限公司 | User-defined serializable data structure, hadoop cluster, server and application method thereof |
CN106570018A (en) * | 2015-10-10 | 2017-04-19 | 阿里巴巴集团控股有限公司 | Serialization method and apparatus, deserialization method and apparatus, serialization and deserialization system, and electronic device |
CN107992624A (en) * | 2017-12-22 | 2018-05-04 | 百度在线网络技术(北京)有限公司 | Parse method, apparatus, storage medium and the terminal device of serialized data |
US20190026335A1 (en) * | 2017-07-23 | 2019-01-24 | AtScale, Inc. | Query engine selection |
-
2019
- 2019-07-26 CN CN201910684136.1A patent/CN110515893B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199879A (en) * | 2014-08-21 | 2014-12-10 | 广州华多网络科技有限公司 | Data processing method and device |
CN106570018A (en) * | 2015-10-10 | 2017-04-19 | 阿里巴巴集团控股有限公司 | Serialization method and apparatus, deserialization method and apparatus, serialization and deserialization system, and electronic device |
CN105760534A (en) * | 2016-03-10 | 2016-07-13 | 上海晶赞科技发展有限公司 | User-defined serializable data structure, hadoop cluster, server and application method thereof |
US20190026335A1 (en) * | 2017-07-23 | 2019-01-24 | AtScale, Inc. | Query engine selection |
CN107992624A (en) * | 2017-12-22 | 2018-05-04 | 百度在线网络技术(北京)有限公司 | Parse method, apparatus, storage medium and the terminal device of serialized data |
Non-Patent Citations (1)
Title |
---|
震秦: "hive-protobuf-serde", 《HTTPS://GITEE.COM/ZHZHENQIN/HIVE-PROTOBUF-SERDE》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111935065A (en) * | 2020-05-30 | 2020-11-13 | 中国兵器科学研究院 | Data communication method based on multi-window system and related device |
CN112637288A (en) * | 2020-12-11 | 2021-04-09 | 上海哔哩哔哩科技有限公司 | Streaming data distribution method and system |
Also Published As
Publication number | Publication date |
---|---|
CN110515893B (en) | 2022-12-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9491117B2 (en) | Extensible framework to support different deployment architectures | |
US11868226B2 (en) | Load test framework | |
Seinturier et al. | A component‐based middleware platform for reconfigurable service‐oriented architectures | |
US8799299B2 (en) | Schema contracts for data integration | |
JP5197688B2 (en) | Integrated environment generator | |
US8832658B2 (en) | Verification framework for business objects | |
JP2009532758A (en) | A framework for modeling continuations in a workflow | |
US8285676B2 (en) | Containment agnostic, N-ary roots leveraged model synchronization | |
Beaton et al. | Usability challenges for enterprise service-oriented architecture APIs | |
CN108363566A (en) | File configuration method, intelligent terminal and storage medium in a kind of project development process | |
CN101937336A (en) | Software asset bundling and consumption method and system | |
CN113946321B (en) | Processing method of computing logic, electronic device and readable storage medium | |
CN110515893A (en) | Date storage method, device, equipment and computer readable storage medium | |
KR102472345B1 (en) | Method for managing hierarchical documents and apparatus using the same | |
CN106293687A (en) | The control method of a kind of flow process of packing, and device | |
Bhattacharjee et al. | Cloudcamp: A model-driven generative approach for automating cloud application deployment and management | |
US9141383B2 (en) | Subprocess definition and visualization in BPEL | |
Margaria et al. | Leveraging Applications of Formal Methods, Verification and Validation. Specialized Techniques and Applications: 6th International Symposium, ISoLA 2014, Imperial, Corfu, Greece, October 8-11, 2014, Proceedings, Part II | |
US7861233B2 (en) | Transparent context switching for software code | |
Mencl et al. | Managing Evolution of Component Specifications using a Federation of Repositories | |
Gargantini et al. | Metamodelling a formal method: applying mde to abstract state machines | |
CN114595246B (en) | Statement generation method, device, equipment and storage medium | |
Andrews | Design and Development of a Run-time Object Design and Instantiation Framework for BPM Systems | |
CN117369773A (en) | Method, apparatus, device and medium for managing code resources | |
CN117421061A (en) | Code sharing method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |