CN108073642A - The method, apparatus and system, data interaction system that data write and read - Google Patents

The method, apparatus and system, data interaction system that data write and read Download PDF

Info

Publication number
CN108073642A
CN108073642A CN201611023009.XA CN201611023009A CN108073642A CN 108073642 A CN108073642 A CN 108073642A CN 201611023009 A CN201611023009 A CN 201611023009A CN 108073642 A CN108073642 A CN 108073642A
Authority
CN
China
Prior art keywords
data
file
data block
real time
specified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611023009.XA
Other languages
Chinese (zh)
Inventor
路璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201611023009.XA priority Critical patent/CN108073642A/en
Publication of CN108073642A publication Critical patent/CN108073642A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Abstract

This programme embodiment provides a kind of data write-in and the method, apparatus and system, data interaction system that read.On the one hand, in this programme embodiment, real time computation system generates data block according at least one data, and the data block is compressed and, the data block obtained after compression is sent to file system;On the other hand, in this programme embodiment, the read data request that file system is sent in response to off-line calculation system, if data block where the file header of file where the requested specified data of read data request and specified data completes write operation, data block where specified data is sent to off-line calculation system by file system.The technical solution that this programme embodiment provides is calculated under the data interaction scene with off-line calculation, data interactive mode can not meet the problem of the needs of real-time and compression ratio simultaneously when meeting readwrite performance in real time to solve in the prior art.

Description

The method, apparatus and system, data interaction system that data write and read
【Technical field】
The present invention relates to big data processing technology field more particularly to a kind of data write-in and read method, apparatus and System, data interaction system.
【Background technology】
At present, there are a variety of data interaction scenes, each data interaction scene in big data treatment technology to have different Data interaction demand.For example, in the data interaction scene of real-time calculating and off-line calculation, real time computation system is needed to file Data are write in system, file system can further be supplied to off-line calculation system, real time computation system with data to be written Data interaction is carried out by file system with off-line calculation system.Real time computation system and off-line calculation system pass through file system , it is necessary to meet the performance requirement of this three aspect of real-time, readwrite performance and compression ratio simultaneously when carrying out data interaction.
In the prior art, this document is written to file system by real time computation system after a complete file is generated In, only after this document is all written to file system by real time computation system, off-line calculation system can be from file system This document is read in system.This realization method can not meet the needs of real-time.Therefore, in the prior art, in real time calculate and Under the data interaction scene of off-line calculation, data interactive mode can not meet the needs of real-time.
【The content of the invention】
In view of this, method, apparatus and system, the data that this programme embodiment provides a kind of data write-in and reads are handed over Mutual system to solve in the prior art, is calculated under the data interaction scene with off-line calculation in real time, and data interactive mode is full During sufficient readwrite performance, can not meet the problem of the needs of real-time and compression ratio simultaneously.
The one side of this programme embodiment provides a kind of method of data write-in, including:
Real time computation system generates data block according at least one data, and the data block is compressed and, The data block obtained after compression is sent to file system;
The file system receives the data block that the real time computation system is sent.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the real-time meter Calculation system generates data block according at least one data, including:
The real time computation system generates at least one data;
The real time computation system judges whether to reach data block formation condition;
If judging to reach specified data block formation condition, the real time computation system is generated according at least one data Data block.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the data block Formation condition includes:
The data of the real time computation system generation reach default quantity threshold;Alternatively,
The length of at least one data of the real time computation system generation reaches default length threshold;Alternatively,
Time interval between previous generation data block reaches default time threshold.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the data block Form include:
Data block length field;And
Record strip digital section in data block;And
Deta bearer field, each Deta bearer field are used to carry a data of real time computation system generation.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the method is also Including:
The data block that the file system sends the real time computation system stores.
The one side of this programme embodiment provides a kind of method of data write-in, is implemented in real time computation system;Including:
According at least one data, data block is generated;
The data block is compressed;
The data block obtained after compression is sent to file system.
The one side of this programme embodiment provides a kind of method of digital independent, including:
Off-line calculation system sends read data request to file system, and the read data request is used to ask to specify data;
The file system is in response to the read data request, if the file header of file where the specified data and described Data block where specifying data completes write operation, the file system by data block where the specified data be sent to it is described from Line computation system.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the file system The file format stored in system includes:
Including at least the file header of file type field, file header length field and attribute field;
Including at least the data block of record strip digital section and Deta bearer field in data block length field, data block.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the reading data The store path of file where the mark of data block where the specified data are carried in request and the data block;The file After system is in response to the read data request, the method further includes:
The file system obtains the file according to the store path;
The file system judges to whether there is file header in the file;
In response to judging that there are file header, file system data according to where the specified data in the file The mark of block is judged in the file with the presence or absence of data block where the specified data;
If there are data block, the file system where the specified data in the file to judge the specified number Write operation is completed according to place data block.
The one side of this programme embodiment provides a kind of method of digital independent, is implemented in file system, including:
The read data request that off-line calculation system is sent is received, the read data request is used to ask to specify data;
In response to the read data request, if the file header of file and the specified data institute where the specified data Write operation is completed in data block, data block where the specified data is sent to the off-line calculation system.
The one side of this programme embodiment provides a kind of system of data write-in, including:
Real time computation system for according at least one data, generating data block, and is compressed the data block, And the data block obtained after compression is sent to file system;
The file system, for receiving the data block that the real time computation system is sent.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the real-time meter Calculation system, is specifically used for:
Generate at least one data;
Judge whether to reach data block formation condition;
If judging to reach specified data block formation condition, data block is generated according at least one data.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the data block Formation condition includes:
The data of the real time computation system generation reach default quantity threshold;Alternatively,
The length of at least one data of the real time computation system generation reaches default length threshold;Alternatively,
Time interval between previous generation data block reaches default time threshold.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the data block Form include:
Data block length field;And
Record strip digital section in data block;And
Deta bearer field, each Deta bearer field are used to carry a data of real time computation system generation.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the file system System, is additionally operable to:
The data block sent to the real time computation system stores.
The one side of this programme embodiment provides a kind of device of data write-in, is arranged at real time computation system;Including:
Generation unit, for according at least one data, generating data block;
Compression unit, for being compressed to the data block;
Transmitting element, for sending the data block obtained after compression to file system.
The one side of this programme embodiment provides a kind of system of digital independent, including:
Off-line calculation system, for sending read data request to file system, the read data request is used to ask to specify Data;
The file system, in response to the read data request, if the file header of file where the specified data Write operation is completed with data block where the specified data, data block where the specified data is sent to the off-line calculation System.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the file system The file format stored in system includes:
Including at least the file header of file type field, file header length field and attribute field;
Including at least the data block of record strip digital section and Deta bearer field in data block length field, data block.
Aspect as described above and any possible realization method, it is further provided a kind of realization method, the reading data The store path of file where the mark of data block where the specified data are carried in request and the data block;The file After system is in response to the read data request, it is additionally operable to:
According to the store path, the file is obtained;
Judge to whether there is file header in the file;
In response to judging there are file header, according to the mark of data block where the specified data, to sentence in the file With the presence or absence of data block where the specified data in the disconnected file;
If judge data block where the specified data there are data block where the specified data in the file Complete write operation.
The one side of this programme embodiment provides a kind of device of digital independent, is arranged in file system, including:
Receiving unit, for receiving the read data request of off-line calculation system transmission, the read data request is used to ask Specify data;
Processing unit, in response to the read data request, if the file header of file and the finger where the specified data Fixed number completes write operation according to place data block, is sent to data block where the specified data by transmitting element described offline Computing system.
The one side of this programme embodiment provides a kind of data interaction system, including:
Real time computation system for according at least one data, generating data block, and is compressed the data block, And the data block obtained after compression is sent to file system;
The file system, for receiving the data block that the real time computation system is sent;
Off-line calculation system, for sending read data request to the file system, the read data request is used to ask Specify data;
The file system is additionally operable in response to the read data request, if the file of file where the specified data Data block where head and the specified data completes write operation, and data block where the specified data is sent to the offline meter Calculation system.
As can be seen from the above technical solutions, this programme embodiment has the advantages that:
In the technical solution that this programme embodiment provides, real time computation system generates data using one or more data Block, then be compressed, it can ensure compression ratio as far as possible, therefore, it is possible to meet the needs of compression ratio, solve the prior art In in real time calculate and off-line calculation data interaction scene under, data interactive mode can not meet the problem of the needs of compression ratio. Moreover, in file system, number where the file header of file where the requested specified data of off-line calculation system and specified data After completing write operation according to block, data block where specified data can be sent to off-line calculation system by file system, offline to count The evidence that counts can obtain required data from file system, disclosure satisfy that the need of real-time when file does not write completely It asks.Therefore, it is possible to meet the needs of real-time, solves the data interaction scene calculated in real time in the prior art with off-line calculation Under, data interactive mode can not meet the problem of the needs of real-time.
【Description of the drawings】
It, below will be to needed in the embodiment in order to illustrate more clearly of the technical solution of this programme embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present invention, for ability For the those of ordinary skill of domain, without having to pay creative labor, others are can also be obtained according to these attached drawings Attached drawing.
Fig. 1 is the structure diagram for the data interaction system that this programme embodiment is provided;
Fig. 2 is the first pass schematic diagram of the method for the data write-in that this programme embodiment is provided;
Fig. 3 is the flow diagram of real time computation system generation data block approach in this programme embodiment;
Fig. 4 is the second procedure schematic diagram of the method for the data write-in that this programme embodiment is provided;
Fig. 5 is the flow diagram of the method for the digital independent provided by this programme embodiment;
Fig. 6 is that this programme embodiment provides first of file format used in file system and real time computation system and shows Illustration;
Fig. 7 is that this programme embodiment provides second of file format used in file system and real time computation system and shows Illustration;
Fig. 8 is the second procedure schematic diagram of the method for the digital independent that this programme embodiment is provided;
Fig. 9 is the flow diagram for the data interactive method that this programme embodiment is provided;
Figure 10 is the structure diagram of the system for the data write-in that this programme embodiment is provided;
Figure 11 is the functional block diagram of the device for the data write-in that this programme embodiment is provided;
Figure 12 is the simplified block diagram of server 100;
Figure 13 is the structure diagram of the system for the digital independent that this programme embodiment is provided;
Figure 14 is the functional block diagram of the device for the digital independent that this programme embodiment is provided;
Figure 15 is the simplified block diagram of server 200.
【Specific embodiment】
For a better understanding of the technical solution of the present invention, this programme embodiment is retouched in detail below in conjunction with the accompanying drawings It states.
It will be appreciated that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.Base Embodiment in the present invention, those of ordinary skill in the art are obtained all without creative efforts Other embodiments belong to the scope of protection of the invention.
The term used in this programme embodiment is only merely for the purpose of description specific embodiment, and is not intended to be limiting The present invention.In this programme embodiment and " one kind " of singulative used in the attached claims, " described " and "the" It is also intended to including most forms, unless context clearly shows that other meanings.
It should be appreciated that term "and/or" used herein is only a kind of incidence relation for describing affiliated partner, represent There may be three kinds of relations, for example, A and/or B, can represent:Individualism A, exists simultaneously A and B, individualism B these three Situation.In addition, character "/" herein, it is a kind of relation of "or" to typically represent forward-backward correlation object.
Depending on linguistic context, word as used in this " if " can be construed to " ... when " or " when ... When " or " in response to determining " or " in response to detection ".Similarly, depending on linguistic context, phrase " if it is determined that " or " if detection (condition or event of statement) " can be construed to " when definite " or " in response to determining " or " when the detection (condition of statement Or event) when " or " in response to detecting (condition or event of statement) ".
Embodiment one
This programme embodiment provides a kind of data interaction system, please refers to Fig.1, and is provided by this programme embodiment The structure diagram of data interaction system, as shown in Figure 1, the data interaction system includes:Real time computation system, file system With off-line calculation system.In the data interaction system that this programme embodiment is provided, real time computation system is mainly used for file Data are write in system, file system is mainly used for storing data, and in response to the request of off-line calculation system, into one The data of storage are supplied to off-line calculation system by step, therefore, pass through file between real time computation system and off-line calculation system System realizes data interaction.
Specifically, in this programme embodiment, real time computation system, for according at least one data, generating data block, and The data block is compressed and, to file system send compression after obtained data block.File system, for receiving The data block that real time computation system is sent.Off-line calculation system, for sending read data request to file system, reading data please It asks to ask to specify data.File system, in response to the read data request, if the file header of file where specifying data Write operation is completed with data block where specified data, data block where specified data is sent to the off-line calculation system.
Embodiment two
This programme embodiment provides a kind of method of data write-in, please refers to Fig.2, is provided by this programme embodiment The first pass schematic diagram of the method for data write-in, as shown in Fig. 2, the method for data write-in comprises the following steps:
S201, real time computation system generate data block according at least one data, and data block is compressed and, The data block obtained after compression is sent to file system.
S202, file system receive the data block that real time computation system is sent.
In a concrete implementation scheme, real time computation system can send write data requests to file system, rewrite The data block obtained after compression is carried in request of data.Correspondingly, data are write in file system reception real time computation system transmission Request and, the data block carried in write data requests is stored.
It please refers to Fig.3, is the flow diagram that real time computation system generates data block approach in this programme embodiment, such as Shown in Fig. 3, this method may comprise steps of:
S301, real time computation system generate at least one data.
S302, real time computation system judge whether to reach data block formation condition, if judging to reach specified data block Formation condition performs step S303;If conversely, judging to be not reaching to specified data block formation condition, step S301 is performed, Continue to generate at least one data, until when judging to reach specified data block formation condition.
S303, real time computation system generate data block according at least one data.
During a concrete implementation, real time computation system can generate some data in real time, generate data In the process, can judge whether to reach default data block formation condition periodically, when judging to reach default data block life During into condition, data block is generated according to the data generated.If, whereas if judge to reach data block generation item not yet Part, then real time computation system continue generate data, do not generate data block temporarily.
In a concrete implementation scheme, data block formation condition can include:
The data of real time computation system generation reach default quantity threshold;Alternatively,
The length of at least one data of real time computation system generation reaches default length threshold;Alternatively,
Time interval between previous generation data block reaches default time threshold.
It should be noted that when real time computation system is judged to reach any one in above-mentioned data block formation condition During part, it is possible to determine to reach data block formation condition, and then data block can be generated.
For example, when real time computation system judges that the number of the data of generation reaches 1024, determine to reach data block Formation condition.Alternatively, in another example, when real time computation system judges that the length of the data of generation reaches 4M, determine to reach several According to block formation condition.Alternatively, in another example, when real time computation system judges the time interval between previous generation data block When reaching 1 minute, determine to reach data block formation condition.
In this programme embodiment, real time computation system can utilize data compression algorithm after a data block is generated, The data block is compressed, most after to file system send write data requests, obtained after carrying the compression in write data requests Data block, in this way, file system can store the data block obtained after compression, the data block of storage is for providing Give off-line calculation system.
It should be noted that in this programme embodiment, when reaching data block formation condition, just according to generated one Or a plurality of data generation data block, and then the data block of generation is compressed, in this manner it is ensured that a fixed number of data accumulation Amount regeneration data block, and be compressed, so as to ensure compression ratio as far as possible, the storage for saving file system is empty Between;It can also avoid calculating the data of generation simultaneously in real time due to being never written in file system, caused offline meter The problem of calculation system cannot be read in time ensure that the real-time of data interaction as far as possible.
In this programme embodiment, the form of the data block of real time computation system generation includes:
Data block length field;And
Record strip digital section in data block;And
Deta bearer field, each Deta bearer field are used to carry a data of real time computation system generation.
For example, data compression algorithm can include but is not limited to used in real time computation system:Lempel-Ziv77 is calculated Method, Run- Length Coding (Run-Length Encoding, RLE) algorithm or fixed bit length algorithm (Fixed Bit Length Packing) etc., this programme embodiment is to used data compression algorithm without being particularly limited to.
Embodiment three
This programme embodiment gives a kind of method of data write-in, and the method for data write-in is realized for file system side Embodiment.It please refers to Fig.4, the second procedure schematic diagram for the method that the data provided by this programme embodiment write, such as Fig. 4 Shown, the method for data write-in comprises the following steps:
S401 according at least one data, generates data block.
S402 is compressed data block.
S403 sends the data block obtained after compression to file system.
Example IV
This programme embodiment gives a kind of method of digital independent, refer to Fig. 5, is provided by this programme embodiment Digital independent method flow diagram, as shown in figure 5, this method comprises the following steps:
S501, off-line calculation system send read data request to file system, which is used to ask to specify number According to.
S502, file system is in response to the read data request, if the file header of file and specified data where specifying data Place data block completes write operation, and data block where specified data is sent to off-line calculation system by file system.
It should be noted that in the file system of storage file, storage at least two to be generally required for each file Part.In this programme embodiment, real time computation system using star-like asynchronous system in file system, being realized and being deposited in file system Store up at least two parts of files.Using the star-like asynchronous system in this programme embodiment, when writing data of data block can be greatly shortened It is long, so that off-line calculation system can obtain the data block from file system in time, it ensure that and calculate in real time and offline meter The required read-write of data interaction scene of calculation.
During a concrete implementation, real time computation system can include at least one client, each client Data can be generated, and then generate data block, then data block are compressed, by the data block obtained after compression by writing Request of data is sent to file system.So file system can in advance for each client generation it is corresponding one or more Identical file header.After file system generates corresponding at least two file header for each client, file system needs Further the data block received is stored.For example, file system adds the data block that real time computation system writes It is added in specified file, in this way, a data block can be added behind each file header in identical several file headers.Such as This, stores several identical file headers and data block in file system, between each file header, the data block added below is complete It is exactly the same, realize the storage and backup of file.Wherein, for the corresponding file header of each client, the sum of file header Mesh needs the total number for being more than or equal to the data block of real time computation system write-in.
In this programme embodiment, the file format that is stored in file system includes:
Including at least the file header of file type field, file header length field and attribute field;And
Including at least the data block of record strip digital section and Deta bearer field in data block length field, data block.
Below to file system in this programme embodiment and the text used in real time computation system by taking Fig. 6 and Fig. 7 as an example Part form is illustrated.
Fig. 6 is refer to, tray used in file system and real time computation system is provided by this programme embodiment First exemplary plot of formula, as shown in the figure, in this programme embodiment, each file stored in file system can include a text For part head (File Meta) and 1 to N number of data block (Block1~BlockN), N is positive integer.
As shown in fig. 6, during a concrete implementation, file header is located at the head of entire file, and file header includes Following three field:
First character section is file type field, is for the field of storage file type, File as shown in Figure 6 The type field.
Second field is file header length field, is for the field of storage file head length, Head as shown in Figure 6 Length fields.
3rd field is attribute field, is for the field of storage file attribute information, Property as shown in Figure 6 Field.
Wherein, file header length can be equal to File the type fields, Head Length fields and Property fields The sum of length.Due to the length of File the type fields, Head Length fields be it is fixed, can be according to the two words The file header length that the length and Head Length fields of section are stored can calculate the length of Property fields.
Wherein, the file attribute information stored in above-mentioned attribute field can include but is not limited to:Docket, in real time meter Data compression algorithm and restricted information etc. used in calculation system.For example, restricted information can be the length of data in data block Degree is no more than 8M.
Further, as shown in fig. 6, during a concrete implementation, each data block includes following three kinds of words Section:
The first field is data block length field, be for recording the field of data block length, it is as shown in Figure 6 Block Length fields.
Second of field is record strip digital section in data block, is for recording the data total number included in data block Field, Row Count fields as shown in Figure 6.
The third field is Deta bearer field, and each Deta bearer field is used to carry the one of real time computation system generation Data, Row fields as shown in Figure 6.
Wherein, each data block length is no more than 8M.Due to Block Length fields and the length of Row Count fields It is fixed, so being recorded according in Block Length fields and the length of Row Count fields and Block Length fields Data block length, the length of at least two Row fields can be extrapolated.
It is understood that each Deta bearer field namely a Row are a records being made of M row, Ke Yiyong In storing one row data.Wherein, several columns (Col1~ColK shown in Fig. 6) can be included in each Row fields, in first row Data be the data in the Row fields for storage string length, the data in secondary series are the tools stored in the Row Volume data, the data in the 3rd row are check bit (the Check Sum shown in Fig. 6), and the check bit is for verifying the data of storage Correctness.In this programme embodiment, it is compressed in units of data block namely is compressed in units of a plurality of record, The compression ratio of file can be improved, meets and calculates in real time and requirement of the data interaction scene of off-line calculation to compression performance.
Fig. 7 is refer to, tray used in file system and real time computation system is provided by this programme embodiment Second exemplary plot of formula, as shown in the figure, in the present embodiment by taking data type is character string (String) type as an example, to file header It is illustrated with data block.
As shown in fig. 7, file header includes three fields:
First character section is file type field, is for the field of storage file type, the present embodiment is with the version of file This number is used as file type, as shown in fig. 7, the version number of this document is SFILE1.
Second field is file header length field, is for the field of storage file head length, as shown in fig. 7, this reality It is 34 to apply file header length in example.
3rd field is attribute field, is for the field of storage file attribute information, as shown in fig. 7, attribute field The file attribute information of middle storage includes summary (Schema) attribute of file and compression (Compress) attribute, such as summary attribute For character string (String), packed attribute is empty (Null), i.e. file system is not compressed the data of storage.
As shown in fig. 7, for the data block 1 added after file header, including three fields:
First character section is data block length field, is for recording the field of data block length, as shown in fig. 7, data 1 length of block is 47.
Second field is record strip digital section in data block, is for recording the data total number included in data block Field, as shown in fig. 7, the number of data included in data block 1 is that have 2.
3rd field is Deta bearer field, and including being two records, it is a Row field each to record, Mei Geji It employs in a data for preserving real time computation system generation, as shown in fig. 7, the number carried in two records in data block 1 According to being abc and defg respectively, wherein, the number 3 before abc represents string length, the expression character string of number 4 before defg Length, 001 behind abc represents check bit, and 002 behind defg represents check bit.
As shown in fig. 7, for the data block 2 added after file header, including three fields:
First character section is data block length field, is for recording the field of data block length, as shown in fig. 7, data 2 length of block is 47.
Second field is record strip digital section in data block, is for recording the data total number included in data block Field, as shown in fig. 7, the number of data included in the present embodiment in data block 2 is that have 2.
3rd field is Deta bearer field, and including two records, each record is a Row, and each record is used for The a data of real time computation system generation is preserved, as shown in fig. 7, the data difference stored in two records in data block 2 It is hij and klmn, wherein, the number 3 before hij represents string length, the expression string length of number 4 before klmn, 003 behind hij represents check bit, and 004 behind klmn represents check bit.
Data block where specifying data is carried in this programme embodiment, in the read data request that off-line calculation system is sent The store path of file where mark and data block;In this way, file system is after read data request is received, it can be according to this Store path obtains file;Then, judge to whether there is file header in this document;In response to judging to exist in the file File header, file system further according to the mark for specifying data place data block, are judged in file with the presence or absence of specified data Place data block;If there is data block where specifying data in file, file system judges data block where specifying data Complete write operation.In this programme embodiment, if data block where the file header of file where specifying data and specified data is complete Into write operation, data block where specified data can be sent to off-line calculation system by file system.
It should be noted that in the prior art, the significant data of file in file system, if metadata is (including above-mentioned text Part type field, file header length field and attribute field) tail of file is stored in, therefore only when entire file is written to text After part system, this document could be read by off-line calculation system.In this programme embodiment, the metadata of file is stored in file header Portion, therefore, as long as after data block where the requested data of off-line calculation system is written to file system, it is possible to offline Computing system is read, and ensure that the data of write-in are read in time by off-line calculation system, meets the needs of real-time.Cause This, calculates under the data interaction scene with off-line calculation, storage format and writing mode in this file in real time, can be full Meets the needs of real-time while sufficient readwrite performance.
It is requested specified that off-line calculation system can be carried during a concrete implementation, in read data request The Resource Locator of file where file where data block, which is storage road of this document in file system Footpath, therefore, file system can obtain data block where the requested specified file of off-line calculation system according to the store path The file at place.
In addition, can also carry the summary info of data in read data request, which can include and specify data The mark of place data block after file system gets the requested file of off-line calculation system, can be sentenced according to the mark With the presence or absence of the data block represented by the mark in disconnected file, if there is the data block, represent that there are requested in file Data block where specifying data, judges that the data block completes write operation.Opposite, if do not represented there are data block Do not judge that the data block is not yet written into file system there are data block where requested specified data in file. In this programme embodiment, when judging that data block is not yet written into file system, file system is returned to off-line calculation system The response message of back read data failure.
In this programme embodiment, file system in response to read data request, if specify data where file file header and Data block where specifying data completes write operation, and the mark of file system data block according to where specifying data is cut from file The corresponding several piece of the mark is taken, the data block is then sent to off-line calculation system, the reading data of off-line calculation system are completed.
For example, file is stored according to row in file system, it is divided into several columns per a line, it is each to be classified as a number According to block, as col1, col2 ..., coln, represent n row.If the summary letter in the read data request that off-line calculation system is sent Coli is carried in breath, which can represent data block where specifying data, represent the data block of the i-th row in certain file.Text Part system can first judge the data block arranged in the file obtained with the presence or absence of i-th according to the coli carried in the summary info, If it does, file system can obtain the data block of i-th row from the file of acquisition.
It should be noted that off-line calculation data can from file system one or more file of acquisition request or Person, can also obtain the partial data in each file at least one file, and the partial data in each file can include One or more data block carries out digital independent that is, in units of data block.For example, the data of the 4th row in each file Block.This programme embodiment is to this without being particularly limited to.
Embodiment five
This programme embodiment provides a kind of method of digital independent, and the method for the digital independent is implemented in file system.Please With reference to figure 8, the second procedure schematic diagram of the method for the digital independent provided by this programme embodiment, as shown in figure 8, the number It may comprise steps of according to the method for reading:
S801, receives the read data request that off-line calculation system is sent, and read data request is used to ask to specify data.
S802, in response to read data request, if data block where the file header of file where specifying data and specified data Write operation is completed, data block where specified data is sent to off-line calculation system.
Embodiment six
This programme embodiment gives a kind of data interactive method, refer to Fig. 9, is provided by this programme embodiment The flow diagram of data interactive method, as shown in figure 9, the method for the digital independent may comprise steps of:
S901, real time computation system generation data.
Whether S902, the data that real time computation system judges to generate after last generation data block reach data block generation Condition.If so, step S903 is performed, whereas if it is no, perform step S901.
S903, real time computation system generate data block according to the data generated after last generation data block.
S904, real time computation system are compressed the data block of generation.
The data block of generation is sent to file system by S905, real time computation system.
S906, file system receive the data block that real time computation system is sent.
S907, file system additional data block received behind the file header previously generated, to the data block that receives into The write operation of data block is completed in row storage.
S908, off-line calculation system sends read data request to file system, for request data.
S909, file system receive the read data request that off-line calculation system is sent.
S910, number where the file header of file where file system judges the requested data of off-line calculation system and data Whether write operation is completed according to block.If so, S911 is performed, if not, performing S912.
S911, data block where file system sends requested data to off-line calculation system, then perform S913 and S914。
S912, file system return the response message of back read data failure to off-line calculation system, then perform S915.
S913, off-line calculation system receive the data block that file system returns.
S914, the data block that off-line calculation system docking receives are handled.
S915, off-line calculation system receive the response message for reading data failure.
This programme embodiment, which further provides, realizes the device embodiment of each step and method in above method embodiment.
Embodiment seven
This programme embodiment also provides a kind of system of data write-in, please refers to Fig.1 0, is carried by this programme embodiment The structure diagram of the system of the data write-in of confession, as shown in Figure 10, which includes:
Real time computation system 10, for according at least one data, generating data block, and pressing the data block Contracting and, to file system send compression after obtained data block;
The file system 20, for receiving the data block that the real time computation system is sent.
In a concrete implementation scheme, the real time computation system 10 is specifically used for:
Generate at least one data;
Judge whether to reach data block formation condition;
If judging to reach specified data block formation condition, data block is generated according at least one data.
In a concrete implementation scheme, the data block formation condition includes:
The data of the real time computation system generation reach default quantity threshold;Alternatively,
The length of at least one data of the real time computation system generation reaches default length threshold;Alternatively,
Time interval between previous generation data block reaches default time threshold.
In a concrete implementation scheme, the form of the data block includes:
Data block length field;And
Record strip digital section in data block;And
Deta bearer field, each Deta bearer field are used to carry a data of real time computation system generation.
In an optional implementation, the file system 20 is additionally operable to:
The data block sent to the real time computation system stores.
It should be noted that in this programme embodiment, real time computation system and file system can be utilized respectively server It realizes, can carry out the deployment of real time computation system and file system according to demand, real time computation system and file system can be with It is deployed on different server, alternatively, can also dispose on one server.
Embodiment eight
This programme embodiment also provides a kind of device of data write-in, please refers to Fig.1 1, is carried by this programme embodiment The functional block diagram of the device of the data write-in of confession, as shown in figure 11, the device of data write-in is arranged at real time computation system; Described device includes:
Generation unit 110, for according at least one data, generating data block;
Compression unit 111, for being compressed to the data block;
Transmitting element 112, for sending the data block obtained after compression to file system.
Embodiment nine
2 are please referred to Fig.1, is the simplified block diagram of server 100.The server 100 can include and one or more numbers According to the processor 101 of storage instrument connection, which can include storage medium 102 and internal storage location 103.It intercepts Device 100 can also include input interface 104, output interface 105 and business interface 106, for being carried out with another device or system Communication.It is storable in by the CPU of processor 101 program codes performed in storage medium 102 or internal storage location 103.
Processor 101 in server 100 calls the program code for being stored in storage medium 102 or internal storage location 103, with Perform following each step:
According at least one data, data block is generated;The data block is compressed;And it is connect by the output Mouth 105 sends the data blocks obtained after compression to file system.
Embodiment ten
This programme embodiment also provides a kind of system of digital independent, please refers to Fig.1 3, is carried by this programme embodiment The structure diagram of the system of the digital independent of confession, as shown in figure 13, the system include:
Off-line calculation system 30, for sending read data request to file system 20, the read data request is used to ask Specify data;
The file system 20, in response to the read data request, if the file of file where the specified data Data block where head and the specified data completes write operation, and data block where the specified data is sent to the offline meter Calculation system.
In a concrete implementation scheme, the file format that is stored in the file system 20 includes:
Including at least the file header of file type field, file header length field and attribute field;
Including at least the data block of record strip digital section and Deta bearer field in data block length field, data block.
In a concrete implementation scheme, the mark of data block where the specified data are carried in the read data request Know the store path with file where the data block;After the file system 20 is in response to the read data request, also use In:
According to the store path, the file is obtained;
Judge to whether there is file header in the file;
In response to judging there are file header, according to the mark of data block where the specified data, to sentence in the file With the presence or absence of data block where the specified data in the disconnected file;
If judge data block where the specified data there are data block where the specified data in the file Complete write operation.
It should be noted that in this programme embodiment, file system and off-line calculation system can be utilized respectively server It realizes, the deployment of real-time file system and off-line calculation system, file system and off-line calculation system can be carried out according to demand It can be deployed on different server, alternatively, can also dispose on one server.
Embodiment 11
This programme embodiment also provides a kind of device of digital independent, please refers to Fig.1 4, is carried by this programme embodiment The functional block diagram of the device of the digital independent of confession, as shown in figure 14, the device of the digital independent are arranged in file system;Institute Stating device includes:
Receiving unit 141, for receiving the read data request of off-line calculation system transmission, the read data request is used for Data are specified in request;
Processing unit 142, in response to the read data request, if the file header of file where the specified data and described Data block completes write operation where specifying data, and data block where the specified data is sent to institute by transmitting element 143 State off-line calculation system.
Embodiment 12
5 are please referred to Fig.1, is the simplified block diagram of server 200.The server 200 can include and one or more numbers According to the processor 201 of storage instrument connection, which can include storage medium 202 and internal storage location 203.Service Device 200 can also include input interface 204, output interface 205, for communicating with another device or system.By processor The program code that 201 CPU is performed is storable in storage medium 202 or internal storage location 203.
Processor 201 in server 200 calls the program code for being stored in storage medium 202 or internal storage location 203, with Perform following each step:
The read data request of off-line calculation system transmission is received by the input interface 204, the read data request is used Data are specified in request;And in response to the read data request, if the file header of file where the specified data and described Data block completes write operation where specifying data, is sent data block where the specified data by the output interface 105 To the off-line calculation system.
In above-described embodiment, storage medium can be read-only memory (Read-Only Memory, ROM) or readable It writes, such as hard disk, flash memory.Internal storage location can be random access memory (Random Access Memory, RAM).Memory Unit can be with processor physical integration or integrated in memory or being configured to individual unit.
Processor is the control centre of above equipment (equipment is above-mentioned server or above-mentioned client), and at offer Device is managed, for executing instruction, interrupt operation is carried out, clocking capability and various other functions is provided.Optionally, processor bag One or more central processing unit (CPU) are included, such as the CPU0 and CPU 1 shown in Figure 15.Above equipment includes one Or multiple processor.Processor can be monokaryon (single CPU) processor or multinuclear (multi -CPU) processor.Unless otherwise stated, Be described as performing the component of such as processor or memory of task and can realize as universal component, be temporarily used for Execution task of fixing time is embodied as being manufactured specifically for the particular elements for performing the task.Terminology used herein " processing Device " refers to one or more devices, circuit and/or process cores, for handling data, such as computer program instructions.
It is storable in by the CPU of the processor program codes performed in internal storage location or storage medium.Optionally, it is stored in Program code in storage medium can be copied into internal storage location and be performed so as to the CPU of processor.Processor can perform at least One kernel (such as LINUXTM、UNIXTM、WINDOWSTM、ANDROIDTM、IOSTM), it is well known that the kernel is used to pass through control Execution, control and the communication of peripheral unit and the use of control computer device resource of other programs or process are made to control The operation of above equipment.
Said elements in above equipment can be connected to each other by bus, bus such as data/address bus, address bus, control One of bus, expansion bus and local bus or its any combination.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit may be referred to the corresponding process in preceding method embodiment, and details are not described herein.
In several embodiments provided by the present invention, it should be understood that disclosed system, apparatus and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit Division is only a kind of division of logic function, can have other dividing mode in actual implementation, for example, multiple units or group Part may be combined or can be integrated into another system or some features can be ignored or does not perform.It is another, it is shown Or the mutual coupling, direct-coupling or communication connection discussed can be by some interfaces, device or unit it is indirect Coupling or communication connection can be electrical, machinery or other forms.
The unit illustrated as separating component may or may not be physically separate, be shown as unit The component shown may or may not be physical location, you can be located at a place or can also be distributed to multiple In network element.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also That unit is individually physically present, can also two or more units integrate in a unit.Above-mentioned integrated list The form that hardware had both may be employed in member is realized, can also be realized in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit realized in the form of SFU software functional unit, can be stored in one and computer-readable deposit In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, is used including some instructions so that a computer It is each that device (can be personal computer, server or network equipment etc.) or processor (Processor) perform the present invention The part steps of embodiment the method.And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only memory (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disc or CD etc. it is various The medium of program code can be stored.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention God and any modification, equivalent substitution, improvement and etc. within principle, done, should be included within the scope of protection of the invention.

Claims (21)

  1. A kind of 1. method of data write-in, which is characterized in that the described method includes:
    Real time computation system generates data block according at least one data, and the data block is compressed and, Xiang Wen Part system sends the data block obtained after compression;
    The file system receives the data block that the real time computation system is sent.
  2. 2. according to the method described in claim 1, it is characterized in that, the real time computation system is according at least one data, life Into data block, including:
    The real time computation system generates at least one data;
    The real time computation system judges whether to reach data block formation condition;
    If judging to reach specified data block formation condition, the real time computation system generates data according at least one data Block.
  3. 3. according to the method described in claim 2, it is characterized in that, the data block formation condition includes:
    The data of the real time computation system generation reach default quantity threshold;Alternatively,
    The length of at least one data of the real time computation system generation reaches default length threshold;Alternatively,
    Time interval between previous generation data block reaches default time threshold.
  4. 4. according to the method described in claim 3, it is characterized in that, the form of the data block includes:
    Data block length field;And
    Record strip digital section in data block;And
    Deta bearer field, each Deta bearer field are used to carry a data of real time computation system generation.
  5. 5. according to the method described in claim 1, it is characterized in that, the method further includes:
    The data block that the file system sends the real time computation system stores.
  6. A kind of 6. method of data write-in, which is characterized in that be implemented in real time computation system;The described method includes:
    According at least one data, data block is generated;
    The data block is compressed;
    The data block obtained after compression is sent to file system.
  7. A kind of 7. method of digital independent, which is characterized in that the described method includes:
    Off-line calculation system sends read data request to file system, and the read data request is used to ask to specify data;
    The file system is in response to the read data request, if the file header of file where the specified data and described specifying Data block where data completes write operation, and data block where the specified data is sent to the offline meter by the file system Calculation system.
  8. 8. the method according to the description of claim 7 is characterized in that the file format stored in the file system includes:
    Including at least the file header of file type field, file header length field and attribute field;
    Including at least the data block of record strip digital section and Deta bearer field in data block length field, data block.
  9. 9. the method according to the description of claim 7 is characterized in where the specified data are carried in the read data request The store path of file where the mark of data block and the data block;The file system in response to the read data request it Afterwards, the method further includes:
    The file system obtains the file according to the store path;
    The file system judges to whether there is file header in the file;
    In response to judging that there are file header, file system data blocks according to where the specified data in the file Mark is judged in the file with the presence or absence of data block where the specified data;
    If there are data block, the file system where the specified data in the file to judge the specified data institute Write operation is completed in data block.
  10. A kind of 10. method of digital independent, which is characterized in that it is implemented in file system, the described method includes:
    The read data request that off-line calculation system is sent is received, the read data request is used to ask to specify data;
    In response to the read data request, if data where the file header of file where the specified data and the specified data Block completes write operation, and data block where the specified data is sent to the off-line calculation system.
  11. 11. a kind of system of data write-in, which is characterized in that the system comprises:
    Real time computation system, for according at least one data, generating data block, and the data block is compressed and, The data block obtained after compression is sent to file system;
    The file system, for receiving the data block that the real time computation system is sent.
  12. 12. system according to claim 11, which is characterized in that the real time computation system is specifically used for:
    Generate at least one data;
    Judge whether to reach data block formation condition;
    If judging to reach specified data block formation condition, data block is generated according at least one data.
  13. 13. system according to claim 12, which is characterized in that the data block formation condition includes:
    The data of the real time computation system generation reach default quantity threshold;Alternatively,
    The length of at least one data of the real time computation system generation reaches default length threshold;Alternatively,
    Time interval between previous generation data block reaches default time threshold.
  14. 14. system according to claim 13, which is characterized in that the form of the data block includes:
    Data block length field;And
    Record strip digital section in data block;And
    Deta bearer field, each Deta bearer field are used to carry a data of real time computation system generation.
  15. 15. system according to claim 11, which is characterized in that the file system is additionally operable to:
    The data block sent to the real time computation system stores.
  16. 16. a kind of device of data write-in, which is characterized in that be arranged at real time computation system;Described device includes:
    Generation unit, for according at least one data, generating data block;
    Compression unit, for being compressed to the data block;
    Transmitting element, for sending the data block obtained after compression to file system.
  17. 17. a kind of system of digital independent, which is characterized in that the system comprises:
    Off-line calculation system, for sending read data request to file system, the read data request is used to ask to specify data;
    The file system, in response to the read data request, if the file header of file and institute where the specified data Data block completes write operation where stating specified data, and data block where the specified data is sent to the off-line calculation system System.
  18. 18. system according to claim 17, which is characterized in that the bag of the file format stored in the file system It includes:
    Including at least the file header of file type field, file header length field and attribute field;
    Including at least the data block of record strip digital section and Deta bearer field in data block length field, data block.
  19. 19. system according to claim 17, which is characterized in that the specified data institute is carried in the read data request The store path of file where the mark of data block and the data block;The file system is in response to the read data request Afterwards, it is additionally operable to:
    According to the store path, the file is obtained;
    Judge to whether there is file header in the file;
    In response to judging, according to the mark of data block where the specified data, to judge institute there are file header in the file It states in file with the presence or absence of data block where the specified data;
    If judge that data block where the specified data is completed there are data block where the specified data in the file Write operation.
  20. 20. a kind of device of digital independent, which is characterized in that be arranged in file system, described device includes:
    Receiving unit, for receiving the read data request of off-line calculation system transmission, the read data request is used to ask to specify Data;
    Processing unit, in response to the read data request, if the file header of file and the specified number where the specified data Write operation is completed according to place data block, data block where the specified data is sent to by the off-line calculation by transmitting element System.
  21. 21. a kind of data interaction system, which is characterized in that the system comprises:
    Real time computation system, for according at least one data, generating data block, and the data block is compressed and, The data block obtained after compression is sent to file system;
    The file system, for receiving the data block that the real time computation system is sent;
    Off-line calculation system, for sending read data request to the file system, the read data request is used to ask to specify Data;
    The file system is additionally operable in response to the read data request, if the file header of file where the specified data and Data block where the specified data completes write operation, and data block where the specified data is sent to the off-line calculation system System.
CN201611023009.XA 2016-11-18 2016-11-18 The method, apparatus and system, data interaction system that data write and read Pending CN108073642A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611023009.XA CN108073642A (en) 2016-11-18 2016-11-18 The method, apparatus and system, data interaction system that data write and read

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611023009.XA CN108073642A (en) 2016-11-18 2016-11-18 The method, apparatus and system, data interaction system that data write and read

Publications (1)

Publication Number Publication Date
CN108073642A true CN108073642A (en) 2018-05-25

Family

ID=62160424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611023009.XA Pending CN108073642A (en) 2016-11-18 2016-11-18 The method, apparatus and system, data interaction system that data write and read

Country Status (1)

Country Link
CN (1) CN108073642A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408141A (en) * 2014-12-01 2015-03-11 国家计算机网络与信息安全管理中心 Redundancy removal file system and data deployment method thereof
CN105357504A (en) * 2015-12-17 2016-02-24 深圳市科漫达智能管理科技有限公司 Recording and playback method and device for video stream data
CN105868305A (en) * 2016-03-25 2016-08-17 西安电子科技大学 A fuzzy matching-supporting cloud storage data dereplication method
CN105959720A (en) * 2016-04-28 2016-09-21 东莞市华睿电子科技有限公司 Video stream data processing method
TW201640352A (en) * 2015-05-14 2016-11-16 Alibaba Group Services Ltd Stream computing system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408141A (en) * 2014-12-01 2015-03-11 国家计算机网络与信息安全管理中心 Redundancy removal file system and data deployment method thereof
TW201640352A (en) * 2015-05-14 2016-11-16 Alibaba Group Services Ltd Stream computing system and method
CN105357504A (en) * 2015-12-17 2016-02-24 深圳市科漫达智能管理科技有限公司 Recording and playback method and device for video stream data
CN105868305A (en) * 2016-03-25 2016-08-17 西安电子科技大学 A fuzzy matching-supporting cloud storage data dereplication method
CN105959720A (en) * 2016-04-28 2016-09-21 东莞市华睿电子科技有限公司 Video stream data processing method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XUHEAZX: "流媒体服务器原理和架构解析", 《CSDN》 *
众播传媒的博客: "影响音视频传输质量的三个关键的因素", 《新浪博客》 *

Similar Documents

Publication Publication Date Title
CN110531940B (en) Video file processing method and device
CN112153085B (en) Data processing method, node and block chain system
US7136976B2 (en) System and method for backup which synchronously or asynchronously stores additional information depending on the target backup data
CN104246767A (en) Telemetry system for a cloud synchronization system
CN102866961A (en) Memory transfer with extended data and user privacy protection
CN111352935B (en) Index creating method, device and equipment in block chain type account book
CN102737205B (en) Protection comprises can the file of editing meta-data
CN108427728A (en) Management method, equipment and the computer-readable medium of metadata
CN110287201A (en) Data access method, device, equipment and storage medium
CN108268344A (en) A kind of data processing method and device
CN107153587A (en) Directly it is attached on nonvolatile memory in high-performance using data reduction and carries out effective striding equipment redundancy realization
CN115470156A (en) RDMA-based memory use method, system, electronic device and storage medium
CN108241676A (en) Realize the method and apparatus that data synchronize
CN104951482A (en) Method and device for operating Sparse-format mirror image document
CN102792281A (en) Storage device
CN107463638A (en) File sharing method and equipment between offline virtual machine
CN106453663A (en) Improved cloud service-based storage capacity expansion method and device
CN114924914B (en) Disk partition table information backup and recovery method and system
CN113360095B (en) Hard disk data management method, device, equipment and medium
CN108073642A (en) The method, apparatus and system, data interaction system that data write and read
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN111625502B (en) Data reading method and device, storage medium and electronic device
CN114513469A (en) Traffic shaping method and device for distributed system and storage medium
CN110019086A (en) More copy read methods, equipment and storage medium based on distributed file system
CN114327942A (en) Shared memory management method and cache service assembly

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180525

RJ01 Rejection of invention patent application after publication