CN104572763A - Method for object transmission in distributed computing system - Google Patents
Method for object transmission in distributed computing system Download PDFInfo
- Publication number
- CN104572763A CN104572763A CN201310512599.2A CN201310512599A CN104572763A CN 104572763 A CN104572763 A CN 104572763A CN 201310512599 A CN201310512599 A CN 201310512599A CN 104572763 A CN104572763 A CN 104572763A
- Authority
- CN
- China
- Prior art keywords
- job
- distributed computing
- computing system
- file
- map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000005540 biological transmission Effects 0.000 title claims description 7
- 238000012545 processing Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A method of object delivery in a distributed computing system, the method comprising the steps of: serializing and coding the objects, and writing the objects into an operation file; transmitting a job file containing the encoded sequence; and decoding and deserializing the coded sequence in the job file containing the coded sequence, and extracting and using the object content. The object serialization and encoding is that the client decomposes the object into a byte stream, and then the byte stream is encoded and converted into ASCII characters. Wherein the job file is a standard XML file; wherein the object is coded by Base 64. The method for transmitting the object in the distributed computing system can transmit the complex object instance among all nodes of the distributed computing system, effectively enhances the processing capacity of the whole distributed computing system and can perform a more advanced distributed computing process.
Description
Technical field
The present invention relates to a kind of distributed computing system, particularly relate to a kind of method that objects in distributed calculating system transmits.
Background technology
Distributed Calculation, that the problem that needs very huge computing capability to solve is divided into many little parts, then these little parts are distributed to many computers to process, finally these result of calculations are integrated and obtain final result, it is a computer science.
Distributed computing system (Map/Reduce) is a distributed computing platform for large-scale data process, and as distributed computing system the most common at present, it is also realized by Google engineers design at first.To its definition be wherein, Map/Reduce is a programming model (programming model), be one for the treatment of with the relevant realization generating large-scale dataset (processing and generating large data sets).User defines a map function and processes a key/value to the key/value couple to generate a collection of centre, then defines a reduce function and be combined by the values of identical key that have in the middle of all these.Task in a lot of real world all can be expressed with this model.
Hadoop, developed by Apache foundation, JAVA development language is adopted to achieve distributed file system HDFS and Map-Reduce Distributed Computing Platform, as long as user inherits the base class MapReduceBase that this system provides, realize two classes of Map and Reduce respectively, and register the task that Job gets final product automatic distributed operation customization.
In the Map/Reduce realization of Hadoop, for each concrete operation (Job), all need to transmit configuration information required when running between different nodes, it is realized by job.xml, information needed when namely will be run by operation (Job) originating end, such as operation (Job) title, I/O form, Map/Reduce task number etc., writes a job.xml file, is then delivered to different system nodes.Job.xml is XML (easily extensible standard language) file of standard, and each specifying information is all formed with the element of XML (element) to be existed wherein.In system, other run node and read relevant information from this job.xml files, in order to configure the partial task that this node runs above, thus realize the distributed operation of Map/Reduce really.Fig. 1 is existing distributed computing system basic structure schematic diagram, as shown in Figure 1, distributed computing system one comprise client, Job Server and task server.Client is by operation and tool related content and configure write job.xml, is submitted to Job Server and goes, and the situation that moment monitoring performs; Job Server, is called JobTracker or Master in Hadoop, is responsible for job file (xml file) to be distributed to multiple task server, and Job Server is in charge of and is operated in All Jobs under this framework; Task server, concrete responsible execution user defining operation, each operation is split into a lot of tasks, comprise Map task and Reduce task dispatching, task is the concrete elementary cell performed, and they all need to be assigned to suitable task server and to get on execution, and task server performs the state of reporting each task to Job Server, help with this overall condition that Job Server understands Job execution, distribute new task dispatching.
Existing distributed computing system, namely object (class) example cannot be transmitted in Hadoop Map/Reduce system, job.xml can only be used for transmitting limited simple data type, such as int, long, float, String, boolean etc., because XML is for the character of transmission, the character namely in each element (element) has a definite limitation, at will one section of buffer memory (buffer) in internal memory can not be copied in xml and transmit, XML encoding and decoding failure can be caused like that, can not transmit.
But the application of customization one just do not have these simple data types above-mentioned; object (class) example of transfer complex between each node of Map-Reduce system often can be needed in the application of user; carry out the Distributed Calculation that some are senior, then the Map/Reduce of Hadoop realizes then can not providing this function at present.
Summary of the invention
In order to solve the deficiency that prior art exists, the object of the present invention is to provide a kind of method that objects in distributed calculating system transmits, the object instance of transfer complex between each node that can make distributed computing system.
For achieving the above object, the method that a kind of objects in distributed calculating system provided by the invention transmits, the method comprises the following steps:
By object serialization, coding, write operation file;
Transmit the job file containing coded sequence;
Undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use contents of object.
Wherein, described is, by client, object is resolved into byte stream by object serialization, coding, and then by described byte stream through coding, converts ascii character to.
Wherein, described job file is the XML file of standard.
Wherein, described is adopt Base64 coding by object coding.
Wherein, the step that described transmission contains the job file of coded sequence comprises further: client sends the described job file containing coded sequence to Job Server and Job Server sends the described job file containing coded sequence the step of task server to.
Wherein, describedly undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use the step of contents of object to comprise further: Job Server is undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use contents of object and task server to be undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use the step of contents of object.
Wherein, described object serialization converts the object of serializability interface to a byte sequence.
The method of objects in distributed calculating system transmission provided by the invention solves current modal distributed computing system, namely the problem of object instance cannot be transmitted in Hadoop Map/Reduce system, effective expansion has been carried out to the disposal ability of the job in Map/Reduce system, Map/Reduce system is when carrying out distributed arithmetic, would not be confined between different node, to transmit the such simple data structure information of character string, but can object (class) example of transfer complex, effectively enhance the disposal ability of whole Hadoop Map/Reduce distributed computing system, more senior distributed arithmetic process can be carried out.
Other features and advantages of the present invention will be set forth in the following description, and, partly become apparent from specification, or understand by implementing the present invention.
Accompanying drawing explanation
Accompanying drawing is used to provide a further understanding of the present invention, and forms a part for specification, and together with embodiments of the present invention, for explaining the present invention, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is existing distributed computing system basic structure schematic diagram;
Fig. 2 is the method flow diagram that objects in distributed calculating system according to the present invention transmits.
Embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein is only for instruction and explanation of the present invention, is not intended to limit the present invention.
In order to realize object (class) example transmitting user in the Map/Reduce system of Hadoop, expand the disposal ability of operation (job) in existing Map/Reduce system, that is effectively strengthening the disposal ability of whole Map/Reduce system, the application's motion have employed the technical scheme that Java object serializing combines with Base64 encoding and decoding.
Java object serializing, is convert the object achieving serializability (Serializable) interface to a byte sequence, and afterwards this byte sequence can be reverted to original object completely.The process of serializing is exactly that object writes byte stream and reading object from byte stream.After converting Obj State to byte stream, can be saved in file by the various byte stream classes in Java.io bag, connect in pipeline to another thread or by network and object data is sent to another main frame.Serializing is divided into two large divisions: serializing and unserializing.Serializing is the Part I of this process, and data decomposition is become byte stream, to store hereof or in transmission over networks.Unserializing is opened byte stream exactly and is reconstructed object.Object serialization not only will convert basic data type to byte representation, also will recover data, recovers the object instance that data demand recovers data.
Base64 encodes, and is one of modal coded system for transmitting 8Bit syllabified code on network.In RFC2045, Base64 is defined as: Base64 content transmits coding and is designed to the octet of arbitrary sequence to be described as a kind of not easily by the form of people's Direct Recognition.Base64 Producing reason also has one, and in the transport process of Email, due to historical reasons, Email is only allowed to transmit ascii character, and namely an octet is low 7.Base64 requirement is the byte (3*8=4*6=24) of four 6Bit the byte conversion of every three 8Bit, then 6Bit is added two high positions 0 again, the byte of composition four 8Bit, that is, the character string after conversion in theory will than original length 1/3.
Use job.xml to transmit the information between different task server (computing node) in the Map/Reduce system of Hadoop, can not the write direct element (element) of xml of byte stream after object serialization transmits, the byte stream that object serialization can be formed is after Base64 coding, convert the legal ascii character that can write xml element (element) to, transmit between each task server (computing node) in Map/Reduce system, the task server of job.xml is received in system, first the element (element) in job.xml is carried out Base64 decoding, the byte stream formed decoding again does the unserializing of object, just can convert required object instance to, so just reach and transmit object between different task server (computing node) in the Map/Reduce system of Hadoop.
Fig. 2 is the method flow diagram that objects in distributed calculating system according to the present invention transmits, and below with reference to Fig. 2, is described in detail the method that objects in distributed calculating system of the present invention transmits.
First, in step 201, client (Map/Reduce Client) is as operation originating end, object is resolved into byte stream and carry out serializing, and then the byte stream that object serialization is formed is encoded through Base64, convert the legal ascii character that can write xml element (element) to, write operation file (job.xml), job file is XML (easily extensible standard language) file of standard, and each specifying information is all formed with the element of XML (element) to be existed wherein.
In step 202, the job file (job.xml) comprised through serializing, Base64 coding is submitted to Job Server (Map/Reduce Master) by client.
In step 203, Job Server receives and comprises after the job file of serializing, Base64 coding, decodes to the Base64 coded sequence in this job file, and unserializing, extract and use this contents of object.
In step 204, Job Server passes to each task server (Map/Reduce Slave) by receiving the job file comprised through serializing, Base64 coding.
In step 205, task server receives and comprises after the job file of serializing, Base64 coding, decodes to the Base64 coded sequence in job file, and unserializing, extract and use this contents of object.
By the method that objects in distributed calculating system of the present invention transmits, operation (job) disposal ability in Map/Reduce system obtains effective expansion, Map/Reduce system is when carrying out distributed arithmetic, would not be confined between different node, to transmit the such simple data structure information of character string, but can object (class) example of transfer complex, effectively enhance the disposal ability of whole Map/Reduce system, more senior distributed arithmetic process can be carried out.
One of ordinary skill in the art will appreciate that: the foregoing is only the preferred embodiments of the present invention and oneself, be not limited to the present invention, although with reference to previous embodiment to invention has been detailed description, for a person skilled in the art, it still can be modified to the technical scheme that foregoing embodiments is recorded, or carries out equivalent replacement to wherein portion of techniques feature.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.
Claims (2)
1. a method for objects in distributed calculating system transmission, the method comprises the following steps:
By object serialization, coding, write operation file;
Transmit the job file containing coded sequence;
Undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use contents of object.
2. the method for objects in distributed calculating system transmission according to claim 1, it is characterized in that, described is, by client, object is resolved into byte stream by object serialization, coding, and then to be encoded through Base64 by described byte stream, converts ascii character to.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310512599.2A CN104572763A (en) | 2013-10-25 | 2013-10-25 | Method for object transmission in distributed computing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310512599.2A CN104572763A (en) | 2013-10-25 | 2013-10-25 | Method for object transmission in distributed computing system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104572763A true CN104572763A (en) | 2015-04-29 |
Family
ID=53088843
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310512599.2A Pending CN104572763A (en) | 2013-10-25 | 2013-10-25 | Method for object transmission in distributed computing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104572763A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109213745A (en) * | 2018-08-27 | 2019-01-15 | 郑州云海信息技术有限公司 | A kind of distributed document storage method, device, processor and storage medium |
CN109426651A (en) * | 2017-06-20 | 2019-03-05 | 北京小米移动软件有限公司 | The method and device of file conversion |
-
2013
- 2013-10-25 CN CN201310512599.2A patent/CN104572763A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109426651A (en) * | 2017-06-20 | 2019-03-05 | 北京小米移动软件有限公司 | The method and device of file conversion |
CN109213745A (en) * | 2018-08-27 | 2019-01-15 | 郑州云海信息技术有限公司 | A kind of distributed document storage method, device, processor and storage medium |
CN109213745B (en) * | 2018-08-27 | 2022-04-22 | 郑州云海信息技术有限公司 | Distributed file storage method, device, processor and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Casado et al. | Emerging trends and technologies in big data processing | |
CN109902274B (en) | Method and system for converting json character string into thraft binary stream | |
Li et al. | Performance improvement techniques for geospatial web services in a cyberinfrastructure environment–A case study with a disaster management portal | |
CN108985448B (en) | Neural network representation standard framework structure | |
CN105183834A (en) | Ontology library based transportation big data semantic application service method | |
CN102033959A (en) | Method for transferring objects in distributed calculating system | |
KR20210083136A (en) | Spatial information based digital twin service providing device and method | |
CN103530538B (en) | A kind of XML secured views querying method based on Schema | |
Liu et al. | A novel cloud platform for service robots | |
Izsó et al. | IncQuery-D: incremental graph search in the cloud. | |
CN109683873B (en) | Space information interface coding method and system architecture using ASN1 rule | |
CN103488697A (en) | System and mobile terminal capable of automatically collecting and exchanging fragmented commercial information | |
CN104572763A (en) | Method for object transmission in distributed computing system | |
Ledeul et al. | Data streaming with apache kafka for cern supervision, control and data acquisition system for radiation and environmental protection | |
CN105550176A (en) | Basic mapping method for relational database and XML | |
CN105224632A (en) | In XBRL technological frame, engine model is converted to the method for page model | |
CN108829930B (en) | Lightweight method for designing MBD model by three-dimensional digital process | |
CN114385139B (en) | Message serialization and comparison method and device for flight framework to run ETL (extract transform load) process | |
CN105793842B (en) | Conversion method and device between serialized message | |
CN109542953A (en) | Data processing method and device based on presto | |
Garg et al. | Study on JSON, its Uses and Applications in Engineering Organizations | |
Srinivas et al. | Storage Optimization Using File Compression Techniques for Big Data. | |
Jeong et al. | Data management technologies for infrastructure monitoring | |
Liu et al. | [Retracted] Video Image Processing Method Based on Cloud Platform Massive Data and Virtual Reality | |
Hussain et al. | Implementation of OGC compliant framework for data integration in Water Distribution System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150429 |