CN104572763A - Method for object transmission in distributed computing system - Google Patents

Method for object transmission in distributed computing system Download PDF

Info

Publication number
CN104572763A
CN104572763A CN201310512599.2A CN201310512599A CN104572763A CN 104572763 A CN104572763 A CN 104572763A CN 201310512599 A CN201310512599 A CN 201310512599A CN 104572763 A CN104572763 A CN 104572763A
Authority
CN
China
Prior art keywords
job
distributed computing
computing system
file
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310512599.2A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Qunfeng Electronic Information Technology Co ltd
Original Assignee
Xi'an Qunfeng Electronic Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Qunfeng Electronic Information Technology Co ltd filed Critical Xi'an Qunfeng Electronic Information Technology Co ltd
Priority to CN201310512599.2A priority Critical patent/CN104572763A/en
Publication of CN104572763A publication Critical patent/CN104572763A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method of object delivery in a distributed computing system, the method comprising the steps of: serializing and coding the objects, and writing the objects into an operation file; transmitting a job file containing the encoded sequence; and decoding and deserializing the coded sequence in the job file containing the coded sequence, and extracting and using the object content. The object serialization and encoding is that the client decomposes the object into a byte stream, and then the byte stream is encoded and converted into ASCII characters. Wherein the job file is a standard XML file; wherein the object is coded by Base 64. The method for transmitting the object in the distributed computing system can transmit the complex object instance among all nodes of the distributed computing system, effectively enhances the processing capacity of the whole distributed computing system and can perform a more advanced distributed computing process.

Description

A kind of method that objects in distributed calculating system transmits
Technical field
The present invention relates to a kind of distributed computing system, particularly relate to a kind of method that objects in distributed calculating system transmits.
Background technology
Distributed Calculation, that the problem that needs very huge computing capability to solve is divided into many little parts, then these little parts are distributed to many computers to process, finally these result of calculations are integrated and obtain final result, it is a computer science.
Distributed computing system (Map/Reduce) is a distributed computing platform for large-scale data process, and as distributed computing system the most common at present, it is also realized by Google engineers design at first.To its definition be wherein, Map/Reduce is a programming model (programming model), be one for the treatment of with the relevant realization generating large-scale dataset (processing and generating large data sets).User defines a map function and processes a key/value to the key/value couple to generate a collection of centre, then defines a reduce function and be combined by the values of identical key that have in the middle of all these.Task in a lot of real world all can be expressed with this model.
Hadoop, developed by Apache foundation, JAVA development language is adopted to achieve distributed file system HDFS and Map-Reduce Distributed Computing Platform, as long as user inherits the base class MapReduceBase that this system provides, realize two classes of Map and Reduce respectively, and register the task that Job gets final product automatic distributed operation customization.
In the Map/Reduce realization of Hadoop, for each concrete operation (Job), all need to transmit configuration information required when running between different nodes, it is realized by job.xml, information needed when namely will be run by operation (Job) originating end, such as operation (Job) title, I/O form, Map/Reduce task number etc., writes a job.xml file, is then delivered to different system nodes.Job.xml is XML (easily extensible standard language) file of standard, and each specifying information is all formed with the element of XML (element) to be existed wherein.In system, other run node and read relevant information from this job.xml files, in order to configure the partial task that this node runs above, thus realize the distributed operation of Map/Reduce really.Fig. 1 is existing distributed computing system basic structure schematic diagram, as shown in Figure 1, distributed computing system one comprise client, Job Server and task server.Client is by operation and tool related content and configure write job.xml, is submitted to Job Server and goes, and the situation that moment monitoring performs; Job Server, is called JobTracker or Master in Hadoop, is responsible for job file (xml file) to be distributed to multiple task server, and Job Server is in charge of and is operated in All Jobs under this framework; Task server, concrete responsible execution user defining operation, each operation is split into a lot of tasks, comprise Map task and Reduce task dispatching, task is the concrete elementary cell performed, and they all need to be assigned to suitable task server and to get on execution, and task server performs the state of reporting each task to Job Server, help with this overall condition that Job Server understands Job execution, distribute new task dispatching.
Existing distributed computing system, namely object (class) example cannot be transmitted in Hadoop Map/Reduce system, job.xml can only be used for transmitting limited simple data type, such as int, long, float, String, boolean etc., because XML is for the character of transmission, the character namely in each element (element) has a definite limitation, at will one section of buffer memory (buffer) in internal memory can not be copied in xml and transmit, XML encoding and decoding failure can be caused like that, can not transmit.
But the application of customization one just do not have these simple data types above-mentioned; object (class) example of transfer complex between each node of Map-Reduce system often can be needed in the application of user; carry out the Distributed Calculation that some are senior, then the Map/Reduce of Hadoop realizes then can not providing this function at present.
Summary of the invention
In order to solve the deficiency that prior art exists, the object of the present invention is to provide a kind of method that objects in distributed calculating system transmits, the object instance of transfer complex between each node that can make distributed computing system.
For achieving the above object, the method that a kind of objects in distributed calculating system provided by the invention transmits, the method comprises the following steps:
By object serialization, coding, write operation file;
Transmit the job file containing coded sequence;
Undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use contents of object.
Wherein, described is, by client, object is resolved into byte stream by object serialization, coding, and then by described byte stream through coding, converts ascii character to.
Wherein, described job file is the XML file of standard.
Wherein, described is adopt Base64 coding by object coding.
Wherein, the step that described transmission contains the job file of coded sequence comprises further: client sends the described job file containing coded sequence to Job Server and Job Server sends the described job file containing coded sequence the step of task server to.
Wherein, describedly undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use the step of contents of object to comprise further: Job Server is undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use contents of object and task server to be undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use the step of contents of object.
Wherein, described object serialization converts the object of serializability interface to a byte sequence.
The method of objects in distributed calculating system transmission provided by the invention solves current modal distributed computing system, namely the problem of object instance cannot be transmitted in Hadoop Map/Reduce system, effective expansion has been carried out to the disposal ability of the job in Map/Reduce system, Map/Reduce system is when carrying out distributed arithmetic, would not be confined between different node, to transmit the such simple data structure information of character string, but can object (class) example of transfer complex, effectively enhance the disposal ability of whole Hadoop Map/Reduce distributed computing system, more senior distributed arithmetic process can be carried out.
Other features and advantages of the present invention will be set forth in the following description, and, partly become apparent from specification, or understand by implementing the present invention.
Accompanying drawing explanation
Accompanying drawing is used to provide a further understanding of the present invention, and forms a part for specification, and together with embodiments of the present invention, for explaining the present invention, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is existing distributed computing system basic structure schematic diagram;
Fig. 2 is the method flow diagram that objects in distributed calculating system according to the present invention transmits.
Embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein is only for instruction and explanation of the present invention, is not intended to limit the present invention.
In order to realize object (class) example transmitting user in the Map/Reduce system of Hadoop, expand the disposal ability of operation (job) in existing Map/Reduce system, that is effectively strengthening the disposal ability of whole Map/Reduce system, the application's motion have employed the technical scheme that Java object serializing combines with Base64 encoding and decoding.
Java object serializing, is convert the object achieving serializability (Serializable) interface to a byte sequence, and afterwards this byte sequence can be reverted to original object completely.The process of serializing is exactly that object writes byte stream and reading object from byte stream.After converting Obj State to byte stream, can be saved in file by the various byte stream classes in Java.io bag, connect in pipeline to another thread or by network and object data is sent to another main frame.Serializing is divided into two large divisions: serializing and unserializing.Serializing is the Part I of this process, and data decomposition is become byte stream, to store hereof or in transmission over networks.Unserializing is opened byte stream exactly and is reconstructed object.Object serialization not only will convert basic data type to byte representation, also will recover data, recovers the object instance that data demand recovers data.
Base64 encodes, and is one of modal coded system for transmitting 8Bit syllabified code on network.In RFC2045, Base64 is defined as: Base64 content transmits coding and is designed to the octet of arbitrary sequence to be described as a kind of not easily by the form of people's Direct Recognition.Base64 Producing reason also has one, and in the transport process of Email, due to historical reasons, Email is only allowed to transmit ascii character, and namely an octet is low 7.Base64 requirement is the byte (3*8=4*6=24) of four 6Bit the byte conversion of every three 8Bit, then 6Bit is added two high positions 0 again, the byte of composition four 8Bit, that is, the character string after conversion in theory will than original length 1/3.
Use job.xml to transmit the information between different task server (computing node) in the Map/Reduce system of Hadoop, can not the write direct element (element) of xml of byte stream after object serialization transmits, the byte stream that object serialization can be formed is after Base64 coding, convert the legal ascii character that can write xml element (element) to, transmit between each task server (computing node) in Map/Reduce system, the task server of job.xml is received in system, first the element (element) in job.xml is carried out Base64 decoding, the byte stream formed decoding again does the unserializing of object, just can convert required object instance to, so just reach and transmit object between different task server (computing node) in the Map/Reduce system of Hadoop.
Fig. 2 is the method flow diagram that objects in distributed calculating system according to the present invention transmits, and below with reference to Fig. 2, is described in detail the method that objects in distributed calculating system of the present invention transmits.
First, in step 201, client (Map/Reduce Client) is as operation originating end, object is resolved into byte stream and carry out serializing, and then the byte stream that object serialization is formed is encoded through Base64, convert the legal ascii character that can write xml element (element) to, write operation file (job.xml), job file is XML (easily extensible standard language) file of standard, and each specifying information is all formed with the element of XML (element) to be existed wherein.
In step 202, the job file (job.xml) comprised through serializing, Base64 coding is submitted to Job Server (Map/Reduce Master) by client.
In step 203, Job Server receives and comprises after the job file of serializing, Base64 coding, decodes to the Base64 coded sequence in this job file, and unserializing, extract and use this contents of object.
In step 204, Job Server passes to each task server (Map/Reduce Slave) by receiving the job file comprised through serializing, Base64 coding.
In step 205, task server receives and comprises after the job file of serializing, Base64 coding, decodes to the Base64 coded sequence in job file, and unserializing, extract and use this contents of object.
By the method that objects in distributed calculating system of the present invention transmits, operation (job) disposal ability in Map/Reduce system obtains effective expansion, Map/Reduce system is when carrying out distributed arithmetic, would not be confined between different node, to transmit the such simple data structure information of character string, but can object (class) example of transfer complex, effectively enhance the disposal ability of whole Map/Reduce system, more senior distributed arithmetic process can be carried out.
One of ordinary skill in the art will appreciate that: the foregoing is only the preferred embodiments of the present invention and oneself, be not limited to the present invention, although with reference to previous embodiment to invention has been detailed description, for a person skilled in the art, it still can be modified to the technical scheme that foregoing embodiments is recorded, or carries out equivalent replacement to wherein portion of techniques feature.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (2)

1. a method for objects in distributed calculating system transmission, the method comprises the following steps:
By object serialization, coding, write operation file;
Transmit the job file containing coded sequence;
Undertaken decoding and unserializing by containing the coded sequence in the job file of coded sequence, extract and use contents of object.
2. the method for objects in distributed calculating system transmission according to claim 1, it is characterized in that, described is, by client, object is resolved into byte stream by object serialization, coding, and then to be encoded through Base64 by described byte stream, converts ascii character to.
CN201310512599.2A 2013-10-25 2013-10-25 Method for object transmission in distributed computing system Pending CN104572763A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310512599.2A CN104572763A (en) 2013-10-25 2013-10-25 Method for object transmission in distributed computing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310512599.2A CN104572763A (en) 2013-10-25 2013-10-25 Method for object transmission in distributed computing system

Publications (1)

Publication Number Publication Date
CN104572763A true CN104572763A (en) 2015-04-29

Family

ID=53088843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310512599.2A Pending CN104572763A (en) 2013-10-25 2013-10-25 Method for object transmission in distributed computing system

Country Status (1)

Country Link
CN (1) CN104572763A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109213745A (en) * 2018-08-27 2019-01-15 郑州云海信息技术有限公司 A kind of distributed document storage method, device, processor and storage medium
CN109426651A (en) * 2017-06-20 2019-03-05 北京小米移动软件有限公司 The method and device of file conversion

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109426651A (en) * 2017-06-20 2019-03-05 北京小米移动软件有限公司 The method and device of file conversion
CN109213745A (en) * 2018-08-27 2019-01-15 郑州云海信息技术有限公司 A kind of distributed document storage method, device, processor and storage medium
CN109213745B (en) * 2018-08-27 2022-04-22 郑州云海信息技术有限公司 Distributed file storage method, device, processor and storage medium

Similar Documents

Publication Publication Date Title
Casado et al. Emerging trends and technologies in big data processing
CN109902274B (en) Method and system for converting json character string into thraft binary stream
Li et al. Performance improvement techniques for geospatial web services in a cyberinfrastructure environment–A case study with a disaster management portal
CN108985448B (en) Neural network representation standard framework structure
CN105183834A (en) Ontology library based transportation big data semantic application service method
CN102033959A (en) Method for transferring objects in distributed calculating system
KR20210083136A (en) Spatial information based digital twin service providing device and method
CN103530538B (en) A kind of XML secured views querying method based on Schema
Liu et al. A novel cloud platform for service robots
Izsó et al. IncQuery-D: incremental graph search in the cloud.
CN109683873B (en) Space information interface coding method and system architecture using ASN1 rule
CN103488697A (en) System and mobile terminal capable of automatically collecting and exchanging fragmented commercial information
CN104572763A (en) Method for object transmission in distributed computing system
Ledeul et al. Data streaming with apache kafka for cern supervision, control and data acquisition system for radiation and environmental protection
CN105550176A (en) Basic mapping method for relational database and XML
CN105224632A (en) In XBRL technological frame, engine model is converted to the method for page model
CN108829930B (en) Lightweight method for designing MBD model by three-dimensional digital process
CN114385139B (en) Message serialization and comparison method and device for flight framework to run ETL (extract transform load) process
CN105793842B (en) Conversion method and device between serialized message
CN109542953A (en) Data processing method and device based on presto
Garg et al. Study on JSON, its Uses and Applications in Engineering Organizations
Srinivas et al. Storage Optimization Using File Compression Techniques for Big Data.
Jeong et al. Data management technologies for infrastructure monitoring
Liu et al. [Retracted] Video Image Processing Method Based on Cloud Platform Massive Data and Virtual Reality
Hussain et al. Implementation of OGC compliant framework for data integration in Water Distribution System

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150429