CN103412739A - Data transmission method and system based on seismic data processing - Google Patents

Data transmission method and system based on seismic data processing Download PDF

Info

Publication number
CN103412739A
CN103412739A CN201310382046XA CN201310382046A CN103412739A CN 103412739 A CN103412739 A CN 103412739A CN 201310382046X A CN201310382046X A CN 201310382046XA CN 201310382046 A CN201310382046 A CN 201310382046A CN 103412739 A CN103412739 A CN 103412739A
Authority
CN
China
Prior art keywords
data
module
socket
port
write
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310382046XA
Other languages
Chinese (zh)
Inventor
赵太银
王建
胡光岷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201310382046XA priority Critical patent/CN103412739A/en
Publication of CN103412739A publication Critical patent/CN103412739A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a data transmission method and system based on seismic data processing and belongs to the data transmission field. Transmission modes are distinguished by judging whether two processing function modules operate on the same computational node or not, if the two processing function modules operate on the same computational node, a multithreading share buffer region transmission mode is adopted, and if the two processing function modules operate on different computational nodes, a multithreading socket interface transmission mode is adopted. The data transmission method and system based on the seismic data processing has the advantages that the problem that existing execution control systems do not really achieve production line parallel of job operation, do not consider the difference of data processing speeds of modules and the like is solved to some extent.

Description

A kind of data transmission method and system of processing based on geological data
Technical field
The invention belongs to field of data transmission, be specifically related to a kind of data transmission method and system of processing based on geological data.
Background technology
Along with the development of computer technology, the performance of single computing machine is more and more higher, but, when the processing in the face of extensive mass data, only by the ability that constantly strengthens single computing machine, can not meet the growing demand of application.General large-scale geological data can be up to hundreds of GB, and very complicated to its calculating of carrying out, and the working time that the one earthquake data are processed is the longest reaches tens days, and parallel computation and Distributed Calculation provide a solution for us.
Parallel computation refers to the process of using simultaneously multiple computational resource to solve computational problem, can be divided into parallel on temporal parallel and space.The temporal parallel pipelining that just refers to, parallel the referring to the concurrent execution of a plurality of processors on space calculated.Distributed Calculation is a kind of the mass computing task to be decomposed by certain rule, then uses many computing machines to calculate respectively, and operation result is integrated and drawn the science of data conclusion.The basic thought of Distributed Calculation is a minute system, the task that can't complete the single computing machine of former cause is completed jointly by one group of computing machine, and, when problem scale constantly enlarges, only by the quantity of adding computing node, reach the requirement on computing time on the basis of not revising original algorithm and program.
For the processing of the large-scale datas such as geological data, the parallel computation programming model of main flow has the MapReduce of Google and the Dryad of Microsoft at present, only simply introduces here and the Dryad relevant to this programme.
Dryad is described as a directed acyclic graph by calculation task, data processing procedure of the vertex representation of figure, and limit means the passage of data interaction, the flow direction in the direction indication data handling procedure on limit.Dryad provides a programming framework comparatively clearly to the user, and the programming personnel need be expressed as business directed acyclic graph (DAG, Directed Acyclic Graph) form, writes serial program for summit.Compare the key-value pair model of MapReduce, the DAG of Dryad is abstract more general, more flexible.
Similarly, in geological data was processed, we can be described as a Seismic Operation to the various operations to geological data with reference to the model of Dryad, the Job execution control system of a set of processing seismic data of design.A Seismic Operation needs a plurality of functional modules that complete the particular data treatment scheme to combine data operation according to certain work flow usually, and each functional module is only done single data processing work.By analyzing the Dryad programming model, Seismic Operation is carried out highly abstract, Seismic Operation can be described as to the DAG model, the processing capacity module that vertex representation is different, data dependence relation between the representation module of limit, the flow direction of data between the direction indication module on limit.
Prior art related to the present invention:
The temporary file technology:
This is the implementation that data transmission system is simplified most, after last module is finished dealing with to the input data, write results in the middle of temporary file, subsequent module from the input data of reading out data temporary file as module, is carried out follow-up data processing operation again.
Consider the versatility of system, because the processing of some modules depends on complete data message, therefore until a upper module is complete and result is outputed in temporary file, next module could and be carried out the corresponding operation of processing from reading out data temporary file, after processing has operated, delete corresponding temporary file and take up room in order to reduce disk.
Inter-process communication techniques:
Process communication technology (IPC, Inter-Process Communication) refers to by the communication technology of exchange bulk information between special communication mechanism implementation process.
The linux of take is example, and several Main Means brief introductions of interprocess communication are as follows:
Pipeline (Pipe) and famous pipeline (named pipe): pipeline can be used for having the communication between the sibship process, named pipes has overcome the restriction that pipeline does not have name, therefore, except having the function that pipeline has, it also allows the communication between the affinity-less relation process;
Signal (Signal): signal is the communication mode of more complicated, and for notice, accepting process has certain event to occur, and except for interprocess communication, process can also send a signal to process itself;
Message (Message) formation (message queue): message queue is the chained list of message, comprises Posix message queue system V message queue.Have the process of enough authorities in formation, to add message, the process that is endowed read right can be read away the message in formation.It is few that message queue has overcome the signaling bearer quantity of information, and pipeline can only carry unformatted byte stream and the shortcoming such as buffer size is limited.
Shared drive: making a plurality of processes can access the same memory headroom, is the fastest IP available C form.Lower and design for other communication mechanism operational efficiency.Often, with other communication mechanism, be combined with as semaphore, reach the synchronous and mutual exclusion between process.
Semaphore (semaphore): mainly as between process and the synchronization means between same process different threads.
Socket (Socket): more general inter-process communication mechanisms can be used for the interprocess communication between different machines.
Present most of executive control systems adopt pipeline stream technology, each module is moved by an independent process, between process, connect with the pipeline that system provides, namely the standard output of previous process is connected to the standard input of a rear process, with order line form control job run.
The problem of the existence of prior art:
Adopt the temporary file technology, as shown in Figure 3, what whole executive control system can be blocked carries out operation with the streamline form, can not effectively utilize the multi-core parallel concurrent of operating system to carry out.In addition, because mass seismic data reaches G up to a hundred, need data to be processed increasing, therefore for complicated operation, even if operation can be deleted corresponding temporary file after completing, but in whole execution flow process, still can produce a lot of very large garbage files.In a word, employing temporary file technology realizes the data transmission of intermodule, does not only utilize the concurrency of system, and transfer efficiency is not high, and a large amount of temporary files that produce also cause very large waste to disk space.
Adopt pipeline stream technology, as shown in Figure 4, with order line form control job run, this pattern is not only stiff, and for the module of many outputs, need to produce a plurality of pipelines and realize, the connection more complicated of pipeline; For the module of many inputs, pipeline stream technology can not meet actual requirement at all.In a word, adopt pipeline stream technology can complete the work of some simple operations, but can not meet the needs of processing complex job.
In addition, for centralized executive control system, generally can move each functional module in the mode of multithreading, if adopt inter-process communication techniques to carry out the data transmission of practical function intermodule, not be well suited for, expense is also larger relatively.Certainly, for distributed executive control system, because enterprises does not generally have the powerful basic-level supports such as distributed file system, realize that the data transmission between different computing nodes can only be used the socket network programming to realize.
Summary of the invention
The objective of the invention is really not realize in order to overcome existing executive control system job run pipeline parallel method, do not consider the shortcoming such as difference of intermodular data processing speed and a kind of data transmission method and system of processing based on geological data of proposing.
In order to realize above goal of the invention, the technical scheme that the present invention takes is as follows: the data transmission method that a kind of geological data is processed, by judging two processing capacity modules, whether at same computing node, move to distinguish transmission mode, if two processing capacity modules adopt multithreading shared buffer cell transmission mode in same computing node operation, if two processing capacity modules adopt multi-process socket transmission mode in different computing node operations.
Further, under multithreading shared buffer cell transmission mode, read process of caching as follows:
(0) initialization;
(1) judge whether initialization of buffer area, initialization enters next step operation, otherwise returns to (0);
(2) if data stride across the head and the tail intersection, reading out data at twice, otherwise directly read;
(3) reading out data;
(4) return to read data byte length.
Further, under multithreading shared buffer cell transmission mode, write process of caching as follows:
(0) initialization;
(1) judge whether initialization of buffer area, initialization enters next step operation, otherwise returns to (0);
(2) obtain the current cache district and can write maximum byte length;
(3) data length more to be written with can write data length, if data length to be written is greater than, can write data length, forward (4) step to; Otherwise forward (6) step to;
(4) judge whether data can partly write, if can enter next step operation; Otherwise return to (0);
(5) data length to be written is greater than and can writes under data length and the writeable prerequisite of part, according to the spatial cache size, and the part data writing, data writing length is for can write data length;
(6) if data stride across head and the tail intersection, data writing at twice, otherwise the data of writing direct;
(7) data writing;
(8) return to the data writing byte length.
Further, under multi-process socket transmission mode, the reception data procedures is as follows:
(1) initialization, carry out the job file parsing, obtains the topological output port relevant information of connection;
(2) by the PORT_ADDR information write into Databasce of self;
(3) monitor corresponding socket port, wait for that accepting TCP connects;
(4) after accepting the TCP connection, continuous reading out data;
(5) carry out data processing operation;
(6) judge whether to receive complement mark, if forward (7) to, otherwise forward (4) to;
(7) from database, delete corresponding PORT_ADDR information, disconnect TCP and connect, the processing capacity module exits.
Further, under multi-process socket transmission mode, the transmission data procedures is as follows:
(1) initialization, carry out the job file parsing, obtains the topological input port relevant information of connection;
(2) from the PORT_ADDR information of data base querying topology output port, described PORT_ADDR information field comprises the socket port numbers of operation id, processing capacity module id, processing capacity module topology input port id, processing capacity module operation node i p address, place, input port;
(3) ip address and the port number information according to previous step, obtained, initiate TCP and connect, and deposits socket in container after connection is successfully established;
(4) constantly send data, send data to the connection of preserving in container;
(5) judge whether data all are sent completely, if forward (6) to, otherwise forward (4) to;
(6) send data and be sent completely sign, disconnect TCP and connect, the processing capacity module exits.
In order to realize above goal of the invention, another technical scheme that the present invention takes is as follows: a kind of data transmission system of processing based on geological data is characterized in that: comprise two modules: for the shared buffer transport module of communicating by letter in computing node and socket transport module for communicating by letter between computing node.
Further, the shared buffer transport module comprises the output port write be used to managing buffer zone request and release, data, also comprises the input port read be used to carrying out data; Wherein, an output port can be connected with any a plurality of input ports, and output port is for safeguarding that related input port reads situation.
Further, the socket transport module comprises that user interface, PORT_ADDR information inquiry module, message sequence module, socket manager, TCP socket realize module;
Wherein, user interface reads and specifies big or small data for the topological output port from specifying topological input port to connect; Also for the big or small data of topological input port transmission appointment to specifying topological output port to connect;
PORT_ADDR information inquiry module, for the treatment of the output port between functional module and the information interaction between input port, makes the processing capacity module to obtain processing the destination that rear data will mail to by inquiry;
The message sequence module, for when carrying out the Internet Transmission such as complexity or isomery, guarantees the correct and reliable of data transmission;
The socket manager manages for the socket object to current use, provides the inquiry to destination node, the network operation of TCP establishment of connection;
The TCP socket realizes that module, for the encapsulation to operating system socket object, provides the interface that socket is operated.
Beneficial effect of the present invention: the present invention supports centralized and distributed executive control system simultaneously, provides a set of unified DLL (dynamic link library) to the module R&D personnel, therefore function is fully transparent concerning upper strata.When execution is controlled to be centralized, namely all modules operate on same computing node, functional module with the multithreading running job, the present invention has designed a shared buffer at two intermodules, as the mutual Buffer Pool of module data, with traditional multi-process operation function module and utilize inter-process communication techniques to realize that intermodular data compares alternately, expense is less, and efficiency is higher.When execution is controlled to be distributed, when namely all modules operate on different computing nodes, owing to being that multi-process runs on different computing nodes, so can only adopt inter-process communication techniques, and need to cross over main frame, communicate, so the present invention utilizes a set of data transmission module of sockets interface design to realize the data interaction of intermodule.
A kind of transmission method and system of processing towards geological data that the present invention proposes, solved the pipeline parallel method that existing executive control system really do not realize job run, the problems such as difference of not considering the intermodular data processing speed to a certain extent.
Be described in detail as follows:
1. realized with the decoupling zero of other inter-module of system and, to the module R&D personnel, provide a set of unified DLL (dynamic link library), therefore function is fully transparent concerning upper strata, can by other transmission system, replace at any time and the module R&D personnel without making any change.
2. support centralized executive control system.Adopt multithread mode and shared buffer can realize efficient data transmission, improve job processing efficiency by the pipeline data parallel processing, be applicable to that processing module is less, the simple operation of flow process, can avoid between process the expense that data communication brings.
3. support distributed executive control system.When the operation processing module is too much, the single node performance is limited after all, in-process too much thread and the shared drive opened up, can not improve to a great extent whole treatment effeciency, now multi-process pattern can take full advantage of each computing node resource and calculate, although keep away, unavoidably bring extra data communication expense, be conducive to the load balancing of computing node.
The accompanying drawing explanation
Fig. 1 is work flow schematic diagram of the present invention;
Fig. 2 is that intermodular data of the present invention transmits schematic diagram;
Fig. 3 is temporary file mode schematic diagram of the present invention;
Fig. 4 is pipeline stream mode schematic diagram of the present invention;
Fig. 5 is group system overall situation schematic diagram of the present invention;
Fig. 6 is that data of the present invention write general flow chart;
Fig. 7 is that data of the present invention read general flow chart;
Fig. 8 is that the present invention reads data cached process flow diagram;
Fig. 9 is that the present invention writes data cached process flow diagram;
Figure 10 is the schematic diagram that the present invention does not stride across the head and the tail intersection situation of buffer circle while writing buffer memory;
Figure 11 is the schematic diagram that the present invention strides across the head and the tail intersection situation of buffer circle while writing buffer memory;
Figure 12 is the schematic diagram that the present invention does not stride across the head and the tail intersection situation of buffer circle while reading buffer memory;
Figure 13 is the schematic diagram that the present invention strides across the head and the tail intersection situation of buffer circle while reading buffer memory;
Figure 14 is socket module frame schematic diagram of the present invention;
Figure 15 is the present invention's topology input port process flow diagram;
Figure 16 is the present invention's topology output port process flow diagram.
Embodiment
For making purpose of the present invention, technical scheme and advantage clearer, referring to the accompanying drawing embodiment that develops simultaneously, the present invention is described in further details.
The principle explanation:
As shown in Figure 1, the present invention is based on a Job execution control system, when carrying out an operation, in the mode of multithreading or multi-process, start the complete corresponding data processing work of all processing capacity modules according to this digraph.Each square frame represents specific processing capacity module flow process, only pays close attention to self module respective handling to the input geological data; Each arrow represents the concrete flow direction of geological data in this system.During this period, the data pass through mechanism that how to design efficiently two processing capacity intermodules is the key that affects the whole system operational efficiency.
The Fig. 2 of take is example, and after the Job execution process started, module k and module k+1 were in running status, and is absorbed in and the data processing function of self module.Module k need to transmit result to next module, and module k+1 also needs the result of receiver module k to carry out subsequent treatment work as the input data of this module after the input data are finished dealing with.Design earthquake data transfer mode between two processing capacity modules how, for the module R&D personnel provides a set of unification, easy interface, problem to be solved by this invention just.
As shown in Figure 5, group system comprises a server node and a series of computing node, and computing node is interconnected into distributed cluster system by network.
Message server is mainly to take message to be base unit, for whole group system provides a unified communication service, in addition, provides the functions such as communication registration, cancellation, inquiry.
Job Server is mainly the work such as responsible management to All Jobs in group system and scheduling.
Executive control system is mainly to be responsible for the concrete treatment scheme of running job, and the data processing function module of an operation both can run on same computing node, as module 1 and the module 2 of Fig. 5; Also can run on different computing nodes, as the module 3 of Fig. 5; When two adjacent processing capacity modules are positioned at same computing node, it is mutual that the mode of employing shared buffer is carried out intermodular data, when two adjacent processing capacity modules were positioned at different computing node, it is mutual that employing socket (socket) transport module carries out intermodular data.In a word, this data transmission system comprises two main modular: for the shared buffer transport module of communicating by letter in computing node and socket transport module for communicating by letter between computing node.Based on the data transmission system that geological data is processed, move main-process stream as shown in Figure 6 and Figure 7.
Lower mask body is introduced shared buffer transport module and socket transport module:
The shared buffer transport module:
Shared buffer transport module characteristics:
1. the shared buffer transport module has a thread for output, and the output data can be read to any a plurality of input ports by 0.Input port quantity is determined by processing capacity module topology and job requirements.
2. signaling mechanism is provided: the notifier processes functional module can continue to export data, or the notice input port has new data to read.
3. adopt the Circular buffer district, the size of buffer area can be determined by the user, but buffer memory is 1G to the maximum, minimum 4K, and default size is 4M, the read-write of data can whole road, the part read-write, has improved to a certain extent the efficiency of read-write.
4. the shared buffer transport module is not done any transactional to data and is processed (not considering the type and size of data, only the exchange of responsible data), makes to a certain extent module have more independence and maintainability.
The shared buffer transport module is read data cached process flow diagram as shown in Figure 8:
(0) initialization; The initialization procedure here is the conventional process of this area, no longer describes in detail at this.
(1) judge whether initialization of buffer area, initialization enters next step operation, otherwise returns to (0);
(2) if data stride across the head and the tail intersection, reading out data at twice, otherwise directly read;
(3) reading out data (data copy);
(4) return to read data byte length.
Read in the buffer memory flow process not carry out the end position inspection, judge whether readable data and the process that whether can partly read is carried out calling before reading the buffer memory function.
Shared buffer is write data cached flow process as shown in Figure 9:
(0) initialization; The same, the initialization procedure here is also the conventional process of this area, no longer describes in detail at this;
(1) judge whether initialization of buffer area, initialization enters next step operation, otherwise returns to (0);
(2) obtain the current cache district and can write maximum byte length;
(3) data length more to be written with can write data length, if data length to be written is greater than, can write data length, forward (4) step to; Otherwise forward (6) step to;
(4) judge whether data can partly write, if can enter next step operation; Otherwise return to (0);
(5) data length to be written is greater than and can writes under data length and the writeable prerequisite of part, according to the spatial cache size, and the part data writing, data writing length is for can write data length;
(6) if data stride across head and the tail intersection, data writing at twice, otherwise the data of writing direct;
(7) data writing (data copy);
(8) return to the data writing byte length.
The buffer area of shared buffer, when reading and writing data, can run into two kinds of situations: the one, and data do not stride across the head and the tail intersection of buffer circle, but reading and writing data direct copying now; The one, data stride across the head and the tail intersection of buffer circle, now need copies data at twice.
Write two kinds of situations of data writing in process of caching as shown in FIG. 10 and 11:
Read two kinds of situations of process of caching as shown in Figures 12 and 13:
The shared buffer transport module comprises the output port write be used to managing buffer zone request and release, data, the input port read be used to carrying out data; An output port can be connected with any a plurality of input ports, output port safeguards that related input port reads situation, if new output data are arranged, the input port thread that the output port notice is blocked continues reading out data, input port is from after the output port reading out data, by the output port thread that notice is blocked, allow the processing capacity module proceed data and process, and the data after processing write the buffer zone of output port.
The socket transport module:
Principle explanation: at first, need to consider which kind of agreement is this module adopt carry out Internet Transmission.
The TCP---transmission control protocol, provide towards connection, byte stream service reliably.When client and server, each other before swap data, must first between both sides, set up a TCP and connect, could transmit data afterwards.TCP provides overtime repeating transmission, abandons repeating data, check data, and the functions such as flow control, guarantee that data can pass to the other end from an end
The UDP---User Datagram Protoco (UDP) is the transition layer protocol of a simple datagram-oriented.UDP does not provide reliability, and it just sends the datagram that application program is passed to the IP layer, but can not guarantee that they can arrive destination.Due to UDP, not be used in before datagram between client and server and set up a connection, and there is no the mechanism such as overtime repeating transmission, so transmission speed is very fast.
Although UDP not be used between client and server, set up a connection, transfer ratio is very fast, does not have the mechanism such as overtime re-transmission, flow control to ensure the transmitting of data, connects therefore adopt Transmission Control Protocol to set up socket.In addition, because geological data is generally all larger, TCP connects as existing long the connection always in whole Job execution process, more suitable in this case.
Due to socket communication, need to know ip and the port numbers of destination address, when the processing capacity module initialization, just must obtain by certain mechanism ip address and the port numbers of destination interface.Because processing capacity module in Seismic Operation is numerous, therefore the ip address of each processing capacity module and port numbers can not be fixed, therefore need Dynamic Acquisition, adopt and deposit database in here, and the mode of inquiring about as required realizes this function.
Therefore, in database, set up a table PORT_ADDR, field comprises the socket port numbers of operation id, processing capacity module id, processing capacity module topology input port id, processing capacity module operation node i p address, place, input port.
PORT_ADDR information can realize by structure in the present embodiment, specific as follows:
Figure BDA0000373329330000131
Figure BDA0000373329330000141
Each processing capacity module, at initial phase, all needs to write to database the PORT_ADDR information of corresponding topological input port, when the processing capacity module has been moved, deletes corresponding PORT_ADDR information from database;
The module general frame is as shown in figure 14:
(1) user interface:
For the module R&D personnel provides a set of unified interface:
bool?getPortData(int?nPortInIndex,void*buffer,int?wantSize,int&realGetSize,status&retValue);
From the topological output port of specifying topological input port to connect, read and specify big or small data.
bool?putPortData(int?nPortOutIndex,void*buffer,int&putSize,status&retValue);
To specifying the topological input port transmission that topological output port connects, specify big or small data.
The interface of this socket and shared buffer is identical, the developer do not know the inside story the layer how to realize, realized the transparency of module.
(2) PORT_ADDR information inquiry module:
The topological input port of processing capacity module is when initialization, need to be to ip address and the socket listening port number of database registration oneself, and wait for that corresponding transmit leg initiates to connect, perhaps the processing capacity module nullifies removing information accordingly after having moved.
The topological output port of processing capacity module is when initialization, in database, inquire about, obtain ip and the listening port number of the topological input port oneself connected, and initiate to connect to it, in follow-up transmission data, the TCP that direct utilization has established connects the transmission data.
(3) message sequence module:
All data in transmission over networks must exist with the form of data stream.The message sequence module is carried out the serializing operation by user message, thereby adapts to the needs of Internet Transmission.When accepting message, need to carry out the unserializing operation to the data that receive equally, thus the message object that obtains receiving.
(4) socket manager:
The Socket manager is responsible for the socket object of current use is managed (because can exist a plurality of socket to connect, corresponding a plurality of topological input ports of topological output port during as the hyperchannel operation), provide the inquiry to destination node, the basic network operations such as TCP establishment of connection.
(5) the TCP socket is realized module:
To the encapsulation of operating system socket object, the interface that socket is operated is provided, facilitate the unified management of socket manager.
The receiving data stream journey is as shown in figure 15:
(1) initialization, carry out the job file parsing, obtains the topological output port relevant information of connection;
(2) by the PORT_ADDR information write into Databasce of self;
(3) monitor corresponding socket port, wait for that accepting TCP connects;
(4) after accepting the TCP connection, continuous reading out data;
(5) carry out data processing operation;
(6) judge whether to receive complement mark, if forward (7) to, otherwise forward (4) to;
(7) from database, delete corresponding PORT_ADDR information, disconnect TCP and connect, the processing capacity module exits.
Send data flow as shown in figure 16:
(1) initialization, carry out the job file parsing, obtains the topological input port relevant information of connection;
(2) from the PORT_ADDR information of data base querying topology output port;
(3) ip address and port numbers (port) information according to previous step, obtained, initiate TCP and connect, and deposits socket in container after connection is successfully established;
(4) constantly send data, send data to the connection of preserving in container;
(5) judge whether data all are sent completely, if forward (6) to, otherwise forward (4) to;
(6) send data and be sent completely sign, disconnect TCP and connect, the processing capacity module exits.
Whether the present invention moves to distinguish at same computing node by two processing capacity modules, if adopt multithreading shared buffer cell transmission mode at same computing node, if at different computing nodes, adopt multi-process socket transmission mode, make earthquake data processing system can obtain transfer efficiency relatively preferably under different situations.By unified api interface and transparent communication service are provided for executive control system, support centralized, distributed executive control system.
Those of ordinary skill in the art will appreciate that, embodiment described here is in order to help reader understanding's implementation method of the present invention, should be understood to that protection scope of the present invention is not limited to such special statement and embodiment.Those of ordinary skill in the art can make various other various concrete distortion and combinations that do not break away from essence of the present invention according to these technology enlightenments disclosed by the invention, and these distortion and combination are still in protection scope of the present invention.

Claims (8)

1. data transmission method of processing based on geological data, it is characterized in that: whether at same computing node, move to distinguish transmission mode by judging two processing capacity modules, if two processing capacity modules adopt multithreading shared buffer cell transmission mode in same computing node operation, if two processing capacity modules adopt multi-process socket transmission mode in different computing node operations.
2. the data transmission method of processing based on geological data according to claim 1 is characterized in that: under multithreading shared buffer cell transmission mode, read process of caching as follows:
(0) initialization;
(1) judge whether initialization of buffer area, initialization enters next step operation, otherwise returns to (0);
(2) if data stride across the head and the tail intersection, reading out data at twice, otherwise directly read;
(3) reading out data;
(4) return to read data byte length.
3. the data transmission method of processing based on geological data according to claim 1 and 2 is characterized in that: under multithreading shared buffer cell transmission mode, write process of caching as follows:
(0) initialization;
(1) judge whether initialization of buffer area, initialization enters next step operation, otherwise returns to (0);
(2) obtain the current cache district and can write maximum byte length;
(3) data length more to be written with can write data length, if data length to be written is greater than, can write data length, forward (4) step to; Otherwise forward (6) step to;
(4) judge whether data can partly write, if can enter next step operation; Otherwise return to (0);
(5) data length to be written is greater than and can writes under data length and the writeable prerequisite of part, according to the spatial cache size, and the part data writing, data writing length is for can write data length;
(6) if data stride across head and the tail intersection, data writing at twice, otherwise the data of writing direct;
(7) data writing;
(8) return to the data writing byte length.
4. the data transmission method of processing based on geological data according to claim 1 is characterized in that: under multi-process socket transmission mode, receive data procedures as follows:
(1) initialization, carry out the job file parsing, obtains the topological output port relevant information of connection;
(2) by the PORT_ADDR information write into Databasce of self, described PORT_ADDR information field comprises the socket port numbers of operation id, module id, module topology input port id, module operation node i p address, place, input port;
(3) monitor corresponding socket port, wait for that accepting TCP connects;
(4) after accepting the TCP connection, continuous reading out data;
(5) carry out data processing operation;
(6) judge whether to receive complement mark, if forward (7) to, otherwise forward (4) to;
(7) from database, delete corresponding PORT_ADDR information, disconnect TCP and connect, the processing capacity module exits.
5. according to the described data transmission method of processing based on geological data of claim 1 or 4, it is characterized in that: under multi-process socket transmission mode, send data procedures as follows:
(1) initialization, carry out the job file parsing, obtains the topological input port relevant information of connection;
(2) from the PORT_ADDR information of data base querying topology output port;
(3) ip address and the port number information according to previous step, obtained, initiate TCP and connect, and deposits socket in container after connection is successfully established;
(4) constantly send data, send data to the connection of preserving in container;
(5) judge whether data all are sent completely, if forward (6) to, otherwise forward (4) to;
(6) send data and be sent completely sign, disconnect TCP and connect, the processing capacity module exits.
6. data transmission system of processing based on geological data is characterized in that: comprise two modules: for the shared buffer transport module of communicating by letter in computing node and socket transport module for communicating by letter between computing node.
7. the data transmission system of processing based on geological data according to claim 6, it is characterized in that: the shared buffer transport module comprises the output port write be used to managing buffer zone request and release, data, also comprises the input port read be used to carrying out data; Wherein, an output port can be connected with any a plurality of input ports, and output port is for safeguarding that related input port reads situation.
8. according to the described data transmission system of processing based on geological data of claim 6 or 7, it is characterized in that: the socket transport module comprises that user interface, PORT_ADDR information inquiry module, message sequence module, socket manager, TCP socket realize module;
Wherein, user interface reads and specifies big or small data for the topological output port from specifying topological input port to connect; Also for the big or small data of topological input port transmission appointment to specifying topological output port to connect;
PORT_ADDR information inquiry module, for the treatment of the output port between functional module and the information interaction between input port, makes the processing capacity module to obtain processing the destination that rear data will mail to by inquiry;
The message sequence module, for when carrying out the Internet Transmission such as complexity or isomery, guarantees the correct and reliable of data transmission;
The socket manager manages for the socket object to current use, provides the inquiry to destination node, the network operation of TCP establishment of connection;
The TCP socket realizes that module, for the encapsulation to operating system socket object, provides the interface that socket is operated.
CN201310382046XA 2013-08-28 2013-08-28 Data transmission method and system based on seismic data processing Pending CN103412739A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310382046XA CN103412739A (en) 2013-08-28 2013-08-28 Data transmission method and system based on seismic data processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310382046XA CN103412739A (en) 2013-08-28 2013-08-28 Data transmission method and system based on seismic data processing

Publications (1)

Publication Number Publication Date
CN103412739A true CN103412739A (en) 2013-11-27

Family

ID=49605753

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310382046XA Pending CN103412739A (en) 2013-08-28 2013-08-28 Data transmission method and system based on seismic data processing

Country Status (1)

Country Link
CN (1) CN103412739A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794095A (en) * 2014-01-16 2015-07-22 华为技术有限公司 Distributed computation processing method and device
CN107949004A (en) * 2017-10-25 2018-04-20 北京空间技术研制试验中心 Data handling system and method for manned spacecraft
CN109196837A (en) * 2016-05-27 2019-01-11 华为技术有限公司 The method and system of interprocess communication is carried out in user's space between OS grades of containers
CN110895347A (en) * 2019-12-09 2020-03-20 中国科学院地质与地球物理研究所 Seismic wave forward modeling method and system
CN111368501A (en) * 2018-12-26 2020-07-03 中国石油天然气集团有限公司 Seismic auxiliary data flow processing system
CN113609518A (en) * 2021-06-18 2021-11-05 天津津航计算技术研究所 Message protocol overtime retransmission method and system based on associated container map

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102135917A (en) * 2010-11-30 2011-07-27 广东星海数字家庭产业技术研究院有限公司 Inter-Linux operating system progress communication information acquisition method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102135917A (en) * 2010-11-30 2011-07-27 广东星海数字家庭产业技术研究院有限公司 Inter-Linux operating system progress communication information acquisition method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
汪竹,赵太银,胡光岷: "私有云环境下适应大规模数据处理的作业执行控制系统研究", 《2012年互联网技术与应用国际学术会议论文集》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794095A (en) * 2014-01-16 2015-07-22 华为技术有限公司 Distributed computation processing method and device
CN104794095B (en) * 2014-01-16 2018-09-07 华为技术有限公司 Distributed Calculation processing method and processing device
CN109196837A (en) * 2016-05-27 2019-01-11 华为技术有限公司 The method and system of interprocess communication is carried out in user's space between OS grades of containers
CN107949004A (en) * 2017-10-25 2018-04-20 北京空间技术研制试验中心 Data handling system and method for manned spacecraft
CN111368501A (en) * 2018-12-26 2020-07-03 中国石油天然气集团有限公司 Seismic auxiliary data flow processing system
CN111368501B (en) * 2018-12-26 2023-09-26 中国石油天然气集团有限公司 Seismic auxiliary data flow processing system
CN110895347A (en) * 2019-12-09 2020-03-20 中国科学院地质与地球物理研究所 Seismic wave forward modeling method and system
CN113609518A (en) * 2021-06-18 2021-11-05 天津津航计算技术研究所 Message protocol overtime retransmission method and system based on associated container map
CN113609518B (en) * 2021-06-18 2023-12-12 天津津航计算技术研究所 Message protocol timeout retransmission method and system based on association container map

Similar Documents

Publication Publication Date Title
Mahgoub et al. {SONIC}: Application-aware data passing for chained serverless applications
CN103412739A (en) Data transmission method and system based on seismic data processing
Voellmy et al. Scalable software defined network controllers
US7984448B2 (en) Mechanism to support generic collective communication across a variety of programming models
US20110125974A1 (en) Distributed symmetric multiprocessing computing architecture
CN103488775A (en) Computing system and computing method for big data processing
KR20200078577A (en) Computing cluster management with redundancy results
WO2014110702A1 (en) Cooperative concurrent message bus, driving member assembly model and member disassembly method
CN102591843A (en) Inter-core communication method for multi-core processor
WO2020186836A1 (en) Task scheduling
CN113495865A (en) Asynchronous data movement pipeline
CN103282888B (en) Data processing method, image processor GPU and primary nodal point equipment
Swenson et al. A new approach to zero-copy message passing with reversible memory allocation in multi-core architectures
US20120304178A1 (en) Concurrent reduction optimizations for thieving schedulers
CN113495761A (en) Techniques for coordinating phases of thread synchronization
CN104199740A (en) Non-tight-coupling multi-node multi-processor system and method based on system address space sharing
CN107943592A (en) A kind of method for avoiding GPU resource contention towards GPU cluster environment
Hendry Decreasing network power with on-off links informed by scientific applications
CN116954944A (en) Distributed data stream processing method, device and equipment based on memory grid
Friedman et al. Fisheye consistency: Keeping data in synch in a georeplicated world
US20220224605A1 (en) Simulating network flow control
CN116802613A (en) Synchronizing graphics execution
Liu et al. BSPCloud: A hybrid distributed-memory and shared-memory programming model
Enberg et al. Transcending POSIX: The End of an Era?
WO2014110701A1 (en) Independent active member and functional active member assembly module and member disassembly method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20131127