CN109145023A - Method and apparatus for handling data - Google Patents
Method and apparatus for handling data Download PDFInfo
- Publication number
- CN109145023A CN109145023A CN201811003311.8A CN201811003311A CN109145023A CN 109145023 A CN109145023 A CN 109145023A CN 201811003311 A CN201811003311 A CN 201811003311A CN 109145023 A CN109145023 A CN 109145023A
- Authority
- CN
- China
- Prior art keywords
- data
- mark
- processing node
- pending
- data processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The embodiment of the present application discloses the method and apparatus for handling data.One specific embodiment of this method includes: that the upstream data processing node for the target data processing node that data flow is flow to from streaming computing system obtains the mark of pending data and pending data, streaming computing system includes the data processing node set for being handled data stream, data flow is from data processing node set, data processing node as entrance flows into and flows out data flow system after flowing through at least one data processing node in data processing node set, target data processing node includes the execution unit of the program segment of the characterize data processing logic for the user's submission for executing streaming computing system;The mark of pending data and pending data is sent to execution unit;Obtain the mark that execution unit runs processing result data corresponding with pending data and processing result data generated.This embodiment improves data-handling efficiencies.
Description
Technical field
The invention relates to field of computer technology, more particularly, to handle the method and apparatus of data.
Background technique
In the large-scale distributed calculating scenes such as streaming computing is widely used in information flow, library is built in search, retrieval charging.Streaming
Calculating is a kind of data processing mode of similar pipeline system, and streaming computing is from a theory: an event occurs and just stands
A data processing is carried out, rather than data buffer storage is got up batch processing.
Existing streaming computing system, in Storm (a distributed, fault-tolerant real time computation system), Yong Huti
It is counted between the operation and Storm platform of friendship based on JSON (JavaScript Object Notation, JS object numbered musical notation)
According to exchange, data exchange agreement is complicated, and requires after user understands Storm and JSON, could use the streaming computing system
System carries out data processing.
Summary of the invention
The embodiment of the present application proposes the method and apparatus for handling data.
In a first aspect, the embodiment of the present application provides a kind of method for handling data, this method comprises: from streaming meter
The upstream data processing node of data flow is flow in calculation system target data processing node obtains pending data and to be processed
The mark of data, streaming computing system include the data processing node set for being handled data stream, and data flow is from number
According to processing node set in, as entrance data processing node flow into and in flowing through data processing node set at least
Data flow system is flowed out after one data processing node, target data processing node includes the user for executing streaming computing system
The execution unit of the program segment of the characterize data processing logic of submission;Pending data and pending data are sent to execution unit
Mark;It obtains execution unit and runs processing result data corresponding with pending data and processing result data generated
Mark.
In some embodiments, it obtains execution unit and runs processing result data corresponding with pending data generated
Later, method further include: the mark of persistence processing result data and processing result data;It is handled under node to target data
Swim the mark that data processing node sends processing result data and processing result data;In response to getting instruction processing result number
According to the instruction information having been processed, the processing result data of persistence and the mark of processing result data are removed.
In some embodiments, to target data processing node downstream data processing node send processing result data and
After the mark of processing result data, method further include: in response to getting the instruction of instruction processing result data processing failure
Information resets the processing result data and processing result number of persistence to the downstream data processing node of target data processing node
According to mark.
In some embodiments, to execution unit send pending data and pending data mark, comprising: by into
Cheng Guandao sends the mark of pending data and pending data to execution unit.
In some embodiments, the mark of pending data and pending data is sent to execution unit, comprising: according to pre-
If agreement sends the mark of pending data and pending data to execution unit, provided at oriented target data in preset protocol
The separator used between field and field included by the row data that reason node is sent, field includes the mark of data, number
According to keyword, data value and the label that whether is had been processed for determining data.
In some embodiments, method further include: request is executed in response to get the mark including target program section,
Execution unit is created with performance objective program segment.
In some embodiments, the topological relation of data processing node passes through pre-set configuration in streaming computing system
File description, configuration file includes extensible markup language configuration file;And method further include: generated simultaneously based on configuration file
Show the topological diagram of streaming computing system.
Second aspect, the embodiment of the present application provide a kind of for handling the device of data, which includes: the first acquisition
Unit, the upstream data processing node for being configured to the target data processing node that data flow is flow to from streaming computing system obtain
The mark of pending data and pending data is taken, streaming computing system includes the data processing for being handled data stream
Node set, data flow from it is in data processing node set, as entrance data processing node flow into and flowing through data
Data flow system is flowed out after at least one data processing node in processing node set, target data processing node includes holding
The execution unit of the program segment for the characterize data processing logic that the user of row streaming computing system submits;First transmission unit, quilt
It is configured to send the mark of pending data and pending data to execution unit;Second acquisition unit is configured to obtain and hold
Row unit runs the mark of processing result data corresponding with pending data and processing result data generated.
In some embodiments, device further include: persistence unit is configured to persistence processing result data and processing
The mark of result data;Second transmission unit, the downstream data processing node for being configured to handle node to target data are sent
The mark of processing result data and processing result data;Clearing cell is configured in response to get instruction processing result number
According to the instruction information having been processed, the processing result data of persistence and the mark of processing result data are removed.
In some embodiments, device further include: playback unit is configured in response to get instruction processing result number
According to the instruction information of processing failure, the processing result of persistence is reset to the downstream data processing node of target data processing node
The mark of data and processing result data.
In some embodiments, the first transmission unit is further configured to send by process pipeline to execution unit
The mark of pending data and pending data.
In some embodiments, the first transmission unit is further configured to send according to preset protocol to execution unit
The mark of pending data and pending data provides the row data institute that oriented target data processing node is sent in preset protocol
Including field and field between the separator that uses, field include the mark of data, the keyword of data, data value with
And the label whether being had been processed for determining data.
In some embodiments, device further include: creating unit is configured in response to get including target program section
Mark execute request, create execution unit with performance objective program segment.
In some embodiments, the topological relation of data processing node passes through pre-set configuration in streaming computing system
File description, configuration file includes extensible markup language configuration file;And device further include: show unit, be configured to
The topological diagram of streaming computing system is generated and showed based on configuration file.
The third aspect, the embodiment of the present application provide a kind of equipment, comprising: one or more processors;Storage device,
On be stored with one or more programs, when said one or multiple programs are executed by said one or multiple processors so that on
It states one or more processors and realizes such as the above-mentioned method of first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should
Such as first aspect above-mentioned method is realized when program is executed by processor.
Method and apparatus provided by the embodiments of the present application for handling data, pass through the data flow from streaming computing system
The upstream data processing node for the target data processing node flowing to obtains the mark of pending data and pending data, target
Data processing node includes the execution list of the program segment of the characterize data processing logic for the user's submission for executing streaming computing system
Member, and after to execution unit send pending data and pending data mark, finally obtain execution unit operation is generated
Processing result data corresponding with pending data and processing result data mark so that the user of streaming computing system
It need to only submit the program segment write using programming language known to it that the processing to data can be completed, to improve at data
Manage efficiency.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the method for handling data of the application;
Fig. 3 is a schematic diagram according to the application scenarios of the method for handling data of the application;
Fig. 4 is the flow chart according to another embodiment of the method for handling data of the application;
Fig. 5 is the structural schematic diagram according to one embodiment of the device for handling data of the application;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the server of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the method for handling data of the application or the implementation of the device for handling data
The exemplary system architecture 100 of example.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various client applications, such as the application of streaming computing class, society can be installed on terminal device 101,102,103
Hand over class application, searching class application etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be the various electronic equipments with display screen, including but not limited to smart phone, tablet computer, on knee portable
Computer and desktop computer etc..When terminal device 101,102,103 is software, above-mentioned cited electricity may be mounted at
In sub- equipment.Multiple softwares or software module (such as providing data processing service) may be implemented into it, also may be implemented
At single software or software module.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as to installing on terminal device 101,102,103
Using the background server supported is provided, server 105 can be from the target data that data flow is flow in streaming computing system
The upstream data processing node for managing node obtains the mark of pending data and pending data, and streaming computing system includes being used for
To the data processing node set that data stream is handled, data flow is from number in data processing node set, as entrance
It is flowed into according to processing node and flows out data flow after flowing through at least one data processing node in data processing node set
System, target data processing node include that the user of execution streaming computing system is submitted by terminal device 101,102,103
Characterize data handles the execution unit of the program segment of logic;The mark of pending data and pending data is sent to execution unit
Know;Obtain the mark that execution unit runs processing result data corresponding with pending data and processing result data generated
Know.
It should be noted that the method provided by the embodiment of the present application for handling data can be held by server 105
Row, correspondingly, the device for handling data can be set in server 105.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
To be implemented as multiple softwares or software module (such as providing Distributed Services), single software or software also may be implemented into
Module.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the process of one embodiment of the method for handling data according to the application is shown
200.The method for being used to handle data, comprising the following steps:
Step 201, the upstream data for the target data processing node that data flow is flow to from streaming computing system handles section
Point obtains the mark of pending data and pending data.
It in the present embodiment, can be first for handling the method executing subject (such as server shown in FIG. 1) of data
The upstream data processing node for the target data processing node that data flow is flow to from streaming computing system obtains pending data
With the mark of pending data.Streaming computing system includes the data processing node set for being handled data stream, number
According to stream from data processing node set, as entrance data processing node flow into and flowing through data processing node set
In at least one data processing node after flow out data flow system, it includes executing streaming computing system that target data, which handles node,
The execution unit of the program segment for the characterize data processing logic that the user of system submits.The program segment of above-mentioned characterize data processing logic
Used programming language can be different from the programming language that streaming computing system uses, the journey of above-mentioned characterize data processing logic
Sequence section can be used a variety of programming languages and write, for example, can as user using it known to programming language write.
It herein, may include control node and multiple working nodes in streaming computing system, at working node, that is, data
Reason node is referred to as operator (Operator), and control node can send corresponding control instruction to the work section of subordinate
Point, so that working node is handled according to the data flow that control instruction calls execution unit to generate business.Each work
It may include one or more execution units as node, when working node is called to be handled data stream, specifically by work
Make the execution unit that node is included to handle data flow, execution unit is specifically as follows thread or process.As an example, target
Data processing node can also include the execution unit for executing the program segment of streaming computing system operation logic, the executing subject
The execution unit of the program segment of streaming computing system operation logic can specifically be executed.It may include several in streaming computing system
Each and every one streaming computing operation, each streaming computing operation are made of some independent calculating logics according to upstream and downstream subscribing relationship.
In the present embodiment, the upstream data processing node of target data processing node can be saves to target data processing
Point provides the data processing node of pending data.Pending data can be the upstream data processing of target data processing node
Node operation is generated, and the mark of pending data can be generated according to default rule, for example, can be generated according to data
Sequentially, the information such as time, storage location, source are generated to generate.
Step 202, the mark of pending data and pending data is sent to execution unit.
In the present embodiment, the pending data that above-mentioned executing subject can be obtained into execution unit sending step 201
With the mark of pending data.Above-mentioned executing subject can be by modes such as signal, pipeline, message queue, shared drives to holding
The mark of row unit transmission pending data and pending data.
In some optional implementations of the present embodiment, pending data and pending data are sent to execution unit
Mark, comprising: send the mark of pending data and pending data to execution unit by process pipeline.Process pipeline can be with
By calling pipeline (pipe) function to create.Pipeline be it is semiduplex, data can only be flowed to direction;Need both sides logical
When letter, need to set up two pipelines;Pipeline is exactly a file for the process of pipe ends, but it is not common
File, it is not belonging to certain file system, but keeps house, be separately formed a kind of file system, and only exist with it is interior
In depositing.The content that one process is write into pipeline is read by the process of the pipeline other end.The content of write-in is added every time in pipe
The end of road buffer area, and be to read data from the head of buffer area every time.It is simple and convenient by pipeline transmission data, into
One step improves the convenience of data processing.
In some optional implementations of the present embodiment, pending data and pending data are sent to execution unit
Mark, comprising: send the mark of pending data and pending data to execution unit according to preset protocol, advised in preset protocol
The separator used between field and field included by the row data that fixed oriented target data processing node is sent, field packet
Include the mark of data, the keyword of data, the value of data and the label whether having been processed for determining data.
The row data that can specify that oriented target data processing node is sent in this implementation, in preset protocol are wrapped
The separator used between the field and field included, field include the mark of data, the keyword of data, data value and
The label whether being had been processed for determining data.As an example, row data can be using such as flowering structure: the mark of data //
Label // data keyword // data the value whether being had been processed for determining data.Wherein, separator " // " can be with
Other separators, such as "/t/t " are selected according to actual needs, and the sequence between field can also be adjusted according to actual needs
It is whole, it can also include first symbol or end mark.Row data are sent to target data processing node according to preset protocol, it is single as a result,
Row data can complete the processing to a data, further improve the treatment effeciency of data.
Step 203, it obtains execution unit and runs processing result data corresponding with pending data generated and place
Manage the mark of result data.
In the present embodiment, above-mentioned executing subject can also pass through the modes such as signal, pipeline, message queue, shared drive
It obtains execution unit and runs processing result data corresponding with pending data that is sending in step 202 generated and processing
The mark of result data.The mark of processing result data can also be generated according to default rule, for example, can be raw according to data
At sequence, generate the generation of the information such as time, storage location, source.
In some optional implementations of the present embodiment, method further include: in response to getting including target program section
Mark execute request, create execution unit with performance objective program segment.Compared to required in storm user platform run
Before, the process of performance objective program segment has existed, this implementation creates execution unit according to request is executed to execute mesh
Program segment is marked, the management to the life cycle of execution unit is realized, further improves the flexibility of data processing.
In some optional implementations of the present embodiment, the topological relation of data processing node is logical in streaming computing system
Pre-set configuration file description is crossed, configuration file includes extensible markup language configuration file;And method further include: base
It is generated in configuration file and shows the topological diagram of streaming computing system.This implementation realizes convection type computing system topology knot
Structure clearly shows, in addition, user the data processing node of convection type computing system can also carry out in the page showed
Control, further improves the flexibility of streaming computing system.
With continued reference to the signal that Fig. 3, Fig. 3 are according to the application scenarios of the method for handling data of the present embodiment
Figure.In the application scenarios of Fig. 3, the upstream number for the target data processing node that data flow is flow to from streaming computing system first
The mark 302 of pending data and pending data is obtained according to processing node 301, streaming computing system includes for data flow
The data processing node set handled, data flow is from data processing section in data processing node set, as entrance
Point flows into and flows out data flow system after flowing through at least one data processing node in data processing node set, target
Data processing node includes that the user of execution streaming computing system handles the program of logic by the characterize data that equipment 305 is submitted
The execution unit 303 of section 304;The mark 302 of pending data and pending data is sent after and to execution unit 303;Finally
Obtain the mark that execution unit 303 runs processing result data corresponding with pending data and processing result data generated
Know 306.
The method provided by the above embodiment of the application passes through the target data that data flow is flow to from streaming computing system
The upstream data processing node for handling node obtains the mark of pending data and pending data, and streaming computing system includes using
In the data processing node set handled data stream, data flow is from data processing node set, as entrance
Data processing node flows into and flows out data after flowing through at least one data processing node in data processing node set
Streaming system, target data processing node include the program of the characterize data processing logic for the user's submission for executing streaming computing system
The execution unit of section;The mark of pending data and pending data is sent to execution unit;Execution unit operation is obtained to give birth to
At processing result data corresponding with pending data and processing result data mark, improve data-handling efficiency.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of the method for handling data.The use
In the process 400 of the method for processing data, comprising the following steps:
Step 401, the upstream data for the target data processing node that data flow is flow to from streaming computing system handles section
Point obtains the mark of pending data and pending data.
It in the present embodiment, can be first for handling the method executing subject (such as server shown in FIG. 1) of data
The upstream data processing node for the target data processing node that data flow is flow to from streaming computing system obtains pending data
With the mark of pending data.
Step 402, the mark of pending data and pending data is sent to execution unit.
In the present embodiment, the pending data that above-mentioned executing subject can be obtained into execution unit sending step 401
With the mark of pending data.
Step 403, it obtains execution unit and runs processing result data corresponding with pending data generated and place
Manage the mark of result data.
In the present embodiment, the available execution unit operation of above-mentioned executing subject is generated sends with step 402
The corresponding processing result data of pending data and processing result data mark.
Step 404, the mark of persistence processing result data and processing result data.
In the present embodiment, above-mentioned executing subject can processing result data and processing to obtain in persistence step 403
The mark of result data.Persistence is the mechanism for converting program data between permanent state and instantaneous state.That is transient data
(such as data in memory, be to be unable to persistence) are persisted as persistant data, and (for example persistence is into database, can
It is permanent to save).Persistence may include full dose persistence and increment persistence, increment persistence can to avoid Data duplication, into
One step improves data-handling efficiency.
Step 405, processing result data and processing knot are sent to the downstream data processing node of target data processing node
The mark of fruit data.
In the present embodiment, above-mentioned executing subject can be subscribed to according to the upstream and downstream indicated in pre-set configuration file
The processing result data and processing that relationship is obtained into the downstream data processing node sending step 403 of target data processing node
The mark of result data.
Step 405, the instruction information having been processed in response to getting instruction processing result data, removes persistence
The mark of processing result data and processing result data.
In the present embodiment, above-mentioned executing subject can have been processed in response to getting instruction processing result data
It indicates information, removes the processing result data of persistence and the mark of processing result data.Indicate that information may include by under
The mark of data that trip data processing node is properly received and handles.As an example, can be by confirming character
(Acknowledgement, ACK) is realized, in data communication, confirmation character can be a kind of biography that recipient issues sender
Defeated class control character indicates that the data sent have confirmed that reception is errorless.By indicating that the data of erasing of information persistence can be into
One step saves memory space.
In some optional implementations of the present embodiment, to the downstream data processing node hair of target data processing node
After the mark for sending processing result data and processing result data, method further include: in response to getting instruction processing result number
According to the instruction information of processing failure, the processing result of persistence is reset to the downstream data processing node of target data processing node
The mark of data and processing result data.Pending data processing failure, which can be target data processing node, the reasons such as restarts and makes
At.It is avoided by the playback of data and omits processing data in streaming computing system, further improve data-handling efficiency.
In the present embodiment, step 401, step 402, the operation of step 403 and step 201, step 202, step 203
Operate essentially identical, details are not described herein.
Figure 4, it is seen that the method for handling data compared with the corresponding embodiment of Fig. 2, in the present embodiment
Process 400 in by the mark of persistence processing result data and processing result data, can be to avoid in streaming computing system
The loss of middle data, further increases data-handling efficiency.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides one kind for handling number
According to device one embodiment, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer
For in various electronic equipments.
As shown in figure 5, the device 500 for handling data of the present embodiment includes: that first acquisition unit 501, first is sent out
Send unit 502, second acquisition unit 503.Wherein, first acquisition unit is configured to the data flow stream from streaming computing system
The upstream data processing node of target data processing node extremely obtains the mark of pending data and pending data, streaming meter
Calculation system includes the data processing node set for being handled data stream, and data flow is from data processing node set
, as entrance data processing node flow into and flowing through at least one data processing node in data processing node set
Data flow system is flowed out later, and target data processing node includes executing at the characterize data of user's submission of streaming computing system
Manage the execution unit of the program segment of logic;First transmission unit is configured to send pending data to execution unit and wait locate
Manage the mark of data;It is generated corresponding with pending data to be configured to obtain execution unit operation for second acquisition unit
The mark of processing result data and processing result data.
In the present embodiment, for handle the first acquisition unit 501 of the device 500 of data, the first transmission unit 502,
The specific processing of second acquisition unit 503 can be with reference to step 201, step 202 and the step 203 in Fig. 2 corresponding embodiment.
In some optional implementations of the present embodiment, device further include: persistence unit is configured at persistence
Manage the mark of result data and processing result data;Second transmission unit is configured to handle the downstream of node to target data
The mark of data processing node transmission processing result data and processing result data;Clearing cell is configured in response to obtain
The instruction information having been processed to instruction processing result data, removes the processing result data and processing result data of persistence
Mark.
In some optional implementations of the present embodiment, device further include: playback unit is configured in response to obtain
To the instruction information of instruction processing result data processing failure, reset to the downstream data processing node of target data processing node
The processing result data of persistence and the mark of processing result data.
In some optional implementations of the present embodiment, the first transmission unit is further configured to through process pipe
Road sends the mark of pending data and pending data to execution unit.
In some optional implementations of the present embodiment, the first transmission unit is further configured to according to default association
The mark for sending pending data and pending data to execution unit is discussed, oriented target data processing section is provided in preset protocol
The separator used between field and field included by the row data that point is sent, field includes the mark of data, data
Keyword, the value of data and the label whether being had been processed for determining data.
In some optional implementations of the present embodiment, device further include: creating unit is configured in response to obtain
To the request that executes for the mark for including target program section, execution unit is created with performance objective program segment.
In some optional implementations of the present embodiment, the topological relation of data processing node is logical in streaming computing system
Pre-set configuration file description is crossed, configuration file includes extensible markup language configuration file;And device further include: exhibition
Existing unit, is configured to generate and show based on configuration file the topological diagram of streaming computing system.
The device provided by the above embodiment of the application, passes through the target data that data flow is flow to from streaming computing system
The upstream data processing node for handling node obtains the mark of pending data and pending data, and streaming computing system includes using
In the data processing node set handled data stream, data flow is from data processing node set, as entrance
Data processing node flows into and flows out data after flowing through at least one data processing node in data processing node set
Streaming system, target data processing node include the program of the characterize data processing logic for the user's submission for executing streaming computing system
The execution unit of section;The mark of pending data and pending data is sent to execution unit;Execution unit operation is obtained to give birth to
At processing result data corresponding with pending data and processing result data mark, improve data-handling efficiency.
Below with reference to Fig. 6, it illustrates the computer systems 600 for the server for being suitable for being used to realize the embodiment of the present application
Structural schematic diagram.Server shown in Fig. 6 is only an example, should not function and use scope band to the embodiment of the present application
Carry out any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and
Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
It can connect with lower component to I/O interface 605: the importation 606 including keyboard, mouse etc.;Including all
The output par, c 607 of such as cathode-ray tube (CRT), liquid crystal display (LCD) and loudspeaker etc.;Storage including hard disk etc.
Part 608;And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 passes through
Communication process is executed by the network of such as internet.Driver 610 is also connected to I/O interface 605 as needed.Detachable media
611, such as disk, CD, magneto-optic disk, semiconductor memory etc., are mounted on as needed on driver 610, in order to from
The computer program read thereon is mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media
611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes
Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or
Computer-readable medium either the two any combination.Computer-readable medium for example can be --- but it is unlimited
In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or any above combination.It calculates
The more specific example of machine readable medium can include but is not limited to: electrical connection, portable meter with one or more conducting wires
Calculation machine disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory
(EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or
The above-mentioned any appropriate combination of person.In this application, computer-readable medium, which can be, any includes or storage program has
Shape medium, the program can be commanded execution system, device or device use or in connection.And in the application
In, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, wherein
Carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to electric
Magnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Jie
Any computer-readable medium other than matter, the computer-readable medium can be sent, propagated or transmitted for being held by instruction
Row system, device or device use or program in connection.The program code for including on computer-readable medium
It can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. or above-mentioned any conjunction
Suitable combination.
The calculating of the operation for executing the application can be write with one or more programming languages or combinations thereof
Machine program code, described program design language include object oriented program language-such as Java, Smalltalk, C+
+, it further include conventional procedural programming language-such as C language or similar programming language.Program code can be with
It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion
Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.?
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or
Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as mentioned using Internet service
It is connected for quotient by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet
Include first acquisition unit, the first transmission unit and second acquisition unit.Wherein, the title of these units is not under certain conditions
The restriction to the unit itself is constituted, for example, the first transmission unit is also described as " being configured to send to execution unit
The unit of the mark of pending data and pending data ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in device described in above-described embodiment;It is also possible to individualism, and without in the supplying device.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should
Device: the upstream data processing node for the target data processing node that data flow is flow to from streaming computing system obtains to be processed
The mark of data and pending data, streaming computing system include the data processing node collection for being handled data stream
Close, data flow from it is in data processing node set, as entrance data processing node flow into and flowing through data processing section
Data flow system is flowed out after at least one data processing node in point set, it includes executing streaming that target data, which handles node,
The execution unit of the program segment for the characterize data processing logic that the user of computing system submits;Number to be processed is sent to execution unit
According to the mark with pending data;Obtain execution unit run processing result data corresponding with pending data generated with
And the mark of processing result data.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (16)
1. a kind of method for handling data, comprising:
The upstream data processing node for the target data processing node that data flow is flow to from streaming computing system obtains to be processed
The mark of data and the pending data, the streaming computing system include the data processing for being handled data stream
Node set, the data flow from it is in data processing node set, as entrance data processing node flow into and flowing through
The data flow system is flowed out after at least one data processing node in data processing node set, at the target data
Reason node includes the execution unit of the program segment of the characterize data processing logic for the user's submission for executing the streaming computing system;
The mark of the pending data and the pending data is sent to the execution unit;
It obtains the execution unit and runs processing result data corresponding with the pending data generated and the place
Manage the mark of result data.
2. according to the method described in claim 1, wherein, it is described obtain the execution unit operation it is generated with described wait locate
After managing the corresponding processing result data of data, the method also includes:
The mark of processing result data described in persistence and the processing result data;
The processing result data and processing knot are sent to the downstream data processing node of target data processing node
The mark of fruit data;
In response to getting the instruction information for indicating that the processing result data has been processed, the processing of persistence is removed
The mark of result data and the processing result data.
3. according to the method described in claim 2, wherein, the downstream data to target data processing node handles section
After point sends the mark of the processing result data and the processing result data, the method also includes:
The instruction information that the processing result data processing failure is indicated in response to getting handles node to the target data
Downstream data processing node reset persistence the processing result data and the processing result data mark.
It is described to send the pending data and described to the execution unit 4. according to the method described in claim 1, wherein
The mark of pending data, comprising:
The mark of the pending data and the pending data is sent to the execution unit by process pipeline.
It is described to send the pending data and described to the execution unit 5. according to the method described in claim 1, wherein
The mark of pending data, comprising:
The mark of the pending data and the pending data is sent to the execution unit according to preset protocol, it is described pre-
If being used between field and field included by the row data for providing the oriented target data processing node transmission in agreement
Separator, the field includes the mark of data, the keyword of data, the value of data and for determining whether data are located
Manage the label completed.
6. according to the method described in claim 1, wherein, the method also includes:
In response to getting the request that executes of the mark including target program section, execution unit is created to execute the target program
Section.
7. method according to claim 1 to 6, wherein data processing node in the streaming computing system
Topological relation is described by pre-set configuration file, and the configuration file includes extensible markup language configuration file;With
And
The method also includes:
The topological diagram of the streaming computing system is generated and showed based on the configuration file.
8. a kind of for handling the device of data, comprising:
First acquisition unit is configured to the upstream number for the target data processing node that data flow is flow to from streaming computing system
The mark of pending data and the pending data is obtained according to processing node, the streaming computing system includes for data
The data processing node set that stream is handled, the data flow is from data in data processing node set, as entrance
Processing node flows into and flows out the data after flowing through at least one data processing node in data processing node set
Streaming system, the target data processing node include that the characterize data processing for the user's submission for executing the streaming computing system is patrolled
The execution unit for the program segment collected;
First transmission unit is configured to send the mark of the pending data and the pending data to the execution unit
Know;
Second acquisition unit is configured to obtain the execution unit and runs place corresponding with the pending data generated
Manage the mark of result data and the processing result data.
9. device according to claim 8, wherein described device further include:
Persistence unit is configured to the mark of processing result data described in persistence and the processing result data;
Second transmission unit, the downstream data processing node for being configured to handle node to the target data send the processing
The mark of result data and the processing result data;
Clearing cell is configured in response to get the instruction information for indicating that the processing result data has been processed, clearly
Except the processing result data of persistence and the mark of the processing result data.
10. device according to claim 9, wherein described device further include:
Playback unit is configured in response to get the instruction information for indicating the processing result data processing failure, to institute
The downstream data processing node for stating target data processing node resets the processing result data and processing knot of persistence
The mark of fruit data.
11. device according to claim 8, wherein first transmission unit is further configured to through process pipe
Road sends the mark of the pending data and the pending data to the execution unit.
12. device according to claim 8, wherein first transmission unit is further configured to according to default association
The mark for sending the pending data and the pending data to the execution unit is discussed, provides have in the preset protocol
The separator used between field and field included by the row data sent to target data processing node, the word
Section includes the mark of data, the keyword of data, the value of data and the label whether having been processed for determining data.
13. device according to claim 8, wherein described device further include:
Creating unit is configured in response to get the request that executes of the mark including target program section, creates execution unit
To execute the target program section.
14. the device according to any one of claim 8-13, wherein data processing node in the streaming computing system
Topological relation described by pre-set configuration file, the configuration file includes extensible markup language configuration file;
And
Described device further include:
Show unit, is configured to generate and show based on the configuration file topological diagram of the streaming computing system.
15. a kind of electronic equipment, comprising:
One or more processors;
Storage device is stored thereon with one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors
Realize the method as described in any in claim 1-7.
16. a kind of computer-readable medium, is stored thereon with computer program, such as right is realized when which is executed by processor
It is required that any method in 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811003311.8A CN109145023B (en) | 2018-08-30 | 2018-08-30 | Method and apparatus for processing data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811003311.8A CN109145023B (en) | 2018-08-30 | 2018-08-30 | Method and apparatus for processing data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109145023A true CN109145023A (en) | 2019-01-04 |
CN109145023B CN109145023B (en) | 2020-11-27 |
Family
ID=64829523
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811003311.8A Active CN109145023B (en) | 2018-08-30 | 2018-08-30 | Method and apparatus for processing data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109145023B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110460495A (en) * | 2019-08-01 | 2019-11-15 | 北京百度网讯科技有限公司 | A kind of water level propulsion method, device, calculate node and storage medium |
CN111142925A (en) * | 2019-12-23 | 2020-05-12 | 山东浪潮通软信息科技有限公司 | Pipeline type data processing method, equipment and storage medium |
CN111324345A (en) * | 2020-03-19 | 2020-06-23 | 北京奇艺世纪科技有限公司 | Data processing mode generation method, data processing method and device and electronic equipment |
CN111435939A (en) * | 2019-01-14 | 2020-07-21 | 百度在线网络技术(北京)有限公司 | Method and device for dividing storage space of node |
CN111488495A (en) * | 2020-04-14 | 2020-08-04 | 北京字节跳动网络技术有限公司 | Information processing method and device |
CN111930748A (en) * | 2020-08-07 | 2020-11-13 | 北京百度网讯科技有限公司 | Data tracking method, device, equipment and storage medium for streaming computing system |
WO2021212385A1 (en) * | 2020-04-22 | 2021-10-28 | 深圳市欢太科技有限公司 | Data testing method and device, server, and data processing system |
CN113965511A (en) * | 2020-07-02 | 2022-01-21 | 北京瀚海云星科技有限公司 | Tag data transmission method based on RDMA (remote direct memory Access), and related device and system |
CN114090481A (en) * | 2020-07-02 | 2022-02-25 | 北京瀚海云星科技有限公司 | Data sending method, data receiving method and related device |
CN114328501A (en) * | 2020-09-29 | 2022-04-12 | 华为技术有限公司 | Data processing method, device and equipment |
CN116662325A (en) * | 2023-07-24 | 2023-08-29 | 宁波森浦信息技术有限公司 | Data processing method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090063515A1 (en) * | 2007-09-05 | 2009-03-05 | International Business Machines Corporation | Optimization model for processing hierarchical data in stream systems |
CN105959151A (en) * | 2016-06-22 | 2016-09-21 | 中国工商银行股份有限公司 | High availability stream processing system and method |
CN107046510A (en) * | 2017-01-13 | 2017-08-15 | 广西电网有限责任公司电力科学研究院 | A kind of node and its system of composition suitable for distributed computing system |
CN107229747A (en) * | 2017-06-26 | 2017-10-03 | 湖南星汉数智科技有限公司 | A kind of large-scale data processing unit and method based on Stream Processing framework |
CN107277087A (en) * | 2016-04-06 | 2017-10-20 | 阿里巴巴集团控股有限公司 | Data processing method and device |
-
2018
- 2018-08-30 CN CN201811003311.8A patent/CN109145023B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090063515A1 (en) * | 2007-09-05 | 2009-03-05 | International Business Machines Corporation | Optimization model for processing hierarchical data in stream systems |
CN107277087A (en) * | 2016-04-06 | 2017-10-20 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN105959151A (en) * | 2016-06-22 | 2016-09-21 | 中国工商银行股份有限公司 | High availability stream processing system and method |
CN107046510A (en) * | 2017-01-13 | 2017-08-15 | 广西电网有限责任公司电力科学研究院 | A kind of node and its system of composition suitable for distributed computing system |
CN107229747A (en) * | 2017-06-26 | 2017-10-03 | 湖南星汉数智科技有限公司 | A kind of large-scale data processing unit and method based on Stream Processing framework |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111435939A (en) * | 2019-01-14 | 2020-07-21 | 百度在线网络技术(北京)有限公司 | Method and device for dividing storage space of node |
CN110460495A (en) * | 2019-08-01 | 2019-11-15 | 北京百度网讯科技有限公司 | A kind of water level propulsion method, device, calculate node and storage medium |
CN110460495B (en) * | 2019-08-01 | 2024-02-23 | 北京百度网讯科技有限公司 | Water level propelling method and device, computing node and storage medium |
CN111142925A (en) * | 2019-12-23 | 2020-05-12 | 山东浪潮通软信息科技有限公司 | Pipeline type data processing method, equipment and storage medium |
CN111324345A (en) * | 2020-03-19 | 2020-06-23 | 北京奇艺世纪科技有限公司 | Data processing mode generation method, data processing method and device and electronic equipment |
CN111488495A (en) * | 2020-04-14 | 2020-08-04 | 北京字节跳动网络技术有限公司 | Information processing method and device |
WO2021212385A1 (en) * | 2020-04-22 | 2021-10-28 | 深圳市欢太科技有限公司 | Data testing method and device, server, and data processing system |
CN113965511A (en) * | 2020-07-02 | 2022-01-21 | 北京瀚海云星科技有限公司 | Tag data transmission method based on RDMA (remote direct memory Access), and related device and system |
CN114090481A (en) * | 2020-07-02 | 2022-02-25 | 北京瀚海云星科技有限公司 | Data sending method, data receiving method and related device |
CN111930748B (en) * | 2020-08-07 | 2023-08-08 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for tracking data of streaming computing system |
CN111930748A (en) * | 2020-08-07 | 2020-11-13 | 北京百度网讯科技有限公司 | Data tracking method, device, equipment and storage medium for streaming computing system |
CN114328501A (en) * | 2020-09-29 | 2022-04-12 | 华为技术有限公司 | Data processing method, device and equipment |
CN116662325A (en) * | 2023-07-24 | 2023-08-29 | 宁波森浦信息技术有限公司 | Data processing method and system |
CN116662325B (en) * | 2023-07-24 | 2023-11-10 | 宁波森浦信息技术有限公司 | Data processing method and system |
Also Published As
Publication number | Publication date |
---|---|
CN109145023B (en) | 2020-11-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109145023A (en) | Method and apparatus for handling data | |
CN109523187A (en) | Method for scheduling task, device and equipment | |
CN110245011A (en) | A kind of method for scheduling task and device | |
CN109033001A (en) | Method and apparatus for distributing GPU | |
CN108763534B (en) | Method and apparatus for handling information | |
US20200004464A1 (en) | Method and apparatus for storing data | |
CN109309736A (en) | The generation method and generating means of globally unique ID | |
CN110213614A (en) | The method and apparatus of key frame are extracted from video file | |
US11502899B2 (en) | Dynamic product installation based on user feedback | |
CN110427304A (en) | O&M method, apparatus, electronic equipment and medium for banking system | |
CN110391938A (en) | Method and apparatus for deployment services | |
CN110334109A (en) | Relational database data query method, system, medium and electronic equipment | |
CN108965098A (en) | Based on information push method, device, medium and the electronic equipment being broadcast live online | |
CN109976919A (en) | A kind of transmission method and device of message request | |
CN111610938B (en) | Distributed data code storage method, electronic device and computer readable storage medium | |
CN111044062A (en) | Path planning and recommending method and device | |
CN110109912A (en) | A kind of identifier generation method and device | |
CN108984770A (en) | Method and apparatus for handling data | |
CN109005250A (en) | Method and apparatus for accessing server-side | |
CN110381471A (en) | The method and apparatus for determining optimum base station for unmanned vehicle | |
CN110830427A (en) | Method and device for message encoding and message decoding in netty environment | |
CN111414161B (en) | Method, device, medium and electronic equipment for generating IDL file | |
CN114461582A (en) | File processing method, device, equipment and storage medium | |
CN108092858B (en) | For switching the method and device of agent node | |
CN112732835A (en) | Block chain-based heterogeneous data storage method and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |