CN108076111A - A kind of system and method for distributing data in big data platform - Google Patents

A kind of system and method for distributing data in big data platform Download PDF

Info

Publication number
CN108076111A
CN108076111A CN201611029700.9A CN201611029700A CN108076111A CN 108076111 A CN108076111 A CN 108076111A CN 201611029700 A CN201611029700 A CN 201611029700A CN 108076111 A CN108076111 A CN 108076111A
Authority
CN
China
Prior art keywords
data
asynchronous
distribution
module
various dimensions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611029700.9A
Other languages
Chinese (zh)
Other versions
CN108076111B (en
Inventor
周伟
俞力
赵贵阳
周春楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
YIYANG SAFETY TECHNOLOGY Co Ltd
Original Assignee
YIYANG SAFETY TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by YIYANG SAFETY TECHNOLOGY Co Ltd filed Critical YIYANG SAFETY TECHNOLOGY Co Ltd
Priority to CN201611029700.9A priority Critical patent/CN108076111B/en
Publication of CN108076111A publication Critical patent/CN108076111A/en
Application granted granted Critical
Publication of CN108076111B publication Critical patent/CN108076111B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/562Brokering proxy services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The system and method for distributing data in big data platform of the present invention, asynchronous I/O is used to build big data Dispatching Unit for technical foundation and carries out big data distribution at a high speed, it is separated using server-side and client thread, improve the handling capacity of data distribution, ensure the complete lattice of data by various dimensions structure Storage Unit, ensure the accuracy and correctness of data distribution with big data administrative center and data bus module, it can run at high speed so that various pieces cooperate, resource will not be mutually waited, resource can be made full use of.Good retractility is presented in whole system simultaneously.

Description

A kind of system and method for distributing data in big data platform
Technical field
The present invention relates to data security arts more particularly to a kind of method for distributing data in big data platform and it is System.
Background technology
In the prior art, the big data platform based on Hadoop framework has enhanced scalability, high reliability and Gao Rong Mistake.Substantial amounts of data query and the widely used memory database of stream compression (Memory DB) and non-relation data at this stage Storehouse technology (NoSql) and caching technology (Cache), have been achieved for good progress.But in actual big data industry In processing procedure of being engaged in, such as Wireless Application Protocol internet log, large user's mailing system, Blog log analysis, user information tracking With analysis etc. in applications, current big data platform existing defects in data I/O processing methods, especially for unstructured , business of half structure, big data quantity, I/O processing speeds there are it is more serious the problem of, be mainly reflected in:
1. in the case of big data quantity especially in the case where big data quantity is continuously write, I/O performances are slow, and I/O speed speed-up ratio and the non-linear relation of server number of nodes;
2. in non-structural, semi structured data the processing such as LOG, BLOG, video, social relationships information, not according to big Stored Data Type is optimized with feature, and processing speed is partially slow;
3. the technology write using more service synchronizations, cause the synchronization time under the states such as network and storage device be in confused situation It is longer so that the cost time is longer in the processing of data consistency.
Therefore, improving data high-speed distribution and I/O operation just becomes the most important target for improving big data platform property.
The content of the invention
The purpose of the present invention is what is be achieved through the following technical solutions.
According to the embodiment of the present invention, a kind of system for distributing data in big data platform, the system tool are proposed Body includes:Big data Dispatching Unit, data bus module, administrative center and big data adaptation module;Wherein,
The big data Dispatching Unit is used to receive the data that multiple client is sent and is stored in itself cache In;Data distribution rule is obtained from data bus module and is distributed allocation processing to data cached;Then to destination service Device distributes data;
The data bus module is a distributed post message subscribing system, for persistently storing data distribution rule Then, so that big data Dispatching Unit is concurrently read;
The administrative center formulates data distribution rule according to resource information, is sent to data bus module preservation;
The big data adaptation module is used to collect the resource information of big data Dispatching Unit and destination server, issues pipe Reason center.
Preferably, the big data Dispatching Unit specifically includes:
Load balancing module for receiving the data that multiple client is sent, is delayed data using load-balancing algorithm It is stored in itself cache of each asynchronous server-side;
Asynchronous service end, for caching through the data after load balancing module equilibrium;It is additionally operable to from data bus module Obtain data distribution rule;And the data in itself cache are subjected to structural remodeling according to distribution rules, generation, which includes, to divide The new various dimensions structured data of resource information is sent out, is stored in various dimensions structure Storage Unit;
Various dimensions structure Storage Unit, for storing the various dimensions structure of the multiple asynchronous service end reconstruct data generation Data;
Asynchronous client for obtaining the data in various dimensions structure Storage Unit, finds the money of distribution destination server Source information distributes the data in various dimensions structure Storage Unit to destination server.
Particularly, the asynchronous service end is interacted with asynchronous client using asynchronous I/O pattern.
Preferably, the asynchronous service end also specifically includes:
Configuration module, for configuring the service of asynchronous service end offer and sending configuration information to administrative center;
Data acquisition module, for receiving through the data after load balancing module equilibrium and being stored in itself cache In;
Distribution rules acquisition module, for obtaining data distribution rule from data bus module;
Structural remodeling module, for distribution rules to be loaded into the data in itself cache, composition is to include source Address and port, destination address and port, the distribution bag of connection protocol and data portion, are packed into two-way opening continuity data team Row, the queue is head, the two-way insertion deletion queue in tail both ends;
Asynchronous client end response module, for responding the request of asynchronous client.
Preferably, the asynchronous client also specifically includes:
Data distribution module obtains the data in various dimensions structure Storage Unit by asynchronous service end, finds distribution mesh The resource information of server is marked, distributes the data in various dimensions structure Storage Unit to destination server.
Detection module, the asynchronous operation for waiting destination server complete signal and the signal are detected;Such as detect letter Number display data is distributed successfully, then the data portion being distributed from the deletion of various dimensions structure Storage Unit;Such as detect signal Display data distribution failure, then call data distribution module retransmission data.
Another embodiment according to the present invention additionally provides a kind of dividing in big data platform by above system execution The method for sending out data, described method includes following steps:
Big data Dispatching Unit configures provided service and sends configuration information to administrative center;
Big data adaptation module collects the configuration information of big data Dispatching Unit and the resource information of destination server, issues Administrative center;
Administrative center formulates data distribution rule according to configuration information and resource information, is sent to data bus module guarantor It deposits;The data bus module is a distributed post message subscribing system, regular for ensured sustained development storage data distribution, with It is concurrently read for big data Dispatching Unit;
Big data Dispatching Unit receives the data that multiple client is sent and preserves in the caches;
Big data Dispatching Unit obtains data distribution rule from data bus module and data in cache is divided Send out allocation processing;
Big data Dispatching Unit distributes data to destination server.
Preferably, the number that the big data Dispatching Unit is sent by multiple load balancing modules reception multiple client According to will be in itself cache of data buffer storage to each asynchronous server-side using load-balancing algorithm.
Preferably, the big data Dispatching Unit obtains data distribution from data bus module by asynchronous service end and advises Then;And the data in cache are subjected to structural remodeling according to distribution rules, generation is new more comprising distribution resource information Dimensional structure data are stored in various dimensions structure Storage Unit;The big data Dispatching Unit is obtained by asynchronous client The data in various dimensions structure Storage Unit are taken, find the resource information of distribution destination server, are distributed to destination server more Data in dimensional structure storage unit.
Wherein, the data in itself cache are carried out structural remodeling by the asynchronous service end, generate new various dimensions Structured data specifically includes:
Distribution rules are loaded into the data in itself cache by the asynchronous service end, and composition is to include source address And the distribution bag of port, destination address and port, connection protocol and data portion, two-way opening continuity data queue is packed into, The queue is head, the two-way insertion deletion queue in tail both ends.
The asynchronous service end is interacted with asynchronous client using asynchronous I/O pattern.
Preferably, the asynchronous client further includes afterwards to destination server distribution data:
The asynchronous operation for waiting destination server of the asynchronous client completes signal and the signal is detected;
It such as detects signal display data to distribute successfully, has then been distributed from various dimensions structure Storage Unit deletion Data portion;
The distribution failure of signal display data is such as detected, then repeats to distribute data to destination server.
The system and method for distributing data in big data platform of the present invention, use asynchronous I/O to be built for technical foundation Big data Dispatching Unit carries out big data and distributes at a high speed, is separated using server-side and client thread, improves gulping down for data distribution The amount of spitting ensures the complete lattice of data by various dimensions structure Storage Unit, with big data administrative center and data/address bus Module ensures the accuracy and correctness of data distribution so that various pieces, which cooperate, can run at high speed, and will not mutually wait It treats resource, resource can be made full use of.Good retractility is presented in whole system simultaneously.
Description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this field Technical staff will be apparent understanding.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Attached drawing 1 shows the system structure signal for distributing data in big data platform of embodiment according to the present invention Figure;
Attached drawing 2 shows the big data Dispatching Unit structure diagram of embodiment according to the present invention;
Attached drawing 3 shows the asynchronous service end structure schematic diagram of embodiment according to the present invention;
Attached drawing 4 shows the method flow for distributing data in big data platform of another embodiment according to the present invention Figure.
Specific embodiment
The illustrative embodiments of the disclosure are more fully described below with reference to accompanying drawings.Although this public affairs is shown in attached drawing The illustrative embodiments opened, it being understood, however, that may be realized in various forms the disclosure without the reality that should be illustrated here The mode of applying is limited.It is to be able to be best understood from the disclosure on the contrary, providing these embodiments, and can be by this public affairs The scope opened completely is communicated to those skilled in the art.
According to embodiment of the present invention, a kind of system for distributing data in big data platform is proposed, such as attached drawing 1 Shown, the system specifically includes:Big data Dispatching Unit M101, data bus module M102, administrative center M103 and big number According to adaptation module M104;Wherein,
The big data Dispatching Unit is used to receive the mass data that multiple client is sent and is stored in itself at a high speed In caching;Data distribution rule is obtained from data bus module and is distributed allocation processing to data cached;Then to target Server distributes data;
The data bus module is a distributed post message subscribing system, for persistently storing data distribution rule Then, so that big data Dispatching Unit is efficiently concurrently read at any time;
The administrative center formulates data distribution rule according to resource information, is sent to data bus module preservation;
The big data adaptation module is used to collect the resource information of big data Dispatching Unit and destination server, issues pipe Reason center;The resource information includes at least connection protocol, IP address and port.
Preferably, as shown in Figure 2, the big data Dispatching Unit specifically includes:
Load balancing module, for receiving the mass data that multiple client is sent, using load-balancing algorithm by number According to being cached in itself cache of each asynchronous server-side;
Asynchronous service end, for caching through the data after load balancing module equilibrium;It is additionally operable to from data bus module Obtain data distribution rule;And the data in itself cache are subjected to structural remodeling according to distribution rules, generation, which includes, to divide The new various dimensions structured data of resource information is sent out, is stored in various dimensions structure Storage Unit;
Various dimensions structure Storage Unit, for storing the various dimensions structure of the multiple asynchronous service end reconstruct data generation Data;
Asynchronous client for obtaining the data in various dimensions structure Storage Unit, finds the money of distribution destination server Source information distributes the data in various dimensions structure Storage Unit to destination server.
Particularly, the asynchronous service end is interacted with asynchronous client using asynchronous I/O pattern.
Preferably, as shown in Figure 3, the asynchronous service end also specifically includes:
Configuration module, for configuring the service of asynchronous service end offer and sending configuration information to administrative center;
Data acquisition module, for receiving through the data after load balancing module equilibrium and being stored in itself cache In;
Distribution rules acquisition module, for obtaining data distribution rule from data bus module;
Structural remodeling module, for distribution rules to be loaded into the data in itself cache, composition is to include source Address and port, destination address and port, the distribution bag of connection protocol and data portion, are packed into two-way opening continuity data team Row, the queue is efficient head, the two-way insertion deletion queue in tail both ends;
Asynchronous client end response module, for responding the request of asynchronous client.
Preferably, the asynchronous client also specifically includes:
Data distribution module obtains the data in various dimensions structure Storage Unit by asynchronous service end, finds distribution mesh The resource information of server is marked, distributes the data in various dimensions structure Storage Unit to destination server.
Detection module, the asynchronous operation for waiting destination server complete signal and the signal are detected;Such as detect letter Number display data is distributed successfully, then the data portion being distributed from the deletion of various dimensions structure Storage Unit;Such as detect signal Display data distribution failure, then call data distribution module retransmission data.
Another embodiment according to the present invention additionally provides a kind of dividing in big data platform by above system execution The method for sending out data, as shown in Figure 4, described method includes following steps:
Big data Dispatching Unit configures provided service and sends configuration information to administrative center;
Big data adaptation module collects the configuration information of big data Dispatching Unit and the resource information of destination server, issues Administrative center;The resource information includes at least connection protocol, IP address and port;
Administrative center formulates data distribution rule according to configuration information and resource information, is sent to data bus module guarantor It deposits;The data bus module is a distributed post message subscribing system, regular for ensured sustained development storage data distribution, with It is efficiently concurrently read at any time for big data Dispatching Unit;It ensure that the uniformity and correctness of distribution rules.
Such as:The resource information of destination server is carried out " formatting " by administrative center first, the number formed after formatting According to for:Connection protocol:// user name:Password@host name@ip addresses:Port, shaped like ssh://zhangsan:123456@ localhost@127.0.0.1:22......;Then, administrative center is again arranged " formatted resource information " by agreement For distribution rules, concrete form is:Connection protocol:// user name:Password@gateway name@gateway ip addresses:Source port->Connection association View:// user name:The ip addresses of password destination host name mesh:Destination interface, such as:{[ssh://zhangsan:123456@ host1@192.168.0.1@22->ssh://lisi:654321@host2@192.168.0.2:8022][…][…]…}。
Big data Dispatching Unit receives the mass data that multiple client is sent and preserves in the caches;
Big data Dispatching Unit obtains data distribution rule from data bus module and data in cache is divided Send out allocation processing;
Big data Dispatching Unit distributes data to destination server.
Preferably, the sea that the big data Dispatching Unit is sent by multiple load balancing modules reception multiple client Data are measured, it will be in itself cache of data buffer storage to each asynchronous server-side using load-balancing algorithm;
Preferably, the big data Dispatching Unit obtains data distribution from data bus module by asynchronous service end and advises Then;And the data in cache are subjected to structural remodeling according to distribution rules, generation is new more comprising distribution resource information Dimensional structure data are stored in various dimensions structure Storage Unit;The big data Dispatching Unit is obtained by asynchronous client The data in various dimensions structure Storage Unit are taken, find the resource information of distribution destination server, are distributed to destination server more Data in dimensional structure storage unit.
Wherein, the data in itself cache are carried out structural remodeling by the asynchronous service end, generate new various dimensions Structured data specifically includes:
Distribution rules are loaded into the data in itself cache by the asynchronous service end, and composition is to include source address And the distribution bag of port, destination address and port, connection protocol and data portion, two-way opening continuity data queue is packed into, The queue is efficient head, the two-way insertion deletion queue in tail both ends.The asynchronous service end uses different with asynchronous client Step I/O patterns interact.
Preferably, the asynchronous client further includes afterwards to destination server distribution data:
The asynchronous operation for waiting destination server of the asynchronous client completes signal and the signal is detected;
It such as detects signal display data to distribute successfully, has then been distributed from various dimensions structure Storage Unit deletion Data portion;
The distribution failure of signal display data is such as detected, then repeats to distribute data to destination server.
The specific implementation of the core of the application, i.e. asynchronous I/O part is detailed below.The asynchronous I/O Partial specific implementation is asynchronous process process, is specifically included:
Multiple load balancing modules receive the data of client transmission by Linux virtual server cluster (LVS), use Load-balancing technique sends data to asynchronous service end.The load-balancing technique includes DNS load balancing, HTTP loads Weighing apparatus, IP load balancing, link layer load balancing, mixed type P load balancing.
The asyncio asynchronous modules that the asynchronous I/O selection python language of the present invention provides, the asynchronous service end uses Get_event_loop in asyncio asynchronous modules rewrites service api generation asynchronous services, monitors itself port, and passes through Run_until_complete methods Xun Huan is called to receive the data that LVS is sent.
After asynchronous service end receives data, backstage method overloading api is called, is write data into the cache of oneself.
Asynchronous service end feedback reception completes signal to LVS.
Asynchronous service end by the data reconstruction in itself cache for convenience of the data structure of real-time calling, used here as The deque data structures that python language provides, the data structure have the advanced characteristic first used, excellent performance, and sheet Body has the excellent characteristics such as anti-deadlock.
Administrative center by big data adaptation module collect destination server information, and be translated into it is a set of can and it is different The distribution rules of server-side intercommunication are walked, which can use json forms, dict forms or xml forms, be carried using google It is stored in after the protocol buf technologies serializing of confession in the bus of kafka technological architectures.
Asynchronous service end is read in bus using kafka and parses distribution rules, and rule is preserved to the height of itself In speed caching.
Asynchronous service end is present in data in deque structures by distribution rules loading, can in deque basis Distribution rules regenerate more dimensions and depth.
The high speed that asynchronous service end carries out data using the asynchronous clients of asyncio according to rule is distributed, optional client There is aiohttp/parmiko etc. at end.
Asynchronous service end is using async and await keywords by function asynchronization, and asynchronous obtain first responds, then The asynchronous content for reading response.Request is initiated using client Session as main interface.Client Session allows Cookie and related object information are preserved between multiple requests.Session (session) use finish after need to close, It is another asynchronous operation to close Session, so being required for making its asynchronization using async with keywords every time.
Client session is established at asynchronous service end, is initiated to ask with it, and is started other multiple asynchronous operations.Different After walking distributing programs normal operation, asynchronous service end can also add in other data in cache in event loop.
After data have been distributed, the asynchronous detection signal that destination server is waited to send simultaneously directly will at asynchronous service end The data of distribution are saved in the caching of itself;During the completion signal for receiving destination server transmission asynchronous, it is slow to discharge itself The corresponding part deposited.
When asynchronous service end receives the signal of the reception failure of destination server, asynchronous service end by the partial data from It reconstructs and pushes in high-speed data structure again in the caching of itself, resend to destination server, and repeat State the operation of two steps.
In the above-described example, the LVS represents Linux Virtual Server (Linux virtual server cluster), bag Balancer containing load (load dispatcher):It is responsible for collecting the request of data of client, and transmits this data to one group of service It is cached on device;Server pool (server pools):Perform the request of data of client;Shared stored (shared storage) are carried It is stored for the data of server pool.The system provides data source for asynchronous service end.
In the above-described example, it is necessary to use kafka technologies to ensure the uniformity of distribution rules.Kafka is distributed hair Cloth-subscription message system, it provides high-throughput for issue and subscription simultaneously;It supports more subscribers, can be automatic when failure Balance consumer;It by message duration to disk, it is therefore (such as data warehouse technology ETL) and real available for consuming in batches When application program.So kafka is the good technique carrier that distribution rules are provided for asynchronous service end;When distribution policy has update When, automatic regular polling kafka buses are understood to ensure the uniformity of distribution policy in asynchronous service end.
The data distribution of the present invention uses asynchronous I/O to carry out high speed distribution for technical foundation based on big data platform, The availability of whole system is significantly increased, efficiency is obviously improved in the case where unstructured data is frequently read and write.The present invention Embodiment the I/O efficiency of big data platform can be effectively improved on the premise of data consistency and integrality is kept.
The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited thereto, Any one skilled in the art in the technical scope disclosed by the present invention, the change or replacement that can be readily occurred in, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of the claim Subject to enclosing.

Claims (10)

1. a kind of system for distributing data in big data platform, the system specifically include:Big data Dispatching Unit, data are total Wire module, administrative center and big data adaptation module;Wherein,
The big data Dispatching Unit is used to receive the data that multiple client is sent and be stored in itself cache;From Data bus module obtains data distribution rule and is distributed allocation processing to data cached;Then distribute to destination server Data;
The data bus module is a distributed post message subscribing system, for persistently storing data distribution rule, with It is concurrently read for big data Dispatching Unit;
The administrative center formulates data distribution rule according to resource information, is sent to data bus module preservation;
The big data adaptation module is used to collect the resource information of big data Dispatching Unit and destination server, issues in management The heart.
2. the system as claimed in claim 1, the big data Dispatching Unit specifically includes:
Load balancing module for receiving the data that multiple client is sent, is arrived data buffer storage using load-balancing algorithm In itself cache of each asynchronous server-side;
Asynchronous service end, for caching through the data after load balancing module equilibrium;It is additionally operable to obtain from data bus module Data distribution rule;And the data in itself cache are subjected to structural remodeling according to distribution rules, generation includes distribution money The new various dimensions structured data of source information, is stored in various dimensions structure Storage Unit;
Various dimensions structure Storage Unit, for storing the various dimensions structure number of the multiple asynchronous service end reconstruct data generation According to;
Asynchronous client for obtaining the data in various dimensions structure Storage Unit, finds the resource letter of distribution destination server Breath distributes the data in various dimensions structure Storage Unit to destination server.
3. system as claimed in claim 2, the asynchronous service end is handed over asynchronous client using asynchronous I/O pattern Mutually.
4. system as claimed in claim 2, the asynchronous service end also specifically includes:
Configuration module, for configuring the service of asynchronous service end offer and sending configuration information to administrative center;
Data acquisition module, for receiving through the data after load balancing module equilibrium and being stored in itself cache;
Distribution rules acquisition module, for obtaining data distribution rule from data bus module;
Structural remodeling module, for distribution rules to be loaded into the data in itself cache, composition is to include source address And the distribution bag of port, destination address and port, connection protocol and data portion, two-way opening continuity data queue is packed into, The queue is head, the two-way insertion deletion queue in tail both ends;
Asynchronous client end response module, for responding the request of asynchronous client.
5. system as claimed in claim 2, the asynchronous client also specifically includes:
Data distribution module obtains the data in various dimensions structure Storage Unit by asynchronous service end, finds distribution target clothes The resource information of business device distributes the data in various dimensions structure Storage Unit to destination server.
Detection module, the asynchronous operation for waiting destination server complete signal and the signal are detected;Signal is such as detected to show Show data distribution success, then the data portion being distributed from the deletion of various dimensions structure Storage Unit;Signal is such as detected to show Data distribution fails, then calls data distribution module retransmission data.
6. a kind of method for distributing data in big data platform, described method includes following steps:
Big data Dispatching Unit configures provided service and sends configuration information to administrative center;
Big data adaptation module collects the configuration information of big data Dispatching Unit and the resource information of destination server, issues management Center;
Administrative center formulates data distribution rule according to configuration information and resource information, is sent to data bus module preservation;Institute It is a distributed post message subscribing system to state data bus module, for ensured sustained development storage data distribution rule, for big File distributing unit is concurrently read;
Big data Dispatching Unit receives the data that multiple client is sent and preserves in the caches;
Big data Dispatching Unit obtains data distribution rule from data bus module and data in cache is distributed point With processing;
Big data Dispatching Unit distributes data to destination server.
7. method as claimed in claim 6, the big data Dispatching Unit receives multiple visitors by multiple load balancing modules The data that family end is sent, will be in itself cache of data buffer storage to each asynchronous server-side using load-balancing algorithm.
8. the method for claim 7, the big data Dispatching Unit is obtained by asynchronous service end from data bus module Access is according to distribution rules;And the data in cache are subjected to structural remodeling according to distribution rules, generation includes distribution resource The new various dimensions structured data of information, is stored in various dimensions structure Storage Unit;The big data Dispatching Unit passes through Asynchronous client obtains the data in various dimensions structure Storage Unit, the resource information of distribution destination server is found, to target Data in server distribution various dimensions structure Storage Unit.
9. method as claimed in claim 8, the data in itself cache are carried out structural remodeling by the asynchronous service end, New various dimensions structured data is generated, is specifically included:
Distribution rules are loaded into the data in itself cache by the asynchronous service end, and composition is to include source address and end The distribution bag of mouth, destination address and port, connection protocol and data portion is packed into two-way opening continuity data queue, described Queue is head, the two-way insertion deletion queue in tail both ends;The asynchronous service end uses asynchronous I/O pattern with asynchronous client It interacts.
10. method as claimed in claim 9, the asynchronous client further includes afterwards to destination server distribution data:
The asynchronous operation for waiting destination server of the asynchronous client completes signal and the signal is detected;
It such as detects signal display data to distribute successfully, then the data being distributed from various dimensions structure Storage Unit deletion Part;
The distribution failure of signal display data is such as detected, then repeats to distribute data to destination server.
CN201611029700.9A 2016-11-15 2016-11-15 System and method for distributing data in big data platform Active CN108076111B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611029700.9A CN108076111B (en) 2016-11-15 2016-11-15 System and method for distributing data in big data platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611029700.9A CN108076111B (en) 2016-11-15 2016-11-15 System and method for distributing data in big data platform

Publications (2)

Publication Number Publication Date
CN108076111A true CN108076111A (en) 2018-05-25
CN108076111B CN108076111B (en) 2021-07-09

Family

ID=62161323

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611029700.9A Active CN108076111B (en) 2016-11-15 2016-11-15 System and method for distributing data in big data platform

Country Status (1)

Country Link
CN (1) CN108076111B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457582A (en) * 2019-08-10 2019-11-15 北京酷我科技有限公司 A kind of data distributing method and recommender system
CN112732996A (en) * 2021-01-11 2021-04-30 深圳市洪堡智慧餐饮科技有限公司 Multi-platform distributed data crawling method based on asynchronous aiohttp

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103731298A (en) * 2013-11-15 2014-04-16 中国航天科工集团第二研究院七〇六所 Large-scale distributed network safety data acquisition method and system
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN104092767A (en) * 2014-07-21 2014-10-08 北京邮电大学 Posting/subscribing system for adding message queue models and working method thereof
CN104754036A (en) * 2015-03-06 2015-07-01 合一信息技术(北京)有限公司 Message processing system and processing method based on kafka

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103731298A (en) * 2013-11-15 2014-04-16 中国航天科工集团第二研究院七〇六所 Large-scale distributed network safety data acquisition method and system
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN104092767A (en) * 2014-07-21 2014-10-08 北京邮电大学 Posting/subscribing system for adding message queue models and working method thereof
CN104754036A (en) * 2015-03-06 2015-07-01 合一信息技术(北京)有限公司 Message processing system and processing method based on kafka

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457582A (en) * 2019-08-10 2019-11-15 北京酷我科技有限公司 A kind of data distributing method and recommender system
CN112732996A (en) * 2021-01-11 2021-04-30 深圳市洪堡智慧餐饮科技有限公司 Multi-platform distributed data crawling method based on asynchronous aiohttp

Also Published As

Publication number Publication date
CN108076111B (en) 2021-07-09

Similar Documents

Publication Publication Date Title
US11647081B2 (en) Method and system for reducing connections to a database
US11218566B2 (en) Control in a content delivery network
CN105245373B (en) A kind of container cloud platform system is built and operation method
US11729260B2 (en) Internet-of-things resource access system and method
US10341196B2 (en) Reliably updating a messaging system
CN110351246A (en) Server cluster system Socket management method and device
CN101207550B (en) Load balancing system and method for multi business to implement load balancing
WO2023077952A1 (en) Data processing method and system, related device, storage medium and product
CN110177118A (en) A kind of RPC communication method based on RDMA
US20110040892A1 (en) Load balancing apparatus and load balancing method
CN104834722A (en) CDN (Content Delivery Network)-based content management system
CN107135268B (en) Distributed task computing method based on information center network
US20100058451A1 (en) Load balancing for services
CN102055771B (en) Device and method for controlling cloud service-oriented multiple concurrent service flow
CN111209364A (en) Mass data access processing method and system based on crowdsourcing map updating
CN106713391A (en) Session information sharing method and sharing system
CN108076111A (en) A kind of system and method for distributing data in big data platform
JP5945543B2 (en) System including middleware machine environment
CN109120556A (en) A kind of method and system of cloud host access object storage server
US20160301625A1 (en) Intelligent High-Volume Cloud Application Programming Interface Request Caching
Patel et al. Towards in-order and exactly-once delivery using hierarchical distributed message queues
CN105144099B (en) Communication system
CN110049081A (en) For build and using high availability Docker private library method and system
Zhang et al. A cloud queuing service with strong consistency and high availability
US10187278B2 (en) Channel management in scalable messaging system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant