CN109325200A - Obtain the method, apparatus and computer readable storage medium of data - Google Patents

Obtain the method, apparatus and computer readable storage medium of data Download PDF

Info

Publication number
CN109325200A
CN109325200A CN201810822678.6A CN201810822678A CN109325200A CN 109325200 A CN109325200 A CN 109325200A CN 201810822678 A CN201810822678 A CN 201810822678A CN 109325200 A CN109325200 A CN 109325200A
Authority
CN
China
Prior art keywords
cluster
subject data
readset
group
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810822678.6A
Other languages
Chinese (zh)
Other versions
CN109325200B (en
Inventor
马跃
曹东阳
孙爱林
乔超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201810822678.6A priority Critical patent/CN109325200B/en
Publication of CN109325200A publication Critical patent/CN109325200A/en
Application granted granted Critical
Publication of CN109325200B publication Critical patent/CN109325200B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Present disclose provides a kind of method, apparatus and computer readable storage medium for obtaining data, are related to field of computer technology.The method therein for obtaining data includes: the subject data acquisition request for receiving user and sending, and subject data acquisition request includes the mark of subject data;The kafka cluster subject data information table being pre-created using the mark inquiry of subject data, determine that the cluster where subject data, kafka cluster subject data information table include the mapping relations in kafka cluster between each subject data and the cluster at place in kafka cluster;The configuration parameter and consumption key of cluster where sending from subject data to user, so that the cluster where user from subject data obtains subject data.The disclosure can realize the automatic configuration of Kafka cluster according to the subject data acquisition request Auto-matching readset group of user.

Description

Obtain the method, apparatus and computer readable storage medium of data
Technical field
This disclosure relates to field of computer technology, in particular to a kind of method, apparatus for obtaining data and computer-readable Storage medium.
Background technique
Kafka is the open source stream process platform developed by Apache Software Foundation, by Scala and written in Java. Kafka is that a kind of distributed post of high-throughput subscribes to message system, it can handle the institute in the website of consumer's scale There is movement flow data.
Larger enterprise's Kafka cluster has accomplished that reading and writing data separates.Kafka reading and writing data mutually decouples, upstream A data are write, there can be multiple downstream traffics to consume this part of data, so the quantity of consumer is greater than the data of the producer Amount.
Standby in order to carry out data storage calamity, the grade of service higher data larger for data volume are often in multiple clusters Upper backup storage, usually writes cluster, multiple readset groups for one.Fig. 1 shows consumer spending one and writes cluster, multiple readsets The schematic diagram of the Kafka cluster of group.Data volume, the grade of service rise the business datum of a grade again, then need a industry Business data split into more parts and meanwhile write into it is multiple write cluster, it is same respectively that this more parts of data for writing cluster are then converged into a data again It walks to multiple reading cluster backups.Fig. 2 shows consumer spending it is multiple write cluster, multiple readset groups Kafka cluster signal Figure.And, business unessential data little for data volume, the then mode taken are the same unseparated cluster of read-write.Fig. 3 Show the schematic diagram that consumer spending reads and writes unseparated Kafka cluster.
Summary of the invention
The technical problem that the disclosure solves is, how according to the subject data acquisition request Auto-matching readset of user Group realizes the automatic configuration of Kafka cluster.
According to the one aspect of the embodiment of the present disclosure, a kind of method for obtaining data is provided, comprising: receive user and send Subject data acquisition request, subject data acquisition request include subject data mark;It is inquired using the mark of subject data The kafka cluster subject data information table being pre-created determines the cluster where subject data, kafka collection in kafka cluster Group's subject data information table includes the mapping relations in kafka cluster between each subject data and the cluster at place;To user The configuration parameter and consumption key of the cluster where subject data are sent, so that the cluster where user from subject data obtains master Inscribe data.
In some embodiments, if determining in kafka cluster, the cluster where subject data includes: in kafka cluster In the presence of the readset group of storage subject data, therefrom select the smallest readset group of load as the cluster where subject data.
In some embodiments, if determining in kafka cluster, the cluster where subject data includes: in kafka cluster There is no the readset group of storage subject data, there is the read-write cluster for storing subject data, read-write cluster is determined as number of topics According to the cluster at place.
In some embodiments, if determining in kafka cluster, the readset group where subject data includes: in kafka collection There is no storing, the readset group of subject data, there is no storing, the read-write cluster of subject data, there are storage subject datas in group Write cluster, from the storage cluster topological diagram being pre-created inquiry write the associated readset group of cluster;Cluster synchronization will be write extremely Associated readset group, and determine according to the readset group synchronized the cluster where subject data.
In some embodiments, cluster synchronization will be write to associated readset group, and true according to the readset group synchronized If the cluster where determining subject data includes: to write cluster as single wheel synchronization type, cluster synchronization will be write and loaded most to associated Small readset group;If writing cluster is more wheel synchronization types, cluster synchronization will be write to associated all readset groups, and therefrom select The smallest readset group is loaded as the cluster where subject data.
In some embodiments, will write cluster synchronization to associated readset group include: that will be write by multiple synchronous tasks For cluster synchronization to each associated readset group, the data volume that each synchronous task is synchronized is no more than preset value;According to default The data volume for each subject data that value and each synchronous task are synchronized determines the master that each synchronous task is synchronized respectively Inscribe the quantity of data.
In some embodiments, subject data acquisition request also includes the grade of service of user;Method further include: by business Grade is associated with the cluster where subject data;According to the associated grade of service of each cluster, the synthesis of each cluster is determined Evaluation grade;According to the comprehensive assessment grade of each cluster, cluster isolation is carried out to each cluster.
According to the other side of the embodiment of the present disclosure, a kind of device for obtaining data is provided, comprising: message sink mould Block is configured as receiving the subject data acquisition request that user sends, and subject data acquisition request includes the mark of subject data; Cluster determining module is configured as the kafka cluster subject data information table being pre-created using the mark inquiry of subject data, Determine that the cluster where subject data, kafka cluster subject data information table include each in kafka cluster in kafka cluster Mapping relations between a subject data and the cluster at place;Message transmission module is configured as sending subject data to user The configuration parameter and consumption key of the cluster at place, so that the cluster where user from subject data obtains subject data.
In some embodiments, cluster determining module is configured as: if there is the reading of storage subject data in kafka cluster Cluster therefrom selects the smallest readset group of load as the cluster where subject data.
In some embodiments, cluster determining module is configured as: if there is no storage subject datas in kafka cluster Readset group, the read-write cluster that there is storage subject data will read and write the cluster where cluster is determined as subject data.
In some embodiments, cluster determining module is configured as: if there is no storage subject datas in kafka cluster Readset group, there is no the read-write cluster of storage subject data, exist and store subject data and write cluster, deposited from what is be pre-created The associated readset group of cluster is write in inquiry in accumulation topological diagram;Cluster synchronization will be write to associated readset group, and according to same The readset group that step obtains determines the cluster where subject data.
In some embodiments, cluster determining module is configured as: if writing cluster as single wheel synchronization type, it is same will to write cluster Step loads the smallest readset group to associated;If writing cluster is more wheel synchronization types, cluster synchronization will be write to associated institute There is readset group, and therefrom selects the smallest readset group of load as the cluster where subject data.
In some embodiments, cluster determining module is configured as: will write cluster synchronization to every by multiple synchronous tasks A associated readset group, the data volume that each synchronous task is synchronized are no more than preset value;According to preset value and each same The data volume for each subject data that step task is synchronized, determines the number for the subject data that each synchronous task is synchronized respectively Amount.
In some embodiments, subject data acquisition request also includes the grade of service of user;Device further include cluster every It from module, is configured as: the grade of service is associated with the cluster where subject data;According to associated business of each cluster etc. Grade, determines the comprehensive assessment grade of each cluster;According to the comprehensive assessment grade of each cluster, to each cluster carry out cluster every From.
According to the another aspect of the embodiment of the present disclosure, a kind of device for obtaining data is provided, comprising: memory;With And it is coupled to the processor of memory, processor is configured as executing acquisition above-mentioned based on instruction stored in memory The method of data.
According to another aspect of the embodiment of the present disclosure, a kind of computer readable storage medium is provided, wherein computer Readable storage medium storing program for executing is stored with computer instruction, and the method above-mentioned for obtaining data is realized in instruction when being executed by processor.
The disclosure can realize Kafka cluster according to the subject data acquisition request Auto-matching readset group of user Automatic configuration.
By the detailed description referring to the drawings to the exemplary embodiment of the disclosure, the other feature of the disclosure and its Advantage will become apparent.
Detailed description of the invention
In order to illustrate more clearly of the embodiment of the present disclosure or technical solution in the prior art, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Disclosed some embodiments without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.
Fig. 1 show consumer spending one write cluster, multiple readset groups Kafka cluster schematic diagram.
Fig. 2 shows consumer spending it is multiple write cluster, multiple readset groups Kafka cluster schematic diagram.
Fig. 3 shows the schematic diagram that consumer spending reads and writes unseparated Kafka cluster.
Fig. 4 shows the flow diagram of the method for the acquisition data of an embodiment of the present disclosure.
Fig. 5 shows the flow diagram that the cluster where subject data is determined in kafka cluster.
Fig. 6 shows the flow diagram of the method for the acquisition data of the disclosure another embodiment.
Fig. 7 shows the working-flow for realizing the application examples for the method for obtaining data.
Fig. 8 shows the structural schematic diagram of the device of the acquisition data of an embodiment of the present disclosure.
Fig. 9 shows the structural schematic diagram of the device of the acquisition data of the disclosure another embodiment.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present disclosure, the technical solution in the embodiment of the present disclosure is carried out clear, complete Site preparation description, it is clear that described embodiment is only disclosure a part of the embodiment, instead of all the embodiments.Below Description only actually at least one exemplary embodiment be it is illustrative, never as to the disclosure and its application or making Any restrictions.Based on the embodiment in the disclosure, those of ordinary skill in the art are not making creative work premise Under all other embodiment obtained, belong to the disclosure protection range.
The data which cluster how inventor to determine to consume on to consumer after a subject data of locking carry out Research.For large-scale enterprise, for Kafka cluster by unified team come O&M, business department is generally only concerned consumption data Accuracy and high-performance, without being concerned about consumption data on which cluster.Therefore, which cluster business department specifically consumes, O&M team is needed to check on.That is, determining the mode used when consuming the data on which cluster for artificial hand at present Dynamic distribution, it may be assumed that firstly, business department is specified to consume for which part subject data, submitted in the form of web page by mail or fill in;Then, Which cluster O&M team inquires the subject data on, and according to the business classification and load etc. of cluster carrying, divides manually With this part of data on certain readset group to business department;Finally, returning to business department's cluster by mail or web page Configuration parameter and consumption key.
Inventor according to the analysis, more than relevant technical solution need operation maintenance personnel to be judged in advance, be manually assigned to User readset group, it may be necessary to which operation maintenance personnel is manually by subject data from cluster synchronization is write to readset group, one side human cost Larger, on the other hand with the growth of Kafka cluster, operation maintenance personnel may cause to distribute to business department due to artificial judgment mistake Door readset group's mistake causes business department's consumption less than the unreasonable distribution of data or cluster resource.In view of the above problems, this It is open to provide the method for new acquisition data, it can be real according to the subject data acquisition request Auto-matching readset group of user The automatic configuration of Kafka cluster is showed.The method of the acquisition data of an embodiment of the present disclosure is introduced below with reference to Fig. 4.
Fig. 4 shows the flow diagram of the method for the acquisition data of an embodiment of the present disclosure.As shown in figure 4, this reality The method for applying the acquisition data in example includes step S402~step S406.
In step S402, the subject data acquisition request that user sends is received, subject data acquisition request includes theme The mark of data.
In step s 404, the kafka cluster subject data information table being pre-created using the mark inquiry of subject data, Determine that the cluster where the subject data, kafka cluster subject data information table include in kafka cluster in kafka cluster Mapping relations between each subject data and the cluster at place.
It determines in kafka cluster and is carried out in detail in the embodiment of the detailed process of the cluster where subject data later It introduces.
In step S406, the configuration parameter from the subject data to user and consumption key of the cluster where sending, so as to Cluster where user from the subject data obtains the subject data.
Above-described embodiment can realize Kafka collection according to the subject data acquisition request Auto-matching readset group of user The automatic configuration of group.
Below with reference to Fig. 5 be discussed in detail how in kafka cluster determine subject data where cluster.
Fig. 5 shows the flow diagram that the cluster where subject data is determined in kafka cluster.As shown in figure 5, Determine that the process of the cluster where subject data specifically includes step S5042~step S5046 in kafka cluster.
In step S5041, judge in kafka cluster with the presence or absence of the readset group for storing the subject data.
If there is the readset group for storing the subject data in kafka cluster, S5042 is thened follow the steps.
In step S5042, select the smallest readset group of load as the master from the readset group for storing the subject data Inscribe the cluster where data.
If thening follow the steps S5043 there is no the readset group for storing the subject data in kafka cluster.
In step S5043, judge in kafka cluster with the presence or absence of the read-write cluster for storing the subject data.
If there is the read-write cluster for storing the subject data in kafka cluster, S5044 is thened follow the steps.
In step S5044, the cluster where cluster is determined as the subject data will be read and write.
If thening follow the steps S5045 there is no the read-write cluster for storing the subject data in kafka cluster.
In step S5045, judge to write cluster with the presence or absence of storing the subject data in kafka cluster.
Store the subject data if it exists writes cluster, thens follow the steps S5046.
In step S5046, the associated readset group of cluster is write in inquiry from the storage cluster topological diagram being pre-created;
In step S5047, cluster synchronization will be write to associated readset group, and determined according to the readset group synchronized Cluster where subject data.
In synchronizing process, if writing cluster as single wheel synchronization type, it is the smallest to associated load that cluster synchronization will be write Readset group;If writing cluster is more wheel synchronization types, cluster synchronization will be write to associated all readset groups, and therefrom selection load The smallest readset group is as the cluster where the subject data.
When synchronous, cluster synchronization can will be write by multiple synchronous tasks to each associated readset group, each synchronization The data volume that task is synchronized is no more than preset value.Then, each master synchronized according to preset value and each synchronous task The data volume for inscribing data, determines the quantity for the subject data that each synchronous task is synchronized respectively.
Store the subject data if it does not exist writes cluster, thens follow the steps S5048.
In step S5048, notify that there is no the clusters for storing the subject data in user's kafka cluster.
It, preferentially will be existing according to the read-write properties for the cluster for storing subject data in kafka cluster in above-described embodiment Readset group is supplied to user and can save overhead, improve working efficiency;Suboptimum by read and write cluster be supplied to user, be because The data volume of read-write cluster is smaller, is chiefly used in storing the relatively low subject data of importance, by with certain priority Cluster looked-up sequence realizes to request the user of relatively more important subject data more rapidly to provide cluster information;It is standby Choosing will write the readset group that cluster synchronization obtains and be supplied to user, can there is only the masters of storage user's request in Kafka cluster When writing cluster of data is inscribed, ensures and provides readset group for user to obtain subject data from readset group.
In addition, the incidence relation for being provided in storage cluster topological diagram and writing cluster Yu readset group is pre-created, pass through storage Cluster topology graph, which can be realized, writes cluster to the automatic synchronization between readset group, and the subject data according to user is furthermore achieved Acquisition request Auto-matching readset group, is furthermore achieved the automatic configuration of Kafka cluster.
The method of the acquisition data of the disclosure another embodiment is introduced below with reference to Fig. 6.
Fig. 6 shows the flow diagram of the method for the acquisition data of the disclosure another embodiment.As shown in fig. 6, this The method of acquisition data in embodiment further includes step S602~step S612.
In step S602, the subject data acquisition request that user sends is received, subject data acquisition request includes theme The mark of data and the grade of service of user.
In step s 604, the kafka cluster subject data information table being pre-created using the mark inquiry of subject data, Determine that the cluster where the subject data, kafka cluster subject data information table include in kafka cluster in kafka cluster Mapping relations between each subject data and the cluster at place.
In step S606, the configuration parameter from the subject data to user and consumption key of the cluster where sending, so as to Cluster where user from the subject data obtains the subject data.
It is in step S608, the grade of service is associated with the cluster where subject data.
For example, being grade L2 by the grade of service of user U1, then grade L2 is associated with to where the subject data of request Cluster M1.
In step S610, according to the associated grade of service of each cluster, the comprehensive assessment grade of each cluster is determined.
For example, cluster M1 can be associated with the user gradation of multiple user U1~U10.It can be by this associated 10 user gradations It is averaged, obtains the comprehensive assessment grade m1 of cluster M1.
In step S612, according to the comprehensive assessment grade of each cluster, cluster isolation is carried out to each cluster.
For example, the cluster of different comprehensive assessment grades can be carried out cluster isolation.
In above-described embodiment, the objective evaluation to cluster grade is realized by the grade of service of user, and according to cluster Comprehensive assessment grade realize automation cluster isolation, therefore the automatic configuration of Kafka cluster is furthermore achieved.
A specific application examples of the method for realizing that the disclosure obtains data is described below.The application example passes through Kafka Data synchronous task management system and Kafka cluster information management system are come between cluster Customer Administration Systems, Kafka cluster Realize the method that the disclosure obtains data.
(1) Kafka cluster Customer Administration Systems
The system has recorded the consumer information of all requesting consumption Kafka data, specifically includes:
(1) business information of consumer can specifically include business item group, the grade of service, purposes belonging to consumer Description etc., the portfolio carried so as to Kafka operation maintenance personnel to cluster, business importance integrated management are convenient for operation maintenance personnel Cluster isolation is done by the grade of service and cost accounting is done for the resource that granularity uses to business item;
(2) parameter information for consuming Kafka data can specifically include consumption which part business datum, this part of business datum Pot-life, the partition holding of this part of business datum etc., the understanding institute's consumption data being more clear convenient for business department deposits Store up situation;
(3) link information of the cluster where data needed for consumer, the user name that can specifically include connection cluster are close Code, the specific service device of Kafka readset group to be connected etc., carry out self-service query convenient for business department at any time, without at other It is recorded in text.
There is kafka cluster subject data information table in Kafka cluster Customer Administration Systems, in the production of subject data When side's creation subject data, operation maintenance personnel can be distributed the subject data to some cluster according to traffic packets, in the process Middle Kafka cluster Customer Administration Systems have automatically recorded the relationship of subject data and cluster.The system opens business department It puts and business department initiates to consume the entrance of Kafka request of data.
(2) data synchronous task management system between Kafka cluster
The system synchronous task management system of data between Kafka cluster is the master for realizing Kafka business read and write abruption Want tool.During cluster synchronization, source cluster one is set to the cluster of writing of Kafka, and target cluster is centainly the readset of Kafka Group.It is most important also to want the subject data that configuration task to be synchronized, configuration in synchronous task in addition to the parameter of configuration good luck row Process is as follows:
(1) which subject data is recorded in a table each cluster;
(2) source cluster is determined;
(3) the industry subject data to be synchronized is selected in the source cluster wide that step (2) determines;
(4) target cluster is determined;
(5) it saves and starts synchronous task.
The theme quantity that each synchronous task can synchronize, can according to the size of the data volume of the theme synchronized come It determines, when threshold value of the synchronous data volume more than setting, then will start new synchronous task.
(3) Kafka cluster information management system
Using Kafka cluster as granularity, configuration parameter maintenance, server info maintenance, traffic packets are carried out to Kafka cluster The information such as maintenance, cluster topology relationship;Cluster information management system plays the role of dictionary of information in entire scheme;The system By operation maintenance personnel according to the actual conditions of cluster (such as in the dilatation of Kafka cluster, capacity reducing, increase and decrease machine, system upgrade, parameter It is manually entered when change by operation maintenance personnel or batch imports) come what is safeguarded.
The application example is mainly realized automatically by the topological relation safeguarded in cluster management system by the number of readset group According to distributing to business department.In topological relation, cluster is write with corresponding sub-cluster, so-called sub-cluster refers to the master write on cluster Inscribe the corresponding readset group of data, i.e., when starting synchronous task, corresponding target cluster.Cluster management system also needs to remember Record the attribute of each cluster, it is which is defined by operation maintenance personnel and create a cluster when typing Kafka cluster believe Cease management system.Readset group only allows business department to read data from the cluster, and writing cluster only allows business department to the cluster Write data;Read-write cluster had both allowed business department to write data to the cluster, while also business department being allowed to consume number from the cluster According to.The mode of cluster synchronization include two kinds: singly synchronize refer to if the sub-cluster of the cluster have it is multiple, according to load selection one It is synchronous that a sub-cluster carries out data;More synchronize refer to if the sub-cluster of the cluster have it is multiple, respectively to these sub-clusters into Row data are synchronous.
Fig. 7 shows the working-flow for realizing the application examples for the method for obtaining data.As can be seen from Figure 7 it is whole not The participation of operation maintenance personnel is needed, judges the reasonable readset group of auto-returned to business department according to rule after business department's sending request Door.
The application example is specified instead of the Kafka readset group of operation maintenance personnel manual allocation subject data according to business department Subject data, safeguard the logic rules Auto-matching readset group in cluster dictionary and program, without operation maintenance personnel carry out Manual configuration realizes the automation of business department's consumption Kafka data acquisition cluster configuration information.
The device of the acquisition data of an embodiment of the present disclosure is described below with reference to Fig. 8.
Fig. 8 shows the structural schematic diagram of the device of the acquisition data of an embodiment of the present disclosure.As shown in figure 8, this reality The device 80 for applying the acquisition data in example includes:
Message reception module 802 is configured as receiving the subject data acquisition request that user sends, and subject data acquisition is asked Seek the mark comprising subject data;
Cluster determining module 804 is configured as the kafka cluster theme being pre-created using the mark inquiry of subject data Data information table determines that the cluster where subject data, kafka cluster subject data information table include in kafka cluster Mapping relations in kafka cluster between each subject data and the cluster at place;
Message transmission module 806 is configured as the configuration parameter from subject data to user and consumption of the cluster where sending Key, so that the cluster where user from subject data obtains subject data.
Above-described embodiment can realize Kafka collection according to the subject data acquisition request Auto-matching readset group of user The automatic configuration of group.
In some embodiments, cluster determining module 804 is configured as: if there is storage subject data in kafka cluster Readset group, therefrom select the smallest readset group of load as the cluster where subject data.
In some embodiments, cluster determining module 804 is configured as: if there is no storage numbers of topics in kafka cluster According to readset group, there is the read-write cluster of storage subject data, cluster cluster will be read and write be determined as where subject data.
In some embodiments, cluster determining module 804 is configured as: if there is no storage themes in kafka cluster The readset group of data, there is no the read-write cluster of storage subject data, exist and store subject data and write cluster, from being pre-created Storage cluster topological diagram in inquiry write the associated readset group of cluster;Cluster synchronization will be write to associated readset group, and root The cluster where subject data is determined according to the readset group synchronized.
It, preferentially will be existing according to the read-write properties for the cluster for storing subject data in kafka cluster in above-described embodiment Readset group is supplied to user and can save overhead, improve working efficiency;Suboptimum by read and write cluster be supplied to user, be because The data volume of read-write cluster is smaller, is chiefly used in storing the relatively low subject data of importance, by with certain priority Cluster looked-up sequence realizes to request the user of relatively more important subject data more rapidly to provide cluster information;It is standby Choosing will write the readset group that cluster synchronization obtains and be supplied to user, can there is only the masters of storage user's request in Kafka cluster When writing cluster of data is inscribed, ensures and provides readset group for user to obtain subject data from readset group.
In addition, the incidence relation for being provided in storage cluster topological diagram and writing cluster Yu readset group is pre-created, pass through storage Cluster topology graph, which can be realized, writes cluster to the automatic synchronization between readset group, and the subject data according to user is furthermore achieved Acquisition request Auto-matching readset group, is furthermore achieved the automatic configuration of Kafka cluster.
In some embodiments, cluster determining module 804 is configured as: if writing cluster as single wheel synchronization type, will write collection Group synchronization loads the smallest readset group to associated;If writing cluster is more wheel synchronization types, cluster synchronization will be write to associated All readset groups, and therefrom select the smallest readset group of load as the cluster where subject data.
In some embodiments, cluster determining module 804 is configured as: will write cluster synchronization extremely by multiple synchronous tasks Each associated readset group, the data volume that each synchronous task is synchronized are no more than preset value;According to preset value and each The data volume for each subject data that synchronous task is synchronized, determines the number for the subject data that each synchronous task is synchronized respectively Amount.
In some embodiments, subject data acquisition request also includes the grade of service of user;Device 80 further includes cluster Isolation module 808, is configured as: the grade of service is associated with the cluster where subject data;It is associated according to each cluster The grade of service determines the comprehensive assessment grade of each cluster;According to the comprehensive assessment grade of each cluster, each cluster is carried out Cluster isolation.
In above-described embodiment, the objective evaluation to cluster grade is realized by the grade of service of user, and according to cluster Comprehensive assessment grade realize automation cluster isolation, therefore the automatic configuration of Kafka cluster is furthermore achieved.
Fig. 9 shows the structural schematic diagram of the device of the acquisition data of the disclosure another embodiment.As shown in figure 9, should The device 90 of the acquisition data of embodiment includes: memory 910 and the processor 920 for being coupled to the memory 910, processor 920 are configured as executing the side of the acquisition data in any one aforementioned embodiment based on the instruction being stored in memory 910 Method.Wherein, memory 910 is such as may include system storage, fixed non-volatile memory medium.System storage is for example It is stored with operating system, application program, Boot loader (Boot Loader) and other programs etc..
The device 90 for obtaining data can also include input/output interface 930, network interface 940, memory interface 950 etc.. It can for example be connected by bus 960 between these interfaces 930,940,950 and memory 910 and processor 920.Wherein, The input-output equipment such as input/output interface 930 is display, mouse, keyboard, touch screen provide connecting interface.Network interface 940 provide connecting interface for various networked devices.The external storages such as memory interface 950 is SD card, USB flash disk provide connection and connect Mouthful.
The disclosure further includes a kind of computer readable storage medium, is stored thereon with computer instruction, and the instruction is processed Device realizes the acquisition data in any one aforementioned embodiment method when executing.
It should be understood by those skilled in the art that, embodiment of the disclosure can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the disclosure Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the disclosure, which can be used in one or more, The calculating implemented in non-transient storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) can be used The form of machine program product.
The disclosure is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present disclosure Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
The foregoing is merely the preferred embodiments of the disclosure, not to limit the disclosure, all spirit in the disclosure and Within principle, any modification, equivalent replacement, improvement and so on be should be included within the protection scope of the disclosure.

Claims (16)

1. a kind of method for obtaining data, comprising:
The subject data acquisition request that user sends is received, the subject data acquisition request includes the mark of subject data;
The kafka cluster subject data information table being pre-created using the mark inquiry of the subject data, in kafka cluster Determine that the cluster where the subject data, the kafka cluster subject data information table include each theme in kafka cluster Mapping relations between data and the cluster at place;
The configuration parameter of cluster where sending from the subject data to user and consumption key, so that user is from the number of topics The subject data is obtained according to the cluster at place.
2. the method for claim 1, wherein cluster determined in kafka cluster where the subject data Include:
If there is the readset group for storing the subject data in kafka cluster, therefrom select the smallest readset group of load as institute State the cluster where subject data.
3. the method for claim 1, wherein cluster determined in kafka cluster where the subject data Include:
If there is no store the readset group of the subject data, there is the read-write collection for storing the subject data in kafka cluster Group, the cluster read-write cluster being determined as where the subject data.
4. the method for claim 1, wherein readset determined in kafka cluster where the subject data Group include:
If there is no storing, the readset group of the subject data, there is no the readings for storing the subject data in kafka cluster Write cluster, there is the storage subject data write cluster, write collection described in inquiry from the storage cluster topological diagram being pre-created The associated readset group of faciation;
By the cluster synchronization of writing to associated readset group, and the subject data institute is determined according to the readset group synchronized Cluster.
5. method as claimed in claim 4, wherein it is described by the cluster synchronization of writing to associated readset group, and according to Cluster where the readset group synchronized determines the subject data includes:
If the cluster of writing is single wheel synchronization type, by the cluster synchronization of writing to the associated the smallest readset group of load;
If the cluster of writing is more wheel synchronization types, by the cluster synchronization of writing to associated all readset groups, and therefrom select The smallest readset group of load is selected as the cluster where the subject data.
6. method as claimed in claim 4, wherein described to write cluster synchronization by described to associated readset group and include:
By multiple synchronous tasks by the cluster synchronization of writing to each associated readset group, what each synchronous task was synchronized Data volume is no more than preset value;
According to the data volume for each subject data that the preset value and each synchronous task are synchronized, determine respectively each same The quantity for the subject data that step task is synchronized.
7. the method for claim 1, wherein the subject data acquisition request also includes the grade of service of user;
The method also includes:
The grade of service is associated with the cluster where the subject data;
According to the associated grade of service of each cluster, the comprehensive assessment grade of each cluster is determined;
According to the comprehensive assessment grade of each cluster, cluster isolation is carried out to each cluster.
8. a kind of device for obtaining data, comprising:
Message reception module is configured as receiving the subject data acquisition request that user sends, the subject data acquisition request Mark comprising subject data;
Cluster determining module is configured as the kafka cluster number of topics being pre-created using the mark inquiry of the subject data According to information table, the cluster where the subject data, the kafka cluster subject data information table are determined in kafka cluster Include the mapping relations in kafka cluster between each subject data and the cluster at place;
Message transmission module is configured as sending the configuration parameter of the cluster where the subject data to user and consume close Key, so that the cluster where user from the subject data obtains the subject data.
9. device as claimed in claim 8, wherein the cluster determining module is configured as:
If there is the readset group for storing the subject data in kafka cluster, therefrom select the smallest readset group of load as institute State the cluster where subject data.
10. device as claimed in claim 8, wherein the cluster determining module is configured as:
If there is no store the readset group of the subject data, there is the read-write collection for storing the subject data in kafka cluster Group, the cluster read-write cluster being determined as where the subject data.
11. device as claimed in claim 8, wherein the cluster determining module is configured as:
If there is no storing, the readset group of the subject data, there is no the readings for storing the subject data in kafka cluster Write cluster, there is the storage subject data write cluster, write collection described in inquiry from the storage cluster topological diagram being pre-created The associated readset group of faciation;
By the cluster synchronization of writing to associated readset group, and the subject data institute is determined according to the readset group synchronized Cluster.
12. device as claimed in claim 11, wherein the cluster determining module is configured as:
If the cluster of writing is single wheel synchronization type, by the cluster synchronization of writing to the associated the smallest readset group of load;
If the cluster of writing is more wheel synchronization types, by the cluster synchronization of writing to associated all readset groups, and therefrom select The smallest readset group of load is selected as the cluster where the subject data.
13. device as claimed in claim 11, wherein the cluster determining module is configured as:
By multiple synchronous tasks by the cluster synchronization of writing to each associated readset group, what each synchronous task was synchronized Data volume is no more than preset value;
According to the data volume for each subject data that the preset value and each synchronous task are synchronized, determine respectively each same The quantity for the subject data that step task is synchronized.
14. device as claimed in claim 8, wherein the subject data acquisition request also includes the grade of service of user;
Described device further includes cluster isolation module, is configured as: by the collection where the grade of service and the subject data Faciation association;According to the associated grade of service of each cluster, the comprehensive assessment grade of each cluster is determined;According to each cluster Comprehensive assessment grade carries out cluster isolation to each cluster.
15. a kind of device for obtaining data, comprising:
Memory;And
It is coupled to the processor of the memory, the processor is configured to the instruction based on storage in the memory, The method for executing the acquisition data as described in any one of claims 1 to 7.
16. a kind of computer readable storage medium, wherein the computer-readable recording medium storage has computer instruction, institute State the method that the acquisition data as described in any one of claims 1 to 7 are realized when instruction is executed by processor.
CN201810822678.6A 2018-07-25 2018-07-25 Method and device for acquiring data and computer readable storage medium Active CN109325200B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810822678.6A CN109325200B (en) 2018-07-25 2018-07-25 Method and device for acquiring data and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810822678.6A CN109325200B (en) 2018-07-25 2018-07-25 Method and device for acquiring data and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109325200A true CN109325200A (en) 2019-02-12
CN109325200B CN109325200B (en) 2021-05-25

Family

ID=65263716

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810822678.6A Active CN109325200B (en) 2018-07-25 2018-07-25 Method and device for acquiring data and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109325200B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110868322A (en) * 2019-11-15 2020-03-06 北京京东尚科信息技术有限公司 Network management method, system, device and storage medium for distributed message service
WO2020244233A1 (en) * 2019-06-05 2020-12-10 创新先进技术有限公司 Data storage method and apparatus
US10970309B2 (en) 2019-06-05 2021-04-06 Advanced New Technologies Co., Ltd. Data storage method and apparatus
CN112751938A (en) * 2020-12-30 2021-05-04 上海赋算通云计算科技有限公司 Real-time data synchronization system based on multi-cluster operation, implementation method and storage medium
CN115174423A (en) * 2022-07-12 2022-10-11 中国建设银行股份有限公司 Kafka query method, device, storage medium and equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608223A (en) * 2016-01-12 2016-05-25 北京中交兴路车联网科技有限公司 Hbase database entering method and system for kafka
CN105631026A (en) * 2015-12-30 2016-06-01 北京奇艺世纪科技有限公司 Security data analysis system
CN106293968A (en) * 2016-08-04 2017-01-04 华中科技大学 A kind of intercommunication system based on Kafka message-oriented middleware and method
US20170242889A1 (en) * 2016-02-24 2017-08-24 Salesforce.Com, Inc. Cache Based Efficient Access Scheduling for Super Scaled Stream Processing Systems
US20170272516A1 (en) * 2016-03-17 2017-09-21 International Business Machines Corporation Providing queueing in a log streaming messaging system
CN108038226A (en) * 2017-12-25 2018-05-15 郑州云海信息技术有限公司 A kind of data Fast Acquisition System and method
CN108197233A (en) * 2017-12-29 2018-06-22 飞狐信息技术(天津)有限公司 A kind of data managing method, middleware and data management system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631026A (en) * 2015-12-30 2016-06-01 北京奇艺世纪科技有限公司 Security data analysis system
CN105608223A (en) * 2016-01-12 2016-05-25 北京中交兴路车联网科技有限公司 Hbase database entering method and system for kafka
US20170242889A1 (en) * 2016-02-24 2017-08-24 Salesforce.Com, Inc. Cache Based Efficient Access Scheduling for Super Scaled Stream Processing Systems
US20170272516A1 (en) * 2016-03-17 2017-09-21 International Business Machines Corporation Providing queueing in a log streaming messaging system
CN106293968A (en) * 2016-08-04 2017-01-04 华中科技大学 A kind of intercommunication system based on Kafka message-oriented middleware and method
CN108038226A (en) * 2017-12-25 2018-05-15 郑州云海信息技术有限公司 A kind of data Fast Acquisition System and method
CN108197233A (en) * 2017-12-29 2018-06-22 飞狐信息技术(天津)有限公司 A kind of data managing method, middleware and data management system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
葛宝龙 等: ""SpringBoot集成Redis缓存在电信运营系统中的应用研究"", 《中国信通信》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020244233A1 (en) * 2019-06-05 2020-12-10 创新先进技术有限公司 Data storage method and apparatus
US10970309B2 (en) 2019-06-05 2021-04-06 Advanced New Technologies Co., Ltd. Data storage method and apparatus
CN110868322A (en) * 2019-11-15 2020-03-06 北京京东尚科信息技术有限公司 Network management method, system, device and storage medium for distributed message service
CN110868322B (en) * 2019-11-15 2023-04-07 北京京东尚科信息技术有限公司 Network management method, system, device and storage medium for distributed message service
CN112751938A (en) * 2020-12-30 2021-05-04 上海赋算通云计算科技有限公司 Real-time data synchronization system based on multi-cluster operation, implementation method and storage medium
CN112751938B (en) * 2020-12-30 2023-04-07 上海赋算通云计算科技有限公司 Real-time data synchronization system based on multi-cluster operation, implementation method and storage medium
CN115174423A (en) * 2022-07-12 2022-10-11 中国建设银行股份有限公司 Kafka query method, device, storage medium and equipment

Also Published As

Publication number Publication date
CN109325200B (en) 2021-05-25

Similar Documents

Publication Publication Date Title
CN109325200A (en) Obtain the method, apparatus and computer readable storage medium of data
US9740706B2 (en) Management of intermediate data spills during the shuffle phase of a map-reduce job
CN109857445A (en) Storage system and control software layout method
CN104965850B (en) A kind of database high availability implementation method based on open source technology
CN101739292B (en) Based on isomeric group operation self-adapting dispatching method and the system of application characteristic
CN105653398B (en) A kind of smart allocation operating system mirror method
CN112668968A (en) Storage management modeling method and system based on domain-driven design
CN104915793A (en) Public information intelligent analysis platform based on big data analysis and mining
DE112019002948T5 (en) DETERMINING AN OPTIMAL STORAGE ENVIRONMENT FOR DATA RECORDS AND FOR THE MIGRATION OF DATA RECORDS
WO2016070762A1 (en) Cloud platform monitoring method and cloud platform monitoring system
CN105635311A (en) Method for synchronizing resource pool information in cloud management platform
CN109992350A (en) The moving method and device of virtual machine in the resource pool of cloud computing system
CN107748752A (en) A kind of data processing method and device
CN110188149A (en) A kind of data warehouse
CN106095957A (en) Cross-domain many wave files synchronous method of distributed file system and device
CN106570093A (en) Independent metadata organization structure-based massive data migration method and apparatus
CN110083306A (en) A kind of distributed objects storage system and storage method
CN109150964A (en) A kind of transportable data managing method and services migrating method
CN115408546A (en) Time sequence data management method, device, equipment and storage medium
CN106780157B (en) Ceph-based power grid multi-temporal model storage and management system and method
CN110119341B (en) IDC equipment management and control system
CN114281509A (en) Spacecraft multi-user collaborative task planning system based on scene description
CN104123104B (en) Daily record control system and method
CN109614380A (en) Log processing method, system, computer equipment and readable medium
CN106134470B (en) A kind of DNC integrated management system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant