CN113608700A - Data transmission processing method and processing system - Google Patents
Data transmission processing method and processing system Download PDFInfo
- Publication number
- CN113608700A CN113608700A CN202110909249.4A CN202110909249A CN113608700A CN 113608700 A CN113608700 A CN 113608700A CN 202110909249 A CN202110909249 A CN 202110909249A CN 113608700 A CN113608700 A CN 113608700A
- Authority
- CN
- China
- Prior art keywords
- data
- path
- storage
- node
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/062—Securing storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The application discloses a processing method and a processing system for data transmission, wherein the processing method for data transmission specifically comprises the following steps: receiving data; constructing a data storage transmission path; storing the data into a data storage transmission path; determining an effective storage path of the data according to the storage of the data; carrying out abnormity detection on the effective path; and if the abnormity detection is qualified, storing the received data into an effective storage path. According to the method and the device, after the data are received, the transmission and storage path of the data can be effectively and efficiently planned, the path is continuously optimized, and the reliability and the safety of the data transmission process are guaranteed.
Description
Technical Field
The present application relates to the field of data processing, and in particular, to a processing method and a processing system for data transmission.
Background
With the explosive growth of data, how a distributed storage system effectively queries, writes and the like mass data becomes a research focus in the field of data storage. The existing data storage mode is mainly characterized in that a large database is arranged to store mass data specially, although the data storage mode can meet the requirement of high storage capacity of the mass data through the large database, the efficiency of inquiring and writing certain data in the large database is greatly reduced, and the data processing efficiency is sacrificed. In addition, during the data transmission and storage process, various uncertain factors are encountered, so that the storage link cannot store the data.
Therefore, how to realize data transmission and storage under the condition of considering both the data processing efficiency and the high storage requirement of mass data becomes a problem which needs to be solved urgently by technical personnel in the field.
Disclosure of Invention
The application provides a data transmission processing method, which specifically comprises the following steps: receiving data; constructing a data storage transmission path; storing the data into a data storage transmission path; determining an effective storage path of the data according to the storage of the data; carrying out abnormity detection on the effective path; and if the abnormity detection is qualified, storing the received data into an effective storage path.
As described above, if the anomaly detection is not qualified, the regeneration node is reselected.
As above, wherein the storage transmission path is constituted by system nodes including a root node, a middle route node, and a leaf node.
As above, wherein the received data is stored into the intermediate path node and the leaf node.
As above, wherein when the data amount of the received data is too large, the data is stored in blocks to different intermediate way nodes, or the data is stored in blocks to different leaf nodes.
The above, wherein the effective storage path of the data is determined according to the storage of the data, comprises the following sub-steps: determining an effective storage path according to the storage condition of the data in the data storage transmission path; checking whether the effective path can completely accommodate the received data; and constructing a new effective path.
As above, wherein the constructing of the new valid path further includes checking whether the node storing the data can perform the regeneration of the node according to the evaluation index.
As above, wherein the evaluation index b1 is specifically represented as;
where v denotes the disk read/write speed, vmaxAnd vminThe comparison standard is the preset magnetic disk read-write speed.
As above, any one of the nodes that store data is selected to perform the regeneration of the plurality of nodes until the stored data and the regeneration node completely accommodate the received data.
A processing system for data transmission comprises a receiving unit, a construction unit, a first storage unit, a path determination unit, an abnormality detection unit, a selection unit and a second storage unit; a receiving unit for receiving data; the construction unit is used for constructing a data storage transmission path; a first storage unit for storing data into the data storage transmission path; the path determining unit is used for determining an effective storage path of the data according to the storage of the data; an abnormality detection unit configured to perform abnormality detection of the effective path; the selecting unit is used for re-selecting the regeneration node if the abnormal detection is unqualified; and the second storage unit is used for storing the received data into the effective storage path if the abnormity detection is qualified.
The application has the following beneficial effects:
the data transmission processing method and the data transmission processing system can effectively and efficiently plan the transmission and storage path of the data after the data are received, and continuously optimize the path, so that the reliability and the safety of the data transmission process are ensured.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a flow chart of a method for optimizing data storage according to an embodiment of the present application;
fig. 2 is an internal structural diagram of an optimization system for data storage according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application are clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The application relates to a method and a system for optimizing data storage. According to the method and the device, risks existing in the data transmission process can be effectively identified, and data storage is well optimized.
Example one
As shown in fig. 1, the method for optimizing data storage provided by the present application specifically includes the following steps:
step S110: data is received.
Specifically, after receiving the data, the method further includes retrieving the data and determining whether the data is valid data. The existing tool can be used for retrieving data, for example, an HBase platform is used for effectively retrieving data, the HBase is a distributed database, large-scale data can be read and written in real time, and aggregation retrieval and continuous retrieval of data can be performed in the process of reading data.
The data aggregation retrieval refers to unified retrieval of data received in a specified time period, and the continuous retrieval refers to real-time retrieval of data after a section of data is received.
Specifically, the data generally has a nominal attribute, a binary attribute, an ordinal attribute, a numerical attribute, a discrete attribute, a continuous attribute, and the like, and therefore the aggregate retrieval of the data and the continuous retrieval of the data are essentially data attribute retrieval.
Wherein if the received data satisfies at least one of the above attributes, step S120 is executed.
Step S120: and constructing a data storage transmission path.
Specifically, the data storage transmission is to transmit data to the system node for storage, and before the data transmission process, a storage path is further constructed.
Specifically, the storage path is composed of system nodes including a root node, a middle route node, and a leaf node. The root node S is connected to other first-layer middle path nodes, the root node S and the first-layer middle path nodes are in a one-to-many relationship, the first-layer middle path nodes are connected with each other, each middle path node is connected to the second-layer middle path nodes, each node of the first-layer middle path nodes and the second middle path nodes are in a one-to-one or one-to-many relationship, and the like until the arrangement of the leaf nodes of the Nth layer is completed, and the root node, the middle path nodes and the leaf nodes are connected into a topological structure which is a path for data transmission and storage.
Further, data may be stored therein to any of the intermediate nodes and leaf nodes. If the data is stored to any middle path node, the middle path node is a storage node, and the root node and the rest middle path nodes are transmission nodes for transmitting the data. In other words, the nodes storing data may all be storage nodes, and the remainder may be transport nodes.
Step S130: and storing the data into the data storage transmission path.
Specifically, the received data is stored in the data storage path, and since the data amount of the received data may be large or small, when the data amount of the received data is too large, the data may be stored in blocks to different intermediate path nodes, or the data may be stored in blocks to different leaf nodes.
Step S140: and determining an effective storage path of the data according to the storage of the data.
The exception handling strategy is to determine the effective data storage transmission path firstly, i.e. to determine the exception handling strategy firstly, and to determine the new storage path for copying data according to the exception handling strategy)
Step S1401: and determining an effective storage path according to the storage condition of the data in the data storage transmission path.
The effective storage path is a path to which nodes that have stored data currently are connected, regardless of whether the system node can accommodate the received data.
Step S1402: it is checked whether the valid path can completely accommodate the received data.
If the valid path can completely accommodate the received data, step S150 is performed, otherwise step S1403 is performed.
Step S1403: and constructing a new effective path.
Specifically, if the system node cannot accommodate the received data, a node is regenerated, and the regenerated node and the node storing the data before constitute a new effective path. The new effective path can completely store data, and the burden of data storage of system nodes is reduced.
If the data is stored in the node A, B, C, and the other nodes have already stored the data and the system node does not have the capability of completely containing the data, the node A, B, C regenerates one or more nodes, and the regenerated node is a child node of the node A, B, C.
Specifically, constructing the new effective path further includes checking whether the node storing the data can perform node regeneration according to a predetermined regeneration node selection standard.
Using a plurality of nodes for storing data as evaluation objects, wherein the evaluation objects are defined as c1, c 2.. cn, setting m evaluation indexes for each object, and monitoring to obtain the evaluation indexes b of the nodes cijObserved value f in (1)ij。
Wherein, the evaluation index is specifically an evaluation index b obtained according to the read-write speed of the disk1And an evaluation index b obtained from the network transmission delay2。
Wherein, v is used to represent the disk read-write speed, and the disk read-write speed of the node is considered to be higher than vmaxPreferably, below vminWorst, vmaxAnd vminIf the comparison standard is a preset comparison standard of the read-write speed of the magnetic disk, the evaluation index b1 is specifically expressed as;
the network transmission delay means that a node sends a transmission request to another node, the time t for responding the request is a standard for measuring the network congestion and stability degree between the nodes, and the network transmission delay difference of different nodes is large, so that the node with the smaller network transmission delay is considered to be better.
Further, the network transmission delay of the node is considered to be lower than tminPreferably, higher than tmaxWorst, where tminAnd tmaxIf both are preset, the evaluation index b2 is specifically expressed as:
wherein, a specific comprehensive evaluation value of each node ci is obtained according to the evaluation index, and the comprehensive evaluation value W is specifically represented as:
where n denotes the number of nodes storing data, m denotes the number of evaluation indexes, k denotes the number of nodes passed through when storing data, fijRepresenting node ci in evaluation index bjI is a natural number.
If the overall evaluation value W is greater than the predetermined threshold value, it is considered that the node storing the data can reproduce the node.
As another example, any one of the nodes A, B, C may be selected to perform regeneration of multiple nodes until the node A, B, C and the regenerating node are able to accommodate the received data. One eligible node among the plurality of nodes is selected to regenerate the regeneration node.
Specifically, the nodes meeting the conditions can be performed according to a plurality of evaluation indexes. For example, the node with the highest disk read-write speed and the highest network transmission delay is selected as the node meeting the conditions, so that the node can regenerate the node.
Step S150: and carrying out abnormity detection on the effective path.
Specifically, anomaly detection is to determine a probability distribution of blocking bandwidths of each layer of nodes (including regeneration nodes) of an active path (or a new active path).
The hindered bandwidth refers to a bandwidth corresponding to a node with a bandwidth smaller than a specified threshold, and for a system node, the hindered bandwidth greatly limits data stored in the system node, so that the probability distribution of the hindered bandwidth of the system node needs to be judged.
Any two nodes in the system nodes form a connected edge, and the connected edge of any two nodes can be an edge connected with two nodes on the same layer or an edge connected with nodes on the upper layer and the lower layer.
The edge weight refers to the importance degree of the connected edges, and since the bandwidth is an important performance index for measuring the network, the bandwidth of each node represents the highest data rate that the node can transmit. Therefore, before calculating the probability distribution of the blocked bandwidth, the method further comprises the steps of acquiring the bandwidth of each node in advance, and setting the edge weight according to the bandwidth of each node.
Because the connected edges correspond to the two nodes, in order to ensure the effectiveness of the path anomaly detection, the node with smaller bandwidth is selected as the node for determining the edge weight.
The larger the bandwidth of the node is, the larger the set edge weight is, and the smaller the bandwidth of the node is, the smaller the set edge weight is.
It should be noted that the edge weight, although settable, cannot exceed a specified range, and the staff member needs to set the edge weight within a reasonable range.
Probability distribution f of blocking bandwidth of system node(a:N)(x) Expressed as:
f(a:N)(x)=N*FN-a(x)[1-F(x)]a-1/(N-a) (a-1) (equation four)
Wherein, N represents the number of the connected edges in the system node, a represents the number of the connected edges corresponding to the preset edge weight larger than the designated threshold, F (x) represents the bandwidth smaller than the designated thresholdProbability of the node corresponding to the value falling in the designated area of the system, FN-a(x) And the probability that the nodes with bandwidths smaller than the specified threshold value in the residual nodes fall into the specified area of the system after the N-a connected edges are removed is represented. Wherein the designated area is set by the staff.
If the probability distribution of the hindered bandwidth is greater than the predetermined probability, it indicates that the distribution of the nodes with the hindered bandwidth is dense and may cause errors during data transmission or storage, and the anomaly detection is not qualified, and step S160 is performed. Otherwise, the anomaly detection is qualified, and step S170 is executed.
Step S160: and reselecting a regeneration node.
Here, the re-selected regeneration node is different from the node selected in step S140, and in this step, the node is regenerated according to the node before the node for storing data. That is, when the data is stored in the node A, B, C, the node that performs the reproduction node is any one or more nodes other than the node A, B, C.
Further, if a plurality of nodes are selected for node regeneration, node judgment still needs to be performed according to the formula three to determine whether the node meets the standard for node regeneration.
Step S170: and storing the received data into an effective storage path.
Example two
As shown in fig. 2, the present application provides a data transmission processing system, which specifically includes: a receiving unit 210, a constructing unit 220, a first storage unit 230, a path determining unit 240, an abnormality detecting unit 250, a selecting unit 260, and a second storage unit 270.
The receiving unit 210 is used for receiving data.
The constructing unit 220 is connected to the receiving unit 210, and is configured to construct a data storage transmission path.
The first storage unit 230 is connected to the construction unit 220 for storing data into the data storage transmission path.
The path determining unit 240 is connected to the first storage unit 230, and is configured to determine an effective storage path of the data according to the storage of the data.
The anomaly detection unit 250 is connected to the path determination unit 240 for performing anomaly detection of the active path.
The selecting unit 260 is connected to the anomaly detection unit 250, and is configured to reselect a regeneration node if the anomaly detection is not qualified.
The second storage unit 270 is connected to the anomaly detection unit 250, and is configured to store the received data in the valid storage path if the anomaly detection is qualified.
The application has the following beneficial effects:
the data transmission processing method and the data transmission processing system can effectively and efficiently plan the transmission and storage path of the data after the data are received, and continuously optimize the path, so that the reliability and the safety of the data transmission process are ensured.
Although the present application has been described with reference to examples, which are intended to be illustrative only and not to be limiting of the application, changes, additions and/or deletions may be made to the embodiments without departing from the scope of the application.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A data transmission processing method is characterized by comprising the following steps:
receiving data;
constructing a data storage transmission path;
storing the data into a data storage transmission path;
determining an effective storage path of the data according to the storage of the data;
carrying out abnormity detection on the effective path;
and if the abnormity detection is qualified, storing the received data into an effective storage path.
2. The method of claim 1, wherein if the anomaly detection fails, the regeneration node is reselected.
3. The data transmission processing method according to claim 1, wherein the storage transmission path is formed of system nodes including a root node, a middle route node, and a leaf node.
4. The method of claim 2, wherein the received data is stored in a middle way node and a leaf node.
5. The data transmission processing method of claim 4, wherein when the data amount of the received data is too large, the data is stored in blocks to different intermediate nodes or the data is stored in blocks to different leaf nodes.
6. The method for processing data transmission according to claim 1, wherein the determining of the valid storage path of the data according to the storage of the data comprises the sub-steps of:
determining an effective storage path according to the storage condition of the data in the data storage transmission path;
checking whether the effective path can completely accommodate the received data;
and constructing a new effective path.
7. The data transmission processing method according to claim 6, wherein the constructing of the new valid path further includes checking whether or not the node storing the data can perform the regeneration of the node, based on the evaluation index.
9. The data transmission processing method according to claim 6, wherein any one of the nodes storing the data is selected to perform the regeneration of the plurality of nodes until the stored data and the regeneration node completely accommodate the received data.
10. A data transmission processing system is characterized by comprising a receiving unit, a construction unit, a first storage unit, a path determination unit, an abnormality detection unit, a selection unit and a second storage unit;
a receiving unit for receiving data;
the construction unit is used for constructing a data storage transmission path;
a first storage unit for storing data into the data storage transmission path;
the path determining unit is used for determining an effective storage path of the data according to the storage of the data;
an abnormality detection unit configured to perform abnormality detection of the effective path;
the selecting unit is used for re-selecting the regeneration node if the abnormal detection is unqualified;
and the second storage unit is used for storing the received data into the effective storage path if the abnormity detection is qualified.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110909249.4A CN113608700A (en) | 2021-08-09 | 2021-08-09 | Data transmission processing method and processing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110909249.4A CN113608700A (en) | 2021-08-09 | 2021-08-09 | Data transmission processing method and processing system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113608700A true CN113608700A (en) | 2021-11-05 |
Family
ID=78340037
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110909249.4A Withdrawn CN113608700A (en) | 2021-08-09 | 2021-08-09 | Data transmission processing method and processing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113608700A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116668521A (en) * | 2023-07-25 | 2023-08-29 | 广东广宇科技发展有限公司 | Distributed multi-element data rapid transmission method based on data structure |
-
2021
- 2021-08-09 CN CN202110909249.4A patent/CN113608700A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116668521A (en) * | 2023-07-25 | 2023-08-29 | 广东广宇科技发展有限公司 | Distributed multi-element data rapid transmission method based on data structure |
CN116668521B (en) * | 2023-07-25 | 2023-10-31 | 广东广宇科技发展有限公司 | Distributed multi-element data rapid transmission method based on data structure |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107544862B (en) | Stored data reconstruction method and device based on erasure codes and storage node | |
CN104584524B (en) | It polymerize the data in intermediary system | |
US20080052322A1 (en) | Conflict resolution in database replication through autonomous node qualified folding | |
CN108108476A (en) | The method of work of highly reliable distributed information log system | |
CN102170460A (en) | Cluster storage system and data storage method thereof | |
CN103106152A (en) | Data scheduling method based on gradation storage medium | |
CN110175070B (en) | Distributed database management method, device, system, medium and electronic equipment | |
CN113608700A (en) | Data transmission processing method and processing system | |
US20150067282A1 (en) | Copy control apparatus and copy control method | |
CN108614837A (en) | File stores and the method and device of retrieval | |
CN113986830B (en) | Cloud data management and task scheduling method and system for distributed CT | |
CN106293492A (en) | A kind of memory management method and distributed file system | |
CN110597655A (en) | Fast predictive restoration method for coupling migration and erasure code-based reconstruction and implementation | |
CN108519927A (en) | A kind of OSD Fault Locating Methods and system based on ICFS systems | |
CN117472652A (en) | Data backup method, device and system of cloud computing operation and maintenance platform | |
CN112711564B (en) | Merging processing method and related equipment | |
CN112883016B (en) | Data storage optimization method and system | |
CN116307260A (en) | Urban road network toughness optimization method and system for disturbance of defective road sections | |
CN116360687A (en) | Cluster distributed storage method, device, equipment and medium | |
CN112835851B (en) | Method and system for processing data file | |
CN111949438B (en) | Multimedia data backup method, device, server and medium | |
CN111147575B (en) | Data storage system based on block chain | |
CN113590630A (en) | Data transmission processing method and processing system | |
CN113791893A (en) | Method and device for realizing capacity balance based on disk grouping | |
CN113611117A (en) | Traffic data processing method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20211105 |
|
WW01 | Invention patent application withdrawn after publication |