WO2023139412A1 - Streaming metrics data integrity in a communication system - Google Patents

Streaming metrics data integrity in a communication system Download PDF

Info

Publication number
WO2023139412A1
WO2023139412A1 PCT/IB2022/050560 IB2022050560W WO2023139412A1 WO 2023139412 A1 WO2023139412 A1 WO 2023139412A1 IB 2022050560 W IB2022050560 W IB 2022050560W WO 2023139412 A1 WO2023139412 A1 WO 2023139412A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
metrics
hash value
metrics data
integrity
Prior art date
Application number
PCT/IB2022/050560
Other languages
French (fr)
Inventor
Yonghui Jin
Qiang Li
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/IB2022/050560 priority Critical patent/WO2023139412A1/en
Publication of WO2023139412A1 publication Critical patent/WO2023139412A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • G06F21/645Protecting data integrity, e.g. using checksums, certificates or signatures using a third party
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/50Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using hash chains, e.g. blockchains or hash trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2151Time stamp

Definitions

  • the present disclosure relates to communication systems, and in particular to communication systems in which streaming metrics data is made available within the systems.
  • Communication systems such as wireless communication networks, generate a significant amount of metrics data that can be consumed by various network entities and functions to assist in the operation of the systems.
  • metrics data may include, for example, data relating to the establishment and management of wireless connections, cells, radio bearers, sessions, resource usage, etc., within the network.
  • a cloud-based network deployment can produce a large quantity of application and platform metrics.
  • Some of the data generated within the network can be used to generate billing information, such as charging data records.
  • Other data produced in the network can include asset data, such as data describing application license usage. Because these data are generated on-the-fly during network operation, the data is typically provided by a data source to data consumers in a streaming format. The data is typically streamed to users and also stored in a time series database, TSDB, for later retrieval.
  • TSDB time series database
  • streaming metrics data generated in the system need to be stored in a transparent, traceable, and immutable way so that the data consumers can be assured of the completeness and integrity of the data they are receiving.
  • a method of operating a data integrity creator includes receiving streaming metrics data of a communication system originating from a metrics data producer, generating integrity ensuring data based on the streaming metrics data, and storing the integrity ensuring data on a blockchain platform.
  • a data integrity creator includes a processing circuit, and a memory that stores computer program instructions that, when executed by the processing circuit, cause the processing circuit to perform operations including receiving streaming metrics data of a communication system originating from a metrics data producer, generating integrity ensuring data based on the streaming metrics data, and storing the integrity ensuring data on a blockchain platform.
  • a data integrity creator includes a streaming data subscription module configured to receive streaming metrics data of a communication system originating from a metrics data producer, and an integrity data generator module configured to generate integrity ensuring data based on the streaming metrics data and store the integrity ensuring data on a blockchain platform.
  • a method of operating a data validator includes obtaining a data request from a metrics data consumer for metrics data of a communication system, retrieving the metrics data from a TSDB, generating a first hash value of the metrics data, obtaining a second hash value of the metrics data from a blockchain platform, comparing the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmitting the metrics data to the metrics data consumer.
  • a data validator includes a processing circuit, and a memory that stores computer program instructions that, when executed by the processing circuit, cause the processing circuit to perform operations including obtaining a data request from a metrics data consumer for metrics data of a communication system, retrieving the metrics data from a TSDB, generating a first hash value of the metrics data, obtaining a second hash value of the metrics data from a blockchain platform, comparing the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmitting the metrics data to the metrics data consumer.
  • a data validator includes a data retrieval module configured to obtain a data request from a metrics data consumer for metrics data of a communication system and retrieve the metrics data from a TSDB, an integrity generator module configured to generate a first hash value of the metrics data, obtain a second hash value of the metrics data from a blockchain platform, and compare the first hash value and the second hash value to determine if the first hash value and the second hash value match, and a communication interface configured, in response to determining that the first hash value and the second hash value match, to transmit the metrics data to the metrics data consumer.
  • a system for ensuring integrity of streaming metrics data includes a data integrity creator that receives streaming metrics data of a communication system originating from a metrics data producer, generates integrity ensuring data based on the streaming metrics data, and stores the integrity ensuring data on a blockchain platform, and a data validator that receives a data request for metrics data from a metrics data consumer, retrieves the metrics data from a time series database, generates a first hash value for the metrics data, retrieves a second hash value for the metrics data from the blockchain platform, compares the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmits the metrics data to the metrics data consumer.
  • Figure 1 is a schematic illustration of a blockchain.
  • Figure 2A is a block diagram of a system for ensuring integrity of streaming metrics data according to some embodiments.
  • Figure 2B illustrates the generation of data blocks from data packets of streaming data according to some embodiments.
  • Figure 3A is a flow diagram that illustrates the generation and storage of integrity ensuring data according to some embodiments.
  • Figure 3B is a flow diagram that illustrates operations of retrieving validated data by a metrics data consumer according to some embodiments.
  • Figure 4 illustrates incremental calculation of a hash value for a data block according to some embodiments.
  • Figure 5 is a functional block diagram that illustrates a data validation system according to some embodiments
  • Figure 6A is a block diagram of a data integrity creator according to some embodiments.
  • Figure 6B illustrates functional modules that may be stored in the memory of data integrity creator.
  • Figure 7A is a block diagram of a data validator according to some embodiments.
  • Figure 7B illustrates functional modules that may be stored in the memory of data validator.
  • Figure 8 illustrates operations of a data integrity creator according to some embodiments.
  • Figure 9 illustrates operations of a data validator according to some embodiments.
  • blockchain technology can be used to help ensure the security, transparency and traceability of data in many different context, and therefore may provide a suitable solution for ensuring the integrity of metrics data, such as streaming metrics data stored in a TSDB, without the need for a dedicated secure data vault.
  • a blockchain is a data structure that consists of a series of blocks of data.
  • the blocks contain transaction data; however, in general, a blockchain may contain any type of data.
  • the integrity of a blockchain derives from the fact that each block contains a cryptographic link to every other block in the chain, which makes it difficult to modify data stored in the blockchain without disrupting the cryptographic links and invalidating the chain.
  • Figure 1 illustrates a blockchain 10 at a high level of generality. Each block in the blockchain is numbered with an index. In Figure 1, blocks N-l, N and N+l of the blockchain are illustrated. Each block includes data stored therein and also includes a cryptographic hash of the previous block in the chain.
  • Block N includes the Block N data as well as a cryptographic hash of the previous block, Block N-l.
  • the cryptographic hash stored in Block N is generated based on both the data in Block N-l and the cryptographic hash of Block N-2 stored in Block N-l.
  • a cryptographic hash is a fixed-length signature of a string of data.
  • Cryptographic hashes which may be generated by any of a number of well-known cryptographic hashing functions, such as the 256-bit Secure Hashing Algorithm (SHA256), are numbers that are deterministically generated according to a hashing algorithm to be uniquely associated with the data that is input into the hashing algorithm. Any change in the input data, even a very slight change, results in an unpredictable change to the resulting hash. Moreover, for large hashing functions, it is computationally infeasible to reconstruct the input data given a resulting hash, or to generate a different set of input data that produces the same hash.
  • SHA256 Secure Hashing Algorithm
  • any change to the data in a block will result in the hash of the block, which is stored in the next block in the chain, being invalidated.
  • any tampering or changing of data in a block can be rapidly detected simply by calculating the hash of the block and comparing it to the stored hash in the next block. Since each block depends on all previous blocks due to the chain of hash values stored in the block, any tampering of one block will affect the hash of all subsequent blocks in the chain.
  • a blockchain may be public or private.
  • a public blockchain anyone is free to join and participate in the core activities of the blockchain network, such as submitting transactions, deploying smart contracts and executing smart contract functions.
  • a private blockchain allows only selected entry of verified participants.
  • the operator of a private blockchain typically has the right to edit or delete entries on the blockchain.
  • blockchains may be permissioned or permissionless. Permissioned blockchains require participants to have permission to access the blockchain and as well as to perform selected activities, such as read and write information and/or interact with smart contracts on the blockchain.
  • Public blockchains are distributed ledgers that any participant can interact with.
  • the users must typically compete for the right to add blocks to the blockchain using a consensus algorithm such as proof-of- work that ensures that significant computational power is needed to generate a new block.
  • a consensus algorithm such as proof-of- work that ensures that significant computational power is needed to generate a new block.
  • the Bitcoin network uses a proof-of-work consensus algorithm that currently requires over 100 million trillion hash operations, or hashes, of computational power per second for approximately ten minutes to find a single block. Because so much computational power is needed to generate each block, the integrity of data in a given block of the blockchain is ensured by the difficulty of finding an alternative valid block before the network finds a new block and extends the chain.
  • a public blockchain platform may publicly expose the data structure of data stored on the blockchain, which may create a security concern depending on the application.
  • a public blockchain platform typically has a floating transaction fee, which means that it may become expensive to perform large numbers of transactions on the network. For example, the average transaction fee on the Ethereum main network is currently over two dollars per transaction. Transactions throughput may also be limited on public blockchains, and they can therefore can be unusable even for small amounts of metrics data.
  • Private blockchains are centralized ledgers that are managed by a trusted entity, which controls the generation and addition of new blocks to the blockchain. Since the trusted entity can change the content of data in a blockchain by recalculating hash values, the users of a blockchain rely on the integrity of the trusted entity to maintain the integrity of the blockchain. Because the trusted entity can add new blocks at any time, private blockchains can achieve relatively high throughput in terms of the rate at which data can be added to the blockchain.
  • a single metrics data producer in a communication system may produce multiple streams of data, each of which generates thousands or even tens of thousands of records per second.
  • a network node in a communication system such as a wireless communication system, may generate a stream of throughput data and a stream of frequency utilization data.
  • Some embodiments described herein provide systems/methods for ensuring the integrity of streaming metrics data that is stored in a TSDB by using off-chain transactions instead of on-chain transactions for storing the metrics data while storing integrity ensuring data on-chain on a blockchain. That is, according to some embodiments, a blockchain may be used to store a unique identifier for a metrics data producer along with integrity ensuring data associated with data produced by the metrics data producer.
  • multiple records of metrics data that are generated by the metrics data producer during a predetermined time interval are grouped into discrete data packets by a monitoring system.
  • the packets are stored in the TSDB along with a timestamp, a producer identifier that identifies the producer of the data and a metrics identifier that identifies the particular metrics data stream generated by the data producer.
  • the data packets can be retrieved from TSDB using the producer identifier, the metrics identifier, and the timestamp.
  • the packets are grouped into data blocks, and a hash value, or checksum, is generated for each block of data packets.
  • the hash value is stored on a blockchain, which may be a public or private blockchain.
  • checksum and “hash value” are used interchangeably herein to refer to a mathematically calculated signature of a set of data, it will be appreciated that hash values and checksums can be calculated in many different ways using many different algorithms. The embodiments described herein are not limited to particular algorithms for calculating hash values or checksums.
  • some embodiments utilize an incremental hashing algorithm for the checksum generation that repeatedly updates the hash after each data packet or group of packets is received. That is, in some embodiments, the hash of a data block is not calculated using all of the data in the data block at once, but instead is calculated incrementally for each packet until all of the packets in the data block have been received and processed.
  • the final hash value is securely recorded on a blockchain.
  • the hash value may be recorded using a smart contract running on the blockchain network.
  • the smart contract records the integrity ensuring data on the blockchain in a transaction data structure containing the producer identifier, a metrics identifier and a timestamp along with the hash value, so that a query can be performed to obtain the integrity ensuring data.
  • the number of the separate transactions that need to be stored on the blockchain can be reduced significantly.
  • the integrity of a group of data packets stored in the TSDB can be verified by calculating the hash of the group of data packets retrieved from the TSDB and comparing the calculated value to the hash value that is stored on the blockchain of a data block corresponding to the group of data packets.
  • a normalization procedure can be used for converting each datapoint representation to a canonical form to provide a more robust solution for hash calculation during the integrity validation.
  • An example normalization procedure is described in more detail below.
  • FIG. 2A illustrates a system 100 according to some embodiments for ensuring the integrity of streaming metrics data that is stored in a TSDB 170
  • Figure 2B illustrates the creation and storage of integrity ensuring data according to some embodiments.
  • measurement data 105 is generated by a metrics data producer 115 ( Figure 3A) and provided to a metrics-based monitoring system 110.
  • the metrics-based monitoring system 110 receives the measurement data from the metrics data producer 115 over a communication interface 106 and assembles the measurement data into packets 122.
  • each packet 122 created by the metrics-based monitoring system 110 includes a packet index and a timestamp generated by the metricsbased monitoring system 110.
  • Each packet 122 also includes a producer ID that identifies the metrics data producer 115, a metrics ID that identifies the particular data stream in the data packet 122, and a payload containing the metrics data.
  • Three sequential data packets 122 having packet indices N-l, N and N+l are illustrated in Figure 2B.
  • the packets 122 are generated sequentially by the metrics-based monitoring system 110 and streamed via a data streaming pipeline 120 to streaming data subscribers.
  • the metrics-based monitoring system 110 also stores the generated packets 122 in the TSDB 170.
  • the data streaming pipeline 120 may be a message bus deployment that provides guaranteed delivery of metrics data to subscribers.
  • the data streaming pipeline 120 may provide a mechanism to easily match data stream publishers and subscribers, where the publishers are the metrics data producers and subscribers are the metrics data consumers.
  • the system 100 includes a data integrity creator 130, a blockchain interface 140 and a data validator 160.
  • the data integrity creator 130 receives streaming metrics data in the form of packets 122 from the streaming data pipeline 120 and generates integrity ensuring data 145 that will be stored on a blockchain platform 150.
  • the blockchain platform 150 may be a private blockchain platform or a public blockchain platform.
  • the blockchain platform is a public blockchain platform, it may be desirable to encrypt some or all of the fields of the integrity ensuring data 145, such as the producer ID and/or metrics IDs, prior to storage on the blockchain to protect the privacy of the integrity ensuring data 145.
  • the blockchain interface 140 provides access to the blockchain platform 150 on which the integrity ensuring data are stored.
  • a data validator 160 provides validated metrics data to a metrics data consumer 180 in response to a request from the metrics data consumer 180.
  • the blockchain platform 150 includes a ledger 155, which may be a decentralized blockchain ledger in the case of a public blockchain platform or a centralized blockchain ledger in the case of a private blockchain platform.
  • the ledger 155 may be accessible through one or more smart contracts 153 that execute transactions to store and retrieve the integrity ensuring data from the ledger 155.
  • the blockchain platform 150 may be an open platform, such as Ethereum, Polkadot, Binance, Solana or Eos, that supports the creation and execution of smart contracts, or may be a proprietary platform that supports the creation and execution of smart contracts or that is designed to permit the storage and retrieval of data such as the integrity ensuring data.
  • the data integrity creator 130 subscribes to the streaming metrics data via the streaming metrics pipeline 120 so that the metrics data streams will be delivered to it.
  • the data integrity creator 130 receives the streamed packets 122 and assembles one or more of the data packets 122 containing metrics data 105 generated by a data producer 115 and having timestamps falling within a predetermined time interval into a data block 135.
  • the data integrity creator 130 calculates a hash value of the streaming metrics data in the data block 135 and generates integrity ensuring data 145 for the data block 135.
  • a single data block 135 may include data packets 122 for more than one metrics data stream.
  • a data block 145 having data block index K including three sequential data packets 122 having packet indices N-l, N and N+l is illustrated in Figure 2B.
  • the integrity ensuring data 145 generated based on the data block 135 includes the data block index K, a data packet start time and data packet total time associated with data block K, the producer ID, the metrics IDs of data packets included in the block, and the hash value generated by the data integrity creator 130.
  • the block timestamp may be the timestamp of the earliest data packet 122 included in the data block 135.
  • the block timestamp may include a start timestamp and an end timestamp covering the timestamps of the data packets 122 included in the data block 135.
  • the block timestamp may include a starting timestamp and a duration covering a time interval associated with the data packets 122 included in the data block 135.
  • the dataPacketStartTime field is the beginning of the time interval for creating integrity ensuring data on the aggregated blocks.
  • Some embodiments may use a fixed-size time-based sliding window for aggregations, the start and end points of the window depend on packet timestamps, not the system clock of data integrity creator 130.
  • the window size is selected to be a value that is evenly distributed within a day/hour/minute so that it is easier to perform the aggregation task. For example, window sizes of 10, 15, 30, and 60 minutes would be good values, while a window size of 7 or 25 minutes would be a less preferable choice.
  • the timestamp in a packet will fall into one of the slots.
  • the first hash value calculation is triggered when the first data packet 122 appears in the time slot, and it ends when the timestamp in the last packet exceeds the time slot.
  • a window grace period can be introduced to conclude the aggregated hash calculation if there is no next packet arriving to trigger the ending.
  • the dataPacketTotalTime field indicates the total time covered by the data block. Because the TSDB 170 has a flat structure with timestamped metrics data, the system can only retrieve blocks of data by timestamps from the TSDB for given metrics.
  • the integrity ensuring data may, in some embodiments, include a field that indicates the total number of data packets included in the block for which the integrity ensuring data was generated.
  • the integrity ensuring data may also include a field that indicates the hash value of the first data packet included in the block for which the integrity ensuring data was generated.
  • the data integrity creator 130 then stores the integrity ensuring data 145 on the blockchain platform 150 via the blockchain interface 140.
  • the blockchain interface 140 provides access to the blockchain platform 150 using a smart contract 153 that stores the integrity ensuring data 145 for a data block 135 on the blockchain ledger 155.
  • the blockchain interface 140 also provides a query interface for accessing the recorded integrity ensuring data 145.
  • a smart contract 153 implements business logic for making a data "asset" transaction on the blockchain platform 150 by using methods and data structures defined in a software development kit (SDK) provided by the blockchain platform 150.
  • SDK software development kit
  • a smart contract is a software module that is stored on and can be executed by some types of blockchain platforms.
  • the Ethereum blockchain is an example of a public blockchain platform that can execute smart contracts written in the Solidity programming language.
  • Other examples of public blockchains that support smart contracts include Polkadot, Solana, and Eos.
  • Methods belonging to the smart contract can be executed by invoking the methods via transactions on the blockchain.
  • the smart contract can handle a data asset transaction to the blockchain platform.
  • the data asset may be modelled according to a representational state transfer (REST) application programming interface (API) in the blockchain interface 150.
  • a query function is also implemented for the data asset in the smart contract.
  • the blockchain interface 140 calls the smart contract 153 deployed to the blockchain platform 150 to perform transactions and make queries.
  • the blockchain interface 140 creates a data structure defined in the smart contract and invokes a new blockchain transaction for recording the integrity ensuring data 145.
  • the API of the blockchain interface 140 has a query function which in turn calls the query function in the smart contract 153 for retrieving the integrity ensuring data 145.
  • the data validator 160 is accessible to the metrics data consumer 180 via an API, such as a REST API, to enable retrieval of the metrics data stored in the TSDB 170 with data validation.
  • the data validator 160 uses the API provided by the blockchain interface 140 to read the recorded integrity ensuring data 145 from the blockchain platform 150.
  • the metrics data consumer 180 may send a request to the data validator 160 for validated metrics data.
  • the request identifies the data producer, the data stream, and a time interval for which metrics data is sought.
  • the data validator 160 determines which block or blocks 135 of metrics data correspond to the time period indicated in the request, and retrieves the corresponding data packets 122 from the TSDB 170.
  • the data validator 160 then generates a hash value for each block of the retrieved data packets using the same hashing algorithm used by the data integrity creator 130 to generate the integrity ensuring data 145.
  • the data validator 160 retrieves the integrity ensuring data 145 for the specified blocks 135 from the blockchain platform via the blockchain interface 140, and compares the hash values in the retrieved integrity ensuring data 145 to the re-calculated hash values for the data packets retrieved from the TSDB 170. If, for a given data block 135, the hash value calculated by the data validator 160 matches the hash value in the integrity ensuring data 145 for the data block 135 retrieved from the blockchain platform 150, the data packets 122 in the data block 135 are determined to be valid, and the valid data packets 122 are provided to the metrics data consumer 180.
  • the data block 135 is considered to be invalid, and an indication may be returned to the metrics data consumer 180 that the data packets 122 in the data block 135 are considered to be invalid.
  • FIG. 3A is a flow diagram illustrating the generation and storage of integrity ensuring data according to some embodiments.
  • a metrics data producer 115 transmits data records 202 periodically or upon request to a metrics monitoring system 110.
  • the metrics monitoring system 110 assembles the records into discrete data packets 122 including a producer ID identifying the metrics data producer 115, a metrics ID identifying the particular data stream to which the records 202 belong, and a timestamp.
  • the metrics monitoring system 110 transmits (arrow 206A) the generated data packets 122 to a TSDB 170 and also streams (arrow 206B) the packets over a data streaming pipeline 120.
  • a data integrity creator 130 may subscribe (arrow 205) to a particular data stream transmitted by the metrics monitoring system 110 over the data streaming pipeline 120 and receive the data packets (arrow 208) via the data streaming pipeline 120.
  • the metrics monitoring system 110 may transmit data packets 122 directly to the data integrity creator 130 instead of the data integrity creator 130 receiving the data packets over the data streaming pipeline 120.
  • the TSDB 170 may obtain the data packets 122 using the data streaming pipeline 120 by subscribing to the stream.
  • the data integrity creator 130 assembles the data packets 122 into data blocks 135 (block 210) and generates integrity ensuring data 145 for each data block 135.
  • the integrity ensuring data 145 includes a data block index, a timestamp, a producer ID, the metrics IDs and a hash value calculated based on the data in the data packets 122 included in the data block 135.
  • the data integrity creator 130 then transmits (arrow 214) the integrity ensuring data 145 to the blockchain interface 140, which stores (arrow 216) the integrity ensuring data 145 on the blockchain platform 150.
  • FIG. 3B illustrates operations of retrieving validated data by a metrics data consumer 180 according to some embodiments.
  • the metrics data consumer 180 transmits a request (arrow 302) for validated metrics data to the data validator 160.
  • the request 302 identifies the data producer, the associated metrics in the data stream and the time period for which validated data is requested.
  • the data validator 160 determines which data packets 122 are covered by the request and which data blocks 135 contain the requested data packets 122 (block 303).
  • the data validator 160 then sends a query (arrow 304) to the TSDB 170, which responds by transmitting the requested data packets 122 to the data validator 160 (arrow 306).
  • the data validator 160 For each data block corresponding to the received data packets, the data validator 160 then generates a hash value from the received data packets (block 308). The data validator 160 then transmits a request (arrow 310) to the blockchain interface 140 requesting the blockchain interface 140 to retrieve the integrity ensuring data 145 associated with the block 135. The blockchain interface 140 then sends a request to the blockchain platform 150, for example by invoking a smart contract, to obtain the integrity ensuring data 145 (arrow 312). The blockchain platform 150 responds with a message including the integrity ensuring data 145 (arrow 314), and the blockchain interface 140 provides the integrity ensuring data 145 to the data validator 160.
  • the data validator 160 validates the integrity of the block by comparing the hash value generated from the data packets retrieved from the TSDB 170 at block 308 with the hash value for the block contained in the integrity ensuring data 145 retrieved from the blockchain platform 150. If the hash values match, then the data validator 160 transmits the validated metrics data to the metrics data consumer 180 (arrow 320). If the hash values do not match, the data validator 160 may transmit a response to the metrics data consumer 180 indicating that the data could not be validated.
  • data blocks 135 are created by the data integrity creator 130 by assembling data packets 122 having timestamps that fall within a predetermined time interval.
  • the time interval may be configured so that the total number of transactions should be less that the maximum transaction throughput on the blockchain network.
  • the time interval may be calculated such that:
  • Rtx is the transaction rate
  • N is the number of data block series
  • 7/ is the sampling period for a data block series /
  • Rtx,max is the maximum transaction rate which can handled by the blockchain network.
  • TSDBs such as the Prometheus and InfluxDB databases
  • TSDBs support a text-based wire format.
  • the default text-based format is called the Exposition Format.
  • Exposition Format each time series is uniquely identified by its metric name and optional key-value pairs called labels.
  • the InfluxDB uses another textbased format, called the InfluxDB Line Protocol, for metrics.
  • the textual representation can be vendor specific and the hash value for a data block is calculated on a byte array representing the stream data, a single byte change in the data will produce a different checksum value.
  • a normalization procedure for the metrics textual representation may be used in some embodiments.
  • a metrics datapoint normalization function may be used to normalize the metrics tag/label sets together with its value and timestamp to a standard, or normal, form so the hash value is reproducible for the same data packets even if the textual representation format changes or the sequence changes for data points having the same timestamp.
  • metrics data can be verified independent of the wire protocol.
  • a TSDB may have a short retention time.
  • the Prometheus TSDB has a 15-day retention for the time-series data, which means that data may need to be replicated to long-term storage if access is needed after the TSDB retention period.
  • the long-term storage might be provided by a different vendor platform and may use different text-based formats for queries, normalizing the metrics data may provide a consistent textbased (byte compatible) format for the correct re-calculation of the hash value.
  • Label (or tag) set which consists of a list of labels/tags with or without values.
  • the set needs to be sorted - string type only
  • Metrics (or measurement or field) value - can be one of the following types:
  • the values can be one of the following types: floats, integers, strings, or Booleans Timestamp -
  • the format depends on the platform and the programming language
  • Scientific Notation can be used, e.g., 1.2345E6, -9.876E-54, 0E0, 1E0, 123456789E0.
  • For non- real numbers and infinity use e.g., 'NaN' for non-real numbers, and '+1 nf', '-Inf' for infinity numbers.
  • Boolean The number '1' and '0' can be used for the normalization for
  • Timestamp The UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1
  • Figure 4 illustrates incremental calculation of a hash value for a data block 135.
  • the hash value may be calculated incrementally as the data packets 122 are received by the data integrity creator 130, and data packets 122 may be discarded as the incremental hash value is generated.
  • hash values 402-1, 402-2 are generated individually for the data packets 122-1, 122-2, different hash values are created for each data packet.
  • the packets 122-1, 122-2 are combined into a data block 135, the hash value 404 for the combined string in the data block 135 is different from the individual hash values 402-1, 402-2.
  • a hash value 406 is generated via incremental string hashing of each of the data packets 122-1, 122-2 using the SHA3-256 hashing algorithm, the resulting hash value 406 is the same as the hash value 404 generated from the combined data.
  • Embodiments described herein may provide certain technical advantages. For example, using blockchain technology with its enhanced security and transparency, the metrics data integrity can be ensured without the need to encrypt the data using a dedicated secure data vault.
  • Aggregating multiple metrics streams from the same producer during a time interval into a searchable data block may make the integrity ensuring process more scalable, as the multiple data streams may be validated in a bundle instead of for each data stream record.
  • Some embodiments may have low or reduced impact on existing metrics data flows, as they provide a non-intrusive solution to the metrics data pipeline.
  • the metrics data can be retrieved from the TSDB and used using existing APIs, while the data end- user/consumer can use the methods described herein to retrieve and validate the data stored in TSDB.
  • the integrity ensuring data for the metrics data streams is generated close to the source, so there is less chance for the data to be altered once published to the data pipeline by a producer.
  • FIG. 5 is a functional block diagram that illustrates a data validation system 100 according to some embodiments for ensuring the integrity of streaming metrics data.
  • the data validation system 100 includes a data integrity creation subsystem 112 that implements functionality of the data integrity creator 130 described above, a blockchain platform interface 116 that implements functionality of the blockchain interface 150 described above, and a data validation subsystem 114 that implements functionality of the data validator 160 described above.
  • FIG. 6A is a block diagram of a data integrity creator 130 according to some embodiments.
  • the data integrity creator 130 includes a processing circuit 134, a memory 136 coupled to the processing circuit 134, and a communication interface 118 coupled to the processing circuit 134.
  • the processing circuit 134 may be a single processor or may include multiple processors, and may be a distributed or cloud-based processor in some embodiments.
  • Figure 6B illustrates functional modules that may be stored in the memory 136 of data integrity creator 130.
  • the functional modules may include a streaming data subscription module 122 that subscribes to streaming metrics data via the streaming data pipeline 120 illustrated in Figure 2A, and an integrity data generator module 124 that generates integrity ensuring data 145 to be stored on a blockchain platform 150 as described above.
  • FIG. 7A is a block diagram of a data validator 160 according to some embodiments.
  • the data validator 160 includes a processing circuit 234, a memory 236 coupled to the processing circuit 234, and a communication interface 218 coupled to the processing circuit 234.
  • the processing circuit 234 may be a single processor or may include multiple processors, and may be a distributed or cloud-based processor in some embodiments.
  • Figure 7B illustrates functional modules that may be stored in the memory 236 of data validator 160.
  • the functional modules may include a data retrieval module 222 that retrieves metrics data from a TSDB 170 illustrated in Figure 2A, and an integrity data generator module 224 that generates a hash value for comparing with integrity ensuring data 145 stored on the blockchain platform 150 as described above.
  • FIG. 8 illustrates operations of a data integrity creator 130 according to some embodiments.
  • a method of operating a data integrity creator includes receiving (block 802) streaming metrics data of a communication system originating from a metrics data producer, generating (block 804) integrity ensuring data based on the streaming metrics data, and storing (block 806) the integrity ensuring data on a blockchain platform.
  • Figure 9 illustrates operations of a data validator 160 according to some embodiments.
  • a method of operating a data validator includes obtaining (block 902) a data request from a metrics data consumer for metrics data of a communication system, and retrieving (block 904) the metrics data from a TSDB.
  • the method further includes generating (block 906) a first hash value of the metrics data, obtaining (block 908) a second hash value of the metrics data from a blockchain platform, and comparing (block 910) the first hash value and the second hash value to determine if the first hash value and the second hash value match.
  • the data validator transmits (block 912) the metrics data to the metrics data consumer. If the first hash value and the second hash value do not match, an error message may be returned to the data consumer (block 914).
  • Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits.
  • These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A system for ensuring integrity of streaming metrics data includes a data integrity creator that receives streaming metrics data of a communication system originating from a metrics data producer, generates integrity ensuring data based on the streaming metrics data, and stores the integrity ensuring data on a blockchain platform, and a data validator that receives a data request for metrics data from a metrics data consumer, retrieves the metrics data from a time series database, generates a first hash value for the metrics data, retrieves a second hash value for the metrics data from the blockchain platform, compares the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmits the metrics data to the metrics data consumer.

Description

STREAMING METRICS DATA INTEGRITY IN A COMMUNICATION SYSTEM
TECHNICAL FIELD
[OOO1] The present disclosure relates to communication systems, and in particular to communication systems in which streaming metrics data is made available within the systems.
BACKGROUND
[0002] Communication systems, such as wireless communication networks, generate a significant amount of metrics data that can be consumed by various network entities and functions to assist in the operation of the systems. Such metrics data may include, for example, data relating to the establishment and management of wireless connections, cells, radio bearers, sessions, resource usage, etc., within the network.
[0003] In particular, a cloud-based network deployment can produce a large quantity of application and platform metrics. Some of the data generated within the network, such as resource usage data, can be used to generate billing information, such as charging data records. Other data produced in the network can include asset data, such as data describing application license usage. Because these data are generated on-the-fly during network operation, the data is typically provided by a data source to data consumers in a streaming format. The data is typically streamed to users and also stored in a time series database, TSDB, for later retrieval.
[0004] Because of the size, complexity and interoperability of modern communication systems, portions of the system may be managed or operated by different entities. Accordingly, the streaming metrics data generated in the system need to be stored in a transparent, traceable, and immutable way so that the data consumers can be assured of the completeness and integrity of the data they are receiving.
SUMMARY
[0005] A method of operating a data integrity creator according to some embodiments includes receiving streaming metrics data of a communication system originating from a metrics data producer, generating integrity ensuring data based on the streaming metrics data, and storing the integrity ensuring data on a blockchain platform.
[0006] A data integrity creator according to some embodiments includes a processing circuit, and a memory that stores computer program instructions that, when executed by the processing circuit, cause the processing circuit to perform operations including receiving streaming metrics data of a communication system originating from a metrics data producer, generating integrity ensuring data based on the streaming metrics data, and storing the integrity ensuring data on a blockchain platform.
[0007] A data integrity creator according to some embodiments includes a streaming data subscription module configured to receive streaming metrics data of a communication system originating from a metrics data producer, and an integrity data generator module configured to generate integrity ensuring data based on the streaming metrics data and store the integrity ensuring data on a blockchain platform.
[0008] A method of operating a data validator according to some embodiments includes obtaining a data request from a metrics data consumer for metrics data of a communication system, retrieving the metrics data from a TSDB, generating a first hash value of the metrics data, obtaining a second hash value of the metrics data from a blockchain platform, comparing the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmitting the metrics data to the metrics data consumer.
[0009] A data validator according to some embodiments includes a processing circuit, and a memory that stores computer program instructions that, when executed by the processing circuit, cause the processing circuit to perform operations including obtaining a data request from a metrics data consumer for metrics data of a communication system, retrieving the metrics data from a TSDB, generating a first hash value of the metrics data, obtaining a second hash value of the metrics data from a blockchain platform, comparing the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmitting the metrics data to the metrics data consumer.
[0010] A data validator according to some embodiments includes a data retrieval module configured to obtain a data request from a metrics data consumer for metrics data of a communication system and retrieve the metrics data from a TSDB, an integrity generator module configured to generate a first hash value of the metrics data, obtain a second hash value of the metrics data from a blockchain platform, and compare the first hash value and the second hash value to determine if the first hash value and the second hash value match, and a communication interface configured, in response to determining that the first hash value and the second hash value match, to transmit the metrics data to the metrics data consumer.
[0011] A system for ensuring integrity of streaming metrics data according to some embodiments includes a data integrity creator that receives streaming metrics data of a communication system originating from a metrics data producer, generates integrity ensuring data based on the streaming metrics data, and stores the integrity ensuring data on a blockchain platform, and a data validator that receives a data request for metrics data from a metrics data consumer, retrieves the metrics data from a time series database, generates a first hash value for the metrics data, retrieves a second hash value for the metrics data from the blockchain platform, compares the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmits the metrics data to the metrics data consumer.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] Figure 1 is a schematic illustration of a blockchain.
[0013] Figure 2A is a block diagram of a system for ensuring integrity of streaming metrics data according to some embodiments.
[0014] Figure 2B illustrates the generation of data blocks from data packets of streaming data according to some embodiments.
[0015] Figure 3A is a flow diagram that illustrates the generation and storage of integrity ensuring data according to some embodiments.
[0016] Figure 3B is a flow diagram that illustrates operations of retrieving validated data by a metrics data consumer according to some embodiments.
[0017] Figure 4 illustrates incremental calculation of a hash value for a data block according to some embodiments. [0018] Figure 5 is a functional block diagram that illustrates a data validation system according to some embodiments
[0019] Figure 6A is a block diagram of a data integrity creator according to some embodiments.
[0020] Figure 6B illustrates functional modules that may be stored in the memory of data integrity creator.
[0021] Figure 7A is a block diagram of a data validator according to some embodiments.
[0022] Figure 7B illustrates functional modules that may be stored in the memory of data validator.
[0023] Figure 8 illustrates operations of a data integrity creator according to some embodiments.
[0024] Figure 9 illustrates operations of a data validator according to some embodiments.
DETAILED DESCRIPTION
[0025] Although more widely known for its use as a foundational technology of cryptocurrencies, blockchain technology can be used to help ensure the security, transparency and traceability of data in many different context, and therefore may provide a suitable solution for ensuring the integrity of metrics data, such as streaming metrics data stored in a TSDB, without the need for a dedicated secure data vault.
[0026] A blockchain is a data structure that consists of a series of blocks of data. In typical cryptocurrency blockchains, the blocks contain transaction data; however, in general, a blockchain may contain any type of data. The integrity of a blockchain derives from the fact that each block contains a cryptographic link to every other block in the chain, which makes it difficult to modify data stored in the blockchain without disrupting the cryptographic links and invalidating the chain. For example, Figure 1 illustrates a blockchain 10 at a high level of generality. Each block in the blockchain is numbered with an index. In Figure 1, blocks N-l, N and N+l of the blockchain are illustrated. Each block includes data stored therein and also includes a cryptographic hash of the previous block in the chain. For example, Block N includes the Block N data as well as a cryptographic hash of the previous block, Block N-l. The cryptographic hash stored in Block N is generated based on both the data in Block N-l and the cryptographic hash of Block N-2 stored in Block N-l.
[0027] A cryptographic hash is a fixed-length signature of a string of data. Cryptographic hashes, which may be generated by any of a number of well-known cryptographic hashing functions, such as the 256-bit Secure Hashing Algorithm (SHA256), are numbers that are deterministically generated according to a hashing algorithm to be uniquely associated with the data that is input into the hashing algorithm. Any change in the input data, even a very slight change, results in an unpredictable change to the resulting hash. Moreover, for large hashing functions, it is computationally infeasible to reconstruct the input data given a resulting hash, or to generate a different set of input data that produces the same hash.
[0028] Consequently, any change to the data in a block will result in the hash of the block, which is stored in the next block in the chain, being invalidated. Thus, any tampering or changing of data in a block can be rapidly detected simply by calculating the hash of the block and comparing it to the stored hash in the next block. Since each block depends on all previous blocks due to the chain of hash values stored in the block, any tampering of one block will affect the hash of all subsequent blocks in the chain.
[0001] In general, a blockchain may be public or private. In a public blockchain, anyone is free to join and participate in the core activities of the blockchain network, such as submitting transactions, deploying smart contracts and executing smart contract functions. A private blockchain allows only selected entry of verified participants. The operator of a private blockchain typically has the right to edit or delete entries on the blockchain.
[0002] In addition to the public/private distinction, blockchains may be permissioned or permissionless. Permissioned blockchains require participants to have permission to access the blockchain and as well as to perform selected activities, such as read and write information and/or interact with smart contracts on the blockchain.
[0003] Public blockchains are distributed ledgers that any participant can interact with. However, to ensure the integrity of the blockchain, the users must typically compete for the right to add blocks to the blockchain using a consensus algorithm such as proof-of- work that ensures that significant computational power is needed to generate a new block. For example, the Bitcoin network uses a proof-of-work consensus algorithm that currently requires over 100 million trillion hash operations, or hashes, of computational power per second for approximately ten minutes to find a single block. Because so much computational power is needed to generate each block, the integrity of data in a given block of the blockchain is ensured by the difficulty of finding an alternative valid block before the network finds a new block and extends the chain.
[0004] Accordingly, although a public blockchain may not need to be managed by a trusted entity, the integrity of the public blockchain comes at the cost of requiring large amounts of computational power to generate new blocks, which can result in relatively low throughput due to the slow addition of new blocks to the chain.
[0005] In addition to the low throughput, a public blockchain platform may publicly expose the data structure of data stored on the blockchain, which may create a security concern depending on the application. Moreover, a public blockchain platform typically has a floating transaction fee, which means that it may become expensive to perform large numbers of transactions on the network. For example, the average transaction fee on the Ethereum main network is currently over two dollars per transaction. Transactions throughput may also be limited on public blockchains, and they can therefore can be unusable even for small amounts of metrics data.
[0006] Private blockchains are centralized ledgers that are managed by a trusted entity, which controls the generation and addition of new blocks to the blockchain. Since the trusted entity can change the content of data in a blockchain by recalculating hash values, the users of a blockchain rely on the integrity of the trusted entity to maintain the integrity of the blockchain. Because the trusted entity can add new blocks at any time, private blockchains can achieve relatively high throughput in terms of the rate at which data can be added to the blockchain.
[0007] Moreover, because access to the private blockchain is limited, the security of the data structure stored on the blockchain is more protected. However, in the embodiments described herein, metrics data is both stored in a time series database and transmitted over a data pipeline, making it more difficult for even a trusted entity to tamper with the integrity ensuring data. For these reasons, a private blockchain may be preferable to use in some cases.
[0008] Although blockchains can help to ensure the integrity of data, current blockchain solutions for ensuring data integrity on-chain have both throughput and data storage issues, because the blockchain platform has a scalability limitation in terms of number of transactions and record size of data that can be stored in a block.
[0009] A single metrics data producer in a communication system may produce multiple streams of data, each of which generates thousands or even tens of thousands of records per second. For example, a network node in a communication system, such as a wireless communication system, may generate a stream of throughput data and a stream of frequency utilization data.
[0010] Due to the potentially large volume of such data, conventional blockchain platforms may not be suitable for storing the data. Likewise, conventional blockchain technology may not be able to easily handle integrity-ensuring data on-chain for multiple streams, even using a permissioned/private mainstream blockchain platforms.
[0011] Some solution for ensuring data integrity with blockchain technology using permissioned blockchains have been proposed. While such approaches may provide a verified data vault service, they may nevertheless not be suitable for verifying the integrity of streaming network metrics data that is stored on an existing TSDB and to handle metrics data streaming in real-time with very little impact on the existing flow.
[0012] Some embodiments described herein provide systems/methods for ensuring the integrity of streaming metrics data that is stored in a TSDB by using off-chain transactions instead of on-chain transactions for storing the metrics data while storing integrity ensuring data on-chain on a blockchain. That is, according to some embodiments, a blockchain may be used to store a unique identifier for a metrics data producer along with integrity ensuring data associated with data produced by the metrics data producer.
[0013] To reduce the number of transactions on the blockchain, multiple records of metrics data that are generated by the metrics data producer during a predetermined time interval are grouped into discrete data packets by a monitoring system. The packets are stored in the TSDB along with a timestamp, a producer identifier that identifies the producer of the data and a metrics identifier that identifies the particular metrics data stream generated by the data producer. The data packets can be retrieved from TSDB using the producer identifier, the metrics identifier, and the timestamp.
[0014] To ensure the integrity of the stored data, the packets are grouped into data blocks, and a hash value, or checksum, is generated for each block of data packets. The hash value is stored on a blockchain, which may be a public or private blockchain. Although the terms "checksum" and "hash value" are used interchangeably herein to refer to a mathematically calculated signature of a set of data, it will be appreciated that hash values and checksums can be calculated in many different ways using many different algorithms. The embodiments described herein are not limited to particular algorithms for calculating hash values or checksums.
[0015] Because of the rate at which the metrics data may be streamed, it may not be desirable to buffer the metrics data stream for the entire time interval and then calculate the hash for all the metrics data generated during the time interval period. Accordingly, some embodiments utilize an incremental hashing algorithm for the checksum generation that repeatedly updates the hash after each data packet or group of packets is received. That is, in some embodiments, the hash of a data block is not calculated using all of the data in the data block at once, but instead is calculated incrementally for each packet until all of the packets in the data block have been received and processed.
[0016] For each data block, the final hash value is securely recorded on a blockchain. The hash value may be recorded using a smart contract running on the blockchain network. The smart contract records the integrity ensuring data on the blockchain in a transaction data structure containing the producer identifier, a metrics identifier and a timestamp along with the hash value, so that a query can be performed to obtain the integrity ensuring data. In this way, the number of the separate transactions that need to be stored on the blockchain can be reduced significantly. Moreover, because only the integrity ensuring data is stored on the blockchain, there is no need to store the voluminous metrics data on the blockchain.
[0017] The integrity of a group of data packets stored in the TSDB can be verified by calculating the hash of the group of data packets retrieved from the TSDB and comparing the calculated value to the hash value that is stored on the blockchain of a data block corresponding to the group of data packets.
[0018] A normalization procedure can be used for converting each datapoint representation to a canonical form to provide a more robust solution for hash calculation during the integrity validation. An example normalization procedure is described in more detail below.
[0019] Figure 2A illustrates a system 100 according to some embodiments for ensuring the integrity of streaming metrics data that is stored in a TSDB 170, and Figure 2B illustrates the creation and storage of integrity ensuring data according to some embodiments. Referring to Figures 2A and 2B, measurement data 105 is generated by a metrics data producer 115 (Figure 3A) and provided to a metrics-based monitoring system 110. The metrics-based monitoring system 110 receives the measurement data from the metrics data producer 115 over a communication interface 106 and assembles the measurement data into packets 122.
[0020] Referring to Figure 2B, each packet 122 created by the metrics-based monitoring system 110 includes a packet index and a timestamp generated by the metricsbased monitoring system 110. Each packet 122 also includes a producer ID that identifies the metrics data producer 115, a metrics ID that identifies the particular data stream in the data packet 122, and a payload containing the metrics data. Three sequential data packets 122 having packet indices N-l, N and N+l are illustrated in Figure 2B.
[0021] Referring to Figure 2A, the packets 122 are generated sequentially by the metrics-based monitoring system 110 and streamed via a data streaming pipeline 120 to streaming data subscribers. The metrics-based monitoring system 110 also stores the generated packets 122 in the TSDB 170. The data streaming pipeline 120 may be a message bus deployment that provides guaranteed delivery of metrics data to subscribers. The data streaming pipeline 120 may provide a mechanism to easily match data stream publishers and subscribers, where the publishers are the metrics data producers and subscribers are the metrics data consumers.
[0022] The system 100 includes a data integrity creator 130, a blockchain interface 140 and a data validator 160. The data integrity creator 130 receives streaming metrics data in the form of packets 122 from the streaming data pipeline 120 and generates integrity ensuring data 145 that will be stored on a blockchain platform 150.
[0023] The blockchain platform 150 may be a private blockchain platform or a public blockchain platform. When the blockchain platform is a public blockchain platform, it may be desirable to encrypt some or all of the fields of the integrity ensuring data 145, such as the producer ID and/or metrics IDs, prior to storage on the blockchain to protect the privacy of the integrity ensuring data 145.
[0024] The blockchain interface 140 provides access to the blockchain platform 150 on which the integrity ensuring data are stored. A data validator 160 provides validated metrics data to a metrics data consumer 180 in response to a request from the metrics data consumer 180.
[0025] The blockchain platform 150 includes a ledger 155, which may be a decentralized blockchain ledger in the case of a public blockchain platform or a centralized blockchain ledger in the case of a private blockchain platform. The ledger 155 may be accessible through one or more smart contracts 153 that execute transactions to store and retrieve the integrity ensuring data from the ledger 155. The blockchain platform 150 may be an open platform, such as Ethereum, Polkadot, Binance, Solana or Eos, that supports the creation and execution of smart contracts, or may be a proprietary platform that supports the creation and execution of smart contracts or that is designed to permit the storage and retrieval of data such as the integrity ensuring data.
[0026] The data integrity creator 130 subscribes to the streaming metrics data via the streaming metrics pipeline 120 so that the metrics data streams will be delivered to it. Referring to Figure 2B, the data integrity creator 130 receives the streamed packets 122 and assembles one or more of the data packets 122 containing metrics data 105 generated by a data producer 115 and having timestamps falling within a predetermined time interval into a data block 135. The data integrity creator 130 calculates a hash value of the streaming metrics data in the data block 135 and generates integrity ensuring data 145 for the data block 135.
[0027] Because the metrics ID is included in the data packet, a single data block 135 may include data packets 122 for more than one metrics data stream.
[0028] A data block 145 having data block index K including three sequential data packets 122 having packet indices N-l, N and N+l is illustrated in Figure 2B.
[0029] As further shown in Figure 2B, the integrity ensuring data 145 generated based on the data block 135 includes the data block index K, a data packet start time and data packet total time associated with data block K, the producer ID, the metrics IDs of data packets included in the block, and the hash value generated by the data integrity creator 130. In some embodiments, the block timestamp may be the timestamp of the earliest data packet 122 included in the data block 135. In some embodiments, the block timestamp may include a start timestamp and an end timestamp covering the timestamps of the data packets 122 included in the data block 135. In some embodiments, the block timestamp may include a starting timestamp and a duration covering a time interval associated with the data packets 122 included in the data block 135.
[0030] An example of a data structure for storing integrity ensuring data 145 according to some embodiments is shown in Table 1 below.
Table 1 - Data structure for integrity ensuring data
Figure imgf000013_0001
[0031] In Table 1, the dataPacketStartTime field is the beginning of the time interval for creating integrity ensuring data on the aggregated blocks. Some embodiments may use a fixed-size time-based sliding window for aggregations, the start and end points of the window depend on packet timestamps, not the system clock of data integrity creator 130. The window size is selected to be a value that is evenly distributed within a day/hour/minute so that it is easier to perform the aggregation task. For example, window sizes of 10, 15, 30, and 60 minutes would be good values, while a window size of 7 or 25 minutes would be a less preferable choice. As there are fixed pre-defined window slots for each day, the timestamp in a packet will fall into one of the slots. The first hash value calculation is triggered when the first data packet 122 appears in the time slot, and it ends when the timestamp in the last packet exceeds the time slot. A window grace period can be introduced to conclude the aggregated hash calculation if there is no next packet arriving to trigger the ending.
[0032] The dataPacketTotalTime field indicates the total time covered by the data block. Because the TSDB 170 has a flat structure with timestamped metrics data, the system can only retrieve blocks of data by timestamps from the TSDB for given metrics.
[0033] Other fields may be included in the integrity ensuring data. For example, the integrity ensuring data may, in some embodiments, include a field that indicates the total number of data packets included in the block for which the integrity ensuring data was generated. The integrity ensuring data may also include a field that indicates the hash value of the first data packet included in the block for which the integrity ensuring data was generated.
[0034] Referring again to Figure 2A, the data integrity creator 130 then stores the integrity ensuring data 145 on the blockchain platform 150 via the blockchain interface 140. The blockchain interface 140 provides access to the blockchain platform 150 using a smart contract 153 that stores the integrity ensuring data 145 for a data block 135 on the blockchain ledger 155. The blockchain interface 140 also provides a query interface for accessing the recorded integrity ensuring data 145.
[0035] In some embodiments, a smart contract 153 implements business logic for making a data "asset" transaction on the blockchain platform 150 by using methods and data structures defined in a software development kit (SDK) provided by the blockchain platform 150. As is known in the art, a smart contract is a software module that is stored on and can be executed by some types of blockchain platforms. The Ethereum blockchain is an example of a public blockchain platform that can execute smart contracts written in the Solidity programming language. Other examples of public blockchains that support smart contracts include Polkadot, Solana, and Eos. Methods belonging to the smart contract can be executed by invoking the methods via transactions on the blockchain. Once deployed, the smart contract can handle a data asset transaction to the blockchain platform. The data asset may be modelled according to a representational state transfer (REST) application programming interface (API) in the blockchain interface 150. A query function is also implemented for the data asset in the smart contract.
[0036] The blockchain interface 140 calls the smart contract 153 deployed to the blockchain platform 150 to perform transactions and make queries. When receiving a request via an API call from the data integrity creator 130, the blockchain interface 140 creates a data structure defined in the smart contract and invokes a new blockchain transaction for recording the integrity ensuring data 145. The API of the blockchain interface 140 has a query function which in turn calls the query function in the smart contract 153 for retrieving the integrity ensuring data 145.
[0037] The data validator 160 is accessible to the metrics data consumer 180 via an API, such as a REST API, to enable retrieval of the metrics data stored in the TSDB 170 with data validation. The data validator 160 uses the API provided by the blockchain interface 140 to read the recorded integrity ensuring data 145 from the blockchain platform 150.
[0038] To obtain validated metrics data from the TSDB 170, the metrics data consumer 180 may send a request to the data validator 160 for validated metrics data. The request identifies the data producer, the data stream, and a time interval for which metrics data is sought. Based on the information provided by the metrics data consumer 180, the data validator 160 determines which block or blocks 135 of metrics data correspond to the time period indicated in the request, and retrieves the corresponding data packets 122 from the TSDB 170. The data validator 160 then generates a hash value for each block of the retrieved data packets using the same hashing algorithm used by the data integrity creator 130 to generate the integrity ensuring data 145. The data validator 160 retrieves the integrity ensuring data 145 for the specified blocks 135 from the blockchain platform via the blockchain interface 140, and compares the hash values in the retrieved integrity ensuring data 145 to the re-calculated hash values for the data packets retrieved from the TSDB 170. If, for a given data block 135, the hash value calculated by the data validator 160 matches the hash value in the integrity ensuring data 145 for the data block 135 retrieved from the blockchain platform 150, the data packets 122 in the data block 135 are determined to be valid, and the valid data packets 122 are provided to the metrics data consumer 180. If, however, the hash value calculated by the data validator 160 does not match the hash value retrieved from the blockchain platform 150, the data block 135 is considered to be invalid, and an indication may be returned to the metrics data consumer 180 that the data packets 122 in the data block 135 are considered to be invalid.
[0039] Figure 3A is a flow diagram illustrating the generation and storage of integrity ensuring data according to some embodiments. As shown therein, a metrics data producer 115 transmits data records 202 periodically or upon request to a metrics monitoring system 110. At block 204, the metrics monitoring system 110 assembles the records into discrete data packets 122 including a producer ID identifying the metrics data producer 115, a metrics ID identifying the particular data stream to which the records 202 belong, and a timestamp.
[0040] The metrics monitoring system 110 transmits (arrow 206A) the generated data packets 122 to a TSDB 170 and also streams (arrow 206B) the packets over a data streaming pipeline 120. A data integrity creator 130 may subscribe (arrow 205) to a particular data stream transmitted by the metrics monitoring system 110 over the data streaming pipeline 120 and receive the data packets (arrow 208) via the data streaming pipeline 120.
[0041] It will be appreciated that other approaches for providing the data packets to the TSDB 170 and the metrics monitoring system 110 are possible, and the concepts described herein are not limited to the particular implementation illustrated in Figures 2A and 3A. For example, in some embodiments, the metrics monitoring system 110 may transmit data packets 122 directly to the data integrity creator 130 instead of the data integrity creator 130 receiving the data packets over the data streaming pipeline 120. Likewise, in some embodiments, the TSDB 170 may obtain the data packets 122 using the data streaming pipeline 120 by subscribing to the stream.
[0042] The data integrity creator 130 assembles the data packets 122 into data blocks 135 (block 210) and generates integrity ensuring data 145 for each data block 135. As shown in Figure 2B, the integrity ensuring data 145 includes a data block index, a timestamp, a producer ID, the metrics IDs and a hash value calculated based on the data in the data packets 122 included in the data block 135.
[0043] The data integrity creator 130 then transmits (arrow 214) the integrity ensuring data 145 to the blockchain interface 140, which stores (arrow 216) the integrity ensuring data 145 on the blockchain platform 150.
[0044] Figure 3B illustrates operations of retrieving validated data by a metrics data consumer 180 according to some embodiments. As shown therein, the metrics data consumer 180 transmits a request (arrow 302) for validated metrics data to the data validator 160. As noted above, the request 302 identifies the data producer, the associated metrics in the data stream and the time period for which validated data is requested. Based on the information provided in the request, the data validator 160 determines which data packets 122 are covered by the request and which data blocks 135 contain the requested data packets 122 (block 303). The data validator 160 then sends a query (arrow 304) to the TSDB 170, which responds by transmitting the requested data packets 122 to the data validator 160 (arrow 306).
[0045] For each data block corresponding to the received data packets, the data validator 160 then generates a hash value from the received data packets (block 308). The data validator 160 then transmits a request (arrow 310) to the blockchain interface 140 requesting the blockchain interface 140 to retrieve the integrity ensuring data 145 associated with the block 135. The blockchain interface 140 then sends a request to the blockchain platform 150, for example by invoking a smart contract, to obtain the integrity ensuring data 145 (arrow 312). The blockchain platform 150 responds with a message including the integrity ensuring data 145 (arrow 314), and the blockchain interface 140 provides the integrity ensuring data 145 to the data validator 160. The data validator 160 validates the integrity of the block by comparing the hash value generated from the data packets retrieved from the TSDB 170 at block 308 with the hash value for the block contained in the integrity ensuring data 145 retrieved from the blockchain platform 150. If the hash values match, then the data validator 160 transmits the validated metrics data to the metrics data consumer 180 (arrow 320). If the hash values do not match, the data validator 160 may transmit a response to the metrics data consumer 180 indicating that the data could not be validated.
[0046] As noted above, data blocks 135 are created by the data integrity creator 130 by assembling data packets 122 having timestamps that fall within a predetermined time interval. The time interval may be configured so that the total number of transactions should be less that the maximum transaction throughput on the blockchain network. For example, the time interval may be calculated such that:
Figure imgf000017_0001
Where Rtx is the transaction rate, N is the number of data block series, and 7/ is the sampling period for a data block series /, and Rtx,max is the maximum transaction rate which can handled by the blockchain network.
[0047] For example, if the time interval associated with data packets 122 in a data block 135 is 1 minute (60 seconds) for all data blocks 135, and there is need to create 180000 data blocks 135 in the TSDB 170, then a hypothetical maximum transaction throughput of 3000 transactions per second on the blockchain network would be exceeded: 180000 — — — = 3000 Transaction per second
Figure imgf000017_0002
[0048] As noted above, a normalization procedure can be used to convert each datapoints in a metrics data stream to a canonical form to provide a more robust solution for hash calculation during the integrity validation.
[0049] Conventional metrics TSDBs, such as the Prometheus and InfluxDB databases, support a text-based wire format. In Prometheus, the default text-based format is called the Exposition Format. In the Exposition Format, each time series is uniquely identified by its metric name and optional key-value pairs called labels. The InfluxDB uses another textbased format, called the InfluxDB Line Protocol, for metrics.
[0050] An example syntax for the line protocol in InfluxDB is shown in Table 2 below. Table 2 - Example InfluxDB Syntax
Figure imgf000018_0001
[0051] As the textual representation can be vendor specific and the hash value for a data block is calculated on a byte array representing the stream data, a single byte change in the data will produce a different checksum value. To be able to reproduce the same checksum value, a normalization procedure for the metrics textual representation may be used in some embodiments.
[0052] In particular, a metrics datapoint normalization function may be used to normalize the metrics tag/label sets together with its value and timestamp to a standard, or normal, form so the hash value is reproducible for the same data packets even if the textual representation format changes or the sequence changes for data points having the same timestamp.
[0053] By using a normalization according to some embodiments, metrics data can be verified independent of the wire protocol.
[0054] A TSDB may have a short retention time. For example, the Prometheus TSDB has a 15-day retention for the time-series data, which means that data may need to be replicated to long-term storage if access is needed after the TSDB retention period. As the long-term storage might be provided by a different vendor platform and may use different text-based formats for queries, normalizing the metrics data may provide a consistent textbased (byte compatible) format for the correct re-calculation of the hash value.
[0055] An example normalization implementation will now be described. The textual representation of metrics datapoints contained in a metrics data packet have the following components:
Metrics (or measurement) name - string type only
Label (or tag) set, which consists of a list of labels/tags with or without values. The set needs to be sorted - string type only
Metrics (or measurement or field) value - can be one of the following types:
The values can be one of the following types: floats, integers, strings, or Booleans Timestamp - The format depends on the platform and the programming language
[0056] The following rules can be applied for the normalization of the above types:
String: Use the default or a custom configured character encoding, e.g., UTF-8 Numbers (Integers and Floating numbers): The E notation in the normalized
Scientific Notation can be used, e.g., 1.2345E6, -9.876E-54, 0E0, 1E0, 123456789E0. For non- real numbers and infinity, use e.g., 'NaN' for non-real numbers, and '+1 nf', '-Inf' for infinity numbers.
Boolean: The number '1' and '0' can be used for the normalization for
Boolean values
Timestamp: The UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1
January 1970 can be used, so Unix time in second, millisecond, microsecond and nanosecond can all preserve their precision.
[0057] With the above normalization rules, a consistent text representation can be re-produced for the metric datapoint value entry. An example normalization procedure is shown in Table 3 below.
Table 3 - Example normalization procedure
Figure imgf000020_0001
[0058] Figure 4 illustrates incremental calculation of a hash value for a data block 135. As noted above, due to the volume of data that can be generated by a metrics data producer 115, it may be impractical to store all data packets 122 that are included in a data block 135 before the hash value is calculated for the data block 135. Accordingly, in some embodiments, the hash value may be calculated incrementally as the data packets 122 are received by the data integrity creator 130, and data packets 122 may be discarded as the incremental hash value is generated.
[0059] Referring to Figure 4, when hash values 402-1, 402-2 are generated individually for the data packets 122-1, 122-2, different hash values are created for each data packet. When the packets 122-1, 122-2 are combined into a data block 135, the hash value 404 for the combined string in the data block 135 is different from the individual hash values 402-1, 402-2. However, when a hash value 406 is generated via incremental string hashing of each of the data packets 122-1, 122-2 using the SHA3-256 hashing algorithm, the resulting hash value 406 is the same as the hash value 404 generated from the combined data.
[0060] Embodiments described herein may provide certain technical advantages. For example, using blockchain technology with its enhanced security and transparency, the metrics data integrity can be ensured without the need to encrypt the data using a dedicated secure data vault.
[0061] By incrementally calculating the hash value for each received metrics data packet, there may be less need to buffer all the metrics data for a data block before calculating the hash value, so the buffer size requirements can be reduced. This may also provide a more flexible approach to calculation of hash values, as there may be a variable size of the data block with a single checksum. [0062] Using a normalization procedure to convert the data representations of data packets to a canonical form may provide a more robust data integrity ensuring solution, as it is less dependent on the wire protocol or storage format used.
[0063] Aggregating multiple metrics streams from the same producer during a time interval into a searchable data block may make the integrity ensuring process more scalable, as the multiple data streams may be validated in a bundle instead of for each data stream record.
[0064] Some embodiments may have low or reduced impact on existing metrics data flows, as they provide a non-intrusive solution to the metrics data pipeline. The metrics data can be retrieved from the TSDB and used using existing APIs, while the data end- user/consumer can use the methods described herein to retrieve and validate the data stored in TSDB. The integrity ensuring data for the metrics data streams is generated close to the source, so there is less chance for the data to be altered once published to the data pipeline by a producer.
[0065] Figure 5 is a functional block diagram that illustrates a data validation system 100 according to some embodiments for ensuring the integrity of streaming metrics data. As shown therein, the data validation system 100 includes a data integrity creation subsystem 112 that implements functionality of the data integrity creator 130 described above, a blockchain platform interface 116 that implements functionality of the blockchain interface 150 described above, and a data validation subsystem 114 that implements functionality of the data validator 160 described above.
[0066] Figure 6A is a block diagram of a data integrity creator 130 according to some embodiments. The data integrity creator 130 includes a processing circuit 134, a memory 136 coupled to the processing circuit 134, and a communication interface 118 coupled to the processing circuit 134. The processing circuit 134 may be a single processor or may include multiple processors, and may be a distributed or cloud-based processor in some embodiments. Figure 6B illustrates functional modules that may be stored in the memory 136 of data integrity creator 130. The functional modules may include a streaming data subscription module 122 that subscribes to streaming metrics data via the streaming data pipeline 120 illustrated in Figure 2A, and an integrity data generator module 124 that generates integrity ensuring data 145 to be stored on a blockchain platform 150 as described above. [0067] Figure 7A is a block diagram of a data validator 160 according to some embodiments. The data validator 160 includes a processing circuit 234, a memory 236 coupled to the processing circuit 234, and a communication interface 218 coupled to the processing circuit 234. The processing circuit 234 may be a single processor or may include multiple processors, and may be a distributed or cloud-based processor in some embodiments. Figure 7B illustrates functional modules that may be stored in the memory 236 of data validator 160. The functional modules may include a data retrieval module 222 that retrieves metrics data from a TSDB 170 illustrated in Figure 2A, and an integrity data generator module 224 that generates a hash value for comparing with integrity ensuring data 145 stored on the blockchain platform 150 as described above.
[0068] Figure 8 illustrates operations of a data integrity creator 130 according to some embodiments. Referring to Figure 8, a method of operating a data integrity creator includes receiving (block 802) streaming metrics data of a communication system originating from a metrics data producer, generating (block 804) integrity ensuring data based on the streaming metrics data, and storing (block 806) the integrity ensuring data on a blockchain platform.
[0069] Figure 9 illustrates operations of a data validator 160 according to some embodiments. Referring to Figure 9, a method of operating a data validator includes obtaining (block 902) a data request from a metrics data consumer for metrics data of a communication system, and retrieving (block 904) the metrics data from a TSDB. The method further includes generating (block 906) a first hash value of the metrics data, obtaining (block 908) a second hash value of the metrics data from a blockchain platform, and comparing (block 910) the first hash value and the second hash value to determine if the first hash value and the second hash value match. In response to determining that the first hash value and the second hash value match, the data validator transmits (block 912) the metrics data to the metrics data consumer. If the first hash value and the second hash value do not match, an error message may be returned to the data consumer (block 914).
[0070] In the above-description of various embodiments of present inventive concepts, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of present inventive concepts. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which present inventive concepts belong. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art.
[0071] When an element is referred to as being "connected", "coupled", "responsive", or variants thereof to another element, it can be directly connected, coupled, or responsive to the other element or intervening elements may be present. In contrast, when an element is referred to as being "directly connected", "directly coupled", "directly responsive", or variants thereof to another element, there are no intervening elements present. Furthermore, "coupled", "connected", "responsive", or variants thereof as used herein may include wirelessly coupled, connected, or responsive. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Well-known functions or constructions may not be described in detail for brevity and/or clarity. The term "and/or" includes any and all combinations of one or more of the associated listed items.
[0072] It will be understood that although the terms first, second, third, etc. may be used herein to describe various elements/operations, these elements/operations should not be limited by these terms. These terms are only used to distinguish one element/operation from another element/operation. Thus, a first element/operation in some embodiments could be termed a second element/operation in other embodiments without departing from the teachings of present inventive concepts. The same reference numerals or the same reference designators denote the same or similar elements throughout the specification.
[0073] As used herein, the terms "comprise", "comprising", "comprises", "include", "including", "includes", "have", "has", "having", or variants thereof are open-ended, and include one or more stated features, integers, elements, steps, components, or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions, or groups thereof.
[0074] Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits. These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).
[0075] These computer program instructions may also be stored in a tangible computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the functions/acts specified in the block diagrams and/or flowchart block or blocks. Accordingly, embodiments of present inventive concepts may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.) that runs on a processor such as a digital signal processor, which may collectively be referred to as "circuitry," "a module" or variants thereof.
[0076] It should also be noted that in some alternate implementations, the functions/acts noted in the blocks may occur out of the order noted in the flowcharts. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Moreover, the functionality of a given block of the flowcharts and/or block diagrams may be separated into multiple blocks and/or the functionality of two or more blocks of the flowcharts and/or block diagrams may be at least partially integrated. Finally, other blocks may be added/inserted between the blocks that are illustrated, and/or blocks/operations may be omitted without departing from the scope of inventive concepts. Moreover, although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows. [0077] Many variations and modifications can be made to the embodiments without substantially departing from the principles of the present inventive concepts. All such variations and modifications are intended to be included herein within the scope of present inventive concepts. Accordingly, the above disclosed subject matter is to be considered illustrative, and not restrictive, and the examples of embodiments are intended to cover all such modifications, enhancements, and other embodiments, which fall within the spirit and scope of present inventive concepts. Thus, to the maximum extent allowed by law, the scope of present inventive concepts are to be determined by the broadest permissible interpretation of the present disclosure including the examples of embodiments and their equivalents, and shall not be restricted or limited by the foregoing detailed description.

Claims

CLAIMS:
1. A method of operating a data integrity creator, comprising: receiving (802) streaming metrics data of a communication system originating from a metrics data producer; generating (804) integrity ensuring data based on the streaming metrics data; and storing (806) the integrity ensuring data on a blockchain platform.
2. The method of Claim 1, wherein the integrity ensuring data comprises a producer identifier that identifies a producer of the metrics data, a metrics identifier that identifies the metrics data, an aggregate hash value associated with the metrics data, and a timestamp associated with the metrics data.
3. The method of Claim 2 wherein the integrity ensuring data further comprises a duration of a time interval associated with the metrics data.
4. The method of Claim 2 wherein the integrity ensuring data further comprises an ending timestamp associated with the metrics data.
5. The method of Claim 2, wherein the integrity ensuring data further comprises a total number of packets in a time interval associated with the metrics data.
6. The method of Claim 2, wherein the integrity ensuring data further comprises a first hash value for a first packet of the metrics data.
7. The method of Claim 1, wherein receiving the streaming metrics data comprises receiving a plurality of data packets containing the streaming metrics data during a time interval, and wherein the integrity ensuring data is generated for the streaming metrics data in a data block comprising the data packets created during the time interval.
8. The method of Claim 7, wherein the integrity ensuring data comprises a data block index of the data block, a timestamp associated with the data block, a producer identifier of data included in the data packets comprising the data block, and a metric identifier of a metric associated with the data packets.
9. The method of Claim 7 , wherein the aggregate hash value is generated incrementally for data packets created during the time interval.
10. The method of Claim 9, wherein the aggregate hash value is generated as an incremental hash of the streaming metrics data in the data packets created during the time interval.
11. The method of Claim 10, wherein the incremental hash of the streaming metrics data is generated using a 256-bit secure hashing algorithm, SHA3-256.
12. The method of Claim 10, wherein the aggregate hash value is generated by calculating a hash of a first data packet of the streaming metrics data and repeatedly updating the hash based on subsequent data packets of the streaming metrics data.
13. The method of Claim 12, further comprising discarding each data packet of streaming metrics data after updating the hash value based on the packet of streaming metrics data.
14. The method of Claim 7, wherein receiving the streaming metrics data, generating the integrity ensuring data for the metrics data and storing the integrity ensuring data on the blockchain platform are performed repeatedly for successive time intervals.
15. The method of Claim 1, wherein storing the integrity ensuring data on the blockchain platform comprises transmitting the integrity ensuring data to a blockchain interface that provides an application programming interface for interacting with the blockchain platform.
16. The method of Claim 15, wherein the blockchain platform comprises a private blockchain platform.
17. The method of Claim 15, wherein the blockchain platform comprises a public blockchain platform.
18. The method of Claim 15, wherein the blockchain platform supports the execution of smart contracts.
19. The method of Claim 1, further comprising subscribing to a metrics data stream for receiving the streaming metrics data.
20. The method of Claim 21, wherein the metrics data stream is generated by a metrics monitoring system that time-stamps packets of the streaming metrics data generated by the metrics data producer and stores the time-stamped packets in a time series database.
21. A data integrity creator (130), comprising: a processing circuit (134); and a memory (136) that stores computer program instructions that, when executed by the processing circuit, cause the processing circuit to perform operations comprising: receiving (802) streaming metrics data of a communication system originating from a metrics data producer; generating (804) integrity ensuring data based on the streaming metrics data; and storing (806) the integrity ensuring data on a blockchain platform.
22. The data integrity creator of Claim 21, further configured to perform operations according to any of Claims 2 to 20.
23. A data integrity creator (130), comprising: a streaming data subscription module (122) configured to receive (802) streaming metrics data of a communication system originating from a metrics data producer; and an integrity data generator module (124) configured to generate (804) integrity ensuring data based on the streaming metrics data and store (806) the integrity ensuring data on a blockchain platform.
24. A method of operating a data validator, comprising: obtaining (902) a data request from a metrics data consumer for metrics data of a communication system; retrieving (904) the metrics data from a time series database, TSDB; generating (906) a first hash value of the metrics data; obtaining (908) a second hash value of the metrics data from a blockchain platform; comparing (910) the first hash value and the second hash value to determine if the first hash value and the second hash value match; and in response to determining that the first hash value and the second hash value match, transmitting (912) the metrics data to the metrics data consumer.
25. The method of Claim 24, wherein the request for validated metrics data comprises a time range for which validated metrics data is requested, the method further comprising: identifying one or more time intervals of metrics data that encompass the time range for which validated metrics data is requested.
26. The method of Claim 25, wherein obtaining the metrics data from the TSDB comprises obtaining the metrics data for the identified one or more time intervals.
27. The method of Claim 26, wherein generating the first hash value comprises generating the first hash value based on metrics data for a first time interval of the identified one or more time intervals, and wherein obtaining the second hash value comprises obtaining the second hash value for the first time interval of the identified one or more time intervals from the blockchain platform.
28. The method of Claim 27 , further comprising generating first hash values for each of the identified one or more time intervals and obtaining second hash values for each of the identified one or more time intervals from the blockchain platform.
29. The method of Claim 27, wherein the first hash value is generated incrementally for metrics data in the first time interval.
30. The method of Claim 29, wherein the first hash value is generated as an incremental hash of the metrics data in the first time interval.
31. The method of Claim 30, wherein the incremental hash of the metrics data in the first time interval is generated using a 256-bit secure hashing algorithm, SHA3-256.
32. The method of Claim 30, wherein the first hash value is generated by calculating a hash of a first packet of the metrics data of the first time interval and repeatedly updating the hash based on subsequent packets of the metrics data of the first time interval.
33. The method of Claim 24, wherein the second hash value is obtained from the blockchain platform based on an identity of a metrics data producer of the metrics data, an identity of a metrics data stream of the metrics data, and a time span associated with the metrics data.
34. The method of Claim 24, wherein the first hash value is generated by iterating through data packets of metrics data received from the TSDB and incrementally updating the hash value for each received packet.
35. A data validator (160), comprising: a processing circuit (234); and a memory (236) that stores computer program instructions that, when executed by the processing circuit, cause the processing circuit to perform operations comprising: obtaining (902) a data request from a metrics data consumer for metrics data of a communication system; retrieving (904) the metrics data from a time series database, TSDB; generating (906) a first hash value of the metrics data; obtaining (908) a second hash value of the metrics data from a blockchain platform; comparing (910) the first hash value and the second hash value to determine if the first hash value and the second hash value match; and in response to determining that the first hash value and the second hash value match, transmitting (912) the metrics data to the metrics data consumer.
36. The data validator of Claim 35, further configured to perform operations according to any of Claims 25 to 34.
37. A data validator (160), comprising: a data retrieval module (222) configured to obtain a data request from a metrics data consumer for metrics data of a communication system and retrieve the metrics data from a time series database; an integrity generator module (224) configured to generate a first hash value of the metrics data, obtain a second hash value of the metrics data from a blockchain platform, and compare the first hash value and the second hash value to determine if the first hash value and the second hash value match; and a communication interface (218) configured, in response to determining that the first hash value and the second hash value match, to transmit the metrics data to the metrics data consumer.
38. A system for ensuring integrity of streaming metrics data, comprising: a data integrity creator (130) that receives streaming metrics data of a communication system originating from a metrics data producer, generates integrity ensuring data based on the streaming metrics data, and stores the integrity ensuring data on a blockchain platform; and a data validator (160) that receives a data request for metrics data from a metrics data consumer, retrieves the metrics data from a time series database, generates a first hash value for the metrics data, retrieves a second hash value for the metrics data from the blockchain platform, compares the first hash value and the second hash value to determine if the first hash value and the second hash value match, and in response to determining that the first hash value and the second hash value match, transmits the metrics data to the metrics data consumer.
39. The system of Claim 38, further comprising: a blockchain interface (150) that enables interaction with the blockchain platform, wherein the data integrity creator stores the integrity ensuring data by transmitting the integrity ensuring data to the blockchain platform using the blockchain interface for storage on the blockchain platform, and the data validation node retrieves the second hash value from the blockchain platform by requesting the second hash value from the blockchain platform using the blockchain interface.
40. The system of Claim 38, further comprising a metrics monitoring system (110) that time-stamps packets of metrics data generated by a metrics data producer and stores the time-stamped packets in the time series database.
PCT/IB2022/050560 2022-01-21 2022-01-21 Streaming metrics data integrity in a communication system WO2023139412A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2022/050560 WO2023139412A1 (en) 2022-01-21 2022-01-21 Streaming metrics data integrity in a communication system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2022/050560 WO2023139412A1 (en) 2022-01-21 2022-01-21 Streaming metrics data integrity in a communication system

Publications (1)

Publication Number Publication Date
WO2023139412A1 true WO2023139412A1 (en) 2023-07-27

Family

ID=80119325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2022/050560 WO2023139412A1 (en) 2022-01-21 2022-01-21 Streaming metrics data integrity in a communication system

Country Status (1)

Country Link
WO (1) WO2023139412A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200021444A1 (en) * 2018-07-13 2020-01-16 Waters Technologies Ireland Limited Techniques for Managing Analytical Information Using Distributed Ledger Technology
US20210240858A1 (en) * 2018-05-09 2021-08-05 Centrica Plc System for protecting integrity of transaction data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210240858A1 (en) * 2018-05-09 2021-08-05 Centrica Plc System for protecting integrity of transaction data
US20200021444A1 (en) * 2018-07-13 2020-01-16 Waters Technologies Ireland Limited Techniques for Managing Analytical Information Using Distributed Ledger Technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YU ZIRU ET AL: "A Storage Architecture of Blockchain for Time-Series Data", 2019 2ND INTERNATIONAL CONFERENCE ON HOT INFORMATION-CENTRIC NETWORKING (HOTICN), IEEE, 13 December 2019 (2019-12-13), pages 90 - 91, XP033755192, [retrieved on 20200409], DOI: 10.1109/HOTICN48464.2019.9063220 *

Similar Documents

Publication Publication Date Title
US10831902B2 (en) Data verification methods and systems using a hash tree, such as a time-centric Merkle hash tree
US11663090B2 (en) Method and system for desynchronization recovery for permissioned blockchains using bloom filters
RU2724136C1 (en) Data processing method and device
US20220141018A1 (en) Method and system for an efficient consensus mechanism for permissioned blockchains using audit guarantees
CN104699718B (en) Method and apparatus for being rapidly introduced into business datum
US20200050782A1 (en) Method and apparatus for operating database
US8260742B2 (en) Data synchronization and consistency across distributed repositories
TW201832098A (en) Transaction verification in a consensus network
WO2019097322A1 (en) Optimization of high volume transaction performance on a blockchain
WO2017219858A1 (en) Streaming data distributed processing method and device
CN114647698A (en) Data synchronization method and device and computer storage medium
CN107276912B (en) Memory, message processing method and distributed storage system
CN114490741A (en) Time sorting method and device based on trusted block chain, electronic equipment and medium
WO2023139412A1 (en) Streaming metrics data integrity in a communication system
US11675777B2 (en) Method, apparatus, and computer readable medium for generating an audit trail of an electronic data record
CN112785302B (en) Message statistics method and device, electronic equipment and readable storage medium
CN115577985B (en) Multi-equipment consensus method for power block chain
CN109828908A (en) Interface testing parameter encryption method, device, electronic equipment and storage medium
CN110933155B (en) Novel block chain network
CN110347748B (en) Data verification method, system, device and equipment based on inverted index
CN117255130A (en) Data processing method, device, equipment and medium based on block chain
CN117171272A (en) Data synchronization method and device
CN114691634A (en) Audit log system applied to cloud resource integrity operation
CN117370289A (en) Data acquisition method and device and electronic equipment
CN116028236A (en) Message queue construction method and device and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22702032

Country of ref document: EP

Kind code of ref document: A1