WO2023214909A1 - Feature selective and generative digital twin emulator machine for device verification and anomaly checking - Google Patents

Feature selective and generative digital twin emulator machine for device verification and anomaly checking Download PDF

Info

Publication number
WO2023214909A1
WO2023214909A1 PCT/SE2023/050328 SE2023050328W WO2023214909A1 WO 2023214909 A1 WO2023214909 A1 WO 2023214909A1 SE 2023050328 W SE2023050328 W SE 2023050328W WO 2023214909 A1 WO2023214909 A1 WO 2023214909A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
emulator
devices
model
time series
Prior art date
Application number
PCT/SE2023/050328
Other languages
French (fr)
Inventor
Bin Xiao
Toni MASTELIC
Darko Huljenic
Peter VON WRYCZA
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Publication of WO2023214909A1 publication Critical patent/WO2023214909A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • the present disclosure relates to machine learning, and in particular, to one or more of feature selection, generative training and verification/anomaly detection.
  • loT may be associated with a predetermined amount (e.g., massive amount) of physical devices and data, which provide input for digital twin models.
  • Data collected by loT sensors may be used to create digital models for simulating equipment, a system, and even the surroundings.
  • Such digital twin (DT) models can show punctual and individual problems, as well as utilize the data to predict one or more probabilities of issues that may happen in the near future.
  • DT models may be generated automatically and may take into consideration one or more possibilities in different conditions and for different purposes.
  • loT-related industrial applications are to emulate the surroundings and/or the system itself for verification of the physical assets and to check for anomalies, e.g., before any deployment changes affect existing physical assets in an loT platform.
  • Digital Twin may refer to a process of mapping physical entities to a digital representation by involving loT, e.g., to enable continuous communication between physical and digital twins.
  • one part of DT is data that is collected from the physical entities.
  • the data provides the ability to describe the physical entity as realistically as required, e.g., required by a use case.
  • the physical entity may be described using the DT model built as its digital representation. This is where the communication between physical and digital twins comes into focus.
  • the role of loT may be used to enrich and automatize data collection and improve decision-making for the operation of physical entities.
  • the identified elements of physical entities can be addressable and stored to enable immediate usage or historical usage of received information. Historical data may be used in some future emulation.
  • the models e.g., emulator and models for other functionalities
  • the models are reusable to serve various industrial applications and adaptable for various use cases
  • the model management is conducted by DT operations, including both the processes, running loops, and components;
  • loT-based device platform may be configured to generate data by capturing environment status via devices together with status data of hardware of the platform.
  • at least part of the data collected from the environment may be redundant, such as some double-measured parameters and parameters not usable for predetermined objectives.
  • the data may not be relevant to the goal of an emulation, e.g., the noise level may not relate to product quality.
  • irrelevant or redundant data may increase computation costs and/or even decrease the performance quality.
  • some systems for generating digital twin models from loT for emulation may include one or more common features in the following aspects: (1) different digital twin modeling (i.e., more than one DT model) applies based on different domain knowledge to solve a diversity of needs, e.g., only relating to a specific asset (such as the production line of certain products) which may be exposed to multiple condition changes.
  • US 20210141870 Al discloses a creation of a digital twin from a mechanical model.
  • An industrial CAD system is supplemented with features that allow a developer to easily convert a mechanical CAD model of an automation system to a dynamic digital twin capable of simulation within a simulation platform.
  • the features allow the user to label selected elements of a mechanical CAD drawing with “aspects” within the CAD environment. These aspect markups label the selected mechanical elements as being specific types of industrial assets or control elements.
  • the CAD platform associates mechatronic metadata with the selected elements based on the type of aspect with which each element is labeled. This mechatronic metadata defines the behavior (e.g., movements, speeds, forces, etc.) of the selected element within the context of a virtual simulation, transforming the mechanical CAD model into a dynamic digital twin that can be exported to a simulation and testing platform.
  • An object of the invention is to enable improved device verification and anomaly detection.
  • Some embodiments advantageously provide methods, systems, and apparatuses for one or more of feature selection, generative training and verification/anomaly detection.
  • DT models are based on knowledge representation, which can support reasoning (e.g., to answer “what-if’).
  • Models for DT may be reusable and flexible so that the operation and creation of the framework can have minimal dependency on human expertise.
  • Device verification and anomaly checking may be one use case that is served by running the models providing certain functionalities, namely an emulator.
  • Device verification and anomaly checking before the physical assets are hard settled plays roles in the domains such as automotive, smart manufacturing, as well as in the communication networks (e.g., radio access network). That is, once the changes on hardware deployment are settled, it can be costly to adjust existing deployment s). For example, in typical systems, changes to hardware may require frequently turning the system offline for error correction which may result in negative customers experience. In one or more embodiments, the deployment may be adjusted while the physical assets are online/settled (i.e., installed and/or in production and/or in use).
  • a powerful emulator may help managing vehicles by enabling verification and anomaly checking, e.g., so that security risks can be avoided before real-life driving tests.
  • data gathered by industrial loT equipment may be consumed by an emulator for predictive maintenance of the production line, e.g., so that anomaly checking and/or verification can happen at an early stage. The data gathered as such may be used to lower downtime and increase production quality after deployment of physical assets.
  • RAN use case In Radio Access Networks (RAN), hybrid historical data and live data collected from physical assets/devices may be used to emulate specific key parameter indicators (KPIs) that reflect the behavior and performance of a network.
  • KPIs key parameter indicators
  • using an emulator i.e., an emulator device
  • device verification and anomaly detection may be used to identify potential problems under different conditions which can affect latency, throughput, and/or cause packet loss.
  • Smart manufacturing use case detecting anomalies, e.g., in manufacturing lines, at an early phase and having risk control on hardware changes are very important.
  • Learning patterns and emulating the patterns for early anomaly detection may be used to minimize losses caused by hardware (e.g., associated with mistaken configurations of hardware).
  • learning and discovering potential risks of hardware changes may be used to support operators to understand possible consequences of each change in detail and/or to avoid decisions on hardware change that may produce undesirable results.
  • feature selections may be performed, e.g., based on data collected by devices before loading the feature selections onto generative trainings.
  • the loaded feature selections may avoid using irrelevant data and redundant data for the generative training.
  • an emulator device e.g., a feature selective and generative DT emulator for device verification and anomaly detection based on loT.
  • the emulator i.e., the emulator functions provided by emulator device
  • the emulator is built/configured upon loT-based Digital Twin setup using Feature-selective Variational Generator for High-dimensional Time Series (FVGHT).
  • FVGHT may using features from concrete autoencoder (CAE), variational autoencoders (VAE), and/or gated recurrent unit (GRU).
  • CAE concrete autoencoder
  • VAE variational autoencoders
  • GRU gated recurrent unit
  • the emulator may be continuously trained based on the collected device data in high-dimensional with localized concrete (CONtinuous relaxations of disCRETE random variables) feature selections on the edge.
  • the emulator may be configured to conduct (i.e., determine and/or perform) a global process to continuously approach the probabilistic distribution the inputted data obeys.
  • the mechanism/process performed by emulator includes performing localized feature-selection (e.g., on edge devices) and/or find out (i.e., determine) a model (i.e., an optimized model) which may have a minimum Kullback- Leibler divergence with an input sample (e.g., received continuously).
  • a model i.e., an optimized model which may have a minimum Kullback- Leibler divergence with an input sample (e.g., received continuously).
  • the emulator i.e., emulator device
  • the emulator may be configured to continuously perform data sampling from the trained model. Further, the emulator may be configured to determine/generates one or more data sets obeying (i.e., associated with, corresponding) to the same distribution patterns (e.g., of input samples). The data sets may be based on contributive features such as to emulate input samples with no or domain expertise (i.e., limited domain expertise) and/or no human intervention.
  • training units within the emulator also present the timely relations between data using Gated Recurrent Units (one difference from other generative methods).
  • One or more embodiments described herein provide an emulator configured to receive/collect data (e.g., continuously takes data) from loT devices with feature selection and without disturbing the running physical sets.
  • data e.g., continuously takes data
  • the emulator may be configured to generates another set of data that is different from the original set of data but obeys the same distribution patterns.
  • the generation of the sets of data may be based on contributive features.
  • Such a feature-selective and generative method may be driven by data that does not require domain knowledge; therefore, it is highly replicable under different conditions (physical environments, heterogeneous devices, etc.,.)
  • an emulator device configured to communicate with a plurality of devices.
  • the emulator device comprises processing circuitry configured to determine high-dimensional time series data based on data associated with the plurality of devices; apply a model to the high-dimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic of a variational autoencoder (VAE); train the model, using online batchbased machine learning, based at least in part on the high-dimensional time series data; and perform at least one action using the trained model.
  • VAE variational autoencoder
  • the processing circuitry is further configured to determine the feature selection of the data associated with the plurality of devices, where the determined feature selection indicates at least one relation between at least two dimensions of the data.
  • the processing circuitry is further configured to determine a plurality of sliding windows, based at least on one parameter, to train the model using online-batch learning, where each sliding window of the plurality of sliding windows has a size.
  • the processing circuitry is further configured to accumulate a plurality of data frames per each sliding window of the plurality of sliding windows. The plurality of data frames is associated with the data associated with the plurality of devices.
  • applying the model includes generating emulated data based at least in part on the plurality of sliding windows.
  • the emulated data includes reconstructed information from the high-dimensional time series data. At least one set of the emulated data corresponds to one sliding window of the plurality of sliding windows.
  • the model is a feature-selective variational generator for high-dimensional time series (FVGHT).
  • the model is applied based at least in part on at least one of a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
  • the emulator device further includes a communication interface in communication with the processing circuitry.
  • the communication interface is configured to at least one of transmit emulated data to at least one device of the plurality of devices, the exposing including transmitting the emulated data; and transmit signaling to cause at least one device of the plurality of devices to perform the at least one action.
  • performing the at least one action includes at least one of performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device.
  • At least one of the plurality of devices is an internet of things (loT) device.
  • LoT internet of things
  • the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
  • the emulator device is digital twin emulator.
  • a method performed by an emulator device configured to communicate with a plurality of devices comprises determining high-dimensional time series data based on data associated with the plurality of devices; applying a model to the high-dimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic of a variational autoencoder (VAE); training the model, using online batch-based machine learning, based at least in part on the high-dimensional time series data; and performing at least one action using the trained model.
  • VAE variational autoencoder
  • the method further includes determining the feature selection of the data associated with the plurality of devices, where the determined feature selection indicates at least one relation between at least two dimensions of the data.
  • the method further includes determining a plurality of sliding windows based at least on one parameter, to train the model using online-batch learning.
  • Each sliding window of the plurality of sliding windows has a size.
  • the method further includes accumulating a plurality of data frames per each sliding window of the plurality of sliding windows.
  • the plurality of data frames is associated with the data associated with the plurality of devices.
  • applying the model includes generating emulated data based at least in part on the plurality of sliding windows.
  • the emulated data includes reconstructed information from the high-dimensional time series data. At least one set of the emulated data corresponds to one sliding window of the plurality of sliding windows.
  • the model is a feature-selective variational generator for high-dimensional time series (FVGHT).
  • the model is applied based at least in part on at least one of a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
  • the method further includes at least one of transmitting (e.g., making available, exposing, sharing) emulated data to at least one device of the plurality of devices; and transmitting signaling to cause at least one device of the plurality of devices to perform the at least one action.
  • transmitting e.g., making available, exposing, sharing
  • performing the at least one action includes at least one of performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device.
  • at least one of the plurality of devices is an internet of things (loT) device.
  • the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
  • the emulator device is digital twin emulator.
  • a computer program comprises instructions which, when executed on processing circuitry of an emulator device, cause the processing circuitry to carry out any method of the present disclosure.
  • a computer program product is described.
  • the computer program product is stored on a computer storage medium and comprises instructions that, when executed by processing circuitry of an emulator device, cause the emulator device to perform any method of the present disclosure.
  • FIG. l is a diagram of an emulator in a loT based digital twin platform
  • FIG. 2 is a schematic diagram of an example system according to principles disclosed herein;
  • FIG. 3 is a block diagram of various elements in the system according to some embodiments of the present disclosure.
  • FIG. 3 is a flowchart of an example process in emulator device according to some embodiments of the present disclosure.
  • FIG. 4 is a flowchart of another example process in emulator device according to some embodiments of the present disclosure.
  • FIG. 5 is a diagram of an example emulator in an loT based digital twin platform according to some embodiments of the present disclosure
  • FIG. 6 is a diagram of an example feature-selective generative ML algorithm (FVGHT) for high-dimensional time-series data according to some embodiments of the present disclosure
  • FVGHT feature-selective generative ML algorithm
  • FIG. 7 is a diagram of various example steps performed by the emulator according to some embodiments of the present disclosure.
  • FIG. 8 is a diagram of example feature selection units of FVGHT for highdimensional time-series data according to some embodiments of the present disclosure
  • FIG. 9 is a diagram of example generative units of FVGHT for highdimensional time-series data according to some embodiments of the present disclosure.
  • FIG. 10 is a diagram of example time series units of FVGHT for high dimensional series data (e.g., the decoder part) according to some embodiments of the present disclosure
  • FIG. 11 is a diagram of example time series units of FVGHT for highdimensional time series data (e.g., the encoder part) according to some embodiments of the present disclosure
  • FIG. 12 is a diagram of an example implementation of emulator machine/device based on distributed processor units according to some embodiments of the present disclosure
  • FIG. 13 is a diagram of example emulator device (e.g., emulator machine) with point-to-point communication between processor and with an external broker according to some embodiments of the present disclosure
  • FIG. 14 is a flowchart of an example process implemented by the emulator device according to some embodiments of the present disclosure.
  • FIG. 15 is a diagram of an example orchestration of an emulator according to some embodiments of the present disclosure.
  • emulation tools i.e., an emulator device.
  • the emulator device may be configured to generate (e.g., dynamically generate) one or more models such as for verification of physical assets and/or detection of potential anomalies in various conditions.
  • the emulator may be configured to select features such as impactive features and remove the irrelevant and redundant features. In one or more embodiments, selecting and/or removing may be important steps for the emulator to be robust.
  • the emulator i.e., emulator device
  • the emulator device may be configured to fulfill/provide the following:
  • Prior domain knowledge (training labels, useful data models) regarding collected data may be sparse due to various combinations of devices and changes in processes (limited domain expertise); therefore, the solution may have minimized dependency on prior domain expertise.
  • the solution may be replicable and reusable for different use cases. That is, the solution (including the algorithms and other components) may be applicable to different use cases (with different data sets, different requirements) without introducing changes (e.g., significant changes).
  • Data processing algorithms may be diverse enough so that predetermined potential conditions and environment assumptions (that have not happened before) can be included in a model;
  • Feature selection may be used so that irrelevant and redundant data can be fixed to increase the quality of the data generated by the emulator
  • the model may be light weighted for edge devices and reduce computation costs (e.g., energy and time).
  • the emulator (and corresponding features) can be configured to meet any of these requirements/conditions.
  • a first system provides for the automatic creation three dimensional (3D) digital twin models of some mechanical parts that can be imported into a simulation platform and be simulated.
  • the first system creates an exact model of devices (e.g., the real thing), while in one or more embodiments of the present disclosure, the emulator device may be configured to automatically create different variations of the model states and simulate the different variations.
  • One limitation of the first system is thus a focus on a 3D model that can be used in a simulator but cannot provide “what if’ scenarios for the model.
  • a second system relates to a direct connection between the robot program and a simulation platform so that the simulation can be performed on a 3D model with the real robot logic.
  • the second system simulates the exact behavior, while one or more embodiments of the present disclosure diversifies it to explore different situations and possibilities.
  • the limitation of the second system is thus the inability to simulate “what if’ scenarios that are similar but not the same as the historical situations, rather the second system only emulate the behavior of real things in the current situation.
  • a third system relates to a digital twin based solution for preventive maintenance of vehicles by simulating future trips and predicting car component failures during those trips, as well as suggesting car mechanic stops along the way.
  • the third system utilizes historical data for simulation.
  • the limitation of the third system is that it utilizes historical journeys as is, without expanding the parametrization space, and thus not accounting for different but very similar scenarios.
  • the third system uses expert models for simulating component failures, while one or more embodiments of the present disclosure use a data-driven approach by applying machine learning and is thus able to explore a much wider problem space.
  • a fourth system utilizes a variational autoencoder (VAE) including an encoder network, prior network and a decoder network to train on a set of images used as input data.
  • VAE variational autoencoder
  • the technique also includes training the VAE by updating one or more parameters of the VAE based on one or more outputs produced by the VAE from the training dataset.
  • Constructed VAE provides output in form of new images that reflect one or more visual attributes associated with the set of training images by applying the decoder network to one or more values sampled from the second distribution of latent variables generated by the prior network.
  • the fourth system use hierarchical variational autoencoder to generate new images similar to the input ones, while one or more embodiments of the present disclosure uses concrete autoencoders to extract features and hierarchical autoencoders to emulate real-world conditions represent in a form of time series data, such as sensor telemetry data.
  • a fifth system relates to compressing images using neural networks.
  • the fifth system utilizes variational autoencoders (VAE), namely an encoder to process the image and provide latent variables as an output, based on which the generative neural networks can recreate the original image.
  • VAE variational autoencoders
  • the latent variables represent features of the image and are used to condition the generative neural networks.
  • the fifth system uses VAE to reproduce the input image from a compressed version, while one or more embodiments of the present disclosure emulate real-world conditions represented in a form of time-series data that are not the same as the observed ones, while still being realistic.
  • a sixth system uses autoencoders including an encoder neural network and a decoder neural network to generate music.
  • the encoder is configured to receive an input audio waveform and, in response, provide an embedding descriptive of the input audio waveform.
  • the decoder is configured to receive the embedding and at least a portion of the input audio waveform and, in response, predict the next sequential audio sample for the input audio waveform.
  • the operations include evaluating a loss function that compares the next sequential audio sample predicted by the decoder neural network to a ground-truth audio sample associated with the input audio waveform.
  • the operations include adjusting one or more parameters of the autoencoder model to improve the loss function.
  • the neural synthesizer model can use deep neural networks to generate sounds at the level of individual samples. Learning directly from data, the neural synthesizer model can provide intuitive control over timbre and dynamics and enable exploration of new sounds that would be difficult or impossible to produce with a hand-tuned synthesizer.
  • the sixth system applies autoencoders to music creation, focusing on synthesizing new sounds, while one or more embodiments of the present disclosure apply VAE to emulate variable real-world conditions represented as time-series data, such as telemetry data, in a manner that the emulated data represents realistic real- world conditions.
  • the embodiments reside primarily in combinations of apparatus components and processing steps related to machine learning and in particular to one or more of feature selection, generative training and verification/anomaly detection. Accordingly, components have been represented where appropriate by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.
  • relational terms such as “first” and “second,” “top” and “bottom,” and the like, may be used solely to distinguish one entity or element from another entity or element without necessarily requiring or implying any physical or logical relationship or order between such entities or elements.
  • the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the concepts described herein.
  • the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
  • the joining term, “in communication with” and the like may be used to indicate electrical or data communication, which may be accomplished by physical contact, induction, electromagnetic radiation, radio signaling, infrared signaling or optical signaling, for example.
  • electrical or data communication may be accomplished by physical contact, induction, electromagnetic radiation, radio signaling, infrared signaling or optical signaling, for example.
  • functions described herein as being performed by an emulator device may be distributed over a plurality of devices.
  • the functions of the emulator device described herein are not limited to performance by a single physical device and, in fact, can be distributed among several physical devices.
  • determining may refer to (without being limited to) at least one of obtaining, calculating, deriving, detecting, measuring, generating, learning, developing, deciding, controlling, regulating, dictating, discovering, establishing, setting, completing, etc.
  • determining data such as high-dimensional time series data may include learning and/or generating data such as high-dimensional time series data.
  • the general description elements in the form of “one of A and B” corresponds to A or B. In some embodiments, at least one of A and B corresponds to A, B or AB, or to one or more of A and B. In some embodiments, at least one of A, B and C corresponds to one or more of A, B and C, and/or A, B, C or a combination thereof.
  • Some embodiments are directed to one or more of feature selection, generative training and verification/anomaly detection.
  • FIG. 2 a schematic diagram of a system 10 that includes an emulator device 12 (also referred to as emulator) in communication with one or more internet of things (loT) devices 14 via one or more networks (e.g., wireless network, 3GPP based network, WAN, LAN, etc.).
  • Emulator device 12 includes hardware 16 enabling it to communicate with one or more entities in system 10 such as with loT device 14.
  • the hardware 16 may include a communication interface 18 for setting up and maintaining at least communications (e.g., wireless, wired, etc.) with one or more elements in system 10 such as with loT device 14.
  • loT device 14 is described as an loT device as a matter of convenience and to aid understanding of the principles disclosed herein. It is understood that the principles described herein may apply to other devices (e.g., network devices) and elements and are not limited solely and exclusively to loT devices. loT device 14 may be a device for use in one or more application domains, these domains comprising, but not limited to, home, city, wearable technology, extended reality, industrial application, and healthcare.
  • the loT device 14 for a home, an office, a building or an infrastructure may be a baking scale, a coffee machine, a grill, a fridge, a refrigerator, a freezer, a microwave oven, an oven, a toaster, a water tap, a water heater, a water geyser, a sauna, a vacuum cleaner, a washer, a dryer, a dishwasher, a door, a window, a curtain, a blind, a furniture, a light bulb, a fan, an air-conditioner, a cooler, an air purifier, a humidifier, a speaker, a television, a laptop, a personal computer, a gaming console, a remote control, a vent, an iron, a steamer, a pressure cooker, a stove, an electric stove, a hair dryer, a hair styler, a mirror, a printer, a scanner, a photocopier, a projector, a hologram projector, a 3D printer,
  • the loT device 14 for use in a city, urban, or rural areas may be connected street lighting, a connected traffic light, a traffic camera, a connected road sign, an air control/monitor, a noise level detector, a transport congestion monitoring device, a transport controlling device, an automated toll payment device, a parking payment device, a sensor for monitoring parking usage, a traffic management device, a digital kiosk, a bin, an air quality monitoring sensor, a bridge condition monitoring sensor, a fire hydrant, a manhole sensor, a tarmac sensor, a water fountain sensor, a connected closed circuit television, a scooter, a hoverboard, a ticketing machine, a ticket barrier, a metro rail, a metro station device, a passenger information panel, an onboard camera, and other connected device on a public transport vehicle.
  • the communication loT device 14 may be a wearable device, or a device related to extended reality, wherein the device related to extended reality may be a device related to augmented reality, virtual reality, merged reality, or mixed reality.
  • Examples of such loT devices may be a smart-band, a tracker, a haptic glove, a haptic suit, a smartwatch, clothes, eyeglasses, a head mounted display, an ear pod, an activity monitor, a fitness monitor, a heart rate monitor, a ring, a key tracker, a blood glucose meter, and a pressure meter.
  • the loT device 14 may be an industrial application device wherein an industrial application device may be an industrial unmanned aerial vehicle, an intelligent industrial robot, a vehicle assembly robot, and an automated guided vehicle.
  • the loT device 14 may be a transportation vehicle, wherein a transportation vehicle may be a bicycle, a motor bike, a scooter, a moped, an auto rickshaw, a rail transport, a train, a tram, a bus, a car, a truck, an airplane, a boat, a ship, a ski board, a snowboard, a snow mobile, a hoverboard, a skateboard, roller-skates, a vehicle for freight transportation, a drone, a robot, a stratospheric aircraft, an aircraft, a helicopter and a hovercraft.
  • a transportation vehicle may be a bicycle, a motor bike, a scooter, a moped, an auto rickshaw, a rail transport, a train, a tram, a bus, a car, a truck, an airplane, a boat, a ship, a ski board, a snowboard, a snow mobile, a hoverboard, a skateboard, roller-skates, a vehicle for freight transportation, a drone,
  • the loT device 14 may be a health or fitness device, wherein a health or fitness device may be a surgical robot, an implantable medical device, a non-invasive medical device, and a stationary medical device which may be: an in-vitro diagnostic device, a radiology device, a diagnostic imaging device, and an x-ray device.
  • a health or fitness device may be a surgical robot, an implantable medical device, a non-invasive medical device, and a stationary medical device which may be: an in-vitro diagnostic device, a radiology device, a diagnostic imaging device, and an x-ray device.
  • the hardware 16 of the emulator device 12 further includes processing circuitry 20.
  • the processing circuitry 20 may include a processor 22 and a memory 24.
  • the processing circuitry 20 may comprise integrated circuitry for processing and/or control, e.g., one or more processors and/or processor cores and/or FPGAs (Field Programmable Gate Array) and/or ASICs (Application Specific Integrated Circuitry) adapted to execute instructions.
  • the processor 22 may be configured to access (e.g., write to and/or read from) the memory 24, which may comprise any kind of volatile and/or nonvolatile memory, e.g., cache and/or buffer memory and/or RAM (Random Access Memory) and/or ROM (Read-Only Memory) and/or optical memory and/or EPROM (Erasable Programmable Read-Only Memory).
  • the memory 24 may comprise any kind of volatile and/or nonvolatile memory, e.g., cache and/or buffer memory and/or RAM (Random Access Memory) and/or ROM (Read-Only Memory) and/or optical memory and/or EPROM (Erasable Programmable Read-Only Memory).
  • the emulator device 12 further has software 26 stored internally in, for example, memory 24, or stored in external memory (e.g., database, storage array, network storage device, etc.) accessible by the emulator device 12 via an external connection.
  • the software 26 may be executable by the processing circuitry 20.
  • the processing circuitry 20 may be configured to control any of the methods and/or processes described herein and/or to cause such methods, and/or processes to be performed, e.g., by emulator device 12.
  • Processor 22 corresponds to one or more processors 22 for performing emulator device 12 functions described herein.
  • the memory 24 is configured to store data, programmatic software code and/or other information described herein.
  • the software 26 may include instructions that, when executed by the processor 22 and/or processing circuitry 20, causes the processor 22 and/or processing circuitry 20 to perform the processes described herein with respect to emulator device 12.
  • processing circuitry 20 of the emulator device 12 may include digital twin (DT) unit 28 which is configured to perform one or more emulator/emulator device 12 functions described herein such as with respect to one or more of feature selection, generative training and verification/anomaly detection.
  • DT digital twin
  • the system 10 further includes the loT device 14 already referred to.
  • the loT device 14 may have hardware 30 that may include a communication interface 32 configured to set up and maintain communication with one or more elements in system 10 such as with emulator device 12 and/or other loT devices 14.
  • the hardware 30 of the loT device 14 further includes processing circuitry 34.
  • the processing circuitry 34 may include a processor 36 and memory 38.
  • the processing circuitry 34 may comprise integrated circuitry for processing and/or control, e.g., one or more processors and/or processor cores and/or FPGAs (Field Programmable Gate Array) and/or ASICs (Application Specific Integrated Circuitry) adapted to execute instructions.
  • the processor 36 may be configured to access (e.g., write to and/or read from) memory 38, which may comprise any kind of volatile and/or nonvolatile memory, e.g., cache and/or buffer memory and/or RAM (Random Access Memory) and/or ROM (Read-Only Memory) and/or optical memory and/or EPROM (Erasable Programmable Read-Only Memory).
  • loT device 14 may include one or more sensors 40 for generating sensor data that may be communicated to emulator device 12 as described herein. loT device may communicate data other than sensor to emulator device 12 as described herein.
  • the loT device 14 may further comprise software 42, which is stored in, for example, memory 38 at the loT device 14, or stored in external memory (e.g., database, storage array, network storage device, etc.) accessible by the loT device 14.
  • the software 42 may be executable by the processing circuitry 34.
  • the processing circuitry 34 may be configured to control any of the methods and/or processes described herein and/or to cause such methods, and/or processes to be performed, e.g., by loT device 14.
  • the processor 36 corresponds to one or more processors 36 for performing loT device 14 functions described herein.
  • the loT device 14 includes memory 38 that is configured to store data, programmatic software code and/or other information described herein.
  • the software 42 may include instructions that, when executed by the processor 36 and/or processing circuitry 34, causes the processor 36 and/or processing circuitry 34 to perform the processes described herein with respect to loT device 14.
  • the inner workings of the emulator device 12 and loT device 14 may be as shown in FIG. 2 and independently, the surrounding network topology.
  • FIG. 2 shows DT unit 28 as being within a respective processor, it is contemplated that this units may be implemented such that a portion of the unit is stored in a corresponding memory within the processing circuitry. In other words, the units may be implemented in hardware or in a combination of hardware and software within the processing circuitry. Further one or more processors, processing circuitry and/or other elements in FIG. 2 may be distributed in one or more devices.
  • FIG. 3 is a flowchart of an example process in an emulator device 12 according to one or more embodiments of the present disclosure.
  • One or more blocks described herein may be performed by one or more elements of emulator device 12 such as by one or more of processing circuitry 20 (including the DT unit 28), processor 22, and/or communication interface 18.
  • Emulator device 12 is configured to receive (Block SI 00) data associated from the plurality of loT devices 14, as described herein.
  • Emulator device 12 is configured to generate (Block S102) variational data based on the received data where the variational data including highdimensional time-series data sets, as described herein.
  • Emulator device 12 is configured to train (Block SI 04) a machine learning, ML, model using the variation data, as described herein.
  • Emulator device 12 is configured to use (Block S106) the trained ML model to perform at least one detection, as described herein.
  • the using of the trained ML model to perform at least one detection includes: generating data for emulation using the trained ML model based at least on a time-sliding window of time-series data, the generated data indicating at least one emulated outcome, and determine whether the emulated outcome indicates anomaly.
  • the ML model is a Feature-selective Variational Generator for High-dimensional Time Series, FVGHT, model.
  • the processing circuitry is further configured to perform feature selection to remove at least one of redundant data and irrelevant data from the received data.
  • the variation data follows a probabilistic distribution associated with the received data.
  • FIG. 4 is a flowchart of another example process in an emulator device 12 according to one or more embodiments of the present disclosure.
  • One or more blocks described herein may be performed by one or more elements of emulator device 12 such as by one or more of processing circuitry 20 (including the DT unit 28), processor 22, and/or communication interface 18.
  • Emulator device 12 is configured to determine (Block SI 08) high-dimensional time series data based on data associated with the plurality of devices and/or apply (Block SI 10) a model to the highdimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic (e.g., a typology such as data-generative typology) of a variational autoencoder (VAE).
  • Emulator device 12 is further configured to train (Block SI 12) the model, using online batch-based machine learning, based at least in part on the high-dimensional time series data and/or perform (Block SI 14) at least one action using the trained model.
  • the method further includes determining the feature selection of the data associated with the plurality of devices, where the determined feature selection indicates at least one relation between at least two dimensions of the data.
  • the method further includes determining a plurality of sliding windows based at least on one parameter, to train the model using online-batch learning. Each sliding window of the plurality of sliding windows has a size. In an embodiment, the method further includes accumulating a plurality of data frames per each sliding window of the plurality of sliding windows. The plurality of data frames is associated with the data associated with the plurality of devices.
  • applying the model includes generating emulated data based at least in part on the plurality of sliding windows.
  • the emulated data includes reconstructed information from the high-dimensional time series data. At least one set of the emulated data corresponds to one sliding window of the plurality of sliding windows.
  • the model is a feature-selective variational generator for high-dimensional time series (FVGHT).
  • the model is applied based at least in part on at least one of a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
  • performing the at least one action includes at least one of performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device.
  • At least one of the plurality of devices is an internet of things (loT) device 14.
  • LoT internet of things
  • the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
  • the emulator device is digital twin emulator.
  • the learning process may include one or more of the following: determining data and/or transmitting the data from sensor 40 to data model 50; determining structure data and/or transmitting the structure data from data model 50 to knowledge representations 52; determining one or more observations and/or transmitting the one or more observations from knowledge representations 52 to behavior model 54; determining a status summary and/or transmitting the status summary from behavior model 54 to decision model 56; determining one or more actions and/or transmit the one or more actions from decision model 56 to knowledge representations 52; determine structure data and/or transmit the structure data from knowledge representations 52 to data model 50; determine one or more instructions and/or transmit the one or more instructions to actuator 44, which may then perform one or more actions which may be associated at least with sensor 40.
  • the emulating process may include one or more actions and/or data transmitted/ exchanged between behavior model 54 and decision model 56.
  • the updating process may include one or more of the following: determining emulated behavior and/or transmitting the emulated behavior to industrial applications 58 (e.g., for updating one or more software applications); determine one or more emulated decisions and/or transmit the one or more emulated decisions to decision model 56; and/or determine one or more actions and/or transmit the one or more actions to the behavior model 54.
  • any of the components/elements shown in FIG. 5 may correspond to and/or be comprised in and/or be performed by any of the components/elements/devices shown in FIG. 2.
  • loT device 14 includes sensor 40 for generating data that is communicated to the emulator device 12. Further, loT device 14 may include an actuator 44 or element that is configured to receive an instruction and/or perform one or more actions associated with the loT device 14.
  • emulator device 12 includes one or more functional blocks such as a data model 50, knowledge representation 52 (Semantic Model), behavior model 54 (emulator), and decision model 56 (emulator), one or more of which may be implemented by DT unit 28 and/or various components of emulator device 12 and/or loT device 14 and/or by entities in system 10. Further, while FIG. 4 illustrates a specific grouping of elements included in the loT device 14 and emulator device 12, other groupings are possible in accordance with the teachings of the present disclosure.
  • the emulator device 12 may be implemented in the loT device 14 such that the loT device 14 may perform at least one emulator device 12 function.
  • the emulator includes different models (e.g., a plurality of models or plurality of machine learning, ML, models), where each model is supported by relevant components.
  • system 10 Several elements of system 10 include:
  • each device includes: o A processing unit (e.g., processing circuitry 34) to process the sensor data and send out the result via a communication unit (e.g., communication interface 32)
  • a processing unit e.g., processing circuitry 34
  • a communication unit e.g., communication interface 32
  • the loT devices could only run some parts of the method described below as one or more embodiments composes computations in a distributed manner.
  • a communication unit to send the sensor data provided by the sensor unit
  • the loT devices could send the output from certain processing composition unit (e.g., processing circuitry 34) o sensors and a certain sensor unit (collectively referred to as sensor (e.g., sensor 40)) to collect information from the physical environment.
  • One or more computing units e.g., emulator device 12
  • devices such as devices (could be one or many), gateways, various types of apparatus, and/or computing cloud, which is composed of: o
  • a processing unit e.g., processing circuitry 20
  • storage/memory e.g., 24
  • One or more communication units e.g., communication interface 18:
  • the devices and computing units may be combined to compose sub-components of the DT emulator, including the emulator itself and other external supportive components.
  • the DT emulator functions may be performed by emulator device 12.
  • the DT emulators may run/operate with support from some external components:
  • Device platform may include physical entities (e.g., loT device 14) with their sensors (e.g., sensor 40) and actuators 44. These may include devices providing data in proprietary format about a physical entity, e.g., temperature sensor, while latter represent devices enforcing instructions in the proper format to a physical entity, e.g., on/off switch. It may be composed of more than one device.
  • the device platform may be implemented as a separate device from emulator device 12 and/or be in communication with emulator device 12 via a communication link.
  • the loT device platform may be implemented using its own set of computing hardware and software, using the same types of components shown in FIG. 2 with respect to emulator device 12.
  • the loT platform may be in communication with the emulator device 12 such that functions discussed herein with respect to the loT platform are performed by one computing device, and functions described with respect to the DT platform are performed by emulator device 12.
  • o Devices e.g., loT devices 14
  • loT devices 14 may be located on-premises and/or be implemented as hardware devices in an loT environment (e.g., a temperature sensor and on/off switch), or as a piece of software that runs on physical entities (e.g., Java code that: collects CPU usage on a Raspberry Pi and can restart it).
  • the data model 50 in the device platform may represent the raw data collected from devices in a formalized and structuralized schema so that behavior and decision models with their relevant emulators can use the device data smoothly. It may be composed of one or more computing units.
  • Data models 50 may be ran/executed by data pre-processing components, which conduct pre-data processing to convert raw data into structured data which can be consumed by the emulator. It may be composed by one or more devices and one or more computing units (e.g., processing circuitry). o Such preprocessing may first intake raw data and/or form a data model 50.
  • the data model 50 may receive data in a standardized or proprietary format from sensors and/or output uniformly structured data, e.g., JavaScript Object Notation (JSON) format.
  • JSON JavaScript Object Notation
  • Data model 50 may commonly run on-premise as well, as it may require a communication channel that is physically near sensors and actuators in order to communicate with them (e.g., Zigbee, Z-Wave and WiFi modules).
  • the data model 50 can run further on the Cloud (e.g., in a cloud computing network), while being connected through Internet with network-enabled physical devices (e.g., loT devices 14).
  • o Knowledge representations 52 may be generated/determined (e.g., emulator device 12) from data model 50 and refer to a semantic model which receives uniformly formatted data from the data model and provides a context, using schema such as Resource Description Framework (RDF), Ontology Language (OWL), Next Generation Service Interfaces Linked Data (NGSI-LD), etc..
  • RDF Resource Description Framework
  • OWL Ontology Language
  • NGSI-LD Next Generation Service Interfaces Linked Data
  • Emulator e.g., emulator device 12
  • the behavior and decision models can run/operate on the edge (e.g., edge of the loT network, physically and/or logically near one or more loT devices 14), to make the round-trip time to be short.
  • the behavior module 54 (e.g., emulator) and/or decision model 56 (e.g., emulator) can also run on the Cloud.
  • the emulator is based on a digital twin platform.
  • the digital twin platform connects to a device platform used for getting and storing both contextual and semantical data, respectively, as well as forwarding actions to actuators.
  • a user may send user-specific ontology/semantics, along with desired intentions on how the system should behave.
  • One or more interfaces may be provided by one or more of communication interface 18 and communication interface 32.
  • o Behavior model 54 e.g., emulator
  • the action may be executed by the decision model 56 (e.g., emulator) based on semantically enhanced insights, as well as “intents” from a user, e.g., due to a robot overheating and a user intent to keep a safe environment it decides to switch off the robot arm.
  • the action may be sent through a semantic model that knows what actuator may need to be switched off. It may be composed of one or more computing units.
  • the emulator may serve the DT platform via interacting with previously described components in these three loops, namely:
  • the emulator device 12 may be configured to take data input from the device platform to train the emulator device 12. Emulator may further be configured to retrieve observations and/or insights from the device platform to get/determine conclusions regarding what is happening and/or decide to take “Actions” to optimize the system. During this process, the emulator device 12 may also take business logic as input from the users, the business logic describes reaction/decision policies using “if’ and “then” schema.
  • Procedure 2 Emulating Loop: based on the learned model (in procedure 1) the emulator generates data for emulation. It creates emulated “Insights” within the “Behavior Model” 54 and sends them to the “Decision Model” 56 to perform decisions and actions. The decisions may be sent as emulated “Actions” to the “Behavior model” to emulate their outcome in form of newly created simulated Insights. Those simulated Insights are used for selecting “Actions” to be executed.
  • Procedure 3 Updating Loop: The emulator (i.e., emulator device 12) is updated by comparing the result of actions to the given “Intent” to optimize the emulators. It utilizes recorded “performance” (e.g., error rates, etc.) in both the “Leaning loop” and “Emulating loop” to update the “Behavior Model Emulator” 54 and “Decision Model Emulator” 56.
  • performance e.g., error rates, etc.
  • the Emulator e.g., emulator device 12
  • the emulator may be composed of the Emulator algorithms and a set of computing units running the emulator algorithms, which may be stored in, for example, memory 24, DT unit 28, etc.
  • the emulator is running/operating together with other external components, namely other components in the digital twin platform.
  • the emulator itself does not have specific requirements on the DT components, and just sending data to the emulator may be enough.
  • the FVGHT may be used to resolve a probabilistic distribution that is infinitely approaching the objective data sets which are multivariable time series data after conducting feature selections on the high-dimensional data collected from DT physical devices.
  • Digital Twin such data are the ones generated by physical twin counterparts.
  • DT may be a neural network typology that inherits functionalities from concrete autoencoder (CAE), variational autoencoders (VAE), and gated recurrent unit (GRU), and performs online training and learning.
  • CAE concrete autoencoder
  • VAE variational autoencoders
  • GRU gated recurrent unit
  • the FVGHT may be used to perform feature selections on the data generated by physical twins, e.g., so that the emulation may be primarily based on contributive/impactive parameters. Moreover, when irrelevant and redundant data are avoided to some extent, it both saves computation resources (such as energy) and light-weighs the computation burden on edge devices.
  • the FVGHT may be a variational data generator and/or inherit the data- generative typology of VAE, e.g., so that data sets having the same probabilistic features as the data set to be emulated can be then generated;
  • the FVGHT may be designed to process time-series data with sliding windows and/or may inherit the typology of GRU networks to handle the continuously generated data from the physical twins. In some embodiments, this feature is important for the Digital Twin emulator, as the timely relations between different batches of data sets may be reflected during the training.
  • FIG. 6 shows a diagram of an example feature-selective generative ML algorithm (FVGHT) for high-dimensional time-series data.
  • generator structure 64 includes collected data from loT (e.g., physical) devices 14, a feature selection structure 70 (which may determine and/or transmit selected authentic features 72 (e.g., not pseudo features)), a first time series structure 74, VAE Latent 76 (e.g., mean/variation), VAE-DE 78, and a second times series structure 80.
  • Actual data e.g., data frame
  • the FVGHT (as shown in FIG. 6) also has one or more of the following features: feature-selective learning, generative learning, unsupervised learning, and online learning. Those features enable it to continuously generate emulating data sets from the data generated by physical twins which takes high-dimensional time series data as input.
  • the emulator takes datasets with any of the following features as input:
  • the data can be generated by both the physical and digital (e.g., logical) entities in the digital twin framework.
  • the emulator delivers datasets as output by demands with one or more of the following features:
  • the generated data contains variational information learned from contributive/impactive information so that potential/undetectable design faults, anomalies can be detected when the demanded amounts of data are large enough (i.e., exceed a predetermined threshold).
  • a method (including a set of different steps) provides features not provided by existing systems.
  • One or more steps included in the method provide a unique integration of some known technologies and are inventively evolving the known technologies from existing systems.
  • a method can be implemented at one or more computing units (and/or any component of emulator device 12), which may perform one or more steps illustrated in FIG. 7.
  • Step S200 To trigger the feature reduction and build the models, a specific size of the time window may be defmed/determined for accumulating the data. This step may group batches (i.e., mini-batches) to data.
  • the size of the windows i.e., sliding windows
  • Each sliding window outputs a data frame in temporal order with a certain defined interval. It could be accumulated on memory or persistent storage depending on the requirements.
  • Step S202 This step comprises the collection of available data from the devices (e.g., loT devices 14). It may integrate heterogeneous devices (e.g., loT devices 14) which may have one or multiple different communication protocols.
  • the emulator is configured to communicate with one or more devices, each having a different communication protocol, i.e., the emulator is configured to support a plurality of communication protocols.
  • the emulator may be configured to collect the data generated by physical twins (i.e., feed the data into the emulator). Once enough data is accumulated (e.g., once collected data amount meets a predefined threshold) according to the size of the sliding window, step S204 may be triggered. Step S202 may keep going/running/operating unless the emulator needs to be shut down;
  • Step S204 continuously train the emulator for feature selection on the edge (if the edge is needed). In one or more embodiments, this step may keep/continue to operate/run unless the emulator needs to be shut down;
  • Step S206 continuously feed feature-selected data to the non-edge part of the emulator and/or handle any backpropagation from the non-edge part. In one or more embodiments, this step may keep/continue to run/operating unless the emulator needs to be shut down;
  • Step S208 continuously train the emulator for generating time series and perform backpropagation with the edge part. In one or mor embodiments, this step may continue to operate unless the emulator needs to be shut down;
  • Step S210 provide demands (the amount of data) to the emulator and/or deliver/transmit the emulating data.
  • the amount of demanded data may not be larger than the total amount of training data fed into the emulator since the emulator starts.
  • the sliding windows (e.g., logical time windows)
  • the input of the Emulator is exposed to the sliding step for accumulation of data
  • Each sliding window accumulates data frames for a certain/predefined amount of time and/or for a certain amount of samples.
  • the computational tasks progress as the sliding windows move, while working on data accumulated during the sliding windows.
  • the emulator handles (e.g., determines, collects, stores) data generated within 2 hours, where 2 hours is the size of the sliding window. Consequently, the data fed into each training step is timeseries data accumulated within 2 hours. After handing data in the current sliding window, the sliding window moves to take data generated in another 2 hours after the current one.
  • Such data accumulation provides a basis for concentrating information in the time dimension.
  • Feature selection units of the emulator e.g., processing circuitry 20 and/or DT unit 28 functionality
  • FIG. 8 shows feature selection units of FVGHT for high-dimensional time-series data.
  • the feature selection units i.e., structure
  • CAE introduces feature selection layers, where Concrete random variables can be sampled to produce continuous relaxation of the one-hot vectors.
  • CAE selects discrete features using an embedded method but without regularizations.
  • the feature-selected data may be fed to the time series generator. Further, both the input and output of these units fulfill time series structures. This may be important Minimum L2 loss in the CAE training is implemented, which may be to force the decoded CAE series highly close (i.e., close) to the input time series.
  • This part of computations may be conducted on the edge (e.g., of the network).
  • the loss introduced in this part can be also used as regulations for the variational time series generation in a global sense.
  • Variational Generator units of the emulator e.g., ofDT unit 28, of emulator device 12
  • the Variational Generator units are based on Variational autoencoder structure as shown in FIG. 9 (a diagram of example generative units of FVGHT for high-dimensional time-series).
  • Data 66 e.g., data frames
  • selected authentic features 68 e.g., a first time series structure 74 (e.g., encoded using VAE), VAE Latent 76, VA decoders 78, VAE decoded times series structure, and a second times series structures 80 is shown.
  • the input to the VAE encoder part i.e., the first time series structure 74
  • the data input to the VAE encoder is authentic parameters collected from physical twins, rather than reconstructed pseudo-features.
  • the VAE also includes some time series units which are discussed below.
  • each X presented in FIG. 9 stands for a data frame having a width that is the same as the number of variables, and the length is the same as the time period (for example, a data frame with multivariable collected within a certain hour). Boxes with dashed lines indicate the sliding windows, which size is the size of the sliding window.
  • the generator comprise a group of variational autoencoders and can take input X from the computation results of the previous sliding windows.
  • the loss is more meaningful than accuracy. That is also because the training labels and inputs are the same datasets, and each VAE tries to minimize the KL divergence between the original data 66 and the reconstructed data (e.g., time series structure 80).
  • the computation accuracy/loss is conducted using K-folder cross-validation, where the K number has defaulted as 5 unless the further configurations are provided.
  • FIG. 10 is a diagram of example time series units of FVGHT for high dimensional series data (e.g., the decoder part)
  • FIG. 11 is a diagram of example time series units of FVGHT for high-dimensional time series data (e.g., the encoder part). More specifically, in the time-series units, data is processed in the time dimension comparing to the CAE structure processing data in parameter dimensions.
  • the RNN in VAE-encoder is also impacted by the size of sliding windows.
  • the deployed sensor transmits data in every 10 milliseconds; based on the descriptions of sliding windows above, , the sliding windows may accumulate 1000 data items, for example, if the time length for the sliding window is defined as 2 hours. Other data items and times may be used.
  • Such time series units support the neural network to conduct training on the input data with continuations on different time slots.
  • the output of a unit X +2 can be used as input for another sliding window so that all the sliding windows are chained.
  • the RNN (GRU) in VAE-decoder generates time series with variations using at least two layers. In the same layer, the outputs from the same Neuron may be (e.g., always) mutually connected as time a sequence.
  • theoretical support for one or more embodiments of FVGHT described herein is as follows:
  • GRU may be a better choice comparing to LSTM, especially relating to the context of loT-based digital Twin described herein;
  • GRU-VAE has scientifically approved performance for processing multivariable time series data.
  • CAE provides performance (e.g., that exceeds a predetermined performance threshold) on feature selections that output authentic parameters rather than reconstructed ones. It has been applied in multiple domains such as natural sciences which typically seeks to explain “why” together with “what”.
  • VAE together with RNN provide performance (e.g.., that exceeds a predetermined performance threshold) to generate time-series data.
  • Typical systems do not provide the advantages and configurations described herein and do not include all the features of one or more embodiments described herein.
  • the emulator process is implemented based on external storage.
  • the emulator exposes (e.g., provides) data to other components which allow the presented solution can be integrated into an loT-based Digital twin Platform.
  • the analyzed results can be exposed to the simulation and automation loops. Further, besides supporting simulation analysis, the solution can also be used for predictions.
  • the emulator device 12 may generate data for various industrial applications 58.
  • the generated data may comprises variations that may or may not be aware of by human engineers, i.e., the emulator device provides data (with stress) for checking potential anomalies and online device verifications.
  • the emulator device 12 may be configured to conduct performance/results evaluation for assumed conditions based on the data collected from physical twins.
  • Emulator device 12 may further provide feedback of the decisions to be tested to users, e.g., without stopping the physical system for testing.
  • Conducting emulation for predictive maintenance in Smart manufacturing may be one of many use cases, e.g., where devices/equipment are geographically distributed.
  • the domain expert can provide enough knowledge supported by understanding the data, especially when the production line is newly introduced or assembled.
  • the categories of deployed equipment/ sensor devices can be very complex (high-dimensional data); the speed of generating data can be very quick (data streams come in high-density); the equipment/ sensor devices are serving different production units in different geographical locations (heterogeneity of devices and distributed topology); the production can be scaled up and down for serving different environments; the data may be ingested in fresh (e.g., real-time or near real-time) and provide online results quickly.
  • a DT-based emulator machine e.g., emulator device 12
  • emulator device 12 may be configured to provide online emulation of the physical deployment of the online data, e.g., using the architecture shown in FIG. 12.
  • a plurality of DT communication units 100 e.g., DT communication units 100a, 100b, etc.
  • DT processing units 102 e.g., DT processing units 102a, 102b, 102c, etc.
  • Any DT communication unit 100 may refer to communication interface 18 and/or communication interface 32.
  • Any DT processing unit 102 may refer to processing circuitry 20 and/or processing circuitry 34.
  • Each component of FIG. 12 may respond to a computing unit in the provided solution, which can be distributed or reside on one device (e.g., emulator device 12, loT device 14).
  • DT processing unit correspond to steps performed by processing circuitry 20 of emulator device 12, DT unit 28, etc.
  • DT communication unit 0 i.e., DT communication unit 100a: This unit works as a data receiver, which takes into data from physical entities.
  • DT processing unit 1 i.e., DT processing unit 102a
  • This unit works as a data parser, which transforms different raw data into data frame.
  • DT processing unit (2 to n) i.e., DT process unit 102b (and/or more)
  • the total number of generator units is equivalent to the size of sliding windows.
  • Each unit is equivalent to a multi-variable input X in the time period t.
  • Each of the units handles the CAE computations to generate data. It interacts with the time-series units for L2 Regulations during the training and inferencing, which also connects to the VAE encoder units.
  • DT processing unit (n+1 to 2n) (i.e., DT process unit 102c (e.g., up to 102n):
  • the total number of generator units is equivalent to the size of sliding windows.
  • Each unit is equivalent to a multi-variable input X in the time period t.
  • Each of the units handles data from physical entities generated in a certain period of time. It interacts with the VAE latent units for backpropagation during the training and inferencing.
  • DT processing unit 2n+l (i.e., DT processing unit 102o): This unit works for VAE Decoder computations which first takes input from outcome of a generated time-series data frame from the VAE latent layers, and then training to force the final output to have minimum DL divergence with the original data sets.
  • DT communication unit (2n+2) (i.e., DT communication unit 100b): This unit works as a data exposer, which transmits the results to relevant components and/or end-users.
  • An emulator i.e., emulator device 12
  • a communication broker 104 may also be used.
  • the terms communication units and processing units have been used for ease of understanding but refer to DT communication units 100 and DT processing units 102, respectively.
  • FIG. 14 is an example flowchart of a process (i.e., method) performed by the emulator device 12 (and/or any of its components).
  • step S300 data generated by physical entities (i.e., loT devices 14) is collected (e.g., by emulator device 12 via communication interface 18).
  • the process further includes transforming and/or formalizing the data (i.e., collected data) to feed in the streams (streams of data fed to one or more components of the emulator device 12 which may include encoders and decoders).
  • data e.g., collected data, transformed data, formalized data, transformed and formalized data
  • data concentration is performed in feature selection units (which may be part of an edge network/component).
  • a first time series computation is performed in GRU units (VAE-EN).
  • generative computation in generative units are performed.
  • a second time series computation is performed in GRU units (VAE-DE).
  • steps S300-S312 may be exposed (i.e., transmitted, shared, made available) to other devices (e.g., loT device 14), other DT components (i.e., emulator device 12 components), and/or users such as end users. Any one of steps S300-314 (e.g., S306-312) may be executed in an interactive loop.
  • One or more embodiments described herein may be based on an outcome of a network cloud engine (NCE) program, where the digital twin is considered as enablers for various industrial applications. Therefore, as described herein, reusability (for various use cases and industrial applications) is one of the features of the present disclosure. There are some examples of how one or more embodiments can be applied to different use cases.
  • NCE network cloud engine
  • DT emulator i.e., emulator device 12
  • the DT emulator provides diverse and variational data as an output, i.e., data that looks like the input sensor data.
  • Highlight 2 The DT emulator generates output by first training a FVGHT model to learn the pattern of input data and then generating data from the trained model where variations are added.
  • Highlight 3 Device data flows into the emulator as an alive/live stream, while the DT emulator takes the sliding windows to process the incoming data. The content of each sliding window is fed into the FVGHT, and the emulator runs continuously and takes input from the moving sliding windows continuously. The online device verification and anomaly detection for some key use cases are described below.
  • the emulator i.e., emulator device 12
  • the emulator enables engineers to perform online device verifications. That is, the emulator may be used to learn a pattern from all the collected hardware data, and then to generate data emulating the hardware systems (with possible environment conditions) with variations. Further, the emulation data sets enable tests on the hardware systems without shutting down anything. Once data is emulated, the emulated data can be used for verifying the physical device state, not only by observing its current state, but also by observing all possible states that might occur.
  • DT emulator learns a pattern that when the heater is turned on the temperature of houses in the district can logarithmically go up depending on the power level. The DT emulator may go through different conditions and verify states. If the desired temperature is set to 35 degrees and DT emulator did not find any set of conditions that would achieve that temperature, the DT emulator (e.g., emulator device) can alarm the user that the desired temperature is unreachable. Moreover, when any changes towards the heating system are needed, the data output by the emulator can be used to test the change proposal before making any physical impact on the existing physical entities.
  • a robotic arm and a conveyor belt are used on a factory floor, as well as camera detecting items on a conveyor belt being placed by the robotic arm.
  • DT emulator can take camera outputs, robotic arm axes, and the conveyor belt speed and learn their correlation and continuously update them as new data arrives. By creating variations of those conditions, the DT emulator can check if any device can reach an undesirable state during its operation. For instance, the DT emulator can determine the minimum speed of the conveyor belt would cause the robotic arm to place items one on top of the other if the robotic arm itself were to be set to its maximum speed. In case any changes are needed for the running physical setup, the data generated by the emulator can be used to verify the changes before any impacts are made on the physical system.
  • Mobile Network therein this example, there may be 200 LTE base stations covering an area, all being digitally twinned.
  • Input data includes base station load and a number of mobile devices connected to it.
  • DT emulator begins to emulate mobile device positions and connections and finds a condition where one base station fails due to overload. Mobile devices that were connected to that base station switch to other base stations, which also gets overloaded as it now has to handle new connections along with the existing ones, making it fail as well. This can create a chain reaction that knocks off all 200 base stations.
  • the DT emulator is capable of verifying the setup/configuration of the entire LTE network in the area. Such verification is particularly very helpful when any changes need to be done on the hardware setup because the running hardware system does not need to be stopped to test potential consequences of the planned changes.
  • the emulator generates data emulating the hardware system, which can be also used to detect anomalies in the existing hardware system, e.g., because the DT emulator is also aware of various environment's acceptable states because of the variations in emulated data.
  • a temperature starts going up while a corresponding heater is turned off. This could happen in a case of fire in the room for instance.
  • This data can be used as an input to the discriminator of the DT emulator along with other emulated data representing various acceptable situations. Further, the DT emulator would be able to identify the fire situation as an anomaly amongst other conditions.
  • using the emulating data with variations it is possible to test whether the current hardware setup can work as the expected way under different (with variations) circumstances/assumptions (can the system respond to anomalies correctly under different circumstances/assumptions).
  • Mobile Network e.g., 3GPP based network
  • one base station e.g., network node stops sending sensor data due to an internal error, while continuing to operate (e.g., while mobile devices (e.g.,, wireless device, UE) are still connected to the base station).
  • DT emulator i.e., emulator device 12
  • emulator device 12 would be able to emulate these conditions and realize that, if the base station has actually failed, some of the neighboring base stations would receive new connections. Hence, it would detect these conditions as the anomaly.
  • using the emulating data with variations it is possible to test whether the current hardware setup can work as the expected way under different (with variations) circumstances/assumptions (can the system respond to anomalies correctly under different circumstances/assumptions
  • FIG. 15 shows a diagram of an example orchestration of emulator according to some embodiments.
  • One or more DT communication units 100a, 100b Data Receiver, Exposer, respectively
  • one or more DT processing units 102a, 102b, 102c, 102d are shown.
  • one or more embodiments described herein relate to the loT landscape consisting of devices, edge gateways, base stations, network nodes, radio nodes, network infrastructure, fog nodes, or cloud.
  • the Timeseries Units, VAE latent units, and VAE Decoder units may be implemented in a cloud network which can benefit from the computation power of the cloud network.
  • Other units, such as Feature selection units, VAE encoder units, data receiver, exposer, and data parser can be located in the edge, depending on the scenario.
  • network node can be any kind of network node comprised in a radio network which may further comprise any of base station (BS), radio base station, base transceiver station (BTS), base station controller (BSC), radio network controller (RNC), g Node B (gNB), evolved Node B (eNB or eNodeB), Node B, multi-standard radio (MSR) radio node such as MSR BS, multi-cell/multicast coordination entity (MCE), integrated access and backhaul (IAB) node, relay node, donor node controlling relay, radio access point (AP), transmission points, transmission nodes, Remote Radio Unit (RRU) Remote Radio Head (RRH), a core network node (e.g., mobile management entity (MME), self-organizing network (SON) node, a coordinating node, positioning node, MDT node, etc.), an external node (e.g., 3rd party node, a node external to the current network), nodes in distributed antenna system (
  • BS base station
  • wireless device or a user equipment (UE) are used interchangeably.
  • the WD herein can be any type of wireless device capable of communicating with a network node or another WD over radio signals, such as wireless device (WD).
  • the WD may also be a radio communication device, target device, device to device (D2D) WD, machine type WD or WD capable of machine to machine communication (M2M), low-cost and/or low-complexity WD, a sensor equipped with WD, Tablet, mobile terminals, smart phone, laptop embedded equipped (LEE), laptop mounted equipment (LME), USB dongles, Customer Premises Equipment (CPE), an Internet of Things (loT) device, or a Narrowband loT (NB-IOT) device etc.
  • D2D device to device
  • M2M machine to machine communication
  • M2M machine to machine communication
  • Tablet mobile terminals
  • smart phone laptop embedded equipped (LEE), laptop mounted equipment (LME), USB dongles
  • CPE Customer Premises Equipment
  • LME Customer Premises Equipment
  • NB-IOT Narrowband loT
  • one or more embodiments provide a digital twin based emulator using a novel online machine learning model (i.e., FVGHT ) to conduct feature selection and generate high-dimensional time-series data sets which can approach (e.g., infinitely approach) the input high-dimensional time series data sets in probabilistic space.
  • FVGHT novel online machine learning model
  • the Emulator machine (device) has a minimum dependency on domain expertise, which is hence highly replicable and reusable for various industrial applications.
  • the Emulator machine conducts feature selections on the edge side, so the parameters taken into the generation process are all (or most are) contributive/impactive parameters which hence also avoid introducing extra burden on computing the irrelevant and redundant data during the generative learning phase.
  • the model may be based on inheriting and integrating the advantages and core features of CAE, VAE, and GRU with clear synergy effects that allows the emulator gain core essences, e.g., that a single parent model cannot provide.
  • An emulator machine e.g., emulator device 12 having feature selections has been introduced on the edge side of the physical system so that the emulator machine can at least in part avoid ingesting/inputting irrelevant and redundant data.
  • each data generation is a process of sampling data from the learned probabilistic distributions.
  • Each generated set may be different but obeys (infinitely approaching to) the probabilistic distribution learned from input data. Therefore, the process is variational in that the majority of potential conditions and environment assumptions that have not happened before can be included in the model to verify the device setup and determine/discover potential flaws.
  • An emulator machine applies a novel model (FVGHT) which integrates the advantages from CAE, VAE, and GRU with synergy effects.
  • the emulator machine can learn a probabilistic pattern based on feature selection for expressing high-dimensional time series based on the data sets collected from physical entities.
  • VAE-VAE-GRU model there is no such VAE-VAE-GRU model that has been applied for emulating purposes.
  • An emulator machine applies online batch-based machine learning: continuously applying batch-based machine learning algorithms on time series to generate data sets for simulating the data collected from the physical entities, which can concentrate information in both KPI dimensions using feature selection and concentrate the time dimension using VAE.
  • An emulator machine may not require domain expertise for providing data labels for either training the model or creating the model such that one or more embodiments may be automated without human intervention.
  • a highly reusable and replicable digital twin emulator i.e., emulator device 12: since one or more embodiments described herein are semi-supervised reinforcement learning without the requirement on either training labels or domain expertise, the digital twin emulator described herein is highly replicable and reusable in different industrial applications.
  • One or more embodiments described herein provide one or more of the following advantages.
  • One or more embodiments provide an emulator (i.e., emulator device 12) which uses FVGHT as the core algorithm to reproduce data highly similar to the data generated by the physical entities.
  • One or more embodiments described herein utilizes novel machine learning techniques for DT emulator, which has the following advantages:
  • Emulations can be created/determined/performed from the data flow in an online manner using batch-based algorithms, where the batch-based algorithms are usually conducted in an offline manner.
  • the online manner provides faster results than the offline manner, and the batch-based algorithms enable the sy stem/ emulator to handle advanced analysis.
  • Data sets may be emulated from high-dimensional time-series data flows using sliding windows, a concrete autoencoder, and a variational autoencoder where the concrete autoencoder plays the functionality to condense the highdimensional data using feature selection.
  • One or more embodiments described herein do not require pre-existing domain expertise to create models for the performance evaluation or to train/evaluate the models which are performed autonomously from the data flow.
  • the feature selection can “light-weight” the generative training by removing redundant and irrelevant data on the edge.
  • the feature selection helps to fit the emulator for devices on the edge for processing high-dimensional data close to the data resources.
  • a plurality of sliding windows may be determined, based at least on one parameter, to train the model using online-batch learning.
  • Each sliding window of the plurality of sliding windows may have a size (e.g., a parameter).
  • the size of sliding windows may be determined and/or received, e.g., given by users (customized).
  • the quantity of sliding windows (e.g., another parameter) may be determined as follows:
  • Quantity of sliding windows the total time length to monitor / the given size of sliding windows.
  • the total time length to monitor may refer to the total length of time series to analyze using a plurality of sliding windows.
  • an ML model is deployed via one or more organized units working together as emulator machine.
  • a digital twin emulator machine can make a digital copy of the physical world and/or be tool to deeply emulate device sets and/or detect what can potentially happen.
  • the tool may be an important tool for online verification. Online verification may refer to detect POTENTIAL anomalies (e.g., to know what can happen after running the physical system) without disturbing the running physical system.
  • generative learning models can perform data sampling from the trained model (e.g., so that it always generates a data set that complies with the pattern of input data, although optionally with variations of one or more embodiments).
  • the more data is generated the more likely potential red-flag events may be exposed.
  • Embodiment Al An emulator device configured to communicate with a plurality of internet of things, loT, devices, the emulator configured to, and/or comprising a communication interface and/or comprising processing circuitry configured to: receive data associated from the plurality of loT devices; generate variational data based on the received data, the variational data including high-dimensional time-series data sets; train a machine learning, ML, model using the variation data; and use the trained ML model to perform at least one detection.
  • loT internet of things
  • processing circuitry configured to: receive data associated from the plurality of loT devices; generate variational data based on the received data, the variational data including high-dimensional time-series data sets; train a machine learning, ML, model using the variation data; and use the trained ML model to perform at least one detection.
  • Embodiment A2 The emulator device of Embodiment Al, wherein the using of the trained ML model to perform at least one detection includes: generating data for emulation using the trained ML model based at least on a time-sliding window of time-series data, the generated data indicating at least one emulated outcome; and determine whether the emulated outcome indicates anomaly.
  • Embodiment A3 The emulator device of Embodiment Al, wherein the ML model is a Feature-selective Variational Generator for High-dimensional Time Series, FVGHT, model.
  • Embodiment A4. The emulator device of Embodiment Al, wherein the processing circuitry is further configured to perform feature selection to remove at least one of redundant data and irrelevant data from the received data.
  • Embodiment A5 The emulator device of Embodiment Al, wherein the variation data follows a probabilistic distribution associated with the received data.
  • Embodiment Bl A method implemented in a network node that is configured to communicate with a wireless device, the method comprising: receiving data associated from the plurality of loT devices; generating variational data based on the received data, the variational data including high-dimensional time-series data sets; training a machine learning, ML, model using the variation data; and using the trained ML model to perform at least one detection.
  • Embodiment B2 further comprising: generating data for emulation using the trained ML model based at least on a time-sliding window of time-series data, the generated data indicating at least one emulated outcome; and determining whether the emulated outcome indicates anomaly
  • Embodiment B3 The method of Embodiment Bl, wherein the ML model is a Feature-selective Variational Generator for High-dimensional Time Series, FVGHT, model.
  • Embodiment B4 The method of Embodiment Bl, further comprising performing feature selection to remove at least one of redundant data and irrelevant data from the received data.
  • Embodiment B5 The method of Embodiment Bl, wherein the variation data follows a probabilistic distribution associated with the received data.
  • the concepts described herein may be embodied as a method, data processing system, computer program product and/or computer storage media storing an executable computer program. Accordingly, the concepts described herein may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects all generally referred to herein as a “circuit” or “module.” Any process, step, action and/or functionality described herein may be performed by, and/or associated to, a corresponding module, which may be implemented in software and/or firmware and/or hardware. Furthermore, the disclosure may take the form of a computer program product on a tangible computer usable storage medium having computer program code embodied in the medium that can be executed by a computer. Any suitable tangible computer readable medium may be utilized including hard disks, CD-ROMs, electronic storage devices, optical storage devices, or magnetic storage devices.
  • These computer program instructions may also be stored in a computer readable memory or storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • Computer program code for carrying out operations of the concepts described herein may be written in an object oriented programming language such as Python, Java® or C++.
  • the computer program code for carrying out operations of the disclosure may also be written in conventional procedural programming languages, such as the "C" programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer.
  • the remote computer may be connected to the user's computer through a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • LAN local area network
  • WAN wide area network
  • Internet Service Provider for example, AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • GRU gated recurrent unit loT internet of things; VAE variational Autoencoder;

Abstract

According to one aspect, an emulator device (12) configured to communicate with a plurality of devices (14) is described. The emulator device (12) comprises processing circuitry (40) configured to determine high-dimensional time series data based on data associated with the plurality of devices (14); apply a model to the high-dimensional time series data based on a determined feature selection, where the applied model inherits at least one characteristic of a variational autoencoder (VAE); train the model, using online batch-based machine learning, based at least in part on the high-dimensional time series data; and perform at least one action using the trained model.

Description

FEATURE SELECTIVE AND GENERATIVE DIGITAL TWIN EMULATOR
MACHINE FOR DEVICE VERIFICATION AND ANOMALY CHECKING
TECHNICAL FIELD
The present disclosure relates to machine learning, and in particular, to one or more of feature selection, generative training and verification/anomaly detection.
BACKGROUND loT Device Platform
Internet of Things (loT) is regarded as an enabler for digital transformation of consumers and industries. Monitoring of parameters such as facility status (e.g., various equipment, manufacture, logistics, autonomous vehicle, etc.,), environments, and industrial processes are one of the most essential components of the digital transformation. Monitoring such as continuous monitoring is achieved by employing a variety of sensors (e.g., a large variety of sensors) in an environment.
Further, loT may be associated with a predetermined amount (e.g., massive amount) of physical devices and data, which provide input for digital twin models. Data collected by loT sensors may be used to create digital models for simulating equipment, a system, and even the surroundings. Such digital twin (DT) models can show punctual and individual problems, as well as utilize the data to predict one or more probabilities of issues that may happen in the near future. DT models may be generated automatically and may take into consideration one or more possibilities in different conditions and for different purposes.
One use case of loT-related industrial applications is to emulate the surroundings and/or the system itself for verification of the physical assets and to check for anomalies, e.g., before any deployment changes affect existing physical assets in an loT platform.
Digital Twin based on loT
Digital Twin (DT) may refer to a process of mapping physical entities to a digital representation by involving loT, e.g., to enable continuous communication between physical and digital twins. Besides the physical twins and digital twins (DT), one part of DT is data that is collected from the physical entities. The data provides the ability to describe the physical entity as realistically as required, e.g., required by a use case. The physical entity may be described using the DT model built as its digital representation. This is where the communication between physical and digital twins comes into focus. In addition, the role of loT may be used to enrich and automatize data collection and improve decision-making for the operation of physical entities. Using these interconnections (i.e., communication between physical and digital twins), the identified elements of physical entities can be addressable and stored to enable immediate usage or historical usage of received information. Historical data may be used in some future emulation.
FIG. 1 is a diagram of an example emulator in an loT-based digital twin platform. In this example, DT relies on an loT platform to manage the operations by sharing and reusing resources (data, devices, models, etc..) across different use cases. loT based DT (as shown in FIG. 1) is created on top of loT, which at the same time provides intelligent capabilities (such as emulation) for various industrial applications, e.g., so that resources (hardware resources, data resources, etc.) can be efficiently managed/reused/shared across various use cases. That further implies that:
• The models (e.g., emulator and models for other functionalities) are reusable to serve various industrial applications and adaptable for various use cases;
• The model management is conducted by DT operations, including both the processes, running loops, and components;
Further, loT-based device platform may be configured to generate data by capturing environment status via devices together with status data of hardware of the platform. However, at least part of the data collected from the environment may be redundant, such as some double-measured parameters and parameters not usable for predetermined objectives. Also, the data may not be relevant to the goal of an emulation, e.g., the noise level may not relate to product quality. In addition, irrelevant or redundant data may increase computation costs and/or even decrease the performance quality.
In addition, some systems for generating digital twin models from loT for emulation may include one or more common features in the following aspects: (1) different digital twin modeling (i.e., more than one DT model) applies based on different domain knowledge to solve a diversity of needs, e.g., only relating to a specific asset (such as the production line of certain products) which may be exposed to multiple condition changes.
(2) different digital twin modeling solutions may be proprietary for specific use cases and may be determined based on proprietary knowledge previously gained from the specific scenario and use cases.
In sum, existing loT DT systems provide unnecessary data, perform unnecessary computations, limited to specific assets, and may be limited by the use of proprietary elements.
US 20210141870 Al discloses a creation of a digital twin from a mechanical model. An industrial CAD system is supplemented with features that allow a developer to easily convert a mechanical CAD model of an automation system to a dynamic digital twin capable of simulation within a simulation platform. The features allow the user to label selected elements of a mechanical CAD drawing with “aspects” within the CAD environment. These aspect markups label the selected mechanical elements as being specific types of industrial assets or control elements. Based on these markups, the CAD platform associates mechatronic metadata with the selected elements based on the type of aspect with which each element is labeled. This mechatronic metadata defines the behavior (e.g., movements, speeds, forces, etc.) of the selected element within the context of a virtual simulation, transforming the mechanical CAD model into a dynamic digital twin that can be exported to a simulation and testing platform.
SUMMARY
An object of the invention is to enable improved device verification and anomaly detection.
Some embodiments advantageously provide methods, systems, and apparatuses for one or more of feature selection, generative training and verification/anomaly detection.
In some embodiments, DT models are based on knowledge representation, which can support reasoning (e.g., to answer “what-if’). • Models for DT may be reusable and flexible so that the operation and creation of the framework can have minimal dependency on human expertise.
Device Checking
Device verification and anomaly checking may be one use case that is served by running the models providing certain functionalities, namely an emulator.
Device verification and anomaly checking before the physical assets are hard settled (i.e., installed and/or in production and/or in use) plays roles in the domains such as automotive, smart manufacturing, as well as in the communication networks (e.g., radio access network). That is, once the changes on hardware deployment are settled, it can be costly to adjust existing deployment s). For example, in typical systems, changes to hardware may require frequently turning the system offline for error correction which may result in negative customers experience. In one or more embodiments, the deployment may be adjusted while the physical assets are online/settled (i.e., installed and/or in production and/or in use).
In industrial applications, such as networking, automotive, and manufacturing, a powerful emulator may help managing vehicles by enabling verification and anomaly checking, e.g., so that security risks can be avoided before real-life driving tests. Similarly, in the smart manufacturing domain, data gathered by industrial loT equipment may be consumed by an emulator for predictive maintenance of the production line, e.g., so that anomaly checking and/or verification can happen at an early stage. The data gathered as such may be used to lower downtime and increase production quality after deployment of physical assets.
RAN use case: In Radio Access Networks (RAN), hybrid historical data and live data collected from physical assets/devices may be used to emulate specific key parameter indicators (KPIs) that reflect the behavior and performance of a network. In some embodiments, using an emulator (i.e., an emulator device), device verification and anomaly detection may be used to identify potential problems under different conditions which can affect latency, throughput, and/or cause packet loss.
Smart manufacturing use case: detecting anomalies, e.g., in manufacturing lines, at an early phase and having risk control on hardware changes are very important. Learning patterns and emulating the patterns for early anomaly detection may be used to minimize losses caused by hardware (e.g., associated with mistaken configurations of hardware). Further, learning and discovering potential risks of hardware changes may be used to support operators to understand possible consequences of each change in detail and/or to avoid decisions on hardware change that may produce undesirable results.
Feature Selection
In one or more embodiments, feature selections may be performed, e.g., based on data collected by devices before loading the feature selections onto generative trainings. The loaded feature selections may avoid using irrelevant data and redundant data for the generative training.
Model for High-Dimensional Time Series Data
In one or more embodiments, an emulator device (e.g., a feature selective and generative DT emulator) for device verification and anomaly detection based on loT is provided. The emulator (i.e., the emulator functions provided by emulator device) is built/configured upon loT-based Digital Twin setup using Feature-selective Variational Generator for High-dimensional Time Series (FVGHT). In some embodiments, FVGHT may using features from concrete autoencoder (CAE), variational autoencoders (VAE), and/or gated recurrent unit (GRU). The emulator may be continuously trained based on the collected device data in high-dimensional with localized concrete (CONtinuous relaxations of disCRETE random variables) feature selections on the edge. The emulator may be configured to conduct (i.e., determine and/or perform) a global process to continuously approach the probabilistic distribution the inputted data obeys.
In an embodiment, the mechanism/process performed by emulator includes performing localized feature-selection (e.g., on edge devices) and/or find out (i.e., determine) a model (i.e., an optimized model) which may have a minimum Kullback- Leibler divergence with an input sample (e.g., received continuously).
In another embodiment, the emulator (i.e., emulator device) may be configured to continuously perform data sampling from the trained model. Further, the emulator may be configured to determine/generates one or more data sets obeying (i.e., associated with, corresponding) to the same distribution patterns (e.g., of input samples). The data sets may be based on contributive features such as to emulate input samples with no or domain expertise (i.e., limited domain expertise) and/or no human intervention.
In some embodiments, training units within the emulator also present the timely relations between data using Gated Recurrent Units (one difference from other generative methods).
Online, Minimum Dependency on Domain Expertise, Replicability
One or more embodiments described herein provide an emulator configured to receive/collect data (e.g., continuously takes data) from loT devices with feature selection and without disturbing the running physical sets.
In some embodiments, the emulator may be configured to generates another set of data that is different from the original set of data but obeys the same distribution patterns. The generation of the sets of data may be based on contributive features. Such a feature-selective and generative method may be driven by data that does not require domain knowledge; therefore, it is highly replicable under different conditions (physical environments, heterogeneous devices, etc.,.)
According to one aspect, an emulator device configured to communicate with a plurality of devices is described. The emulator device comprises processing circuitry configured to determine high-dimensional time series data based on data associated with the plurality of devices; apply a model to the high-dimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic of a variational autoencoder (VAE); train the model, using online batchbased machine learning, based at least in part on the high-dimensional time series data; and perform at least one action using the trained model.
In some embodiments, the processing circuitry is further configured to determine the feature selection of the data associated with the plurality of devices, where the determined feature selection indicates at least one relation between at least two dimensions of the data.
In some other embodiments, the processing circuitry is further configured to determine a plurality of sliding windows, based at least on one parameter, to train the model using online-batch learning, where each sliding window of the plurality of sliding windows has a size. In an embodiment, the processing circuitry is further configured to accumulate a plurality of data frames per each sliding window of the plurality of sliding windows. The plurality of data frames is associated with the data associated with the plurality of devices.
In another embodiment, applying the model includes generating emulated data based at least in part on the plurality of sliding windows. The emulated data includes reconstructed information from the high-dimensional time series data. At least one set of the emulated data corresponds to one sliding window of the plurality of sliding windows.
In some embodiments, the model is a feature-selective variational generator for high-dimensional time series (FVGHT).
In some other embodiments, the model is applied based at least in part on at least one of a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
In an embodiment, the emulator device further includes a communication interface in communication with the processing circuitry. The communication interface is configured to at least one of transmit emulated data to at least one device of the plurality of devices, the exposing including transmitting the emulated data; and transmit signaling to cause at least one device of the plurality of devices to perform the at least one action.
In another embodiment, performing the at least one action includes at least one of performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device.
In some embodiments, at least one of the plurality of devices is an internet of things (loT) device.
In some other embodiments, the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
In an embodiment, the emulator device is digital twin emulator.
According to another aspect, a method performed by an emulator device configured to communicate with a plurality of devices is described. The method comprises determining high-dimensional time series data based on data associated with the plurality of devices; applying a model to the high-dimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic of a variational autoencoder (VAE); training the model, using online batch-based machine learning, based at least in part on the high-dimensional time series data; and performing at least one action using the trained model.
In some embodiments, the method further includes determining the feature selection of the data associated with the plurality of devices, where the determined feature selection indicates at least one relation between at least two dimensions of the data.
In some other embodiments, the method further includes determining a plurality of sliding windows based at least on one parameter, to train the model using online-batch learning. Each sliding window of the plurality of sliding windows has a size.
In an embodiment, the method further includes accumulating a plurality of data frames per each sliding window of the plurality of sliding windows. The plurality of data frames is associated with the data associated with the plurality of devices.
In another embodiment, applying the model includes generating emulated data based at least in part on the plurality of sliding windows. The emulated data includes reconstructed information from the high-dimensional time series data. At least one set of the emulated data corresponds to one sliding window of the plurality of sliding windows.
In some embodiments, the model is a feature-selective variational generator for high-dimensional time series (FVGHT).
In some other embodiments, the model is applied based at least in part on at least one of a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
In an embodiment, the method further includes at least one of transmitting (e.g., making available, exposing, sharing) emulated data to at least one device of the plurality of devices; and transmitting signaling to cause at least one device of the plurality of devices to perform the at least one action.
In another embodiment, performing the at least one action includes at least one of performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device. In some embodiments, at least one of the plurality of devices is an internet of things (loT) device.
In some other embodiments, the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
In an embodiment, the emulator device is digital twin emulator.
According to one aspect, a computer program is described. The computer program comprises instructions which, when executed on processing circuitry of an emulator device, cause the processing circuitry to carry out any method of the present disclosure.
According to another aspect, a computer program product is described. The computer program product is stored on a computer storage medium and comprises instructions that, when executed by processing circuitry of an emulator device, cause the emulator device to perform any method of the present disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
A more complete understanding of the present embodiments, and the attendant advantages and features thereof, will be more readily understood by reference to the following detailed description when considered in conjunction with the accompanying drawings wherein:
FIG. l is a diagram of an emulator in a loT based digital twin platform;
FIG. 2 is a schematic diagram of an example system according to principles disclosed herein;
FIG. 3 is a block diagram of various elements in the system according to some embodiments of the present disclosure;
FIG. 3 is a flowchart of an example process in emulator device according to some embodiments of the present disclosure;
FIG. 4 is a flowchart of another example process in emulator device according to some embodiments of the present disclosure;
FIG. 5 is a diagram of an example emulator in an loT based digital twin platform according to some embodiments of the present disclosure; FIG. 6 is a diagram of an example feature-selective generative ML algorithm (FVGHT) for high-dimensional time-series data according to some embodiments of the present disclosure;
FIG. 7 is a diagram of various example steps performed by the emulator according to some embodiments of the present disclosure;
FIG. 8 is a diagram of example feature selection units of FVGHT for highdimensional time-series data according to some embodiments of the present disclosure;
FIG. 9 is a diagram of example generative units of FVGHT for highdimensional time-series data according to some embodiments of the present disclosure;
FIG. 10 is a diagram of example time series units of FVGHT for high dimensional series data (e.g., the decoder part) according to some embodiments of the present disclosure;
FIG. 11 is a diagram of example time series units of FVGHT for highdimensional time series data (e.g., the encoder part) according to some embodiments of the present disclosure;
FIG. 12 is a diagram of an example implementation of emulator machine/device based on distributed processor units according to some embodiments of the present disclosure;
FIG. 13 is a diagram of example emulator device (e.g., emulator machine) with point-to-point communication between processor and with an external broker according to some embodiments of the present disclosure;
FIG. 14 is a flowchart of an example process implemented by the emulator device according to some embodiments of the present disclosure; and
FIG. 15 is a diagram of an example orchestration of an emulator according to some embodiments of the present disclosure.
DETAILED DESCRIPTION
In some embodiments, emulation tools (i.e., an emulator device) are provided.
The emulator device may be configured to generate (e.g., dynamically generate) one or more models such as for verification of physical assets and/or detection of potential anomalies in various conditions.
In some other embodiments, the emulator may be configured to select features such as impactive features and remove the irrelevant and redundant features. In one or more embodiments, selecting and/or removing may be important steps for the emulator to be robust.
Unlike typical approaches, conducting/performing emulation (based on which one can perform device verification and anomaly checking) using heterogeneous loT devices/modules online, the emulator (i.e., emulator device) may be configured to fulfill/provide the following:
• Device verification and anomaly checking may be conducted online without physical impact on the existing physical assets;
• Prior domain knowledge (training labels, useful data models) regarding collected data may be sparse due to various combinations of devices and changes in processes (limited domain expertise); therefore, the solution may have minimized dependency on prior domain expertise.
• The solution may be replicable and reusable for different use cases. That is, the solution (including the algorithms and other components) may be applicable to different use cases (with different data sets, different requirements) without introducing changes (e.g., significant changes).
• Data processing algorithms may be diverse enough so that predetermined potential conditions and environment assumptions (that have not happened before) can be included in a model;
• Feature selection may be used so that irrelevant and redundant data can be fixed to increase the quality of the data generated by the emulator;
• The model may be light weighted for edge devices and reduce computation costs (e.g., energy and time).
Typical solutions cannot meet the requirements/conditions stated above.
However, in one or more embodiments, the emulator (and corresponding features) can be configured to meet any of these requirements/conditions. These and other problems (limitations) of some other systems can be summarized as follows: (1) The created models are highly domain-specific: one solution for one case; hard to be reused;
(2) Models need manual interferences to be adapted using domain knowledge;
(3) Limited conditions and situations are included; and
(4) Irrelevant and redundant data are loaded for creating models for digital Twin during data generation process.
For example, a first system provides for the automatic creation three dimensional (3D) digital twin models of some mechanical parts that can be imported into a simulation platform and be simulated. Compared to one or more embodiments of the present disclosure, the first system creates an exact model of devices (e.g., the real thing), while in one or more embodiments of the present disclosure, the emulator device may be configured to automatically create different variations of the model states and simulate the different variations. One limitation of the first system is thus a focus on a 3D model that can be used in a simulator but cannot provide “what if’ scenarios for the model.
A second system relates to a direct connection between the robot program and a simulation platform so that the simulation can be performed on a 3D model with the real robot logic. Compared to one or more embodiments of the present disclosure, the second system simulates the exact behavior, while one or more embodiments of the present disclosure diversifies it to explore different situations and possibilities. The limitation of the second system is thus the inability to simulate “what if’ scenarios that are similar but not the same as the historical situations, rather the second system only emulate the behavior of real things in the current situation.
A third system relates to a digital twin based solution for preventive maintenance of vehicles by simulating future trips and predicting car component failures during those trips, as well as suggesting car mechanic stops along the way. The third system utilizes historical data for simulation. The limitation of the third system is that it utilizes historical journeys as is, without expanding the parametrization space, and thus not accounting for different but very similar scenarios. To further compare the third system with one or mor embodiments of the present disclosure, the third system uses expert models for simulating component failures, while one or more embodiments of the present disclosure use a data-driven approach by applying machine learning and is thus able to explore a much wider problem space.
A fourth system utilizes a variational autoencoder (VAE) including an encoder network, prior network and a decoder network to train on a set of images used as input data. The technique also includes training the VAE by updating one or more parameters of the VAE based on one or more outputs produced by the VAE from the training dataset. Constructed VAE provides output in form of new images that reflect one or more visual attributes associated with the set of training images by applying the decoder network to one or more values sampled from the second distribution of latent variables generated by the prior network. Compared to one or more embodiments of the present disclosure, the fourth system use hierarchical variational autoencoder to generate new images similar to the input ones, while one or more embodiments of the present disclosure uses concrete autoencoders to extract features and hierarchical autoencoders to emulate real-world conditions represent in a form of time series data, such as sensor telemetry data.
A fifth system relates to compressing images using neural networks. The fifth system utilizes variational autoencoders (VAE), namely an encoder to process the image and provide latent variables as an output, based on which the generative neural networks can recreate the original image. The latent variables represent features of the image and are used to condition the generative neural networks. Compared to one or more embodiments of the present disclosure, the fifth system uses VAE to reproduce the input image from a compressed version, while one or more embodiments of the present disclosure emulate real-world conditions represented in a form of time-series data that are not the same as the observed ones, while still being realistic.
A sixth system uses autoencoders including an encoder neural network and a decoder neural network to generate music. The encoder is configured to receive an input audio waveform and, in response, provide an embedding descriptive of the input audio waveform. The decoder is configured to receive the embedding and at least a portion of the input audio waveform and, in response, predict the next sequential audio sample for the input audio waveform. The operations include evaluating a loss function that compares the next sequential audio sample predicted by the decoder neural network to a ground-truth audio sample associated with the input audio waveform. The operations include adjusting one or more parameters of the autoencoder model to improve the loss function. Unlike a traditional synthesizer which generates audio from hand-designed components like oscillators and wavetables, the neural synthesizer model can use deep neural networks to generate sounds at the level of individual samples. Learning directly from data, the neural synthesizer model can provide intuitive control over timbre and dynamics and enable exploration of new sounds that would be difficult or impossible to produce with a hand-tuned synthesizer. Compared to one or more embodiments of the present disclosure, the sixth system applies autoencoders to music creation, focusing on synthesizing new sounds, while one or more embodiments of the present disclosure apply VAE to emulate variable real-world conditions represented as time-series data, such as telemetry data, in a manner that the emulated data represents realistic real- world conditions.
Before describing in detail exemplary embodiments, it is noted that the embodiments reside primarily in combinations of apparatus components and processing steps related to machine learning and in particular to one or more of feature selection, generative training and verification/anomaly detection. Accordingly, components have been represented where appropriate by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.
As used herein, relational terms, such as “first” and “second,” “top” and “bottom,” and the like, may be used solely to distinguish one entity or element from another entity or element without necessarily requiring or implying any physical or logical relationship or order between such entities or elements. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the concepts described herein. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including” when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In embodiments described herein, the joining term, “in communication with” and the like, may be used to indicate electrical or data communication, which may be accomplished by physical contact, induction, electromagnetic radiation, radio signaling, infrared signaling or optical signaling, for example. One having ordinary skill in the art will appreciate that multiple components may interoperate, and modifications and variations are possible of achieving the electrical and data communication.
In some embodiments described herein, the term “coupled,” “connected,” and the like, may be used herein to indicate a connection, although not necessarily directly, and may include wired and/or wireless connections.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the concepts described herein. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including” when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Note further, that functions described herein as being performed by an emulator device may be distributed over a plurality of devices. In other words, it is contemplated that the functions of the emulator device described herein are not limited to performance by a single physical device and, in fact, can be distributed among several physical devices.
In some embodiments, the term “determining” (and/or to determine”) may refer to (without being limited to) at least one of obtaining, calculating, deriving, detecting, measuring, generating, learning, developing, deciding, controlling, regulating, dictating, discovering, establishing, setting, completing, etc. In a nonlimiting example, determining data such as high-dimensional time series data may include learning and/or generating data such as high-dimensional time series data.
In some embodiments, the general description elements in the form of “one of A and B” corresponds to A or B. In some embodiments, at least one of A and B corresponds to A, B or AB, or to one or more of A and B. In some embodiments, at least one of A, B and C corresponds to one or more of A, B and C, and/or A, B, C or a combination thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms used herein should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Some embodiments are directed to one or more of feature selection, generative training and verification/anomaly detection.
Referring again to the drawing figures, in which like elements are referred to by like reference numerals, there is shown in FIG. 2 a schematic diagram of a system 10 that includes an emulator device 12 (also referred to as emulator) in communication with one or more internet of things (loT) devices 14 via one or more networks (e.g., wireless network, 3GPP based network, WAN, LAN, etc.). Emulator device 12 includes hardware 16 enabling it to communicate with one or more entities in system 10 such as with loT device 14. The hardware 16 may include a communication interface 18 for setting up and maintaining at least communications (e.g., wireless, wired, etc.) with one or more elements in system 10 such as with loT device 14. loT device 14 is described as an loT device as a matter of convenience and to aid understanding of the principles disclosed herein. It is understood that the principles described herein may apply to other devices (e.g., network devices) and elements and are not limited solely and exclusively to loT devices. loT device 14 may be a device for use in one or more application domains, these domains comprising, but not limited to, home, city, wearable technology, extended reality, industrial application, and healthcare. By way of example, the loT device 14 for a home, an office, a building or an infrastructure may be a baking scale, a coffee machine, a grill, a fridge, a refrigerator, a freezer, a microwave oven, an oven, a toaster, a water tap, a water heater, a water geyser, a sauna, a vacuum cleaner, a washer, a dryer, a dishwasher, a door, a window, a curtain, a blind, a furniture, a light bulb, a fan, an air-conditioner, a cooler, an air purifier, a humidifier, a speaker, a television, a laptop, a personal computer, a gaming console, a remote control, a vent, an iron, a steamer, a pressure cooker, a stove, an electric stove, a hair dryer, a hair styler, a mirror, a printer, a scanner, a photocopier, a projector, a hologram projector, a 3D printer, a drill, a hand-dryer, an alarm clock, a clock, a security camera, a smoke alarm, a fire alarm, a connected doorbell, an electronic door lock, a lawnmower, a thermostat, a plug, an irrigation control device, a flood sensor, a moisture sensor, a motion detector, a weather station, an electricity meter, a water meter, and a gas meter.
By further ways of example, the loT device 14 for use in a city, urban, or rural areas may be connected street lighting, a connected traffic light, a traffic camera, a connected road sign, an air control/monitor, a noise level detector, a transport congestion monitoring device, a transport controlling device, an automated toll payment device, a parking payment device, a sensor for monitoring parking usage, a traffic management device, a digital kiosk, a bin, an air quality monitoring sensor, a bridge condition monitoring sensor, a fire hydrant, a manhole sensor, a tarmac sensor, a water fountain sensor, a connected closed circuit television, a scooter, a hoverboard, a ticketing machine, a ticket barrier, a metro rail, a metro station device, a passenger information panel, an onboard camera, and other connected device on a public transport vehicle.
As further way of example, the communication loT device 14 may be a wearable device, or a device related to extended reality, wherein the device related to extended reality may be a device related to augmented reality, virtual reality, merged reality, or mixed reality. Examples of such loT devices may be a smart-band, a tracker, a haptic glove, a haptic suit, a smartwatch, clothes, eyeglasses, a head mounted display, an ear pod, an activity monitor, a fitness monitor, a heart rate monitor, a ring, a key tracker, a blood glucose meter, and a pressure meter. As further ways of example, the loT device 14 may be an industrial application device wherein an industrial application device may be an industrial unmanned aerial vehicle, an intelligent industrial robot, a vehicle assembly robot, and an automated guided vehicle.
As further ways of example, the loT device 14 may be a transportation vehicle, wherein a transportation vehicle may be a bicycle, a motor bike, a scooter, a moped, an auto rickshaw, a rail transport, a train, a tram, a bus, a car, a truck, an airplane, a boat, a ship, a ski board, a snowboard, a snow mobile, a hoverboard, a skateboard, roller-skates, a vehicle for freight transportation, a drone, a robot, a stratospheric aircraft, an aircraft, a helicopter and a hovercraft. As further ways of example, the loT device 14 may be a health or fitness device, wherein a health or fitness device may be a surgical robot, an implantable medical device, a non-invasive medical device, and a stationary medical device which may be: an in-vitro diagnostic device, a radiology device, a diagnostic imaging device, and an x-ray device.
In the embodiment shown, the hardware 16 of the emulator device 12 further includes processing circuitry 20. The processing circuitry 20 may include a processor 22 and a memory 24. In particular, in addition to or instead of a processor, such as a central processing unit, and memory, the processing circuitry 20 may comprise integrated circuitry for processing and/or control, e.g., one or more processors and/or processor cores and/or FPGAs (Field Programmable Gate Array) and/or ASICs (Application Specific Integrated Circuitry) adapted to execute instructions. The processor 22 may be configured to access (e.g., write to and/or read from) the memory 24, which may comprise any kind of volatile and/or nonvolatile memory, e.g., cache and/or buffer memory and/or RAM (Random Access Memory) and/or ROM (Read-Only Memory) and/or optical memory and/or EPROM (Erasable Programmable Read-Only Memory).
Thus, the emulator device 12 further has software 26 stored internally in, for example, memory 24, or stored in external memory (e.g., database, storage array, network storage device, etc.) accessible by the emulator device 12 via an external connection. The software 26 may be executable by the processing circuitry 20. The processing circuitry 20 may be configured to control any of the methods and/or processes described herein and/or to cause such methods, and/or processes to be performed, e.g., by emulator device 12. Processor 22 corresponds to one or more processors 22 for performing emulator device 12 functions described herein. The memory 24 is configured to store data, programmatic software code and/or other information described herein. In some embodiments, the software 26 may include instructions that, when executed by the processor 22 and/or processing circuitry 20, causes the processor 22 and/or processing circuitry 20 to perform the processes described herein with respect to emulator device 12. For example, processing circuitry 20 of the emulator device 12 may include digital twin (DT) unit 28 which is configured to perform one or more emulator/emulator device 12 functions described herein such as with respect to one or more of feature selection, generative training and verification/anomaly detection.
The system 10 further includes the loT device 14 already referred to. The loT device 14 may have hardware 30 that may include a communication interface 32 configured to set up and maintain communication with one or more elements in system 10 such as with emulator device 12 and/or other loT devices 14.
The hardware 30 of the loT device 14 further includes processing circuitry 34. The processing circuitry 34 may include a processor 36 and memory 38. In particular, in addition to or instead of a processor, such as a central processing unit, and memory, the processing circuitry 34 may comprise integrated circuitry for processing and/or control, e.g., one or more processors and/or processor cores and/or FPGAs (Field Programmable Gate Array) and/or ASICs (Application Specific Integrated Circuitry) adapted to execute instructions. The processor 36 may be configured to access (e.g., write to and/or read from) memory 38, which may comprise any kind of volatile and/or nonvolatile memory, e.g., cache and/or buffer memory and/or RAM (Random Access Memory) and/or ROM (Read-Only Memory) and/or optical memory and/or EPROM (Erasable Programmable Read-Only Memory). loT device 14 may include one or more sensors 40 for generating sensor data that may be communicated to emulator device 12 as described herein. loT device may communicate data other than sensor to emulator device 12 as described herein.
Thus, the loT device 14 may further comprise software 42, which is stored in, for example, memory 38 at the loT device 14, or stored in external memory (e.g., database, storage array, network storage device, etc.) accessible by the loT device 14. The software 42 may be executable by the processing circuitry 34.
The processing circuitry 34 may be configured to control any of the methods and/or processes described herein and/or to cause such methods, and/or processes to be performed, e.g., by loT device 14. The processor 36 corresponds to one or more processors 36 for performing loT device 14 functions described herein. The loT device 14 includes memory 38 that is configured to store data, programmatic software code and/or other information described herein. In some embodiments, the software 42 may include instructions that, when executed by the processor 36 and/or processing circuitry 34, causes the processor 36 and/or processing circuitry 34 to perform the processes described herein with respect to loT device 14.
In some embodiments, the inner workings of the emulator device 12 and loT device 14 may be as shown in FIG. 2 and independently, the surrounding network topology.
Although FIG. 2 shows DT unit 28 as being within a respective processor, it is contemplated that this units may be implemented such that a portion of the unit is stored in a corresponding memory within the processing circuitry. In other words, the units may be implemented in hardware or in a combination of hardware and software within the processing circuitry. Further one or more processors, processing circuitry and/or other elements in FIG. 2 may be distributed in one or more devices.
FIG. 3 is a flowchart of an example process in an emulator device 12 according to one or more embodiments of the present disclosure. One or more blocks described herein may be performed by one or more elements of emulator device 12 such as by one or more of processing circuitry 20 (including the DT unit 28), processor 22, and/or communication interface 18. Emulator device 12 is configured to receive (Block SI 00) data associated from the plurality of loT devices 14, as described herein. Emulator device 12 is configured to generate (Block S102) variational data based on the received data where the variational data including highdimensional time-series data sets, as described herein. Emulator device 12 is configured to train (Block SI 04) a machine learning, ML, model using the variation data, as described herein. Emulator device 12 is configured to use (Block S106) the trained ML model to perform at least one detection, as described herein. According to one or more embodiments, the using of the trained ML model to perform at least one detection includes: generating data for emulation using the trained ML model based at least on a time-sliding window of time-series data, the generated data indicating at least one emulated outcome, and determine whether the emulated outcome indicates anomaly. According to one or more embodiments, the ML model is a Feature-selective Variational Generator for High-dimensional Time Series, FVGHT, model. According to one or more embodiments, the processing circuitry is further configured to perform feature selection to remove at least one of redundant data and irrelevant data from the received data. According to one or more embodiments, the variation data follows a probabilistic distribution associated with the received data.
FIG. 4 is a flowchart of another example process in an emulator device 12 according to one or more embodiments of the present disclosure. One or more blocks described herein may be performed by one or more elements of emulator device 12 such as by one or more of processing circuitry 20 (including the DT unit 28), processor 22, and/or communication interface 18. Emulator device 12 is configured to determine (Block SI 08) high-dimensional time series data based on data associated with the plurality of devices and/or apply (Block SI 10) a model to the highdimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic (e.g., a typology such as data-generative typology) of a variational autoencoder (VAE). Emulator device 12 is further configured to train (Block SI 12) the model, using online batch-based machine learning, based at least in part on the high-dimensional time series data and/or perform (Block SI 14) at least one action using the trained model.
In some embodiments, the method further includes determining the feature selection of the data associated with the plurality of devices, where the determined feature selection indicates at least one relation between at least two dimensions of the data.
In some other embodiments, the method further includes determining a plurality of sliding windows based at least on one parameter, to train the model using online-batch learning. Each sliding window of the plurality of sliding windows has a size. In an embodiment, the method further includes accumulating a plurality of data frames per each sliding window of the plurality of sliding windows. The plurality of data frames is associated with the data associated with the plurality of devices.
In another embodiment, applying the model includes generating emulated data based at least in part on the plurality of sliding windows. The emulated data includes reconstructed information from the high-dimensional time series data. At least one set of the emulated data corresponds to one sliding window of the plurality of sliding windows.
In some embodiments, the model is a feature-selective variational generator for high-dimensional time series (FVGHT).
In some other embodiments, the model is applied based at least in part on at least one of a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
In an embodiment, the method further includes at least one of transmitting (e.g., making available, exposing, sharing) emulated data to at least one device of the plurality of devices; and transmitting signaling to cause at least one device of the plurality of devices to perform the at least one action.
In another embodiment, performing the at least one action includes at least one of performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device.
In some embodiments, at least one of the plurality of devices is an internet of things (loT) device 14.
In some other embodiments, the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
In an embodiment, the emulator device is digital twin emulator.
Having described the general process flow of arrangements of the disclosure and having provided examples of hardware and software arrangements for implementing the processes and functions of the disclosure, the sections below provide details and examples of arrangements for one or more of feature selection, generative training and verification/anomaly detection.
Some embodiments provide for one or more of feature selection, generative training and verification/anomaly detection, such as based on one or more functions performed by emulator device 12 (also referred to an emulator, DT emulator, etc.). One or more emulator device 12 functions described below may be performed by one or more of processor 22, DT unit 28, processing circuitry 20, etc. Further, one or more emulator device 12 functions may be distributed such that one or more of these functions are performed in another entity such as a cloud service network, loT device 14, etc.
A digital Twin (DT) Emulator
The emulator may operate in loT based DT platform as illustrated in FIG. 4. In particular, in one or more embodiments, the DT platform includes an loT device platform 60, a DT twin platform 62 (e.g., including one or more emulators), at least one loT device 14 (e.g., physical entity), and/or industrial applications 58 (e.g., such as software 26, 42, and/or any other hardware component of system 10). One or more processes may be performed (e.g., by any of the elements/components/devices shown in FIG. 2), where the processes include, without being limited to, a learning process, an emulating process, and an updating process. The learning process may include one or more of the following: determining data and/or transmitting the data from sensor 40 to data model 50; determining structure data and/or transmitting the structure data from data model 50 to knowledge representations 52; determining one or more observations and/or transmitting the one or more observations from knowledge representations 52 to behavior model 54; determining a status summary and/or transmitting the status summary from behavior model 54 to decision model 56; determining one or more actions and/or transmit the one or more actions from decision model 56 to knowledge representations 52; determine structure data and/or transmit the structure data from knowledge representations 52 to data model 50; determine one or more instructions and/or transmit the one or more instructions to actuator 44, which may then perform one or more actions which may be associated at least with sensor 40.
The emulating process may include one or more actions and/or data transmitted/ exchanged between behavior model 54 and decision model 56. The updating process may include one or more of the following: determining emulated behavior and/or transmitting the emulated behavior to industrial applications 58 (e.g., for updating one or more software applications); determine one or more emulated decisions and/or transmit the one or more emulated decisions to decision model 56; and/or determine one or more actions and/or transmit the one or more actions to the behavior model 54.
Any of the components/elements shown in FIG. 5 may correspond to and/or be comprised in and/or be performed by any of the components/elements/devices shown in FIG. 2.
In some embodiments, loT device 14 includes sensor 40 for generating data that is communicated to the emulator device 12. Further, loT device 14 may include an actuator 44 or element that is configured to receive an instruction and/or perform one or more actions associated with the loT device 14. In one or more embodiments, emulator device 12 includes one or more functional blocks such as a data model 50, knowledge representation 52 (Semantic Model), behavior model 54 (emulator), and decision model 56 (emulator), one or more of which may be implemented by DT unit 28 and/or various components of emulator device 12 and/or loT device 14 and/or by entities in system 10. Further, while FIG. 4 illustrates a specific grouping of elements included in the loT device 14 and emulator device 12, other groupings are possible in accordance with the teachings of the present disclosure. For example, at least part of the emulator device 12 may be implemented in the loT device 14 such that the loT device 14 may perform at least one emulator device 12 function. In one or more embodiments, the emulator includes different models (e.g., a plurality of models or plurality of machine learning, ML, models), where each model is supported by relevant components.
Several elements of system 10 include:
• One or many devices (e.g., loT device 14) where each device includes: o A processing unit (e.g., processing circuitry 34) to process the sensor data and send out the result via a communication unit (e.g., communication interface 32)
■ Optionally, the loT devices could only run some parts of the method described below as one or more embodiments composes computations in a distributed manner. o A communication unit to send the sensor data provided by the sensor unit ■ Optionally, the loT devices could send the output from certain processing composition unit (e.g., processing circuitry 34) o sensors and a certain sensor unit (collectively referred to as sensor (e.g., sensor 40)) to collect information from the physical environment. For example, regular environment sensors (e.g., temperature, humidity, air pollution, acoustic, sound, vibration), sensors for navigation (e.g., altimeters, gyroscopes, internal navigators, magnetic compasses), optical items (e.g., light sensor, thermographic cameras, photodetectors), and many other sensor types.
• One or more computing units (e.g., emulator device 12), such as devices (could be one or many), gateways, various types of apparatus, and/or computing cloud, which is composed of: o A processing unit (e.g., processing circuitry 20) with storage/memory (e.g., 24):
■ to implement either the whole method at once or to only implement specific steps of the method;
■ to interact with the communication units;
■ to temporarily store the data; o One or more communication units (e.g., communication interface 18):
■ to collect data from heterogeneous radio nodes and devices via different protocols;
■ to exchange information between intelligence processor units;
■ to expose the data and/or insights to other external systems or other internal modules.
In one or more embodiments, the devices and computing units may be combined to compose sub-components of the DT emulator, including the emulator itself and other external supportive components. In one or more embodiments, the DT emulator functions may be performed by emulator device 12.
The DT emulators (e.g., emulator devices 12) may run/operate with support from some external components:
• Device platform (loT) may include physical entities (e.g., loT device 14) with their sensors (e.g., sensor 40) and actuators 44. These may include devices providing data in proprietary format about a physical entity, e.g., temperature sensor, while latter represent devices enforcing instructions in the proper format to a physical entity, e.g., on/off switch. It may be composed of more than one device. In some embodiments, the device platform may be implemented as a separate device from emulator device 12 and/or be in communication with emulator device 12 via a communication link. In other words, in some embodiments, the loT device platform may be implemented using its own set of computing hardware and software, using the same types of components shown in FIG. 2 with respect to emulator device 12. However, the loT platform may be in communication with the emulator device 12 such that functions discussed herein with respect to the loT platform are performed by one computing device, and functions described with respect to the DT platform are performed by emulator device 12. o Devices (e.g., loT devices 14) may be located on-premises and/or be implemented as hardware devices in an loT environment (e.g., a temperature sensor and on/off switch), or as a piece of software that runs on physical entities (e.g., Java code that: collects CPU usage on a Raspberry Pi and can restart it).
• The data model 50 in the device platform may represent the raw data collected from devices in a formalized and structuralized schema so that behavior and decision models with their relevant emulators can use the device data smoothly. It may be composed of one or more computing units.
• Knowledge representations 52 (semantic models) may represent logic and knowledge for a certain use case, which may help ensure that the emulator can be based on logic reasoning and be able to answer questions such as “what-if’. It may be composed of one or more computing units.
• Data models 50 may be ran/executed by data pre-processing components, which conduct pre-data processing to convert raw data into structured data which can be consumed by the emulator. It may be composed by one or more devices and one or more computing units (e.g., processing circuitry). o Such preprocessing may first intake raw data and/or form a data model 50. The data model 50 may receive data in a standardized or proprietary format from sensors and/or output uniformly structured data, e.g., JavaScript Object Notation (JSON) format. Data model 50 may commonly run on-premise as well, as it may require a communication channel that is physically near sensors and actuators in order to communicate with them (e.g., Zigbee, Z-Wave and WiFi modules). For software sensors and actuators (e.g., sensor 40 and/or actuator 44 may be implemented in software and may not be a physical entities, i.e., they may be logical entities), the data model 50 can run further on the Cloud (e.g., in a cloud computing network), while being connected through Internet with network-enabled physical devices (e.g., loT devices 14). o Knowledge representations 52 may be generated/determined (e.g., emulator device 12) from data model 50 and refer to a semantic model which receives uniformly formatted data from the data model and provides a context, using schema such as Resource Description Framework (RDF), Ontology Language (OWL), Next Generation Service Interfaces Linked Data (NGSI-LD), etc..
• Digital Twin (DT) Emulator (e.g., emulator device 12) may include a behavior model 54 (e.g., emulator) and a decision model 56 (e.g., emulator). For time- critical use cases, the behavior and decision models can run/operate on the edge (e.g., edge of the loT network, physically and/or logically near one or more loT devices 14), to make the round-trip time to be short. The behavior module 54 (e.g., emulator) and/or decision model 56 (e.g., emulator) can also run on the Cloud. The emulator is based on a digital twin platform. On a southbound interface, the digital twin platform connects to a device platform used for getting and storing both contextual and semantical data, respectively, as well as forwarding actions to actuators. On a northbound interface of the digital twin platform, a user may send user-specific ontology/semantics, along with desired intentions on how the system should behave. One or more interfaces may be provided by one or more of communication interface 18 and communication interface 32. o Behavior model 54 (e.g., emulator) may be configured to take contextually enhanced observations and/or gives back semantically enhanced insights, e.g., it takes multiple temperature readings from a robot arm and informs that a robot arm is overheating. This way, outcomes of certain actions observed as insights can thus be extrapolated, simulated, and in general planned for future goals. It may be composed of one or more computing units (e.g., processing circuitry 20). o The actions may be executed by the decision model 56 (e.g., emulator) based on semantically enhanced insights, as well as “intents” from a user, e.g., due to a robot overheating and a user intent to keep a safe environment it decides to switch off the robot arm. The action may be sent through a semantic model that knows what actuator may need to be switched off. It may be composed of one or more computing units. The emulator may serve the DT platform via interacting with previously described components in these three loops, namely:
Procedure 1 — Learning Loop: The emulator device 12 may be configured to take data input from the device platform to train the emulator device 12. Emulator may further be configured to retrieve observations and/or insights from the device platform to get/determine conclusions regarding what is happening and/or decide to take “Actions” to optimize the system. During this process, the emulator device 12 may also take business logic as input from the users, the business logic describes reaction/decision policies using “if’ and “then” schema.
Procedure 2 — Emulating Loop: based on the learned model (in procedure 1) the emulator generates data for emulation. It creates emulated “Insights” within the “Behavior Model” 54 and sends them to the “Decision Model” 56 to perform decisions and actions. The decisions may be sent as emulated “Actions” to the “Behavior model” to emulate their outcome in form of newly created simulated Insights. Those simulated Insights are used for selecting “Actions” to be executed.
Procedure 3 — Updating Loop: The emulator (i.e., emulator device 12) is updated by comparing the result of actions to the given “Intent” to optimize the emulators. It utilizes recorded “performance” (e.g., error rates, etc.) in both the “Leaning loop” and “Emulating loop” to update the “Behavior Model Emulator” 54 and “Decision Model Emulator” 56.
The Emulator (e.g., emulator device 12)
In one or more embodiments, the emulator may be composed of the Emulator algorithms and a set of computing units running the emulator algorithms, which may be stored in, for example, memory 24, DT unit 28, etc. The emulator is running/operating together with other external components, namely other components in the digital twin platform. However, the emulator itself does not have specific requirements on the DT components, and just sending data to the emulator may be enough.
(1) The FVGHT to generate emulated data
The FVGHT may be used to resolve a probabilistic distribution that is infinitely approaching the objective data sets which are multivariable time series data after conducting feature selections on the high-dimensional data collected from DT physical devices. In the case of Digital Twin, such data are the ones generated by physical twin counterparts. It (i.e., DT) may be a neural network typology that inherits functionalities from concrete autoencoder (CAE), variational autoencoders (VAE), and gated recurrent unit (GRU), and performs online training and learning.
• The FVGHT may be used to perform feature selections on the data generated by physical twins, e.g., so that the emulation may be primarily based on contributive/impactive parameters. Moreover, when irrelevant and redundant data are avoided to some extent, it both saves computation resources (such as energy) and light-weighs the computation burden on edge devices.
• The FVGHT may be a variational data generator and/or inherit the data- generative typology of VAE, e.g., so that data sets having the same probabilistic features as the data set to be emulated can be then generated;
• The FVGHT may be designed to process time-series data with sliding windows and/or may inherit the typology of GRU networks to handle the continuously generated data from the physical twins. In some embodiments, this feature is important for the Digital Twin emulator, as the timely relations between different batches of data sets may be reflected during the training. FIG. 6 shows a diagram of an example feature-selective generative ML algorithm (FVGHT) for high-dimensional time-series data. More specifically, generator structure 64 includes collected data from loT (e.g., physical) devices 14, a feature selection structure 70 (which may determine and/or transmit selected authentic features 72 (e.g., not pseudo features)), a first time series structure 74, VAE Latent 76 (e.g., mean/variation), VAE-DE 78, and a second times series structure 80. Actual data (e.g., data frame) may be encoded and/or converted and/or reconstructed into a second time series structure 80. Moreover, the FVGHT (as shown in FIG. 6) also has one or more of the following features: feature-selective learning, generative learning, unsupervised learning, and online learning. Those features enable it to continuously generate emulating data sets from the data generated by physical twins which takes high-dimensional time series data as input.
(2) The components of the emulator
The emulator takes datasets with any of the following features as input:
• Both online and historical High-dimensional time series data, which have no obviously detected relations between each dimension of the data sets. The data can be generated by both the physical and digital (e.g., logical) entities in the digital twin framework.
• Both online and historical high-dimensional time-series data, which have detected relations between each dimension of the data sets. The data can be generated by both the physical and digital entities in the digital twin framework.
The emulator delivers datasets as output by demands with one or more of the following features:
• Infinitely approximating the input data sets by infinitively approximating to the probabilistic distribution based on contributive/impactive information (parameters);
• Generates data sets by sampling from the generator according to demanded amounts, while each portion of the generated data set is not (e.g., never) the same on actual value.
• The generated data contains variational information learned from contributive/impactive information so that potential/undetectable design faults, anomalies can be detected when the demanded amounts of data are large enough (i.e., exceed a predetermined threshold).
The Steps to use the Emulator
As described herein, a method (including a set of different steps) provides features not provided by existing systems. One or more steps included in the method provide a unique integration of some known technologies and are inventively evolving the known technologies from existing systems. A method can be implemented at one or more computing units (and/or any component of emulator device 12), which may perform one or more steps illustrated in FIG. 7.
Step S200: To trigger the feature reduction and build the models, a specific size of the time window may be defmed/determined for accumulating the data. This step may group batches (i.e., mini-batches) to data. The size of the windows (i.e., sliding windows) may be specific depending on the use case and is configurable according to different requirements. Each sliding window outputs a data frame in temporal order with a certain defined interval. It could be accumulated on memory or persistent storage depending on the requirements.
Step S202: This step comprises the collection of available data from the devices (e.g., loT devices 14). It may integrate heterogeneous devices (e.g., loT devices 14) which may have one or multiple different communication protocols. In other words, the emulator is configured to communicate with one or more devices, each having a different communication protocol, i.e., the emulator is configured to support a plurality of communication protocols. Further, the emulator may be configured to collect the data generated by physical twins (i.e., feed the data into the emulator). Once enough data is accumulated (e.g., once collected data amount meets a predefined threshold) according to the size of the sliding window, step S204 may be triggered. Step S202 may keep going/running/operating unless the emulator needs to be shut down;
Step S204: continuously train the emulator for feature selection on the edge (if the edge is needed). In one or more embodiments, this step may keep/continue to operate/run unless the emulator needs to be shut down;
Step S206: continuously feed feature-selected data to the non-edge part of the emulator and/or handle any backpropagation from the non-edge part. In one or more embodiments, this step may keep/continue to run/operating unless the emulator needs to be shut down;
Step S208: continuously train the emulator for generating time series and perform backpropagation with the edge part. In one or mor embodiments, this step may continue to operate unless the emulator needs to be shut down;
Step S210: provide demands (the amount of data) to the emulator and/or deliver/transmit the emulating data. In one nonlimiting example of this step, the amount of demanded data may not be larger than the total amount of training data fed into the emulator since the emulator starts.
The sliding windows (e.g., logical time windows)
The input of the Emulator is exposed to the sliding step for accumulation of data;
Each sliding window accumulates data frames for a certain/predefined amount of time and/or for a certain amount of samples.
The computational tasks progress as the sliding windows move, while working on data accumulated during the sliding windows. For example, the emulator handles (e.g., determines, collects, stores) data generated within 2 hours, where 2 hours is the size of the sliding window. Consequently, the data fed into each training step is timeseries data accumulated within 2 hours. After handing data in the current sliding window, the sliding window moves to take data generated in another 2 hours after the current one. Such data accumulation provides a basis for concentrating information in the time dimension.
Feature selection units of the emulator (e.g., processing circuitry 20 and/or DT unit 28 functionality)
In this emulator, processing of the input data is performed in two dimensions, namely the feature dimensions and the time dimension, as shown in FIG. 8 which shows feature selection units of FVGHT for high-dimensional time-series data. In the concrete autoencoder (CAE) units of the feature selection structure 70, data is processed in the feature dimension. The feature selection units (i.e., structure) aim to inherit (i.e., use in part) architecture from the CAE which aims to avoid feeding redundant and irrelevant parameters to the variational generator. These units take input from the sliding windows. Unlike other autoencoders, CAE introduces feature selection layers, where Concrete random variables can be sampled to produce continuous relaxation of the one-hot vectors. CAE selects discrete features using an embedded method but without regularizations. It uses a relaxation of the discrete random variables over the Concrete distribution, which enables a low-variance estimate of the gradient through discrete stochastic nodes. As the output of these units, the feature-selected data may be fed to the time series generator. Further, both the input and output of these units fulfill time series structures. This may be important Minimum L2 loss in the CAE training is implemented, which may be to force the decoded CAE series highly close (i.e., close) to the input time series.
This part of computations may be conducted on the edge (e.g., of the network). The loss introduced in this part can be also used as regulations for the variational time series generation in a global sense.
Variational Generator units of the emulator (e.g., ofDT unit 28, of emulator device 12)
The Variational Generator units are based on Variational autoencoder structure as shown in FIG. 9 (a diagram of example generative units of FVGHT for high-dimensional time-series). Data 66 (e.g., data frames), selected authentic features 68, a first time series structure 74 (e.g., encoded using VAE), VAE Latent 76, VA decoders 78, VAE decoded times series structure, and a second times series structures 80 is shown. The input to the VAE encoder part (i.e., the first time series structure 74) has been feature selected via CAE so that the data dimension has been decreased. Further, the data input to the VAE encoder is authentic parameters collected from physical twins, rather than reconstructed pseudo-features. The VAE also includes some time series units which are discussed below.
To approximate the real probabilistic distributions of the input data, variations may be introduced for two reasons: (1) to re-construct the input during each training so that after each training epic the input for the VAE encoder may be forced to move close to the samples in the targeted distributions; (2) after training the decoder the time series structure may be used to generate emulated time series data samples. Each X presented in FIG. 9 stands for a data frame having a width that is the same as the number of variables, and the length is the same as the time period (for example, a data frame with multivariable collected within a certain hour). Boxes with dashed lines indicate the sliding windows, which size is the size of the sliding window. The generator comprise a group of variational autoencoders and can take input X from the computation results of the previous sliding windows.
Since the entire stacked VAE serves to reconstruct information from the highdimensional data and then generate another time series sets/structure (e.g., that is close/similar to the high-dimensional data), the loss is more meaningful than accuracy. That is also because the training labels and inputs are the same datasets, and each VAE tries to minimize the KL divergence between the original data 66 and the reconstructed data (e.g., time series structure 80). The computation accuracy/loss is conducted using K-folder cross-validation, where the K number has defaulted as 5 unless the further configurations are provided.
Time-series units of the emulator
In this emulator, the time series are handled/processed using recurrent neural networks (RNN) which are constructed using GRU units. Long-short term memory (LSTM) units can fit the structure as needed. The time series RNN are applied in both the VAE-encoder and VAE-decoder structures, as shown in FIG. 10 and FIG. 11. FIG. 10 is a diagram of example time series units of FVGHT for high dimensional series data (e.g., the decoder part), and FIG. 11 is a diagram of example time series units of FVGHT for high-dimensional time series data (e.g., the encoder part). More specifically, in the time-series units, data is processed in the time dimension comparing to the CAE structure processing data in parameter dimensions. The RNN in VAE-encoder is also impacted by the size of sliding windows. For example, the deployed sensor transmits data in every 10 milliseconds; based on the descriptions of sliding windows above, , the sliding windows may accumulate 1000 data items, for example, if the time length for the sliding window is defined as 2 hours. Other data items and times may be used. Such time series units support the neural network to conduct training on the input data with continuations on different time slots. Further, the output of a unit X +2 can be used as input for another sliding window so that all the sliding windows are chained. The RNN (GRU) in VAE-decoder generates time series with variations using at least two layers. In the same layer, the outputs from the same Neuron may be (e.g., always) mutually connected as time a sequence. Moreover, theoretical support for one or more embodiments of FVGHT described herein is as follows:
• For edge devices GRU may be a better choice comparing to LSTM, especially relating to the context of loT-based digital Twin described herein;
• GRU-VAE has scientifically approved performance for processing multivariable time series data.
• CAE provides performance (e.g., that exceeds a predetermined performance threshold) on feature selections that output authentic parameters rather than reconstructed ones. It has been applied in multiple domains such as natural sciences which typically seeks to explain “why” together with “what”.
• VAE together with RNN provide performance (e.g.., that exceeds a predetermined performance threshold) to generate time-series data.
Typical systems do not provide the advantages and configurations described herein and do not include all the features of one or more embodiments described herein.
Expose the emulator data
In one or more embodiments, the emulator process is implemented based on external storage. In this step, the emulator exposes (e.g., provides) data to other components which allow the presented solution can be integrated into an loT-based Digital twin Platform. For example, the analyzed results can be exposed to the simulation and automation loops. Further, besides supporting simulation analysis, the solution can also be used for predictions.
The emulator device 12 may generate data for various industrial applications 58. The generated data may comprises variations that may or may not be aware of by human engineers, i.e., the emulator device provides data (with stress) for checking potential anomalies and online device verifications. Moreover, by integrating with the process component, the emulator device 12 may be configured to conduct performance/results evaluation for assumed conditions based on the data collected from physical twins. Emulator device 12 may further provide feedback of the decisions to be tested to users, e.g., without stopping the physical system for testing.
Implementation scenarios Conducting emulation for predictive maintenance in Smart manufacturing may be one of many use cases, e.g., where devices/equipment are geographically distributed.
In manufacturing, it may be important to emulate the production line and automatically determine any impact of possible changes. However, not always the domain expert can provide enough knowledge supported by understanding the data, especially when the production line is newly introduced or assembled. In modern plants many scenarios exist. For example, the categories of deployed equipment/ sensor devices can be very complex (high-dimensional data); the speed of generating data can be very quick (data streams come in high-density); the equipment/ sensor devices are serving different production units in different geographical locations (heterogeneity of devices and distributed topology); the production can be scaled up and down for serving different environments; the data may be ingested in fresh (e.g., real-time or near real-time) and provide online results quickly. To address these scenarios, a DT-based emulator machine (e.g., emulator device 12) is provided herein.
In one or more embodiments, emulator device 12 may be configured to provide online emulation of the physical deployment of the online data, e.g., using the architecture shown in FIG. 12. A plurality of DT communication units 100 (e.g., DT communication units 100a, 100b, etc.) and DT processing units 102 (e.g., DT processing units 102a, 102b, 102c, etc.) are shown. Any DT communication unit 100 may refer to communication interface 18 and/or communication interface 32. Similarly, Any DT processing unit 102 may refer to processing circuitry 20 and/or processing circuitry 34. Each component of FIG. 12 may respond to a computing unit in the provided solution, which can be distributed or reside on one device (e.g., emulator device 12, loT device 14). In one or more embodiments, DT processing unit correspond to steps performed by processing circuitry 20 of emulator device 12, DT unit 28, etc.
DT communication unit 0 (i.e., DT communication unit 100a): This unit works as a data receiver, which takes into data from physical entities.
DT processing unit 1 (i.e., DT processing unit 102a): This unit works as a data parser, which transforms different raw data into data frame. DT processing unit (2 to n)(i.e., DT process unit 102b (and/or more)): The total number of generator units is equivalent to the size of sliding windows. Each unit is equivalent to a multi-variable input X in the time period t. Each of the units handles the CAE computations to generate data. It interacts with the time-series units for L2 Regulations during the training and inferencing, which also connects to the VAE encoder units.
DT processing unit (n+1 to 2n) (i.e., DT process unit 102c (e.g., up to 102n): The total number of generator units is equivalent to the size of sliding windows. Each unit is equivalent to a multi-variable input X in the time period t. Each of the units handles data from physical entities generated in a certain period of time. It interacts with the VAE latent units for backpropagation during the training and inferencing.
DT processing unit (2n+l)(i.e., DT processing unit 102o): This unit works for VAE Decoder computations which first takes input from outcome of a generated time-series data frame from the VAE latent layers, and then training to force the final output to have minimum DL divergence with the original data sets.
DT communication unit (2n+2) (i.e., DT communication unit 100b): This unit works as a data exposer, which transmits the results to relevant components and/or end-users.
Node-level data flow and sequence
An emulator (i.e., emulator device 12) may comprise one or many processing units (e.g., processing circuitry 20) and two communication units (e.g., communication interface 18). Data exchanges between units within the emulator may be performed via a data bus, as shown (a) in FIG. 13. Data exchange may be occur between any external components and the emulator device 12 (e.g., DT processing units 102a, 102b) are conducted by the DT communication units 100a, 100b, 100c, lOOd, as shown (b) in FIG. 13. A communication broker 104 may also be used. The terms communication units and processing units have been used for ease of understanding but refer to DT communication units 100 and DT processing units 102, respectively.
FIG. 14 is an example flowchart of a process (i.e., method) performed by the emulator device 12 (and/or any of its components). In step S300, data generated by physical entities (i.e., loT devices 14) is collected (e.g., by emulator device 12 via communication interface 18). At step S302, the process further includes transforming and/or formalizing the data (i.e., collected data) to feed in the streams (streams of data fed to one or more components of the emulator device 12 which may include encoders and decoders). At step S304, data (e.g., collected data, transformed data, formalized data, transformed and formalized data) may be accumulated, e.g., to create a sliding window, using the sliding window, until a time associated with the sliding window expires, etc. At step S306, data concentration is performed in feature selection units (which may be part of an edge network/component). At step S308, a first time series computation is performed in GRU units (VAE-EN). At step S310, generative computation in generative units (global) are performed. At step S312, a second time series computation is performed in GRU units (VAE-DE). At least one of the results of steps S300-S312 (and/or data/actions associated with these steps) may be exposed (i.e., transmitted, shared, made available) to other devices (e.g., loT device 14), other DT components (i.e., emulator device 12 components), and/or users such as end users. Any one of steps S300-314 (e.g., S306-312) may be executed in an interactive loop.
Use cases and Examples
One or more embodiments described herein may be based on an outcome of a network cloud engine (NCE) program, where the digital twin is considered as enablers for various industrial applications. Therefore, as described herein, reusability (for various use cases and industrial applications) is one of the features of the present disclosure. There are some examples of how one or more embodiments can be applied to different use cases.
Adding to one or more embodiments described herein, the following highlights may be added:
• Highlight 1 : DT emulator (i.e., emulator device 12) takes device data from any loT environment as an input, i.e., various time-series sensor data. The DT emulator provides diverse and variational data as an output, i.e., data that looks like the input sensor data.
• Highlight 2: The DT emulator generates output by first training a FVGHT model to learn the pattern of input data and then generating data from the trained model where variations are added. • Highlight 3 : Device data flows into the emulator as an alive/live stream, while the DT emulator takes the sliding windows to process the incoming data. The content of each sliding window is fed into the FVGHT, and the emulator runs continuously and takes input from the moving sliding windows continuously. The online device verification and anomaly detection for some key use cases are described below.
• Online Device verification'.
In various use-cases, hardware (e.g., complex hardware) is deployed and needs to be upgraded/changed based on different requirements. To avoid the situation that the running hardware systems have to be stopped for testing the results (which may cost quite some time to test all the possibilities), the emulator (i.e., emulator device 12) enables engineers to perform online device verifications. That is, the emulator may be used to learn a pattern from all the collected hardware data, and then to generate data emulating the hardware systems (with possible environment conditions) with variations. Further, the emulation data sets enable tests on the hardware systems without shutting down anything. Once data is emulated, the emulated data can be used for verifying the physical device state, not only by observing its current state, but also by observing all possible states that might occur.
• Sustainable Smart urban:
There may be a “smart” urban area with different deployed sensors (temperature sensors) and actuators (heater with several power levels). DT emulator learns a pattern that when the heater is turned on the temperature of houses in the district can logarithmically go up depending on the power level. The DT emulator may go through different conditions and verify states. If the desired temperature is set to 35 degrees and DT emulator did not find any set of conditions that would achieve that temperature, the DT emulator (e.g., emulator device) can alarm the user that the desired temperature is unreachable. Moreover, when any changes towards the heating system are needed, the data output by the emulator can be used to test the change proposal before making any physical impact on the existing physical entities.
• Smart Manufacturing:
In this example, a robotic arm and a conveyor belt are used on a factory floor, as well as camera detecting items on a conveyor belt being placed by the robotic arm. DT emulator can take camera outputs, robotic arm axes, and the conveyor belt speed and learn their correlation and continuously update them as new data arrives. By creating variations of those conditions, the DT emulator can check if any device can reach an undesirable state during its operation. For instance, the DT emulator can determine the minimum speed of the conveyor belt would cause the robotic arm to place items one on top of the other if the robotic arm itself were to be set to its maximum speed. In case any changes are needed for the running physical setup, the data generated by the emulator can be used to verify the changes before any impacts are made on the physical system.
• Mobile Network: therein this example, there may be 200 LTE base stations covering an area, all being digitally twinned. Input data includes base station load and a number of mobile devices connected to it. DT emulator begins to emulate mobile device positions and connections and finds a condition where one base station fails due to overload. Mobile devices that were connected to that base station switch to other base stations, which also gets overloaded as it now has to handle new connections along with the existing ones, making it fail as well. This can create a chain reaction that knocks off all 200 base stations. In this way, the DT emulator is capable of verifying the setup/configuration of the entire LTE network in the area. Such verification is particularly very helpful when any changes need to be done on the hardware setup because the running hardware system does not need to be stopped to test potential consequences of the planned changes.
• Anomaly Detection'.
The emulator generates data emulating the hardware system, which can be also used to detect anomalies in the existing hardware system, e.g., because the DT emulator is also aware of various environment's acceptable states because of the variations in emulated data.
• Sustainable Smart Urban:
In this example, a temperature starts going up while a corresponding heater is turned off. This could happen in a case of fire in the room for instance. This data can be used as an input to the discriminator of the DT emulator along with other emulated data representing various acceptable situations. Further, the DT emulator would be able to identify the fire situation as an anomaly amongst other conditions. Moreover, using the emulating data with variations, it is possible to test whether the current hardware setup can work as the expected way under different (with variations) circumstances/assumptions (can the system respond to anomalies correctly under different circumstances/assumptions).
• Smart Manufacturing:
In this example, a conveyor belt stops operating due to a malfunction, while a robotic arm continues its operation. DT emulator would be able to detect this as an anomaly as these conditions would not resemble any other previously emulated working conditions. Moreover, using the emulating data with variations, it is possible to test whether the current hardware setup can work as the expected way under different (with variations) circumstances/assumptions (can the system respond to anomalies correctly under different circumstances/assumptions).
• Mobile Network, e.g., 3GPP based network
In this example, one base station (e.g., network node) stops sending sensor data due to an internal error, while continuing to operate (e.g., while mobile devices (e.g.,, wireless device, UE) are still connected to the base station). DT emulator (i.e., emulator device 12) would be able to emulate these conditions and realize that, if the base station has actually failed, some of the neighboring base stations would receive new connections. Hence, it would detect these conditions as the anomaly. Moreover, using the emulating data with variations, it is possible to test whether the current hardware setup can work as the expected way under different (with variations) circumstances/assumptions (can the system respond to anomalies correctly under different circumstances/assumptions
Some of the units in the emulator (e.g., one or more elements and/or functions of emulator device 12) could be implemented in a different location in a distributed way, as shown in FIG. 15. More specifically, FIG. 15 shows a diagram of an example orchestration of emulator according to some embodiments. One or more DT communication units 100a, 100b (Data Receiver, Exposer, respectively) are shown. In addition, one or more DT processing units 102a, 102b, 102c, 102d (Raw Data Parser, Feature Selector Units, Time Series Units, VAE Units, respectively) are shown. Instead of relating to cloud implementation only, one or more embodiments described herein relate to the loT landscape consisting of devices, edge gateways, base stations, network nodes, radio nodes, network infrastructure, fog nodes, or cloud. The Timeseries Units, VAE latent units, and VAE Decoder units may be implemented in a cloud network which can benefit from the computation power of the cloud network. Other units, such as Feature selection units, VAE encoder units, data receiver, exposer, and data parser can be located in the edge, depending on the scenario.
The term “network node” used herein can be any kind of network node comprised in a radio network which may further comprise any of base station (BS), radio base station, base transceiver station (BTS), base station controller (BSC), radio network controller (RNC), g Node B (gNB), evolved Node B (eNB or eNodeB), Node B, multi-standard radio (MSR) radio node such as MSR BS, multi-cell/multicast coordination entity (MCE), integrated access and backhaul (IAB) node, relay node, donor node controlling relay, radio access point (AP), transmission points, transmission nodes, Remote Radio Unit (RRU) Remote Radio Head (RRH), a core network node (e.g., mobile management entity (MME), self-organizing network (SON) node, a coordinating node, positioning node, MDT node, etc.), an external node (e.g., 3rd party node, a node external to the current network), nodes in distributed antenna system (DAS), a spectrum access system (SAS) node, an element management system (EMS), etc. The network node may also comprise test equipment. The term “radio node” used herein may be used to also denote a wireless device (WD) such as a wireless device (WD) or a radio network node.
In some embodiments, the non-limiting terms wireless device (WD) or a user equipment (UE) are used interchangeably. The WD herein can be any type of wireless device capable of communicating with a network node or another WD over radio signals, such as wireless device (WD). The WD may also be a radio communication device, target device, device to device (D2D) WD, machine type WD or WD capable of machine to machine communication (M2M), low-cost and/or low-complexity WD, a sensor equipped with WD, Tablet, mobile terminals, smart phone, laptop embedded equipped (LEE), laptop mounted equipment (LME), USB dongles, Customer Premises Equipment (CPE), an Internet of Things (loT) device, or a Narrowband loT (NB-IOT) device etc. Accordingly, one or more embodiments provide a digital twin based emulator using a novel online machine learning model (i.e., FVGHT ) to conduct feature selection and generate high-dimensional time-series data sets which can approach (e.g., infinitely approach) the input high-dimensional time series data sets in probabilistic space. The Emulator machine (device) has a minimum dependency on domain expertise, which is hence highly replicable and reusable for various industrial applications. The Emulator machine conducts feature selections on the edge side, so the parameters taken into the generation process are all (or most are) contributive/impactive parameters which hence also avoid introducing extra burden on computing the irrelevant and redundant data during the generative learning phase.
One or more embodiments of the emulator, a model is used. The model may be based on inheriting and integrating the advantages and core features of CAE, VAE, and GRU with clear synergy effects that allows the emulator gain core essences, e.g., that a single parent model cannot provide.
One or more embodiments described herein have one or more of the following features:
• An emulator machine (e.g., emulator device 12) having feature selections has been introduced on the edge side of the physical system so that the emulator machine can at least in part avoid ingesting/inputting irrelevant and redundant data.
• An emulator machine learns and generates high-dimensional time series data in a variational way. After feature selection training and generative training, each data generation is a process of sampling data from the learned probabilistic distributions. Each generated set may be different but obeys (infinitely approaching to) the probabilistic distribution learned from input data. Therefore, the process is variational in that the majority of potential conditions and environment assumptions that have not happened before can be included in the model to verify the device setup and determine/discover potential flaws.
• An emulator machine applies a novel model (FVGHT) which integrates the advantages from CAE, VAE, and GRU with synergy effects. The emulator machine can learn a probabilistic pattern based on feature selection for expressing high-dimensional time series based on the data sets collected from physical entities. In particular, there is no such VAE-VAE-GRU model that has been applied for emulating purposes.
• An emulator machine applies online batch-based machine learning: continuously applying batch-based machine learning algorithms on time series to generate data sets for simulating the data collected from the physical entities, which can concentrate information in both KPI dimensions using feature selection and concentrate the time dimension using VAE.
• An emulator machine may not require domain expertise for providing data labels for either training the model or creating the model such that one or more embodiments may be automated without human intervention.
• A highly reusable and replicable digital twin emulator (i.e., emulator device 12): since one or more embodiments described herein are semi-supervised reinforcement learning without the requirement on either training labels or domain expertise, the digital twin emulator described herein is highly replicable and reusable in different industrial applications.
One or more embodiments described herein provide one or more of the following advantages. One or more embodiments provide an emulator (i.e., emulator device 12) which uses FVGHT as the core algorithm to reproduce data highly similar to the data generated by the physical entities. One or more embodiments described herein utilizes novel machine learning techniques for DT emulator, which has the following advantages:
1. The emulation is variational, i.e., diverse in that the majority of potential conditions and environment assumptions that have not happened before can be included in the model.
2. The novel model FVGHT is applied to the emulator. The FVGHT is a generative learning model that may be configured to use the core functionality of CAE, VAE, and GRU, with a functional synergy that a single parent model could not provide.
3. Emulations can be created/determined/performed from the data flow in an online manner using batch-based algorithms, where the batch-based algorithms are usually conducted in an offline manner. The online manner provides faster results than the offline manner, and the batch-based algorithms enable the sy stem/ emulator to handle advanced analysis.
4. Data sets may be emulated from high-dimensional time-series data flows using sliding windows, a concrete autoencoder, and a variational autoencoder where the concrete autoencoder plays the functionality to condense the highdimensional data using feature selection.
5. One or more embodiments described herein do not require pre-existing domain expertise to create models for the performance evaluation or to train/evaluate the models which are performed autonomously from the data flow.
6. One or more embodiments described herein are highly reusable and replicable for different scenarios due to the fact of flexible, scalable, and does not require pre-existing domain expertise.
7. One or more embodiments described herein conduct feature selection (using Concrete autoencoders) on the edge side before feeding data into the variational generation learning process, which hence removes irrelevant and redundant information for training the emulator.
8. The feature selection can “light-weight” the generative training by removing redundant and irrelevant data on the edge. The feature selection helps to fit the emulator for devices on the edge for processing high-dimensional data close to the data resources.
In some embodiments, a plurality of sliding windows may be determined, based at least on one parameter, to train the model using online-batch learning. Each sliding window of the plurality of sliding windows may have a size (e.g., a parameter). The size of sliding windows may be determined and/or received, e.g., given by users (customized). Further, in some other embodiments, the quantity of sliding windows (e.g., another parameter) may be determined as follows:
Quantity of sliding windows = the total time length to monitor / the given size of sliding windows. where the total time length to monitor may refer to the total length of time series to analyze using a plurality of sliding windows. In some embodiments, an ML model is deployed via one or more organized units working together as emulator machine. In some other embodiments, a digital twin emulator machine can make a digital copy of the physical world and/or be tool to deeply emulate device sets and/or detect what can potentially happen. The tool may be an important tool for online verification. Online verification may refer to detect POTENTIAL anomalies (e.g., to know what can happen after running the physical system) without disturbing the running physical system. In some other embodiments, generative learning models can perform data sampling from the trained model (e.g., so that it always generates a data set that complies with the pattern of input data, although optionally with variations of one or more embodiments). In some embodiments, the more data is generated, the more likely potential red-flag events may be exposed.
The following is a nonlimiting list of example embodiments:
Embodiment Al . An emulator device configured to communicate with a plurality of internet of things, loT, devices, the emulator configured to, and/or comprising a communication interface and/or comprising processing circuitry configured to: receive data associated from the plurality of loT devices; generate variational data based on the received data, the variational data including high-dimensional time-series data sets; train a machine learning, ML, model using the variation data; and use the trained ML model to perform at least one detection.
Embodiment A2. The emulator device of Embodiment Al, wherein the using of the trained ML model to perform at least one detection includes: generating data for emulation using the trained ML model based at least on a time-sliding window of time-series data, the generated data indicating at least one emulated outcome; and determine whether the emulated outcome indicates anomaly.
Embodiment A3. The emulator device of Embodiment Al, wherein the ML model is a Feature-selective Variational Generator for High-dimensional Time Series, FVGHT, model. Embodiment A4. The emulator device of Embodiment Al, wherein the processing circuitry is further configured to perform feature selection to remove at least one of redundant data and irrelevant data from the received data.
Embodiment A5. The emulator device of Embodiment Al, wherein the variation data follows a probabilistic distribution associated with the received data.
Embodiment Bl. A method implemented in a network node that is configured to communicate with a wireless device, the method comprising: receiving data associated from the plurality of loT devices; generating variational data based on the received data, the variational data including high-dimensional time-series data sets; training a machine learning, ML, model using the variation data; and using the trained ML model to perform at least one detection.
Embodiment B2. The method of Embodiment Bl, further comprising: generating data for emulation using the trained ML model based at least on a time-sliding window of time-series data, the generated data indicating at least one emulated outcome; and determining whether the emulated outcome indicates anomaly
Embodiment B3. The method of Embodiment Bl, wherein the ML model is a Feature-selective Variational Generator for High-dimensional Time Series, FVGHT, model.
Embodiment B4. The method of Embodiment Bl, further comprising performing feature selection to remove at least one of redundant data and irrelevant data from the received data.
Embodiment B5. The method of Embodiment Bl, wherein the variation data follows a probabilistic distribution associated with the received data.
As will be appreciated by one of skill in the art, the concepts described herein may be embodied as a method, data processing system, computer program product and/or computer storage media storing an executable computer program. Accordingly, the concepts described herein may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects all generally referred to herein as a “circuit” or “module.” Any process, step, action and/or functionality described herein may be performed by, and/or associated to, a corresponding module, which may be implemented in software and/or firmware and/or hardware. Furthermore, the disclosure may take the form of a computer program product on a tangible computer usable storage medium having computer program code embodied in the medium that can be executed by a computer. Any suitable tangible computer readable medium may be utilized including hard disks, CD-ROMs, electronic storage devices, optical storage devices, or magnetic storage devices.
Some embodiments are described herein with reference to flowchart illustrations and/or block diagrams of methods, systems and computer program products. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer (to thereby create a special purpose computer), special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable memory or storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. It is to be understood that the functions/acts noted in the blocks may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows.
Computer program code for carrying out operations of the concepts described herein may be written in an object oriented programming language such as Python, Java® or C++. However, the computer program code for carrying out operations of the disclosure may also be written in conventional procedural programming languages, such as the "C" programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer. In the latter scenario, the remote computer may be connected to the user's computer through a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Many different embodiments have been disclosed herein, in connection with the above description and the drawings. It will be understood that it would be unduly repetitious and obfuscating to literally describe and illustrate every combination and subcombination of these embodiments. Accordingly, all embodiments can be combined in any way and/or combination, and the present specification, including the drawings, shall be construed to constitute a complete written description of all combinations and subcombinations of the embodiments described herein, and of the manner and process of making and using them, and shall support claims to any such combination or subcombination.
Abbreviations that may be used in the preceding description include:
CAE concrete autoencoder;
DT digital twin;
GRU gated recurrent unit; loT internet of things; VAE variational Autoencoder;
It will be appreciated by persons skilled in the art that the embodiments described herein are not limited to what has been particularly shown and described herein above. In addition, unless mention was made above to the contrary, it should be noted that all of the accompanying drawings are not to scale. A variety of modifications and variations are possible in light of the above teachings without departing from the scope of the following claims.

Claims

CLAIMS What is claimed is:
1. An emulator device (12) configured to communicate with a plurality of devices (14), the emulator device (12) comprising processing circuitry (20) configured to: determine high-dimensional time series data based on data associated with the plurality of devices; apply a model to the high-dimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic of a variational autoencoder, VAE; train the model, using online batch-based machine learning, based at least in part on the high-dimensional time series data; and perform at least one action using the trained model.
2. The emulator device (12) of Claim 1, wherein the processing circuitry (20) is further configured to: determine the feature selection of the data associated with the plurality of devices, the determined feature selection indicating at least one relation between at least two dimensions of the data.
3. The emulator device (12) of any one of Claims 1 and 2, wherein the processing circuitry (20) is further configured to: determine a plurality of sliding windows, based at least on one parameter, to train the model using online-batch learning, each sliding window of the plurality of sliding windows having a size.
4. The emulator device (12) of Claim 3, wherein the processing circuitry (20) is further configured to: accumulate a plurality of data frames per each sliding window of the plurality of sliding windows, the plurality of data frames being associated with the data associated with the plurality of devices.
5. The emulator device (12) of any one of Claims 3 and 4, wherein applying the model includes: generating emulated data based at least in part on the plurality of sliding windows, the emulated data including reconstructed information from the highdimensional time series data, at least one set of the emulated data corresponding to one sliding window of the plurality of sliding windows.
6. The emulator device (12) of any one of Claims 1-5, wherein the model is a feature-selective variational generator for high-dimensional time series, FVGHT.
7. The emulator device (12) of any one of Claims 1-6, wherein the model is applied based at least in part on at least one of: a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
8. The emulator device (12) of any one of Claims 1-7, wherein the emulator device further includes a communication interface in communication with the processing circuitry, the communication interface being configured to at least one of: transmit emulated data to at least one device of the plurality of devices; and transmit signaling to cause at least one device of the plurality of devices to perform the at least one action.
9. The emulator device (12) of any one of Claims 1-8, wherein performing the at least one action includes at least one of: performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device.
10. The emulator device (12) of any one of Claims 1-9, wherein at least one of the plurality of devices is an internet of things, loT, device (14).
11. The emulator device (12) of any one of Claims 1-10, wherein the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
12. The emulator device (12) of any one of Claims 1-11, wherein the emulator device is digital twin emulator.
13. A method performed by an emulator device (12) configured to communicate with a plurality of devices (14), the method comprising: determining (SI 08) high-dimensional time series data based on data associated with the plurality of devices; applying (SI 10) a model to the high-dimensional time series data based on a determined feature selection, the applied model inheriting at least one characteristic of a variational autoencoder, VAE; training (SI 12) the model, using online batch-based machine learning, based at least in part on the high-dimensional time series data; and performing (SI 14) at least one action using the trained model.
14. The method of Claim 13, wherein the method further includes: determining the feature selection of the data associated with the plurality of devices, the determined feature selection indicating at least one relation between at least two dimensions of the data.
15. The method of any one of Claims 13 and 14, wherein the method further includes: determining a plurality of sliding windows based at least on one parameter, to train the model using online-batch learning, each sliding window of the plurality of sliding windows having a size.
16. The method of Claim 15, wherein the method further includes: accumulating a plurality of data frames per each sliding window of the plurality of sliding windows, the plurality of data frames being associated with the data associated with the plurality of devices.
17. The method of any one of Claims 15 and 16, wherein applying the model includes: generating emulated data based at least in part on the plurality of sliding windows, the emulated data including reconstructed information from the highdimensional time series data, at least one set of the emulated data corresponding to one sliding window of the plurality of sliding windows.
18. The method of any one of Claims 13-17, wherein the model is a feature-selective variational generator for high-dimensional time series, FVGHT.
19. The method of any one of Claims 13-18, wherein the model is applied based at least in part on at least one of: a concrete autoencoder, CAE; and a gated recurrent unit, GRU.
20. The method of any one of Claims 13-19, wherein the method further includes at least one of: transmitting emulated data to at least one device of the plurality of devices; and transmitting signaling to cause at least one device of the plurality of devices to perform the at least one action.
21. The method of any one of Claims 13-20, wherein performing the at least one action includes at least one of: performing an online device verification of at least one device of the plurality of devices without shutting down the at least one device; and determining an anomaly associated with the at least one device.
22. The method of any one of Claims 13-21, wherein at least one of the plurality of devices is an internet of things, loT, device.
23. The method of any one of Claims 13-22, wherein the data associated with the plurality of devices includes data generated by at least one of a physical entity and a digital entity.
24. The method of any one of Claims 13-23, wherein the emulator device is digital twin emulator.
25. A computer program, comprising instructions which, when executed on processing circuitry (20) of an emulator device (12), cause the processing circuitry (20) to carry out a method according to any one of Claims 13-24.
26. A computer program product stored on a computer storage medium and comprising instructions that, when executed by processing circuitry (20) of an emulator device (12), cause the emulator device (12) to perform a method according to any one of Claims 13-24.
PCT/SE2023/050328 2022-05-03 2023-04-11 Feature selective and generative digital twin emulator machine for device verification and anomaly checking WO2023214909A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263337855P 2022-05-03 2022-05-03
US63/337,855 2022-05-03

Publications (1)

Publication Number Publication Date
WO2023214909A1 true WO2023214909A1 (en) 2023-11-09

Family

ID=86053872

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2023/050328 WO2023214909A1 (en) 2022-05-03 2023-04-11 Feature selective and generative digital twin emulator machine for device verification and anomaly checking

Country Status (1)

Country Link
WO (1) WO2023214909A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210141870A1 (en) 2019-11-11 2021-05-13 Rockwell Automation Technologies, Inc. Creation of a digital twin from a mechanical model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210141870A1 (en) 2019-11-11 2021-05-13 Rockwell Automation Technologies, Inc. Creation of a digital twin from a mechanical model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BOOYSE WIHAN ET AL: "Deep digital twins for detection, diagnostics and prognostics", MECHANICAL SYSTEMS AND SIGNAL PROCESSING, ELSEVIER, AMSTERDAM, NL, vol. 140, 1 February 2020 (2020-02-01), XP086085843, ISSN: 0888-3270, [retrieved on 20200201], DOI: 10.1016/J.YMSSP.2019.106612 *
YIFAN GUO ET AL: "Multidimensional Time Series Anomaly Detection: A GRU-based Gaussian Mixture Variational Autoencoder Approach", PROCEEDINGS OF MACHINE LEARNING RESEARCH, vol. 95, 1 January 2018 (2018-01-01), pages 97 - 112, XP055576177 *

Similar Documents

Publication Publication Date Title
US11496609B2 (en) Systems and methods for performing simulations at a base station router
US20180307979A1 (en) Distributed deep learning using a distributed deep neural network
Khan et al. Metaverse for wireless systems: Vision, enablers, architecture, and future directions
CN112966714B (en) Edge time sequence data anomaly detection and network programmable control method
WO2020112436A1 (en) 3s-chain: smart, secure, and software-defined networking (sdn)-powered blockchain-powered networking and monitoring system
Li et al. Rlops: Development life-cycle of reinforcement learning aided open ran
Dai et al. Routing optimization meets Machine Intelligence: A perspective for the future network
Soldani et al. 5G AI-enabled automation
Ergun et al. A survey on how network simulators serve reinforcement learning in wireless networks
Hafi et al. Split Federated Learning for 6G Enabled-Networks: Requirements, Challenges and Future Directions
Thomas et al. Causal semantic communication for digital twins: A generalizable imitation learning approach
WO2023214909A1 (en) Feature selective and generative digital twin emulator machine for device verification and anomaly checking
Huang et al. AI-Generated Network Design: A Diffusion Model-based Learning Approach
Apostolakis et al. Digital twins for next-generation mobile networks: Applications and solutions
WO2023214910A1 (en) Machine for device verification and anomaly checking
Shuvro et al. Transformer Based Traffic Flow Forecasting in SDN-VANET
Hattori et al. Network Digital Replica using Neural-Network-based Network Node Modeling
Nguyen et al. Digital twin for iot environments: A testing and simulation tool
US20220385545A1 (en) Event Detection in a Data Stream
Gilbert The role of artificial intelligence for network automation and security
KR20210057569A (en) Method and appratus for processing voice signal
Tang et al. Anomaly Detection of Service Function Chain Based on Distributed Knowledge Distillation Framework in Cloud-Edge Industrial Internet of Things Scenarios
WO2023198282A1 (en) Generative emulator for controllable physical systems
Shuvro et al. SDN-based Time Series Traffic Flow Forecasting in VANET
Afifi et al. Machine Learning with Computer Networks: Techniques, Datasets and Models

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23718080

Country of ref document: EP

Kind code of ref document: A1