EP3871136A1 - Détection d'objet basée sur ordinateur dans une vidéo ou une image - Google Patents

Détection d'objet basée sur ordinateur dans une vidéo ou une image

Info

Publication number
EP3871136A1
EP3871136A1 EP19798184.8A EP19798184A EP3871136A1 EP 3871136 A1 EP3871136 A1 EP 3871136A1 EP 19798184 A EP19798184 A EP 19798184A EP 3871136 A1 EP3871136 A1 EP 3871136A1
Authority
EP
European Patent Office
Prior art keywords
interest
frame
distribution
factor
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP19798184.8A
Other languages
German (de)
English (en)
Inventor
Quoc Huy Phan
Thomas Harte
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Future Health Works Ltd
Original Assignee
Future Health Works Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/167,300 external-priority patent/US10922573B2/en
Priority claimed from GB1817286.6A external-priority patent/GB2578325B/en
Application filed by Future Health Works Ltd filed Critical Future Health Works Ltd
Publication of EP3871136A1 publication Critical patent/EP3871136A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • Videos and images containing one or more objects may be analyzed by computers utilizing software.
  • software is used to analyze videos or images in different applications.
  • Software used in some analysis systems includes machine learning algorithms which are trained to analyze videos or images using large datasets of videos or images.
  • Described herein are software and systems for analyzing videos and/or images.
  • Software and systems described herein are configured in different embodiments to carry out different types of analyses.
  • software and systems described herein are configured to locate an object of interest within a video and/or image.
  • an object of interest or factor of interest is located by the software and systems within a series of video frames and/or images.
  • a location of an object of interest or factor of interest relative to a different object within a video frame and/or image is identified.
  • software and systems described herein are configured to identify a factor of interest within a video and/or image.
  • factors of interest include colors, sizes, shapes, dimensions, velocity, distance, angles, ages, and weights.
  • Factors of interest in some embodiments relate to an individual captured within at least one frame of a video or within an image. In some embodiments, factors of interest relate to an object captured within at least one frame of a video or within an image. In some embodiments, factors of interest relate to both an individual and an object captured within at least one frame of a video or within an image.
  • DNNs deep neural networks
  • CNNs Convolutional neural networks
  • CNNs may make predictions through videos or images, but the predictions may be inaccurate if the videos or images to be analyzed are poor quality. Additionally, researchers may have less control over how CNNs work and these machine learning algorithms typically don’t take into account an uncertainty level around such
  • an analysis result may comprise outputs from multiple layers of a DNN to be used to predict a wide range of variables from a video or image input.
  • a DNN to be used to predict a wide range of variables from a video or image input.
  • prior knowledge is better incorporated into the machine learning framework, thus making it more specific to the testing scenario in hand.
  • preliminary experiments show that the methods disclosed herein can predict knee angle up to 2° as compared to a marker-based approach.
  • Traditional image analysis technology typically comprises software which utilizes machine learning algorithms trained with large datasets of images and videos.
  • Traditional technology is not particularly good at analyzing certain images where, for example, the machine learning algorithm was not trained with a similar image or video. That is, the traditional image analysis technology is poor at, for example, analyzing a video or image containing an object that it has not previously“seen” as part of its training. This poor performance in the traditional technology is compounded when an object that the technology is not familiar with has similar features to an object of interest or factor of interest.
  • Described herein is a computer-based method for identifying an object of interest or factor of interest within a video, the method comprising: (a) inputting the video comprising a plurality of frames into a software module;
  • the software module comprises a DNN.
  • the feature map comprises data from hidden layers of the DNN.
  • the DNN comprises at least one of VGG-19, ResNet, Inception, and MobileNet.
  • the factor of interest comprises at least one of a location of a color within the frame and an angle within the frame.
  • the statistical technique comprises Monte-Carlo
  • the statistical technique further comprises Bayesian Modeling, and wherein the Bayesian Modeling is used to model a change in a location of the object of interest within the frame to a different location of the object of interest within a different frame of the plurality of frames.
  • the statistical technique further comprises identifying a position of the object of interest within the frame relative to a different object of interest within the frame.
  • the position of the object of interest within the frame is expressed as an angle.
  • the object of interest comprises a joint of a body of an individual.
  • the joint comprises a shoulder, elbow, hip, knee, or ankle.
  • the video captures the individual within the frame.
  • the video captures a factor of interest from the frame to a different frame within the plurality of frames.
  • the factor of interest comprises a movement of a joint.
  • the movement of the joint is measured relative to a different joint in the body of the individual and is expressed as an angle.
  • the angle is used by a healthcare provider to evaluate the individual.
  • Described herein is a computer-based system for identifying an object of interest or a factor of interest within a video, the system comprising:
  • the software module comprises a deep neural network.
  • the feature map comprises data from hidden layers of the deep neural network.
  • the deep neural network comprises at least one of VGG-19, ResNet, Inception, and MobileNet.
  • the factor of interest comprises at least one of a location of a color within the frame and an angle within the frame.
  • the statistical technique comprises Monte Carlo Sampling, and wherein the Monte Carlo Sampling is used to sample the likelihood of the presence of the object of interest within the feature map.
  • the statistical technique further comprises Bayesian Modeling, and wherein the Bayesian Modeling is used to model a change in a location of the object of interest within the frame to a different location of the object of interest within a different frame of the plurality of frames.
  • the statistical technique comprises identifying a position of the object of interest within the frame relative to a different object of interest within the frame.
  • the position of the object of interest within the frame is expressed as an angle.
  • the object of interest comprises a joint of a body of an individual.
  • the joint comprises a shoulder, elbow, shoulder, elbow, hip, knee, or ankle.
  • the video captures the individual within the frame.
  • the video captures a factor of interest from the frame to a different frame within the plurality of frames.
  • the factor of interest comprises a movement of a joint.
  • the movement of the joint is measured relative to a different joint of the body of the individual and is expressed as an angle.
  • the angle is used by a healthcare provider to evaluate the individual.
  • Described herein is a non-transitory medium comprising a computer program configured to:
  • the software module comprises a deep neural network.
  • the feature map comprises data from hidden layers of the deep neural network.
  • the deep neural network comprises at least one of VGG-19, ResNet, Inception, and MobileNet.
  • the factor of interest comprises at least one of a location of a color within the frame and an angle within the frame.
  • the statistical technique comprises Monte Carlo Sampling, and wherein the Monte Carlo Sampling is used to sample the likelihood of the presence of the object of interest within the feature map.
  • the statistical technique further comprises Bayesian Modeling, and wherein the Bayesian Modeling is used to model a change in a location of the object of interest within the frame to a different location of the object of interest within a different frame of the plurality of frames.
  • the statistical technique comprises identifying a position of the object of interest within the frame relative to a different object of interest within the frame.
  • the position of the object of interest within the frame is expressed as an angle.
  • the object of interest comprises a joint of a body of an individual.
  • the joint comprises a shoulder, elbow, shoulder, elbow, hip, knee, or ankle.
  • the video captures the individual within the frame.
  • the video captures a factor of interest from the frame to a different frame within the plurality of frames.
  • the factor of interest comprises a movement of a joint.
  • the movement of the joint is measured relative to a different joint of the body of the individual and is expressed as an angle.
  • the angle is used by a healthcare provider to evaluate the individual.
  • FIG. 1 shows an example of a CNN
  • FIG. 2 shows exemplary heatmaps as a result of running a neural network on an input image
  • FIG. 3 shows visual examples of the Monte Carlo sampling method on a small scale heatmap
  • FIG. 4 demonstrates an example of the process of approximating a probability distribution function (PDF) for a joint angle from multi-frame heatmaps for a single joint;
  • PDF probability distribution function
  • FIG. 5A shows an example of a video of a subject performing a leg exercise
  • FIG. 5B shows examples of actual results of applying a Gaussian process regressor (GPR) with a kernel on a real-world video;
  • GPR Gaussian process regressor
  • FIG. 6 shows an example of a computer-based method for locating a factor of interest within a video comprising a plurality of frames
  • FIG. 7 shows an example of a feature map used to construct a probabilistic model
  • FIG. 8 shows an exemplary embodiment of a method for identifying an object of interest or a factor of interest within a video comprising a plurality of frames
  • FIG. 9 shows an exemplary embodiment of a system as described herein comprising a device such as a digital processing device.
  • Described herein are software and systems configured to analyze videos and/or images with a high level of accuracy and reliability.
  • analysis generally occurs as follows: (1) a video and/or image is inputted into (and/or ingested by) a software algorithm such as a machine learning algorithm, (2) a representation such as a feature map comprising a probability of the existence of an object of interest or factor of interest within the video and/or image is created such as, for example, a heatmap, and (3) apply statistical techniques to the representation of likelihoods or probabilities to accurately identify the object of interest or factor of interest and determine the presence of the object of interest or factor of interest at a location within the video and/or image.
  • a software algorithm such as a machine learning algorithm
  • a representation such as a feature map comprising a probability of the existence of an object of interest or factor of interest within the video and/or image is created such as, for example, a heatmap
  • a representation such as a feature map comprising a probability of the existence of an object of interest or factor of interest
  • the method disclosed herein comprises employing Markov-Chain Monte Carlo methods that exploit information from hidden neural network layers; producing noise-resistant and reliable predictions for joint angles/range of motion; providing confidence level (certainty) about predictions, which can prove useful in clinical applications.
  • advantages of the method disclosed herein comprise: 1) building a relationship between powerful discriminative methods (such as deep CNNs) to a more well- studied and controllable Bayesian methods through sampling from feature maps such as, for example, heatmaps; 2) allowing predictions with uncertainty, which can be very important for clinical applications; 3) the framework is flexible enough to apply to any problem that relies on joint locations and object detection in general; and 4) as sampling methods approximate the true distribution, the prediction is usually more accurate.
  • all technical and scientific terms used herein have the same meaning as is commonly understood by one skilled in the art to which the claimed subject matter belongs. It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of any subject matter claimed. In this application, the use of the singular includes the plural unless specifically stated otherwise.
  • any percentage range, ratio range, or integer range is to be understood to include the value of any integer within the recited range and, when appropriate, fractions thereof (such as one tenth and one hundredth of an integer), unless otherwise indicated.
  • the terms“a” and“an” as used herein refer to“one or more” of the enumerated components unless otherwise indicated or dictated by its context.
  • the use of the alternative e.g.,“or” should be understood to mean either one, both, or any combination thereof of the alternatives.
  • the terms“include” and“comprise” are used synonymously.
  • the term“about” or“approximately” can mean within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, e.g., the limitations of the measurement system.
  • “about” can mean plus or minus 10%, per the practice in the art.
  • “about” can mean a range of plus or minus 20%, plus or minus 10%, plus or minus 5%, or plus or minus 1% of a given value.
  • the term“about” means within an acceptable error range for the particular value that should be assumed.
  • the ranges and/or subranges of values are provided, the ranges and/or subranges can include the endpoints of the ranges and/or subranges.
  • data comprises, for example, a video or image to be analyzed that is inputted manually into the software or systems.
  • the data may further comprise structured data, time-series data, unstructured data, and relational data.
  • the unstructured data may comprise text, audio data, image data and/or video.
  • the time- series data may comprise data from one or more of a smart meter, a smart appliance, a smart device, a monitoring system, a telemetry device, or a sensor.
  • the relational data may comprise data from one or more of a customer system, an enterprise system, an operational system, a website, or web accessible application program interface (API).
  • API application program interface
  • software and or systems as described herein comprise a data ingestion module configured to ingest data into a processing component.
  • a processing component comprises a machine learning algorithm.
  • a data ingestion module is configured to either retrieve or receive data from one or more data sources, wherein retrieving data comprises a data extraction process and receiving data comprises receiving transmitted data from an electronic source of data.
  • some embodiments of the platforms described herein are configured to retrieve or receive data from many different data sources such as wearable devices, cameras, smartphones, laptops, databases, and cloud storage systems.
  • the wearable devices may comprise Fitbit, Apple Watch, Samsung Gear, Samsung Galaxy watch, Misfit, Huawei Mi band, and Microsoft band.
  • data that is ingested by the software or systems is sorted based on, for example, data type.
  • the data is stored in a database.
  • a database can be stored in computer readable format.
  • a computer processor may be configured to access the data stored in the computer readable memory.
  • a computer system may be used to analyze the data to obtain a result.
  • the result may be stored remotely or internally on storage medium and communicated to personnel such as healthcare professionals.
  • the computer system may be operatively coupled with components for transmitting the result.
  • Components for transmitting can include wired and wireless components. Examples of wired communication components can include a Universal Serial Bus (USB) connection, a coaxial cable connection, an Ethernet cable such as a Cat5 or Cat6 cable, a fiber optic cable, or a telephone line.
  • USB Universal Serial Bus
  • wireless communication components can include a Wi-Fi receiver, a component for accessing a mobile data standard such as a 3G or 4G LTE data signal, or a Bluetooth receiver.
  • a Wi-Fi receiver a component for accessing a mobile data standard such as a 3G or 4G LTE data signal
  • a Bluetooth receiver In some embodiments, all data in the storage medium are collected and archived to build a data warehouse.
  • the database comprises an external database.
  • the external database may be a medical database, for example, but not limited to, Adverse Drug Effects Database, American Hospital Formulary Service (“AHFS”) Supplemental File, Allergen Picklist File, Average Wholesale Acquisiation Cost (“WAC”) Pricing File, Brand Probability File, Canadian Drug File v2, Comprehensive Price History, Controlled Substances File, Drug Allergy Cross-Reference File, Drug Application File, Drug Dosing & Administration Database, Drug Image Database v2.0/Drug Imprint Database v2.0, Drug Inactive Date File, Drug Indications Database, Drug Fab Conflict Database, Drug Therapy Monitoring System (“DTMS”) v2.2 / DTMS Consumer Monographs, Duplicate Therapy Database, Federal Government Pricing File, Healthcare Common Procedure Coding System Codes (“HCPCS”) Database, ICD-10 Mapping Files, Immunization Cross-Reference File, Integrated A to Z Drug Facts Module, Integrated Patient Education, Master Parameters Database, Medi-Span Electronic Drug File (“MED-File”) v2,
  • AHFS American
  • a machine learning algorithm (or software module) of a platform as described herein utilizes one or more neural networks.
  • a neural network is a type of computational system that can learn the relationships between an input dataset and a target dataset.
  • a neural network may be a software representation of a human neural system (e.g. cognitive system), intended to capture“learning” and“generalization” abilities as used by a human.
  • the machine learning algorithm (or software module) comprises a neural network comprising a CNN.
  • Non-limiting examples of structural components of embodiments of the machine learning software described herein include: CNNs, recurrent neural networks, dilated CNNs, fully-connected neural networks, deep generative models, and
  • a neural network is comprised of a series of layers termed “neurons.”
  • a neural network comprises an input layer, to which data is presented; one or more internal, and/or“hidden”, layers; and an output layer.
  • a neuron may be connected to neurons in other layers via connections that have weights, which are parameters that control the strength of the connection.
  • the number of neurons in each layer may be related to the complexity of the problem to be solved. The minimum number of neurons required in a layer may be determined by the problem complexity, and the maximum number may be limited by the ability of the neural network to generalize.
  • the input neurons may receive data being presented and then transmit that data to the first hidden layer through connections’ weights, which are modified during training.
  • the first hidden layer may process the data and transmit its result to the next layer through a second set of weighted connections. Each subsequent layer may“pool” the results from the previous layers into more complex relationships.
  • neural networks are programmed by training them with a known sample set and allowing them to modify themselves during (and after) training so as to provide a desired output such as an output value. After training, when a neural network is presented with new input data, it is configured to generalize what was“learned” during training and apply what was learned from training to the new previously unseen input data in order to generate an output associated with that input.
  • the neural network comprises ANNs.
  • ANN may be machine learning algorithms that may be trained to map an input dataset to an output dataset, where the ANN comprises an interconnected group of nodes organized into multiple layers of nodes.
  • the ANN architecture may comprise at least an input layer, one or more hidden layers, and an output layer.
  • the ANN may comprise any total number of layers, and any number of hidden layers, where the hidden layers function as trainable feature extractors that allow mapping of a set of input data to an output value or set of output values.
  • a deep learning algorithm (such as a DNN) is an ANN comprising a plurality of hidden layers, e.g., two or more hidden layers.
  • Each layer of the neural network may comprise a number of nodes (or“neurons”).
  • a node receives input that comes either directly from the input data or the output of nodes in previous layers, and performs a specific operation, e.g., a summation operation.
  • a connection from an input to a node is associated with a weight (or weighting factor).
  • the node may sum up the products of all pairs of inputs and their associated weights.
  • the weighted sum may be offset with a bias.
  • the output of a node or neuron may be gated using a threshold or activation function.
  • the activation function may be a linear or non-linear function.
  • the activation function may be, for example, a rectified linear unit (ReLU) activation function, a Leaky ReLU activation function, or other function such as a saturating hyperbolic tangent, identity, binary step, logistic, arctan, softsign, parametric rectified linear unit, exponential linear unit, softplus, bent identity, softexponential, sinusoid, sine, Gaussian, or sigmoid function, or any combination thereof.
  • ReLU rectified linear unit
  • Leaky ReLU activation function or other function such as a saturating hyperbolic tangent, identity, binary step, logistic, arctan, softsign, parametric rectified linear unit, exponential linear unit, softplus, bent identity, softexponential, sinusoid, sine, Gaussian, or sigmoid function, or any combination thereof.
  • the weighting factors, bias values, and threshold values, or other computational parameters of the neural network may be“taught” or“learned” in a training phase using one or more sets of training data.
  • the parameters may be trained using the input data from a training dataset and a gradient descent or backward propagation method so that the output value(s) that the ANN computes are consistent with the examples included in the training dataset.
  • the number of nodes used in the input layer of the ANN or DNN may be at least about 10, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000, 100,000, or greater.
  • the number of node used in the input layer may be at most about 100,000, 90,000, 80,000, 70,000, 60,000, 50,000, 40,000, 30,000, 20,000, 10,000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 50, 10, or less.
  • the total number of layers used in the ANN or DNN may be at least about 3, 4, 5, 10, 15, 20, or greater. In other instances, the total number of layers may be at most about 20, 15, 10, 5, 4, 3, or less.
  • the total number of leamable or trainable parameters, e.g., weighting factors, biases, or threshold values, used in the ANN or DNN may be at least about 10, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000, 100,000, or greater.
  • the number of leamable parameters may be at most about 100,000, 90,000, 80,000, 70,000, 60,000, 50,000, 40,000, 30,000, 20,000, 10,000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 50, 10, or less.
  • a machine learning software module comprises a neural network such as a deep CNN.
  • the network is constructed with any number of
  • the number of convolutional layers is between 1-10 and the dilated layers between 0-10.
  • the total number of convolutional layers may be at least about 1, 2, 3, 4, 5, 10, 15, 20, or greater, and the total number of dilated layers may be at least about 1, 2, 3, 4, 5, 10, 15, 20, or greater.
  • the total number of convolutional layers may be at most about 20, 15, 10, 5, 4, 3, or less, and the total number of dilated layers may be at most about 20, 15, 10, 5, 4, 3, or less.
  • the number of convolutional layers is between 1-10 and the fully-connected layers between 0-10.
  • the total number of convolutional layers (including input and output layers) may be at least about 1, 2, 3, 4, 5, 10, 15, 20, or greater, and the total number of fully-connected layers may be at least about 1, 2, 3, 4, 5, 10, 15, 20, or greater.
  • convolutional layers may be at most about 20, 15, 10, 5, 4, 3, 2, 1, or less, and the total number of fully-connected layers may be at most about 20, 15, 10, 5, 4, 3, 2, 1, or less.
  • the input data for training of the ANN may comprise a variety of input values depending whether the machine learning algorithm is used for processing sensor signal data for a sensor device, a sensor panel, or a detection system of the present disclosure.
  • the sensor device may comprise acoustic sensors, sound sensors, vibration sensors, chemical sensors, electric current sensors, magnetic sensors, radio sensors, moisture sensors, humidity sensors, flow sensors, radiation sensors, imaging sensors, light sensors, optical sensors, pressure sensors, density sensors, thermal sensors, heat sensors, temperature sensors, and proximity sensors.
  • the ANN or deep learning algorithm may be trained using one or more training datasets comprising the same or different sets of input and paired output data.
  • a machine learning software module comprises a neural network comprising a CNN, RNN, dilated CNN, fully-connected neural networks, deep generative models and deep restricted Boltzmann machines.
  • a machine learning algorithm comprises CNNs.
  • the CNN may be deep and feedforward ANNs.
  • the CNN may be applicable to analyzing visual imagery.
  • the CNN may comprise an input, an output layer, and multiple hidden layers.
  • the hidden layers of a CNN may comprise convolutional layers, pooling layers, fully-connected layers and
  • the layers may be organized in 3 dimensions: width, height and depth.
  • the convolutional layers may apply a convolution operation to the input and pass results of the convolution operation to the next layer.
  • the convolution operation may reduce the number of free parameters, allowing the network to be deeper with fewer parameters.
  • each neuron may receive input from some number of locations in the previous layer.
  • neurons may receive input from only a restricted subarea of the previous layer.
  • the convolutional layer's parameters may comprise a set of leamable filters (or kernels). The leamable filters may have a small receptive field and extend through the full depth of the input volume.
  • each filter may be convolved across the width and height of the input volume, compute the dot product between the entries of the filter and the input, and produce a two-dimensional activation map of that filter.
  • the network may leam filters that activate when it detects some specific type of feature at some spatial position in the input.
  • the pooling layers comprise global pooling layers.
  • the global pooling layers may combine the outputs of neuron clusters at one layer into a single neuron in the next layer.
  • max pooling layers may use the maximum value from each of a cluster of neurons in the prior layer
  • average pooling layers may use the average value from each of a cluster of neurons at the prior layer.
  • the fully-connected layers connect every neuron in one layer to every neuron in another layer.
  • each neuron may receive input from some number locations in the previous layer.
  • each neuron may receive input from every element of the previous layer.
  • the normalization layer is a batch normalization layer.
  • the batch normalization layer may improve the performance and stability of neural networks.
  • the batch normalization layer may provide any layer in a neural network with inputs that are zero mean/unit variance. The advantages of using batch normalization layer may include faster trained networks, higher learning rates, easier to initialize weights, more activation functions viable, and simpler process of creating deep networks.
  • FIG. 1 shows an example of CNNs.
  • a CNN architecture comprises a plurality of layers that transform the input into a prediction.
  • the CNNs may comprise convolutional layers 102, pooling layers 104, and fully-connected layers 106.
  • a machine learning software module comprises a recurrent neural network software module.
  • a recurrent neural network software module may be configured to receive sequential data as an input, such as consecutive data inputs, and the recurrent neural network software module updates an internal state at every time step.
  • a recurrent neural network can use internal state (memory) to process sequences of inputs.
  • the recurrent neural network may be applicable to tasks such as handwriting recognition or speech recognition.
  • the recurrent neural network may also be applicable to next word prediction, music composition, image captioning, time series anomaly detection, machine translation, scene labeling, and stock market prediction.
  • a recurrent neural network may comprise fully recurrent neural network,
  • a machine learning software module comprises a supervised or unsupervised learning method such as, for example, support vector machines (“SVMs”), random forests, clustering algorithm (or software module), gradient boosting, logistic regression, and/or decision trees.
  • the supervised learning algorithms may be algorithms that rely on the use of a set of labeled, paired training data examples to infer the relationship between an input data and output data.
  • the unsupervised learning algorithms may be algorithms used to draw inferences from training datasets to the output data.
  • the unsupervised learning algorithm may comprise cluster analysis, which may be used for exploratory data analysis to find hidden patterns or groupings in process data.
  • One example of unsupervised learning method may comprise principal component analysis.
  • the principal component analysis may comprise reducing the dimensionality of one or more variables.
  • the dimensionality of a given variable may be at least 1, 5, 10, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200 1300, 1400, 1500, 1600, 1700, 1800, or greater.
  • the dimensionality of a given variables may be at most 1800, 1700, 1600, 1500, 1400, 1300, 1200, 1100, 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 50, 10, or less.
  • the machine learning algorithm may comprise reinforcement learning algorithms.
  • the reinforcement learning algorithm may be used for optimizing Markov decision processes (i.e., mathematical models used for studying a wide range of optimization problems where future behavior cannot be accurately predicted from past behavior alone, but rather also depends on random chance or probability).
  • One example of reinforcement learning may be Q-leaming.
  • Reinforcement learning algorithms may differ from supervised learning algorithms in that correct training data input/output pairs are never presented, nor are sub-optimal actions explicitly corrected.
  • the reinforcement learning algorithms may be implemented with a focus on real-time performance through finding a balance between exploration of possible outcomes (e.g., correct compound identification) based on updated input data and exploitation of past training.
  • training data resides in a cloud-based database that is accessible from local and/or remote computer systems on which the machine learning-based sensor signal processing algorithms are running.
  • the cloud-based database and associated software may be used for archiving electronic data, sharing electronic data, and analyzing electronic data.
  • training data generated locally may be uploaded to a cloud-based database, from which it may be accessed and used to train other machine learning-based detection systems at the same site or a different site.
  • sensor device and system test results generated locally may be uploaded to a cloud-based database and used to update the training dataset in real time for continuous improvement of sensor device and detection system test performance.
  • a neural network comprises a DNN.
  • a neural network comprises a VGG-19 as, for example, described in SIMONY AN, K., AND ZISSERMAN, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (2015). The DNN and VGG-19 are described elsewhere herein.
  • the likelihood is presented by one-dimensional values (e.g., probabilities).
  • the probability may be configured to measure the likelihood that an event may occur.
  • the probability may range from about 0 and 1, 0.1 to 0.9, 0.2 to 0.8, 0.3 to 0.7, or 0.4 to 0.6.
  • the event may comprise any type of situation, including, by way of non-limiting examples, whether a person will be sick based on his/her lifestyle, whether a certain day of the week will have rain whether a patient may be successfully treated, whether the unemployment rate may be increased in 3 months, or whether one pharmaceutical composition may have FDA approval.
  • the likelihood is presented by two-dimensional values.
  • the two- dimensional values may be presented by two-dimensional space, a feature map such as, for example, a heatmap, or spreadsheet. If the two-dimensional value is presented by a feature map such as, for example, a heatmap, the feature map such as, for example, a heatmap may show the likelihood that an event occurs in a location of the feature map such as, for example, a heatmap.
  • the likelihood is presented by multi-dimensional values.
  • FIG. 2 shows an exemplary feature map such as, for example, a heatmap as a result of running a neural network on an input image.
  • the input image may have dimension 6 x 6 x 3, which means height, width and number of color channels, respectively.
  • a neural network like the VGG-19 may then output an array of heatmaps of dimensions 6 x 6, one for each joint of interest. Each pixel in a heatmap represents the likelihood of having a certain joint appearing at that location.
  • the input image 200 shows the image of a leg of a subject.
  • the exemplary heatmaps comprise hip heatmap 202, knee heatmap 204, and ankle heatmap 206.
  • the exemplary heatmaps may be obtained through a neural network.
  • the likelihood that the hip joint occurs at the position of column 5 and row 1 (208) is 0.6.
  • the likelihood that the knee joint occurs at the position of column 4 and row 3 (210) is 0.7.
  • the likelihood that the ankle joint occurs at the position of column 6 and row 5 (212) is 0.2. The lower the number of the likelihood, the less chance that the joint (e.g., hip joint, knee joint, and ankle joint) occurs in the location on the heatmap.
  • statistical techniques used to obtain one or more PDFs.
  • statistical techniques are applied to identify the risk factors for cancer, classify a recorded phoneme, predict whether a subject may have a certain disease based on a subject’s physical information, customize an email spam detection system, classify a tissue sample into one of several cancer classes, or establish the relationship between salary and demographic variables.
  • statistical techniques comprise linear regression, classification, resampling methods, subset selection, shrinkage, dimension reduction, nonlinear models, tree- based methods, support vector machines, and unsupervised learning.
  • linear regression is used as a method to predict a target variable by fitting the best linear relationship between the dependent and independent variable.
  • the best fit means that the sum of ah the distances between the shape and the actual observations at each point is the least.
  • Linear regression may comprise simple linear regression and multiple linear regression.
  • the simple linear regression may use a single independent variable to predict a dependent variable.
  • the multiple linear regression may use more than one independent variables to predict a dependent variable by fitting a best linear relationship.
  • a dataset comprises ratings of multiple cereals, the number of grams of sugar contained in each serving, and the number of grams of fat contained in each serving; and a simple linear regression model uses the number of grams of sugar as the independent variable and rating as the dependent variable.
  • a multiple linear regression model uses the number of grams of sugar and the number of grams of fat as the independent variables and rating as the dependent variable.
  • classification is a data mining technique that assigns categories to a collection of data in order to achieve accurate predictions and analysis.
  • a classification model is used to identify loan applicants as low, medium, or high credit risks.
  • Classification techniques may comprise logistic regression and discriminant analysis. Logistic regression may be used when the dependent variable is dichotomous (binary). Logistic regression may be used to discover and describe the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.
  • discriminant analysis is used where two or more groups, clusters or populations are known a priori and one or more new observations are classified into one of the known populations based on the measured characteristics. For instance, a discriminant model is used to determine employees’ different personality types based on data collected on employees in three different job classifications: 1) customer service personnel, 2) mechanics; and 3) dispatchers. Discriminant analysis may comprise linear discriminant analysis and quadratic discriminant analysis. Linear discriminant analysis may compute“discriminant scores” for each observation to classify what response variable class it is in. Quadratic discriminant analysis may assume that each class has its own covariance matrix.
  • resampling is a method comprising drawing repeated samples from the original data samples.
  • the resampling may not involve the utilization of the generic distribution tables in order to compute approximate probability values.
  • the resampling may generate a unique sampling distribution on the basis of the actual data.
  • the resampling may use experimental methods, rather than analytical methods, to generate the unique sampling distribution.
  • the resampling techniques may comprise bootstrapping and cross- validation. Bootstrapping may be performed by sampling with replacements from the original data and taking the“not chosen” data points as test cases. Cross validation may be performed by splitting the training data into a plurality of parts.
  • subset selection identifies a subset of predictors related to the response.
  • the subset selection may comprise best-subset selection, forward stepwise selection, backward stepwise selection, and hybrid method.
  • shrinkage fits a model involving all predictors, but the estimated coefficients are shrunken towards zero relative to the least squares estimates. This shrinkage may reduce variance.
  • the shrinkage may comprise ridge regression and the lasso.
  • dimension reduction reduces the problem of estimating p + 1 coefficients to the simple problem of M + 1 coefficients, where M ⁇ p. It may be attained by computing M different linear combinations or projections of the variables. Then these M projections are used as predictors to fit a linear regression model by least squares.
  • Dimension reduction may comprise principal component regression and partial least squares.
  • the principal component regression may be used to derive a low-dimensional set of features from a large set of variables.
  • the principal components used in the principal component regression may capture the most variance in the data using linear combinations of the data in subsequently orthogonal directions.
  • the partial least squares method may be a supervised alternative to principal component regression because partial least squares may make use of the response variable in order to identify the new features.
  • nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables.
  • the nonlinear regression may comprise step function, piecewise function, spline, and generalized additive model.
  • tree-based methods are used for both regression and classification problems.
  • the regression and classification problems may involve stratifying or segmenting the predictor space into a number of simple regions.
  • the tree-based methods may comprise bagging, boosting, and random forest.
  • the bagging may decrease the variance of prediction by generating additional data for training from original dataset using combinations with repetitions to produce multistep of the same camality/size as the original data.
  • the boosting may calculate the output using several different models and then average the result using a weighted average approach.
  • the random forest algorithm may draw random bootstrap samples of the training set.
  • support vector machines are classification techniques listed under supervised learning models in machine learning.
  • the support vector machines may be a constrained optimization problem where the margin is maximized subject to the constraint that it perfectly classifies the data.
  • Unsupervised methods may be methods to draw inferences from datasets comprising input data without labeled responses.
  • the unsupervised methods may comprise clustering, principal component analysis, k-Mean clustering, and hierarchical clustering.
  • the statistical techniques comprise a Monte Carlo sampling method.
  • the Monte Carlo sampling method may comprise one or more computational algorithms that rely on repeated random sampling to obtain numerical results.
  • the Monte Carlo sampling method may apply to optimization, numerical integration, and generation of draws from a probability distribution.
  • the Monte Carlo sampling method may be applied to stochastic problems by nature, for example, particle transport, telephone and other communication systems, and population studies based on the statistics of survival and reproduction.
  • the Monte Carlo sampling method may also be applied to deterministic problems by nature, for example, the evaluation of integrals, solving the systems of algebraic equations, and solving partial differential equations.
  • the Monte Carlo sampling method may comprise the following steps: 1) defining a domain of possible inputs; 2) generating inputs randomly from a probability distribution over the domain; 3) performing a deterministic computation on the inputs; and 4) aggregating the results.
  • the Monte Carlo sampling method may comprise: 1) PDFs by which a physical (or mathematical) system is described; 2) random number generator, which means a source of random numbers uniformly distributed on the unit interval that are available; 3) sampling rule demonstrating a prescription for sampling from the specified PDF, assuming the availability of random numbers on the unit interval; 4) scoring (or tallying), whereby the outcomes may be accumulated into overall tallies or scores for the quantities of interest; 5) error estimation, typically shown as a function of the number of trials and other quantities; 6) variance reduction techniques, comprising methods for reducing the variance in the estimated solution to reduce the computational time for Monte Carlo simulation; and 7) parallelization and vectorization, including efficient use of advanced computer architectures.
  • the locations of objects of interest are approximated by the Monte Carlo sampling method.
  • the objects of interest comprise one or more devices, the locations of which are used for analysis of usage, marketing, or other financial or business purposes.
  • the one or more devices may comprise any type of device, for example, but not limited to, consumer electronics, telecommunication devices, office devices, agricultural devices, lights, household equipment, safety equipment, or medical equipment.
  • the consumer electronics may comprise TVs, photo equipment and accessories, cameras (video or film), speaker, radio / hi-fi systems, or video projectors.
  • the telecommunication devices may comprise mobile phones, modems, router, phone cards, or telephones.
  • the office devices may comprise shredders, faxes, copiers, projectors, cutting machine, and typewriters.
  • the agricultural devices may comprise tractor, cultivator, chisel plow, harrow, subsoiler, rotator, roller, trowel, seed drill, liquid manure spreader, sprayer, sprinkler system, produce sorter, farm truck, grain dryer, conveyor belt, mower, hay rake, bulk tank, milking machine, grinder-mixture, or livestock trailer.
  • the household devices may comprise cooler, blender, fan, refrigerator, heater, oven, air- conditioner, dishwasher, washer and dryer, vacuum cleaner, and microwave.
  • the safety equipment may comprise rescue equipment, carbon monoxide detector, surveillance cameras, and surveillance monitors.
  • the medical equipment may comprise stethoscope, suction device, thermometer, tongue depressor, transfusion kit, tuning fork, ventilator, watch, stopwatch, weighing scale, crocodile forceps, bedpan, cannula, cardioverter, defibrillator, catheter, dialyzer, electrocardiograph machine, enema equipment, endoscope, gas cylinder, gauze sponge, hypodermic needle, syringe, infection control equipment, an oximeter or oximeters that monitors oxygen levels of the user, instrument sterilizer, kidney dish, measuring tape, medical halogen penlight, nasogastric tube, nebulizer, ophthalmoscope, otoscope, oxygen mask and tubes, pipette, dropper, proctoscope, reflex hammer, and sphygmomanometer.
  • the objects of interest comprise transportation systems, the locations of which are used for analysis of transportation and infrastructure.
  • the transportation system may comprise, by way of non-limiting examples, an aircraft, airplane, automobile, battleship, bus, bullet train, bike, cab, canoe, cargo ship, compact car, truck, elevated railroad, ferry, fishing boat, jet boat, kayak, limo, minibus, minivan, sail boat, school bus, tank, train, van, or yacht.
  • the objects of interest comprise organs of a subject.
  • the subject may be any living beings, for example, amphibians, reptiles, birds, mammals, fishes, insects, spiders, crabs, or snails.
  • the organ may include, by way of non-limiting examples, mouth, tongue, stomach, liver, pancreas, small intestine, large intestine, pharynx, lungs, kidney, uterus, heart, eye, ear, bones, joints, and skin.
  • the objects of interest comprise tissue of a subject.
  • a tissue may be a sample that is healthy, benign, or otherwise free of a disease.
  • a tissue may be a sample removed from a subject, such as a tissue biopsy, a tissue resection, an aspirate (such as a fine needle aspirate), a tissue washing, a cytology specimen, a bodily fluid, or any combination thereof.
  • a tissue may comprise neurons.
  • a tissue may comprise brain tissue, spinal tissue, or a combination thereof.
  • a tissue may comprise cells representative of a blood-brain barrier.
  • a tissue may comprise a breast tissue, bladder tissue, kidney tissue, liver tissue, colon tissue, thyroid tissue, cervical tissue, prostate tissue, lung tissue, heart tissue, muscle tissue, pancreas tissue, anal tissue, bile duct tissue, a bone tissue, uterine tissue, ovarian tissue, endometrial tissue, vaginal tissue, vulvar tissue, stomach tissue, ocular tissue, nasal tissue, sinus tissue, penile tissue, salivary gland tissue, gut tissue, gallbladder tissue, gastrointestinal tissue, bladder tissue, brain tissue, spinal tissue, or a blood sample.
  • the objects of interest comprise small units of ordinary matter, the locations of which are used for scientific research.
  • the small units of ordinary matter may comprise atom, nucleus, electrons, neutrons, protons, and ions.
  • the Monte Carlo sampling method is used to approximate the PDF for joint location from a single heatmap.
  • the heatmap Y £ may comprise dimensions N x N for joint i.
  • N may be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, N may be at most 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less.
  • the joint may comprise hand joints, elbow joints, wrist joints, axillary articulations, sternoclavicular joints, vertebral articulations, temporomandibular joints, sacroiliac joints, hip joints, knee joints, and articulations of foot.
  • the heatmap Y c can represent the likelihood of hand joints locations; and the heatmap Y 2 can represent the likelihood of elbow joints locations.
  • the heatmap can be used to approximate the distribution function p(x;
  • the Monte Carlo sampling method may be used to approximate the distribution of joint locations by alternatively sampling from rows and columns of a heatmap.
  • the mean joint location can be approximated by following the steps.
  • joint i mean location m; for joint i is designated to be calculated.
  • the joint may comprise hand joints, elbow joints, wrist joints, axillary articulations, sternoclavicular joints, vertebral articulations, temporomandibular joints, sacroiliac joints, hip joints, knee joints, and articulations of foot.
  • x 1 [x x x 2 ] is initialized by sampling from a uniform distribution.
  • distributions other than the uniform distribution can be used to initialize
  • the other distributions include, but are not limited to, the Bernoulli distribution, the Rademacher distribution, the binomial distribution, the beta-binomial distribution, the degenerate distribution, the discrete uniform distribution, the hypergeometric distribution, the Poisson binomial distribution, the Fisher's noncentral hypergeometric distribution, Wallenius' noncentral hypergeometric distribution, the beta negative binomial distribution, the Boltzmann distribution, the Gibbs distribution, the Maxwell-Boltzmann distribution, the Borel distribution, the extended negative binomial distribution, the extended hypergeometric distribution, the generalized log-series distribution, the geometric distribution, the logarithmic (series) distribution, the negative binomial distribution, the discrete compound Poisson distribution, the parabolic fractal distribution, the Poisson distribution, the Conway- Maxwell-Poisson distribution, the zero-truncated Poisson distribution, the Polya-Eggenberger distribution, the Skellam distribution
  • the zeta distribution the Zipf distribution, the Behrens-Fisher distribution, the Cauchy distribution, the Chernoff s distribution, the Exponentially modified Gaussian
  • the Fisher's z-distribution the skewed generalized t-distribution, the generalized logistic distribution, the generalized normal distribution, the geometric stable distribution, the Gumbel distribution, the Holtsmark distribution, the hyperbolic distribution, the hyperbolic secant distribution, the Johnson SU distribution, the Landau distribution, the Laplace distribution, the Levy skew alpha-stable distribution, the Linnik distribution, the logistic distribution, the map- Airy distribution, the normal distribution, the normal-exponential-gamma distribution, the normal-inverse Gaussian distribution, the Pearson Type IV distribution, the skew normal distribution, the Student's t-distribution, useful for estimating unknown means of Gaussian populations, the noncentral t-distribution, the skew t-distribution, the Champernowne
  • the distribution the type-l Gumbel distribution, the Tracy-Widom distribution, the Voigt distribution, the beta prime distribution, the Birnbaum-Saunders distribution, the chi distribution, the noncentral chi distribution, the chi-squared distribution, the inverse-chi-squared distribution, the noncentral chi-squared distribution, the scaled inverse chi-squared distribution, the Dagum distribution, the exponential distribution, the exponential-logarithmic distribution, the F- distribution, the noncentral F -distribution, the folded normal distribution, the Frechet distribution, the Gamma distribution, the Erlang distribution, the inverse-gamma distribution, the generalized gamma distribution, the generalized Pareto distribution, the Gamma/Gompertz distribution, the Gompertz distribution, the half-normal distribution, the Flotelling's T-squared distribution, the inverse Gaussian distribution, the Levy distribution, the log-Cauchy distribution, the log-Laplace distribution, the log-logistic distribution, the Lomax distribution,
  • the x x may represent a column in a heatmap, and the x 2 may represent a row in a heatmap.
  • the x may have the value ranging from 1 to 8
  • the x 2 may have the value ranging from 1 to 7.
  • the process of initialization may be the assignment of an initial value for a variable, e.g., x 1 .
  • the uniform distribution may be the continuous uniform distribution.
  • the continuous uniform distribution may be symmetric probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable.
  • the t represents the time.
  • T is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater.
  • T is at most 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less.
  • categorical means a categorical distribution, which is a discrete probability distribution that describes the possible results of a random variable that can take on one of multiple possible categories, with the probability of each category separately specified.
  • the categorical distribution is the generalization of the Bernoulli distribution for a categorical random variable, i.e. for a discrete variable with more than two possible outcomes, such as the roll of a die.
  • the categorical distribution is a special case of the multinomial distribution, in that it gives the probabilities of potential outcomes of a single drawing rather than multiple drawings.
  • the distribution of x +1 is p(x-i
  • xf shows a row of the heatmap at the time t.
  • the distribution of xf +1 is p(x 2
  • x[ +1 shows a column of the heatmap at the time t+ 1.
  • the expectation value and covariance value may be calculated.
  • the equations for calculating expectation value and covariance value may be
  • the process of Monte Carlo sampling method may start from heatmap 302.
  • the likelihood of the location may be used to calculate the next move of the dot, x , by using a distribution parametrized by row 3 of the heatmap 302, which is the equation shown in step three.
  • the likelihood of the location may be used to calculate the next move of the dot by using a distribution parametrized by column 4 of the heatmap 304:
  • the next move of the dot may be at column 4 and row 4, which shows in the heatmap 306.
  • the process may be continued.
  • the move of the dot shown in heatmap 308 may be at column 3 and row 4;
  • the move of the dot show in heatmap 310 may be at column 3 and row 3;
  • the move of the dot shown in heatmap 312 may be at column 2 and row 3;
  • the move of the dot shown in heatmap 314 may be at column 2 and row 6;
  • the move of the dot shown in heatmap 316 may be at column 1 and row 6.
  • ip k can be a single element in a row or column vector Y ⁇ of K
  • vector elements can be first normalized by ,
  • the discrete values can be computed with + g k where 1 g k ⁇ are independent and identically drawn from the Gumbel distribution Gumbel (0, 1).
  • the function is relevant to the locations of one or more devices.
  • the function may represent the movement of the device at different times, the relative location of two or more devices at the same time, or the ratio of usage of one or more devices at the same time.
  • the function is relevant to one or more organs of a same subject. In this situation, the function may represent the distance between different organs.
  • the function is relevant to one or more organs of different subject. In this situation, the function may represent the distance between different organs of different subjects, the relationship between different subjects, and the proximity of different subjects.
  • the function is relevant to locations of small units of ordinary matter. In this situation, the function may present the movement of small units of ordinary matter, or the relative locations of small units of ordinary matter.
  • the Monte Carlo sampling method can be used to approximate the PDF of any function that takes joint locations as input.
  • the following steps show how to compute the mean and variance of joint angle given individual detection heatmaps for each joint.
  • [x 5 x 6 ] are 2D locations each joints A, B, C, by sampling from a uniform distribution.
  • the x 4 represents a column in a heatmap, and the x 2 represents a row in a heatmap.
  • the x 1 may have the value ranging from 1 to 8
  • the x 2 may have the value ranging from 1 to 7.
  • the process of initialization may be the assignment of an initial value for a variable, e.g., x A.
  • the x B [x 3 x 4 ]
  • the x 3 represents a column in a heatmap
  • the x 4 represents a row in a heatmap.
  • the x 5 represents a column in a heatmap
  • the x 6 represents a row in a heatmap.
  • the uniform distribution may be the continuous uniform distribution.
  • the continuous uniform distribution may be symmetric probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable.
  • the t represents the time.
  • T is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater.
  • T is at most 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less.
  • categorical is a categorical distribution, which is a discrete probability distribution that describes the possible results of a random variable that can take on one of multiple possible categories, with the probability of each category separately specified.
  • the categorical distribution is the generalization of the Bernoulli distribution for a categorical random variable, i.e. for a discrete variable with more than two possible outcomes, such as the roll of a die.
  • the categorical distribution is a special case of the multinomial distribution, in that it gives the probabilities of potential outcomes of a single drawing rather than multiple drawings.
  • the distribution of x 4 +1 is r(c 4
  • xf means a row of the heatmap at the time /.
  • the distribution of xf +1 is p(x 2
  • xf +1 means a column of the heatmap at the time i+1. Tn some embodiments, the distribution of xf +1 is p(x 3 which shows the likelihood of x 3 under the condition of x . In some embodiments, x means a row of the heatmap Y b at the time t. In some embodiments, the distribution of x +1 is p(x 4
  • x 3 +1 means a column of the heatmap Y 5 at the time i+1 Tn some embodiments, the distribution of xf +1 is p(x 5
  • expectation value and covariance value may be calculated.
  • the equations for calculating expectation value and covariance value may be
  • the proposed framework can be applied to explore the
  • the number of different heatmaps is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of different heatmaps is at most 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less. In some embodiments, the relationships between different heatmaps can be represented by a function. In some embodiments,
  • the function is relevant to the locations of one or more devices. In this situation, the function may represent the movement of the device at different times, the relative location of [0104]
  • the samples generated by the method disclosed herein can serve as both input and output for a Gaussian process regressor (GPR), which, for example, is described in RASMUSSEN, C. E. Gaussian processes in machine learning. In Advanced lectures on machine learning. Springer, 2004, pp. 63-71.
  • GPR Gaussian process regressor
  • N is the number of frames
  • a special kernel that takes into account output uncertainty can be used.
  • FIGs. 5A-5B represent the Gaussian process regression with known output variance.
  • FIG. 5A shows an example of a video of a subject performing a leg exercise. In this figure, the subject is moving the left leg. The joins of interest are hip joints 502, knee joints 504, and ankle joints 506.
  • FIG. 5B shows graphs of the left knee's joint angle across time. The FIG. 5B shows examples of actual results of applying GPR with the above kernel on a real-world video.
  • the line 508 shows the predictive means of joint angles
  • the multiple dots, such as, for example, dot510 are the samples i with vertical lines 512 proportional to values a t.
  • Line 514 shows the joint angles computed using markers on the subject.
  • samples X ⁇ (LG-L, ⁇ c ), (;%, ⁇ w ) ⁇ (computed with the method disclosed elsewhere herein) can be used as inputs and actual joint angles can be used as outputs.
  • the Gaussian processes for example, are introduced in papers DAMIANOU, A. C., TITSIAS, M. K., AND LAWRENCE, N. D. Variational inference for latent variables and uncertain inputs in gaussian processes. The Journal of Machine Learning Research 17, 1 (2016), 1425-1486 and MCEtUTCHON, A., AND RASMUSSEN, C. E. Gaussian process training with input noise. In Advances in Neural Information Processing Systems (2011), pp. 1341-1349. The Gaussian processes can be used to model datasets with uncertain inputs.
  • FIG. 6 shows an example of a computer-based method for locating an object of interest or factor of interest within a video comprising a plurality of frames.
  • the method comprises: inputting the video 602 into a machine learning algorithm 604; generating a heatmap 606 from a frame of the plurality of frames with the machine learning algorithm, wherein the heatmap provides a likelihood of a presence of the object of interest or factor of interest at each of a plurality of locations within the frame; and analyzing the heatmap using a statistical technique 608 and 610 thereby locating 612 the object of interest or factor of interest within the video.
  • the statistical techniques may comprise Monte Carlo sampling 608 and Bayesian modeling 610.
  • FIG. 7 shows an example that the heatmaps can be used to construct a probabilistic model.
  • the heatmaps can be generated through a plurality of video frames 702.
  • the heatmap can be used as proposal distributions. After Monte Carlo sampling, the heatmaps can be used to construct complex probabilistic models 704 over arbitrary factors.
  • FIG. 8 shows an exemplary embodiment of a method 800 for locating an object of interest or factor of interest within a video comprising a plurality of frames.
  • a video is inputted into a machine learning algorithm.
  • the machine learning algorithm is used to generate a heatmap from a frame of the plurality of frames.
  • a statistical technique is employed to analyze the heatmap to locate the object of interest or factor of interest within the video.
  • FIG. 9 shows an exemplary embodiment of a system as described herein comprising a device such as a digital processing device 901.
  • the digital processing device 901 includes a software application configured to monitor the physical parameters of an individual.
  • the digital processing device 901 may include a central processing unit (“CPU,” also“processor” and “computer processor” herein) 905, which can be a single-core or multi-core processor, or a plurality of processors for parallel processing.
  • CPU central processing unit
  • processor also“processor” and “computer processor” herein
  • the digital processing device 901 also includes either memory or a memory location 910 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 915 (e.g., hard disk), communication interface 920 (e.g., network adapter, network interface) for communicating with one or more other systems, and peripheral devices, such as a cache.
  • the peripheral devices can include storage device(s) or storage medium(s) 965 which communicate with the rest of the device via a storage interface 970.
  • the memory 910, storage unit 915, interface 920 and peripheral devices are configured to communicate with the CPU 905 through a communication bus 925, such as a motherboard.
  • the digital processing device 901 can be operatively coupled to a computer network (“network”) 930 with the aid of the communication interface 920.
  • the network 930 can comprise the Internet.
  • the network 930 can be a telecommunication and/or data network.
  • the digital processing device 901 includes input device(s) 945 to receive information from a user, the input device(s) in communication with other elements of the device via an input interface 950.
  • the digital processing device 901 can include output device(s) 955 that communicates to other elements of the device via an output interface 960.
  • the CPU 905 is configured to execute machine-readable instructions embodied in a software application or module.
  • the instructions may be stored in a memory location, such as the memory 910.
  • the memory 910 may include various components (e.g., machine readable media) including, by way of non-limiting examples, a random-access memory (“RAM”) component (e.g., a static RAM “SRAM”, a dynamic RAM “DRAM, etc.), or a read-only (ROM) component.
  • RAM random-access memory
  • SRAM static RAM
  • DRAM dynamic RAM
  • ROM read-only
  • the memory 910 can also include a basic input/output system (BIOS), including basic routines that help to transfer information between elements within the digital processing device, such as during device start-up, may be stored in the memory 910.
  • BIOS basic input/output system
  • the storage unit 915 can be configured to store files, such as health or risk parameter data (e.g., individual health or risk parameter values, health or risk parameter value maps, value groups, movement of individuals, and individual medical histories).
  • the storage unit 915 can also be used to store operating system, application programs, and the like.
  • storage unit 915 may be removably interfaced with the digital processing device (e.g., via an external port connector (not shown)) and/or via a storage unit interface.
  • Software may reside, completely or partially, within a computer-readable storage medium within or outside of the storage unit 915. In another example, software may reside, completely or partially, within processor(s) 905.
  • Information and data can be displayed to a user through a display 935.
  • the display is connected to the bus 925 via an interface 940, and transport of data between the display other elements of the device 901 can be controlled via the interface 940.
  • Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the digital processing device 901, such as, for example, on the memory 910 or electronic storage unit 915.
  • the machine executable or machine-readable code can be provided in the form of a software application or software module.
  • the code can be executed by the processor 905.
  • the code can be retrieved from the storage unit 915 and stored on the memory 910 for ready access by the processor 905.
  • the electronic storage unit 915 can be precluded, and machine-executable instructions are stored on memory 910.
  • a remote device 902 is configured to communicate with the digital processing device 901, and may comprise any mobile computing device, non-limiting examples of which include a tablet computer, laptop computer, smartphone, or smartwatch.
  • the remote device 902 is a smartphone of the user that is configured to receive information from the digital processing device 901 of the device or system described herein in which the information can include a summary, sensor data, or other data.
  • the remote device 902 is a server on the network configured to send and/or receive data from the device or system described herein.
  • a computer-based method for locating an object of interest or factor of interest within a video comprising a plurality of frames comprises: inputting the video into a machine learning algorithm; generating a heatmap from a frame of the plurality of frames with the machine learning algorithm, wherein the heatmap provides a likelihood of a presence of the object of interest or factor of interest at each of a plurality of locations within the frame; and analyzing the heatmap using a statistical technique thereby locating the object of interest or factor of interest within the video.
  • the number of the plurality of frames is at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of the plurality of frames is at most 10, 9, 8, 7, 6, 5, 4, 3, 2, or less.
  • the video may be obtained through a device.
  • the device may be an electronic device.
  • the electronic device may comprise a portable electronic device.
  • the electronic devices may be mobile phones, PCs, tablets, printers, consumer electronics, and appliances.
  • the machine learning algorithm comprises a DNN.
  • the machine learning algorithm comprises decision tree learning, association rule learning, ANN, inductive logic programming, support vector machines, clustering, Bayesian networks, reinforcement learning, representation learning, similarity and metric learning, sparse dictionary learning, genetic algorithm, and rule-based machine learning.
  • the heatmap comprises data from hidden layers of theDNN.
  • the number of the hidden layers is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of the hidden layers is at most 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less.
  • the DNN comprises VGG-19.
  • the VGG-19 comprises 19 convolutional layers with uniform architecture.
  • the DNN comprises VGG-16, AlexNet, ZFNet, GoogleNet/Inception, MobileNet, and ResNet.
  • the heatmap identifies a likelihood that multiple objects of interest are located at locations within the frame.
  • the multiple objects of interest may comprise joints of a subject, cars, tracing devices, sensors, and people of interest.
  • the statistical technique comprises Monte Carlo Sampling.
  • the Monte Carlo Sampling is used to sample the likelihood of the presence of the object of interest or factor of interest for at least one of each of the plurality of locations within the frame.
  • the Monte Carlo Sampling is used to sample the likelihood of the presence of the object of interest or factor of interest for at least 2, 3, 4, 5, 6, or greater of each of the plurality of locations within the frame.
  • the statistical technique further comprises Bayesian modeling.
  • the Bayesian modeling is used to model a change in a location of the object of interest or factor of interest within the frame to a different location of the object of interest or factor of interest within a different frame of the plurality of frames.
  • the Bayesian Modeling represents a set of variables and their conditional dependencies.
  • the number of variables is at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of variables is at most about 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less
  • the method comprises identifying a position of the object of interest or factor of interest within the frame relative to a different object of interest or factor of interest within the frame.
  • the position of the object of interest or factor of interest within the frame is expressed as an angle.
  • the position of the object of interest or factor of interest within the frame is expressed as a distance, a ratio, a code, or a function.
  • the object of interest or factor of interest comprises a joint of a body of an individual.
  • the joints comprise hand joints, elbow joints, wrist joints, axillary articulations, sternoclavicular joints, vertebral articulations, temporomandibular joints, sacroiliac joints, hip joints, knee joints, and articulations of foot.
  • the joint comprises a shoulder, elbow, hip, knee, or ankle.
  • the video captures the individual within the frame. In some embodiments, the video captures movement of the joint from the frame to a different frame within the plurality of frames. In some embodiments, the movement of the joint from the frame to a different frame within the plurality of frames is measured relative to a different joint of the body of the individual and is expressed as an angle. In some embodiments, the angle is used by a healthcare provider to evaluate the individual.
  • a computer based system for locating an object of interest or factor of interest within a video comprising a plurality of frames comprises a processor; a non-transitory medium comprising a computer program configured to cause the processor to: input the video into a machine learning algorithm; generate a heatmap from a frame of the plurality of frames using the machine learning algorithm, wherein the heatmap provides a likelihood of a presence of the object of interest or factor of interest at each of a plurality of locations within the frame; and analyze the heatmap using a statistical technique thereby locating the object of interest or factor of interest within the video.
  • the processor comprises a central processing unit (“CPU”), which can be a single-core or multi-core processor, or a plurality of processors for parallel processing.
  • the computer-based system further comprises one or more non- transitory computer readable storage media encoded with a program including instructions executable by the operating system to process image or video data.
  • a computer readable storage medium includes, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, cloud computing systems and services, and the like.
  • the program and instructions are permanently, substantially permanently, semi-permanently, or non- transitorily encoded on the media.
  • the computer-based system includes and/or utilizes one or more databases.
  • suitable databases include, by way of non-limiting examples, relational databases, non-relational databases, object-oriented databases, object databases, entity-relationship model databases, associative databases, and XML databases. Further non-limiting examples include SQL, PostgreSQL, MySQL, Oracle, DB2, and Sybase.
  • a database is internet-based.
  • a database is web-based.
  • a database is cloud computing-based.
  • a database is based on one or more local computer storage devices.
  • the machine learning algorithm comprises a DNN.
  • the machine learning algorithm comprises decision tree learning, association rule learning, ANN, inductive logic programming, support vector machines, clustering, Bayesian networks, reinforcement learning, representation learning, similarity and metric learning, sparse dictionary learning, genetic algorithm, and rule-based machine learning.
  • the heatmap comprises data from hidden layers of the DNN.
  • the number of the hidden layers is at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of the hidden layers is at most 10, 9, 8, 7, 6, 5, 4, 3, 2, or less.
  • the DNN comprises VGG-19.
  • the VGG-19 comprises 19 convolutional layers with uniform architecture.
  • the DNN comprises VGG-16, AlexNet, ZFNet, GoogleNet/Inception, MobileNet, and ResNet.
  • the heatmap identifies a likelihood that multiple objects of interest are located at locations within the frame.
  • the multiple objects of interest may comprise joints of a subject, cars, tracing devices, sensors, and a person of interest.
  • the statistical technique comprises Monte Carlo Sampling.
  • the Monte Carlo Sampling is used to sample the likelihood of the presence of the object of interest or factor of interest for at least one of each of the plurality of locations within the frame.
  • the Monte Carlo Sampling is used to sample the likelihood of the presence of the object of interest or factor of interest for at least 2, 3, 4, 5, 6, or greater of each of the plurality of locations within the frame.
  • the statistical technique further comprises Bayesian modeling.
  • the Bayesian modeling is used to model a change in a location of the object of interest or factor of interest within the frame to a different location of the object of interest or factor of interest within a different frame of the plurality of frames.
  • the Bayesian modeling represents a set of variables and their conditional dependencies.
  • the number of variables is at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of variables is at most about 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less.
  • the computer program is further configured to cause the processor to identify a position of the object of interest or factor of interest within the frame relative to a different object of interest or factor of interest within the frame.
  • the position of the object of interest or factor of interest within the frame within the frame is expressed as an angle.
  • the position of the object of interest or factor of interest within the frame is expressed as a distance, a ratio, a code, or a function.
  • the object of interest or factor of interest comprises a joint of a body of an individual.
  • the joints comprise hand joints, elbow joints, wrist joints, axillary articulations, sternoclavicular joints, vertebral articulations, temporomandibular joints, sacroiliac joints, hip joints, knee joints, and articulations of foot.
  • the joint comprises a shoulder, elbow, hip, knee, or ankle.
  • the joint comprises a shoulder, elbow, hip, knee, or ankle.
  • the video captures the individual within the frame.
  • the video captures movement of the joint from the frame to a different frame within the plurality of frames.
  • the movement of the joint from the frame to a different frame within the plurality of frames is measured relative to a different joint of the body of the individual and is expressed as an angle.
  • the angle is used by a healthcare provider to evaluate the individual.
  • a non-transitory medium comprises a computer program configured to cause the processor to: input the video into a machine learning algorithm; generate a heatmap from a frame of the plurality of frames with the machine learning algorithm, wherein the heatmap provides a likelihood of a presence of the object of interest or factor of interest at each of a plurality of locations within the frame; and analyze the heatmap using a statistical technique thereby locating the object of interest or factor of interest within the video.
  • the computer program includes a sequence of instructions, executable in the digital processing device’s CPU, written to perform a specified task.
  • Computer- readable instructions may be implemented as program modules, such as functions, objects, application programming interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types.
  • APIs application programming interfaces
  • a computer program may be written in various versions of various languages.
  • the functionality of the computer-readable instructions may be combined or distributed as desired in various environments.
  • a computer program comprises one sequence of instructions.
  • a computer program comprises a plurality of sequences of instructions.
  • a computer program is provided from one location.
  • a computer program is provided from a plurality of locations.
  • a computer program includes one or more software modules.
  • a computer program includes, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof.
  • a software module comprises a file, a section of code, a programming object, a programming structure, or combinations thereof.
  • a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, or combinations thereof.
  • the one or more software modules comprise, by way of non-limiting examples, a web
  • software modules are in one computer program or application. In other embodiments, software modules are in more than one computer programs or applications. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on cloud computing platforms. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one locations.
  • the machine learning algorithm comprises a DNN.
  • the machine learning algorithm comprises decision tree learning, association rule learning, ANN, inductive logic programming, support vector machines, clustering, Bayesian networks, reinforcement learning, representation learning, similarity and metric learning, sparse dictionary learning, genetic algorithm, and rule-based machine learning.
  • the heatmap comprises data from hidden layers of theDNN.
  • the number of the hidden layers is at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of the hidden layers is at most 10, 9, 8, 7, 6, 5, 4, 3, 2, or less.
  • the DNN comprises VGG-19.
  • the VGG-19 comprises 19 convolutional layers with uniform architecture.
  • the DNN comprises VGG-16, AlexNet, ZFNet, GoogleNet/Inception, MobileNet, and ResNet.
  • the heatmap identifies a likelihood that multiple objects of interest are located at locations within the frame.
  • the multiple objects of interest may comprise joints of a subject, cars, tracing devices, sensors, and a person of interest.
  • the statistical technique comprises Monte Carlo sampling.
  • the Monte Carlo sampling is used to sample the likelihood of the presence of the object of interest or factor of interest or factor of interest for at least one of each of the plurality of locations within the frame.
  • the Monte Carlo sampling is used to sample the likelihood of the presence of the object of interest or factor of interest or factor of interest for at least 2, 3, 4, 5, 6, or greater of each of the plurality of locations within the frame.
  • the statistical technique further comprises Bayesian Modeling.
  • the Bayesian modeling is used to model a change in a location of the object of interest or factor of interest or factor of interest within the frame to a different location of the object of interest or factor of interest or factor of interest within a different frame of the plurality of frames.
  • the Bayesian Modeling represents a set of variables and their conditional dependencies.
  • the number of variables is at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or greater. In other embodiments, the number of variables is at most about 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or less.
  • the computer program is further configured to cause the processor to identify a position of the object of interest or factor of interest within the frame relative to a different object of interest or factor of interest within the frame.
  • the position of the object of interest or factor of interest within the frame within the frame is expressed as an angle.
  • the position of the object of interest or factor of interest within the frame is expressed as a distance, a ratio, a code, or a function.
  • the object of interest or factor of interest comprises a joint of a body of an individual.
  • the joints comprise hand joints, elbow joints, wrist joints, axillary articulations, sternoclavicular joints, vertebral articulations, temporomandibular joints, sacroiliac joints, hip joints, knee joints, and articulations of foot.
  • the joint comprises a shoulder, elbow, hip, knee, or ankle.
  • the joint comprises a shoulder, elbow, hip, knee, or ankle.
  • the video captures the individual within the frame.
  • the video captures movement of the joint from the frame to a different frame within the plurality of frames.
  • the movement of the joint from the frame to a different frame within the plurality of frames is measured relative to a different joint of the body of the individual and is expressed as an angle.
  • the angle is used by a healthcare provider to evaluate the individual.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Software Systems (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne des logiciels et des systèmes permettant d'analyser des vidéos et/ou des images. Les logiciels et systèmes de l'invention sont configurés dans différents modes de réalisation afin de réaliser différents types d'analyses. Par exemple, dans certains modes de réalisation, les logiciels et systèmes de l'invention sont configurés pour localiser un objet d'intérêt dans une vidéo et/ou une image.
EP19798184.8A 2018-10-22 2019-10-21 Détection d'objet basée sur ordinateur dans une vidéo ou une image Pending EP3871136A1 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/167,300 US10922573B2 (en) 2018-10-22 2018-10-22 Computer based object detection within a video or image
GB1817286.6A GB2578325B (en) 2018-10-24 2018-10-24 Computer based object detection within a video or image
PCT/EP2019/078563 WO2020083831A1 (fr) 2018-10-22 2019-10-21 Détection d'objet basée sur ordinateur dans une vidéo ou une image

Publications (1)

Publication Number Publication Date
EP3871136A1 true EP3871136A1 (fr) 2021-09-01

Family

ID=68468658

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19798184.8A Pending EP3871136A1 (fr) 2018-10-22 2019-10-21 Détection d'objet basée sur ordinateur dans une vidéo ou une image

Country Status (3)

Country Link
EP (1) EP3871136A1 (fr)
AU (2) AU2019367163B2 (fr)
WO (1) WO2020083831A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016401B (zh) * 2020-08-04 2024-05-17 杰创智能科技股份有限公司 基于跨模态行人重识别方法及装置
CN112712042B (zh) * 2021-01-04 2022-04-29 电子科技大学 嵌入关键帧提取的行人重识别端到端网络架构

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10049450B2 (en) * 2015-12-03 2018-08-14 Case Western Reserve University High-throughput adaptive sampling for whole-slide histopathology image analysis
WO2018017399A1 (fr) * 2016-07-20 2018-01-25 Usens, Inc. Procédé et système de poursuite 3d du squelette de la main

Also Published As

Publication number Publication date
WO2020083831A9 (fr) 2021-04-08
WO2020083831A1 (fr) 2020-04-30
AU2019367163B2 (en) 2024-03-28
AU2024204390A1 (en) 2024-07-18
AU2019367163A1 (en) 2021-06-03

Similar Documents

Publication Publication Date Title
US11694123B2 (en) Computer based object detection within a video or image
Alizadehsani et al. Handling of uncertainty in medical data using machine learning and probability theory techniques: A review of 30 years (1991–2020)
Chowdhary et al. An efficient segmentation and classification system in medical images using intuitionist possibilistic fuzzy C-mean clustering and fuzzy SVM algorithm
RU2703679C2 (ru) Способ и система поддержки принятия врачебных решений с использованием математических моделей представления пациентов
Zhang et al. Real-time remote health monitoring system driven by 5G MEC-IoT
Ghoshal et al. Estimating uncertainty in deep learning for reporting confidence to clinicians in medical image segmentation and diseases detection
Manimurugan et al. Two-stage classification model for the prediction of heart disease using IoMT and artificial intelligence
Guleria et al. XAI framework for cardiovascular disease prediction using classification techniques
GB2578325A (en) Computer based object detection within a video or image
Mekruksavanich et al. A hybrid deep residual network for efficient transitional activity recognition based on wearable sensors
AU2024204390A1 (en) Computer based object detection within a video or image
Strzelecki et al. Machine learning for biomedical application
US11531851B2 (en) Sequential minimal optimization algorithm for learning using partially available privileged information
Swain et al. Deep learning models for yoga pose monitoring
Souza et al. Internet of medical things: an effective and fully automatic IoT approach using deep learning and fine-tuning to lung CT segmentation
RU2720363C2 (ru) Способ формирования математических моделей пациента с использованием технологий искусственного интеллекта
Chadaga et al. Predicting cervical cancer biopsy results using demographic and epidemiological parameters: a custom stacked ensemble machine learning approach
Akbulut Automated pneumonia based lung diseases classification with robust technique based on a customized deep learning approach
Shirazi et al. Deep learning in the healthcare industry: theory and applications
US20240070440A1 (en) Multimodal representation learning
Rayan et al. Impact of IoT in Biomedical Applications Using Machine and Deep Learning
Nissa et al. A Technical Comparative Heart Disease Prediction Framework Using Boosting Ensemble Techniques
Jabbar et al. Machine Learning Methods for Signal, Image and Speech Processing
Palanisamy et al. Machine learning techniques for the performance enhancement of multiple classifiers in the detection of cardiovascular disease from PPG signals
Likhon et al. SkinMultiNet: advancements in skin cancer prediction using deep learning with web interface

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20210512

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
RIN1 Information on inventor provided before grant (corrected)

Inventor name: HARTE, THOMAS

Inventor name: PHAN, HUY QUOC

RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: FUTURE HEALTH WORKS LTD.

RIN1 Information on inventor provided before grant (corrected)

Inventor name: HARTE, THOMAS

Inventor name: PHAN, HUY QUOC

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230511

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20230720