WO2020256732A1 - Domain adaptation and fusion using task-irrelevant paired data in sequential form - Google Patents

Domain adaptation and fusion using task-irrelevant paired data in sequential form Download PDF

Info

Publication number
WO2020256732A1
WO2020256732A1 PCT/US2019/038370 US2019038370W WO2020256732A1 WO 2020256732 A1 WO2020256732 A1 WO 2020256732A1 US 2019038370 W US2019038370 W US 2019038370W WO 2020256732 A1 WO2020256732 A1 WO 2020256732A1
Authority
WO
WIPO (PCT)
Prior art keywords
domain
data
pipeline
source
task
Prior art date
Application number
PCT/US2019/038370
Other languages
French (fr)
Inventor
Kuan-Chuan Peng
Srikrishna KARANAM
Ziyan Wu
Jan Ernst
Original Assignee
Siemens Aktiengesellschaft
Siemens Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Aktiengesellschaft, Siemens Corporation filed Critical Siemens Aktiengesellschaft
Priority to PCT/US2019/038370 priority Critical patent/WO2020256732A1/en
Publication of WO2020256732A1 publication Critical patent/WO2020256732A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present disclosure generally relates to vision technology, and more specifically, to performing domain adaption and fusion for classifiers that apply machine learning.
  • a common task of interest in visual learning is to apply a machine learning process to one or more data distributions, which may include labeled or unlabeled images, to learn a model that can attach a label to a data sample from an unlabeled data distribution with minimal error.
  • a data adaptation process is required to transfer knowledge from the source domain (the data domain for the labeled data) to the target domain (i.e., the data domain for the unlabeled data).
  • the source data distribution may include labeled data set of grayscale images of numeric digits, where the target distribution is color (e.g., red, green, and blue (RGB)) images of numeric digits to be labeled for identification purpose.
  • the target distribution is color (e.g., red, green, and blue (RGB)) images of numeric digits to be labeled for identification purpose.
  • RGB red, green, and blue
  • such a transfer of knowledge between domains, or“domain adaptation” may occur using one of several approaches such as unsupervised, semi-supervised, or supervised.
  • unsupervised domain adaptation the training data typically includes a set of labeled source domain data and a set of unlabeled target domain data.
  • the training data typically includes a set of labeled source domain data, a set of unlabeled target domain data, and a small amount of labeled target domain data.
  • Fully supervised domain adaptation involves complete labeled training data for both the source domain and the target domain.
  • Machine learning classifiers implement classification algorithms for mapping input data to a category based on instances of observations and features/feature vectors of observation properties for a predicted class.
  • DNN deep neural networks
  • DNN deep neural networks
  • collecting and annotating such datasets are a tedious task, and in some contexts even impossible. Therefore, one line of approaches depends only on synthetically generated data for training DNN (for example, render depth images from CAD models). Nevertheless, for certain domains, it is also very difficult and non-trivial to synthesize realistic data, such as RGB images, sequential data, or meaningful documents with specific structure.
  • a“domain” can refer to either a modality or a dataset.
  • a 3D layout of a room can be captured by a depth sensor or inferred from RGB images.
  • access to data from certain domains(s) is often limited.
  • the shortage of labeled data for training classifiers in specific domains is a significant problem in machine learning applications since the cost of acquiring data labels is often high.
  • Domain adaptation is one way to address this problem by leveraging labeled data in one or more related domains, often referred to as“source domains,” when learning a classifier for labeling unseen data in a“target domain” for a task of interest (TOI).
  • source domains often referred to as “source domains”
  • TOI task of interest
  • An approach to effectively learn a feature representation fusing the source domain and target domain (either one of the source or target domain, or both, can be in sequential form) without using any task-relevant data from the target domain to further enhance analytics performance is described herein.
  • An objective is to learn from a source data distribution a well performing model on a target data distribution of a different domain. Unlike previous work dealing with static imagery, this disclosure includes classifier methods to process sequential data or domains involving time series.
  • Embodiments of the present disclosure include methods, systems, and computer program products for performing domain adaption and fusion between a source domain and a target domain using task-irrelevant paired data, where at least one of the source domain data and target domain data is in sequential form.
  • a non-limiting example includes an iterative training process for a neural network pipeline in the source domain and a neural network pipeline in the target domain.
  • An initial training for the source domain pipeline is performed using labeled task relevant source domain data, which may be sequential data such as words of a document.
  • a target domain representation may be simulated by training the target domain neural network pipeline against the trained source domain pipeline with the source domain parameters fixed, and using task-irrelevant pairs of dual domain input data.
  • a domain adaptation training step may include simultaneously training the source domain representation and the target domain representation with two loss functions, where the first loss function drives the source domain pipeline by the labeled task relevant data inputs and the second loss function drives the target domain representation by domain adaptation to the source domain using the task-irrelevant dual domain pairs of input data.
  • an extension of the end product of domain adaptation i.e., a source-domain task of interest solver and a target-domain task of interest solver
  • is domain fusion which produces a dual-domain (source and target) task solver, robust to noise in either domain.
  • FIG. 1 is a flow diagram illustrating an example of a training data set in accordance with one or more embodiments of the present disclosure
  • FIG. 2 is a flow diagram illustrating an example of a training process for a machine learning network to generate a single domain source representation for a task of interest in accordance with one or more embodiments of the present disclosure
  • FIG. 3 is a flow diagram illustrating an example of a training process for a machine learning network using domain adaptation to transfer features to target domain data in accordance with one or more embodiments of the present disclosure
  • FIG. 4 depicts a flow diagram illustrating an example of a training process for a machine learning network to simulate a target domain representation in accordance with one or more embodiments of the present disclosure
  • FIG. 5 depicts a flow diagram illustrating a joint training process of domain adaptation according to one or more embodiments of the present disclosure
  • FIG. 6 is a flow diagram illustrating an example of a training process for a machine learning network using domain fusion to generate a joint representation in accordance with one or more embodiments of the present disclosure
  • FIG. 7 is a flow diagram illustrating an example of a testing process for the joint representation trained in FIG. 6 in accordance with one or more embodiments of the present disclosure
  • FIG. 8 depicts an example of a system that facilitates machine learning in accordance with one or more embodiments of the present disclosure
  • FIG. 9 is a schematic illustration of a cloud computing environment in accordance with one or more embodiments of the present disclosure.
  • FIG. 10 is a schematic illustration of abstraction model layers in accordance with one or more embodiments of the present disclosure.
  • FIG. 11 is a schematic illustration of a computer system in accordance with one or more embodiments of the present disclosure.
  • task-irrelevant data is used to refer to data which is not task relevant. For example, in a case of applying a two-scene classification task that identifies computer room and conference room images, images that show a bedroom, bathroom, or basement would be task-irrelevant data.
  • the term“source modality” or“source domain” refers to the modality or domain that the abstract features are learned from and are to be transferred from.
  • the term“target modality” or“target domain” refers to the modality or domain that the abstract features are to be transferred to.
  • the term“task- relevant data” refers to data that is directly applicable and related to the end objective of the TOI. For example, if the task is classifying images of cats and dogs, then any image containing either a cat or a dog is considered to be task-relevant data.
  • the term“task-relevant images” is used herein to refer to task-relevant data that includes images.
  • the term“task-irrelevant data” refers to data that is not applicable to the end objective and has no relation to the end objective. For example, if the task is classifying images of cats and dogs, then any image that does not contain either a cat or a dog is considered to be task-irrelevant data.
  • the term“task-irrelevant images” is used herein to refer to task- irrelevant data that includes images.
  • FIG. 1 shows a basic flow diagram for the training data for a domain adaptation and fusion process presented in this disclosure, drawn from task relevant data 110 and task irrelevant data 120.
  • Task relevant source domain data 115 and dual domain task irrelevant data pairs 125 are adapted and fused by the domain adaptation and fusion engine 150, which will be described in further detail below.
  • data associated with workpieces 4, 5, 6 are“task- irrelevant” data pairs, (i.e., being unrelated to classifying workpieces 1, 2, 3 in terms of being a uniquely different item, and/or a different type of item) to which domain adaptation and domain fusion may be performed in accordance with embodiments of the present disclosure.
  • disclosed embodiments also pertain to other tasks of interest (e.g., regression tasks) and/or other domains (either static imagery or sequential data or time series).
  • One or more embodiments of the present disclosure described herein provide a process for domain adaptation transferring learned feature representations, or transfer learning, where the learning is based on a source data distribution in a well performing model being applied to a different target data distribution.
  • the transfer occurs from one domain to another domain using only pairwise information from the two domains, where either one or both domains can be sequential data or time series.
  • the pairwise information used in the paired training data can be any kind of fixed correspondence or relationship between the source and target domains such as, but not limited to, spatial relation, or temporal relation of time series data.
  • a TOI solution e.g., a classifier or detector
  • a TOI solution of a target domain is learned with only task-irrelevant pairwise data and task-relevant source domain data, where either the source or target domain, or both, is sequential data or time series.
  • shared abstract features are extracted from source and target domains by jointly optimizing over an objective TOI using task-irrelevant pairwise data pairs from source and target domains, where either the source or target domain, or both, is sequential data or time series.
  • a versatile approach is provided that can effectively transfer learned abstract features from one modality to another without requiring objective- relevant, or task- relevant, data from the target modality, while at the same time optimizing over the target objective.
  • an approach to effectively learn a feature representation by fusing the source modality and target modality without using any task-relevant data from the target modality is provided to further enhance the performance of analytics.
  • One or more embodiments of the present disclosure include a process for learning a fused representation and a TOI solver of source and target domain with task-relevant training data only from the source domain.
  • source domain and a source neural network e.g., a convolutional neural network (CNN), recurrent neural network (RNN), or deep neural network (DNN) can be used to simulate the input of the target domain in the target domain thread.
  • CNN convolutional neural network
  • RNN recurrent neural network
  • DNN deep neural network
  • the neural network may be fine-tuned to explore effective unique (i.e., not shared by the target domain) abstract features in the source domain to further boost the fusion performance.
  • FIG. 2 is a flow diagram illustrating an example of a training process for a machine learning network to generate a single domain source representation for a task of interest in accordance with one or more embodiments of the present disclosure.
  • a softmax loss function 230 receives supervisory labels, and a source representation 216 is generated by a series of neural network iterations shown as source RNN operations 211, 213, 215 being trained using the process shown in FIG. 2. Because sequential data is being processed as input data, each RNN operation is a sequential operation generating an intermediate representation 212, 214, which is a recurrent input for the next operation for a next intermediate representation with a current data input, with the final input producing a final representation, shown as source representation 216.
  • the source domain data type A may be documents related to a workpiece ID, and the supervisory labels would the workpiece ID.
  • the neural network is trained by source domain data with the objective of the training being to recognize the class (or category).
  • the source domain may be documents and the sequential data units of the source domain data may be words wi, W2, ... w n of the input documents 210, received as training data by the source RNN operations 211, 213, 215.
  • process 200 may be implemented by applying any source domain data type having a sequential format.
  • the neural network unit is trained with the objective of recognizing the workpiece ID to which each document belongs, represented by source representation 216.
  • Each intermediate representation of RNN operations 211, 213, 215 and the source representation 216 implemented for example as a one-dimensional feature vector.
  • the training is according to a supervised approach in which workpiece ID labels are input to the softmax loss function 230 to supervise the training.
  • the softmax loss function 230 is an objective function that provides feedback that is used to adjust the encoding by the source neural network.
  • FIG. 3 is a flow diagram illustrating an example of a training process for a machine learning network to transfer features from source domain to target domain data in accordance with one or more embodiments of the present disclosure.
  • a dual domain training is performed for a source domain of a neural network pipeline and a target domain neural network pipeline.
  • the processing shown in FIG. 3 takes a set of discriminative abstract features in the source domain (e.g., such as a document domain for this example) that were learned using a process such as that shown in FIG. 2, and adapts them to the target domain which is of a different domain (e.g., such as a depth image domain).
  • both the source domain input data e.g., data type A, or documents
  • the target domain input data e.g., data type B, or depth images
  • parameters for the source domain neural network pipeline shown as source RNN operations 311, 313, 315, source representation 316 are fixed
  • target domain neural network shown as DNN 321
  • the recurrent operation of source domain RNN operations 311, 313, 315 is similar to that described above for RNN operations 211, 213, 215 shown in FIG. 2.
  • the target domain for this example may be workpiece depth images
  • the input data 320 may be workpiece depth images used to train the target DNN 321 to generate target representation 326.
  • the input data 320 is not sequential with respect to the source data, and as such, the target domain neural network pipeline need not be an RNN.
  • a single depth scan image of a workpiece may be processed by target DNN 321 in a non-recurrent manner, while a document of 1,000 words in the source domain pertain to the same workpiece, and the words are training data for the source pipeline RNN operations 311, 313, 315.
  • FIG. 3 shows the use of an adversarial learning process for domain adaptation to transfer knowledge from the source domain (i.e., the source representation 316) to the target domain (e.g., the depth image domain) with minimal errors.
  • a classifier shown as discriminator 340 outputs a domain label 375 which is a value of zero or a one depending on which domain (target or source) the label comes from.
  • discriminator 340 accepts one input at a time, either from the source representation 316 or the target representation 326.
  • the training process 300 converges once the discriminator 340 is unable to determine which domain is providing the input, such that the source representation 316 and the target representation 326 have become practically indistinguishable.
  • target DNN 321 After target DNN 321 is trained, the target representation 326 and source representation 316 are theoretically interchangeable, as the learned target representation 326 is now adapted to the source representation 316.
  • the TOI solution working for the source domain should also work for the target domain.
  • Using an adversarial learning process, such as that shown in FIG. 3, is effective in domain adaptation even without much supervision.
  • a drawback to the approach shown in FIG. 3 is that real depth images 320 from the target domain that are task-relevant are required to train the target DNN 321, and these images are often unavailable or difficult to obtain.
  • workpiece images for training the target domain should be relevant to workpiece documents used by the source domain pipeline, but such workpiece images may not be readily available for effective training.
  • FIG. 4 a flow diagram is shown illustrating a training process for machine learning network to simulate a target domain representation using task-irrelevant dual domain data pairs in accordance with one or more embodiments of the present disclosure.
  • the neural network for the source domain pipeline in FIG. 4 shown as RNN operations 411, 413, 415, operates in a recurrent manner similar to that described above for RNN operations 211, 213, 215 in FIG. 2.
  • the adversarial learning-based domain adaption approach shown in FIG. 3 requires task-relevant data from the target domain which may be unavailable.
  • process 400 includes L2 loss function 455 to remove dependency on task relevant data from the source domain, by extracting abstract features common to both the source domain and target domain found in task-irrelevant data pairings.
  • the task- irrelevant dual domain data pairs 401 are related to the task-relevant dual domain data in that they share a common source domain (data type A) and a common target domain (data type B).
  • data type A data type A
  • data type B common target domain
  • the source domain may be documents and the target domain may be depth images.
  • the task relevant source domain may be documents related to a workpiece
  • the task relevant target domain may be depth images related to a workpiece
  • the task-irrelevant source domain may be a document related to automobile parts (e.g., data 410)
  • the task-irrelevant target domain may be depth images of automobile parts (e.g., data 420), where each pair of dual domain pairs 401 of a document 410 and a depth image 420 are labeled with pairwise label 430.
  • the process 400 exploits the availability of a large quantity of dual domain training data pairs 401 which is not relevant to the TOI.
  • the source representation 416 and target representation 426 can be adapted by the loss function 455.
  • parameters for source domain neural network pipeline shown as R N operations 411, 413, 415, and source representation 416 are trainable, while parameters for target neural network unit, shown as DNN 421, and target representation 426 are being fixed, which permits a domain adaptation for a dual domain training process 400.
  • the parameters are fixed for the source pipeline while the target pipeline is trainable.
  • the task-irrelevant dual-domain pairs are simultaneously fed to the source pipeline and the target pipeline to train the source pipeline until source representation 416 is adapted to the target representation 426.
  • the source domain data 410 may be sequential data type A (e.g., document data with sequential words) wi, W2, wn.
  • the target domain data 420 may be data type B (e.g., depth images) related to the data 410, such as documents and depth images related to a common feature (e.g., automobile parts), and unrelated to the TOI (e.g., for classification of particular workpieces).
  • L2 loss function 455, which takes the supervision of the pairwise label 430, is applied to output from the source representation 416 and output from the target representation 426.
  • the L2 loss function 455 can be replaced with any suitable loss function that encourages the similarity of the two input representations (e.g., an LI loss function).
  • source representation 416 which simulates a target representation (e.g., target representation 326) that would be generated if task relevant target domain training data (e.g., data 320) had been available.
  • target representation e.g., target representation 326
  • task relevant target domain training data e.g., data 320
  • source representation 416 is trained to resemble (or“simulate”) the target representation 426.
  • One drawback to using the processing shown in FIG. 4 is that it is possible that there may be insufficient abstract features identifiable as common to both source and target domains, which may significantly degrade performance of transferring the abstract features in the task irrelevant data pairs for the source domain and the target domain.
  • the embodiment shown in FIG. 5 overcomes this drawback as a refinement to process 400 by combining two loss functions together as a joint training process.
  • a training process 500 applies a source domain neural network pipeline shown as RNN operations 511, 513, 515, which are shared for both sets of source domain data (i.e., the task-relevant data 510 and the task-irrelevant data 505).
  • the source representation function 516 is also shared by both source domain data input streams 510, 505.
  • the recurrent operation of RNN operations 511, 513, 515 is similar to that described above for RNN operations 211, 213, 215 in FIG. 2.
  • the target pipeline is previously trained using target domain data 520 of the task irrelevant data pairs 501.
  • the target neural network unit shown as DNN 521 and target representation 525 are fixed for training the source RNNs 511, 512, 513 and source representation 515.
  • the training output is two analytic pipelines with source modality and target modality, which can solve the task objective effectively, despite having no task- relevant data from the target domain for use throughout the training process.
  • Class labeled task-relevant data 510 (e.g., document data 510 with sequential words wi, W2, w n ) is a first input for the source domain pipeline
  • task- irrelevant data 505 (e.g., document data 505 with sequential words wi, W2, wn) is a second input for the source domain pipeline.
  • the task-irrelevant data 505 includes a plurality of task- irrelevant sequential data units that are paired with target domain data 520 to form dual domain task- irrelevant data pairs 501 which are simultaneously fed to the source domain pipeline and target domain pipeline.
  • Each pair 501 of the task-irrelevant data comprises a series of task- irrelevant sequential data units (e.g., words wi, W2, wn) and a task-irrelevant data unit 520 (e.g., a depth scan image).
  • Pairwise labels 542 for the input pairs 501 are fed to the L2 loss function 535 (e.g., a property feature of the automobile part described and depicted by the dual domain data input pair).
  • the result of training process 500 is a source representation 516 configured as a source domain classifier refined by knowledge of abstract features learned from task irrelevant data pairings.
  • the product of domain adaptation of process 500 allows the trained source representation 516 to be applied as a simulated target representation, which could be used in a testing phase operation for classifying task relevant target domain data.
  • the source representation 516 may be applied in a target pipeline being fed depth image data related to source domain document data related to a TOI workpiece.
  • the source representation 516 may be implemented as a classifier in place of a trained target domain classifier, where task relevant target domain training data was unavailable to generate a proper target domain classifier.
  • FIG. 6 a flow diagram illustrating an example of a training process for a machine learning network using domain fusion to generate a joint classifier in accordance with one or more embodiments of the present disclosure.
  • the domain fusion pipeline is trained by concatenating two analytics streams for the source domain and the target domain, which are represented by source representation 616 and target representation 626, respectively, to generate a concatenated representation 630, thereby optimizing a joint classifier 640 for a TOI objective function (e.g., workpiece classification).
  • a TOI objective function e.g., workpiece classification
  • the training data input 620 for the target pipeline may be in sequential form, such as words wi, W2, w n of a document, to be processed by a series of neural network pipeline operations shown as RNN 621, 623, 625.
  • the training data input 610 for the source pipeline may be sequential data units, such as words wi, W2, w n of a document, to be processed by a series of neural network operations shown as RNN 611, 613, 615.
  • the source pipeline is initialized using parameters that generated the source representation 516 of the domain adaptation process 500 shown in FIG. 5.
  • the fusion training process 600 fixes the parameters for the target pipeline to learn abstract features between the source domain and target domain as a refinement to the domain adaptation process 500, using the source pipeline of process 600 to learn a source representation 616, which can explore the potential from the source domain fully, without the constraint that only the shared abstract features should be utilized for the domain fusion.
  • a task-relevant data input stream 610 e.g., data type A
  • task- relevant data input stream 620 e.g., data type A
  • target pipeline which allows the use of simulated target domain data to train the classifier 640 in the absence of target domain training data (e.g., data type B).
  • either one of the task-relevant data 610, 620 that is input to the source domain pipeline or target domain pipeline can optionally include empty inputs to simulate missing data and/or noise for a more robust performance of the classifier during training.
  • output from the source representation 616 and the target representation 626 are combined by concatenation representation function 630, which is input to joint classifier 640 and trained using a loss function 650 (e.g., a softmax loss function).
  • the resultant trained joint classifier 640 is capable of generating a prediction for a dual domain TOI objective.
  • the source domain data is sequential words of examination report documentation
  • the target domain data is depth scan images each generated for a respective examination report, to form paired data inputs.
  • the classifier 640 can identify the condition of the workpiece using a fused representation of the document-data input pairs.
  • FIG. 7 is a flow diagram illustrating an example of a testing process for the joint classifier trained in FIG. 6 in in accordance with one or more embodiments of the present disclosure.
  • the elements of process 600 are altered by removal of the loss function 650, as the joint classifier 740 is configured to directly output a predicted class label 755, which provides a TOI objective result.
  • the simulated target analytics pipeline of RNN operations 621, 623, 625 can be replaced with a neural network unit, shown as target DNN 721, suitable for actual task relevant target domain data.
  • target DNN 721 suitable for actual task relevant target domain data.
  • another type of DNN such as a CNN, may be used in place of an RNN.
  • simulated data is not applied as during the training phase, and actual task relevant source domain data and task relevant target domain data are paired according to at least one common feature (e.g., document words for sequential data units in source domain, and depth scan images for the target domain, paired according to being originated from a common examination report for a TOI workpiece).
  • the joint classifier 740 is capable of generating an output prediction that relates to the TOI objective for the given input pair (e.g., a workpiece property label based on a dual-domain paired input of document information and depth scan image information from an examination report of a particular workpiece of interest).
  • System 800 is a machine learning system that can be utilized to solve a variety of technical issues (e.g., learning previously unknown functional relationships) in connection with technologies such as, but not limited to, machine learning technologies, time-series data technologies, data analysis technologies, data classification technologies, data clustering technologies, trajectory/journey analysis technologies, medical device technologies, collaborative filtering technologies, recommendation system technologies, signal processing technologies, word embedding technologies, topic model technologies, image processing technologies, video processing technologies, audio processing technologies, and/or other digital technologies.
  • System 800 employs hardware and/or software to solve problems that are highly technical in nature, that are not abstract and that cannot be performed as a set of mental acts by a human.
  • system 800 some or all of the processes performed by system 800 are performed by one or more specialized computers (e.g., one or more specialized processing units, a specialized computer with a domain adaptation and fusion component, etc.) for carrying out defined tasks related to machine learning.
  • system 800 and/or components of the system are employed to solve new problems that arise through advancements in technologies mentioned above, employment of image data, machine learning process, and/or computer architecture, and the like.
  • the system 800 provides the above-described technical improvements to machine learning systems, artificial intelligence systems, data analysis systems, data analytics systems, data classification systems, data clustering systems, trajectory /journey analysis systems, medical device systems, collaborative filtering systems, recommendation systems, signal processing systems, word embedding systems, topic model systems, image processing systems, video processing systems, and/or other digital systems.
  • the system 800 also provide technical improvements to a central processing unit associated with a machine learning process by improving processing performance of the central processing unit, reducing computing bottlenecks of the central processing unit, improving processing efficiency of the central processing unit, and/or reducing an amount of time for the central processing unit to perform the machine learning process.
  • system 800 is configured to perform one or more processes, such as the ones described herein with reference to FIGS. 2-7.
  • system 800 includes preprocessing component 804, domain adaptation component 806, and domain fusion component 808.
  • system 800 constitutes machine-executable component(s) embodied within machine(s), (e.g ., embodied in one or more computer readable mediums (or media) associated with one or more machines). Such component s), when executed by the one or more machines, (e.g., computer(s), computing device(s), virtual machine(s), etc.) cause the machine(s) to perform the operations described.
  • system 800 includes memory 810 that stores computer executable components and instructions.
  • system 800 in some embodiments of the disclosure includes a processor 8 12 to facilitate execution of the computer executable components and instructions (e.g., computer executable components and corresponding instructions).
  • preprocessing component 804, domain adaptation component 806, and domain fusion component 808, memory 8 10, and/or processor 8 12 are electrically and/or communicatively coupled to one another in one or more embodiments of the disclosure.
  • preprocessing component 804 may be used to set up data for inputs to machine learning elements, such as source RNNs described above (e.g., RNNs 511, 512, 513).
  • preprocessing component 804 may be used to implement word parsing of documents used for source domain training data (e.g., documents 510 in FIG. 5), or pairing of task- irrelevant dual-domain data inputs (e.g., document-depth image pairs 501 in FIG. 5).
  • cloud computing environment 900 comprises one or more cloud computing nodes 902 with which local computing devices used by cloud consumers, such as, for example, camera 904, computers 906, 908.
  • the computers 906 implements an IPS system as described previously.
  • Nodes 902 may communicate with one another. They may be grouped (not shown) physically or virtually, in one or more networks, such as private, community, public, or hybrid clouds as described hereinabove, or a combination thereof. This allows cloud computing environment 900 to offer infrastructure, platforms and/or software as services for which a cloud consumer does not need to maintain resources on a local computing device.
  • computing devices 906-908 shown in FIG. 9 are intended to be illustrative only and that computing nodes 902 and cloud computing environment 900 can communicate with any type of computerized device over any type of network and/or network addressable connection (e.g., using a web browser).
  • hardware and software layer 1012 includes hardware and software components. Examples of hardware components include: mainframes 1014; RISC (Reduced Instruction Set Computer) architecture based servers 1016; servers 1018; blade servers 1020; storage devices 1022; and networks and networking components 1024.
  • mainframes 1014 RISC (Reduced Instruction Set Computer) architecture based servers 1016; servers 1018; blade servers 1020; storage devices 1022; and networks and networking components 1024.
  • RISC Reduced Instruction Set Computer
  • software components include network application server software 1026 and database software 1028; virtualization layer 1030 provides an abstraction layer from which the following examples of virtual entities may be provided: virtual servers 1032; virtual storage 1034; virtual networks 1036, including virtual private networks; virtual applications and operating systems 1038; and virtual clients 1040.
  • management layer 1042 may provide the functions described below.
  • Resource provisioning 1044 provides dynamic procurement of computing resources and other resources that are utilized to perform tasks within the cloud computing environment.
  • Metering and pricing 1046 provide cost tracking as resources are utilized within the cloud computing environment, and billing or invoicing for consumption of these resources. In one example, these resources may comprise application software licenses.
  • Security provides identity verification for cloud consumers and tasks, as well as protection for data and other resources.
  • User portal 1048 provides access to the cloud computing environment for consumers and system administrators.
  • Service level management 1050 provides cloud computing resource allocation and management such that required service levels are met.
  • Service Level Agreement (SLA) planning and fulfillment 1052 provides pre-arrangement for, and procurement of, cloud computing resources for which a future requirement is anticipated in accordance with an SLA.
  • SLA Service Level Agreement
  • Workloads layer 1054 provides examples of functionality for which the cloud computing environment may be utilized. Examples of workloads and functions that may be provided from this layer include: mapping and navigation 1056; software development and lifecycle management 1058; transaction processing 1060; point cloud to virtual reality data processing 1064; user defined content to point cloud processing 1066; and domain adaptation and fusion processing 1068.
  • FIG. 11 a schematic illustration of a system 1100 is depicted upon which aspects of one or more embodiments of domain adaption and fusion using weakly supervised target-irrelevant data may be implemented.
  • the computer 1101 includes a processing device 1105 and a memory 1110 coupled to a memory controller 1115 and an input/output controller 1135.
  • the input/output controller 1135 can be, for example, but not limited to, one or more buses or other wired or wireless connections, as is known in the art.
  • the input/output controller 1135 may have additional elements, which are omitted for simplicity, such as controllers, buffers (caches), drivers, repeaters, and receivers, to enable communications. Further, the computer 1101 may include address, control, and/or data connections to enable appropriate communications among the aforementioned components.
  • a keyboard 1150 and mouse 1155 or similar devices can be coupled to the input/output controller 1135.
  • input may be received via a touch-sensitive or motion sensitive interface (not depicted).
  • the computer 1101 can further include a display controller 1125 coupled to a display 1130.
  • a camera e.g., camera 904 of FIG. 9 may be coupled to the system 1100.
  • the memory 1110 can include any one or combination of volatile memory elements (e.g., random access memory (RAM, such as DRAM, SRAM, SDRAM, etc.)) and nonvolatile memory elements (e.g., ROM, erasable programmable read only memory (EPROM), electronically erasable programmable read only memory (EEPROM), flash memory, programmable read only memory (PROM), tape, compact disc read only memory (CD-ROM), flash drive, disk, hard disk drive, diskette, cartridge, cassette or the like, etc.).
  • volatile memory elements e.g., random access memory (RAM, such as DRAM, SRAM, SDRAM, etc.
  • nonvolatile memory elements e.g., ROM, erasable programmable read only memory (EPROM), electronically erasable programmable read only memory (EEPROM), flash memory, programmable read only memory (PROM), tape, compact disc read only memory (CD-ROM), flash drive, disk, hard disk drive, diskette, cartridge, cassette or the like, etc
  • the memory 1110 is an example of a tangible computer readable storage medium 1140 upon which instructions executable by the processing device 1105 may be embodied as a computer program product.
  • the memory 1110 can have a distributed architecture, where various components are situated remote from one another, but can be accessed by the processing device 1105.
  • the instructions in memory 1110 may include one or more separate programs, each of which comprises an ordered listing of executable instructions for implementing logical functions.
  • the instructions in the memory 1110 include a suitable operating system (OS) 1111 and program instructions 1116.
  • the operating system 1111 essentially controls the execution of other computer programs and provides scheduling, input-output control, file and data management, memory management, and communication control and related services.
  • the processing device 1105 is configured to execute instructions stored within the memory 1110, to communicate data to and from the memory 1110, and to generally control operations of the computer 1101 pursuant to the instructions.
  • Examples of program instructions 1116 can include instructions to implement the processing described herein in reference to FIGs. 1-10.
  • the computer 1101 of FIG. 11 also includes a network interface 1160 that can establish communication channels with one or more other computer systems via one or more network links.
  • the network interface 1160 can support wired and/or wireless communication protocols known in the art. For example, when embodied in a user system, the network interface 1160 can establish communication channels with an application server.
  • aspects of the present disclosure may be embodied as a system, method, or computer program product and may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, micro-code, etc.), or a combination thereof. Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
  • the computer readable storage medium may be a tangible medium containing or storing a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof
  • a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the computer readable medium may contain program code embodied thereon, which may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • computer program code for carrying out operations for implementing aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • the present disclosure may be a system, a method, and/or a computer program product at any possible technical detail level of integration
  • the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure
  • the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide, or other transmission media (e.g ., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
  • the network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, configuration data for integrated circuitry, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++, or the like, and procedural programming languages, such as the“C” programming language or similar programming languages.
  • the computer readable program instructions may execute entirely on the user’s computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instruction by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • the flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the blocks may occur out of the order noted in the Figures.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
  • Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise. On the contrary, such devices need only transmit to each other as necessary or desirable, and may actually refrain from exchanging data most of the time. For example, a machine in communication with another machine via the Internet may not transmit data to the other machine for weeks at a time. In addition, devices that are in communication with each other may communicate directly or indirectly through one or more intermediaries.
  • Determining something can be performed in a variety of manners and therefore the term “determining” (and like terms) includes calculating, computing, deriving, looking up (e.g., in a table, database or data structure), ascertaining and the like.
  • a "processor” generally means any one or more microprocessors, CPU devices, GPU devices, computing devices, microcontrollers, digital signal processors, or like devices, as further described herein.
  • a CPU typically performs a variety of tasks while a GPU is optimized to display images.
  • databases are described, (i) alternative database structures to those described may be readily employed, and (ii) other memory structures besides databases may be readily employed. Any illustrations or descriptions of any sample databases presented herein are illustrative arrangements for stored representations of information. Any number of other arrangements may be employed besides those suggested by, e.g., tables illustrated in drawings or elsewhere. Similarly, any illustrated entries of the databases represent exemplary information only; the number and content of the entries can be different from those described herein. Further, despite any depiction of the databases as tables, other formats (including relational databases, object-based models and/or distributed databases) could be used to store and manipulate the data types described herein. Likewise, object methods or behaviors of a database can be used to implement various processes, such as the described herein. In addition, the databases may, in a known manner, be stored locally or remotely from a device that accesses data in such a database.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A process for data classification for dual-domain data includes defining a neural network pipelines for source domain and target domain of a data classifier to learn a dual-domain task of interest. The data classifier is trained by simulating a target representation during a first training session using dual-domain task-irrelevant data pairs for input, and performing a domain adaptation in a second training session while simultaneously feeding task irrelevant data pairs and applying a first loss function, initialized with parameters of the source domain pipeline trained by the first training session. The source pipeline is shared for a joint training using a second loss function to generate a second source domain representation with task relevant data inputs. Task-relevant data in the target domain is unavailable for training the data classifier, and at least one of the training data sets is in sequential form.

Description

DOMAIN ADAPTATION AND FUSION
USING TASK-IRRELEVANT PAIRED DATA IN SEQUENTIAL FORM
TECHNICAL FIELD
[0001] The present disclosure generally relates to vision technology, and more specifically, to performing domain adaption and fusion for classifiers that apply machine learning.
BACKGROUND
[0002] A common task of interest in visual learning (e.g., computer vision for object recognition in images) is to apply a machine learning process to one or more data distributions, which may include labeled or unlabeled images, to learn a model that can attach a label to a data sample from an unlabeled data distribution with minimal error. When the unlabeled data distribution is of a domain different than the labeled distribution, then a data adaptation process is required to transfer knowledge from the source domain (the data domain for the labeled data) to the target domain (i.e., the data domain for the unlabeled data). For example, in the case of identifying numeric digits in images, the source data distribution may include labeled data set of grayscale images of numeric digits, where the target distribution is color (e.g., red, green, and blue (RGB)) images of numeric digits to be labeled for identification purpose. In general, such a transfer of knowledge between domains, or“domain adaptation”, may occur using one of several approaches such as unsupervised, semi-supervised, or supervised. In unsupervised domain adaptation, the training data typically includes a set of labeled source domain data and a set of unlabeled target domain data. For semi-supervised domain adaptation, the training data typically includes a set of labeled source domain data, a set of unlabeled target domain data, and a small amount of labeled target domain data. Fully supervised domain adaptation involves complete labeled training data for both the source domain and the target domain.
[0003] Machine learning classifiers implement classification algorithms for mapping input data to a category based on instances of observations and features/feature vectors of observation properties for a predicted class. Recent progress in computer vision classifiers has been dominated by deep neural networks (DNN) trained with large amount of labeled data. However, collecting and annotating such datasets are a tedious task, and in some contexts even impossible. Therefore, one line of approaches depends only on synthetically generated data for training DNN (for example, render depth images from CAD models). Nevertheless, for certain domains, it is also very difficult and non-trivial to synthesize realistic data, such as RGB images, sequential data, or meaningful documents with specific structure.
[0004] Information that is useful to solve practical tasks for computer vision classifiers often exists in different domains, in which the information is captured by various sensors. As used herein, a“domain” can refer to either a modality or a dataset. For example, in one scenario, a 3D layout of a room can be captured by a depth sensor or inferred from RGB images. In real-world scenarios, however, access to data from certain domains(s) is often limited. The shortage of labeled data for training classifiers in specific domains is a significant problem in machine learning applications since the cost of acquiring data labels is often high. Domain adaptation is one way to address this problem by leveraging labeled data in one or more related domains, often referred to as“source domains,” when learning a classifier for labeling unseen data in a“target domain” for a task of interest (TOI).
[0005] In an earlier copending filing, US Pat. Publication 2018/03300205, a domain adaptation and fusion method was proposed as a versatile approach that can effectively transfer the learned abstract features from one image domain to another one without requiring task relevant data from the target domain, at the same time optimizing over the end objective of the TOI.
SUMMARY
[0006] An approach to effectively learn a feature representation fusing the source domain and target domain (either one of the source or target domain, or both, can be in sequential form) without using any task-relevant data from the target domain to further enhance analytics performance is described herein. An objective is to learn from a source data distribution a well performing model on a target data distribution of a different domain. Unlike previous work dealing with static imagery, this disclosure includes classifier methods to process sequential data or domains involving time series.
[0007] Embodiments of the present disclosure include methods, systems, and computer program products for performing domain adaption and fusion between a source domain and a target domain using task-irrelevant paired data, where at least one of the source domain data and target domain data is in sequential form. A non-limiting example includes an iterative training process for a neural network pipeline in the source domain and a neural network pipeline in the target domain. An initial training for the source domain pipeline is performed using labeled task relevant source domain data, which may be sequential data such as words of a document. To address unavailable task relevant target domain data for training, a target domain representation may be simulated by training the target domain neural network pipeline against the trained source domain pipeline with the source domain parameters fixed, and using task-irrelevant pairs of dual domain input data. In another embodiment example to address a lack of common abstract features between the target domain and source domain, a domain adaptation training step may include simultaneously training the source domain representation and the target domain representation with two loss functions, where the first loss function drives the source domain pipeline by the labeled task relevant data inputs and the second loss function drives the target domain representation by domain adaptation to the source domain using the task-irrelevant dual domain pairs of input data. In another embodiment example, an extension of the end product of domain adaptation (i.e., a source-domain task of interest solver and a target-domain task of interest solver), is domain fusion, which produces a dual-domain (source and target) task solver, robust to noise in either domain.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The specifics of the exclusive rights described herein are particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other features and advantages of the embodiments of the disclosure are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow diagram illustrating an example of a training data set in accordance with one or more embodiments of the present disclosure;
FIG. 2 is a flow diagram illustrating an example of a training process for a machine learning network to generate a single domain source representation for a task of interest in accordance with one or more embodiments of the present disclosure; FIG. 3 is a flow diagram illustrating an example of a training process for a machine learning network using domain adaptation to transfer features to target domain data in accordance with one or more embodiments of the present disclosure;
FIG. 4 depicts a flow diagram illustrating an example of a training process for a machine learning network to simulate a target domain representation in accordance with one or more embodiments of the present disclosure;
FIG. 5 depicts a flow diagram illustrating a joint training process of domain adaptation according to one or more embodiments of the present disclosure;
FIG. 6 is a flow diagram illustrating an example of a training process for a machine learning network using domain fusion to generate a joint representation in accordance with one or more embodiments of the present disclosure;
FIG. 7 is a flow diagram illustrating an example of a testing process for the joint representation trained in FIG. 6 in accordance with one or more embodiments of the present disclosure;
FIG. 8 depicts an example of a system that facilitates machine learning in accordance with one or more embodiments of the present disclosure;
FIG. 9 is a schematic illustration of a cloud computing environment in accordance with one or more embodiments of the present disclosure;
FIG. 10 is a schematic illustration of abstraction model layers in accordance with one or more embodiments of the present disclosure; and
FIG. 11 is a schematic illustration of a computer system in accordance with one or more embodiments of the present disclosure.
DETAILED DESCRIPTION
[0009] Various embodiments of the disclosure are described herein with reference to the related drawings. Alternative embodiments of the disclosure can be devised without departing from the scope of this disclosure. Various connections and positional relationships (e.g., over, below, adjacent, etc.) are set forth between elements in the following description and in the drawings. These connections and/or positional relationships, unless specified otherwise, can be direct or indirect, and the present disclosure is not intended to be limiting in this respect. Accordingly, a coupling of entities can refer to either a direct or an indirect coupling, and a positional relationship between entities can be a direct or indirect positional relationship. Moreover, the various tasks and process steps described herein can be incorporated into a more comprehensive procedure or process having additional steps or functionality not described in detail herein.
[0010] Methods and systems are described for achieving a task objective related to data of more than one domain. As noted above, when applying machine learning to such data, access to data from one or more domains is often limited or not available for training the network. One of the main challenges faced by a domain adaptation solution is identifying how knowledge learned from a given set of data can be applicable to other domains in which a source domain, a target domain, and a task of interest (TOI) are given. One objective of a domain fusion solution is to obtain a dual-domain (source and target) task solver which is robust to noise in either domain.
[0011] While there are existing proposed approaches (supervised, semi-supervised, or unsupervised), in domain adaptation and fusion, they assume that the task-relevant data ( e.g the data that is directly applicable and related to the TOI) in the target domain, is available at training time. However, typically the target domain data is not available. For example, acquiring a depth image inside a small delicate component may be infeasible due to an unstable tool at hand and/or a limited amount of time or budget.
[0012] As used herein, the term“task-irrelevant data” is used to refer to data which is not task relevant. For example, in a case of applying a two-scene classification task that identifies computer room and conference room images, images that show a bedroom, bathroom, or basement would be task-irrelevant data.
[0013] As used herein, the term“source modality” or“source domain” refers to the modality or domain that the abstract features are learned from and are to be transferred from. As used herein, the term“target modality” or“target domain” refers to the modality or domain that the abstract features are to be transferred to.
[0014] As used herein, the term“task- relevant data” refers to data that is directly applicable and related to the end objective of the TOI. For example, if the task is classifying images of cats and dogs, then any image containing either a cat or a dog is considered to be task-relevant data. The term“task-relevant images” is used herein to refer to task-relevant data that includes images. As used herein, the term“task-irrelevant data” refers to data that is not applicable to the end objective and has no relation to the end objective. For example, if the task is classifying images of cats and dogs, then any image that does not contain either a cat or a dog is considered to be task-irrelevant data. The term“task-irrelevant images” is used herein to refer to task- irrelevant data that includes images.
[0015] One or more embodiments of the present disclosure address one or more of the problems identified above by providing a domain adaptation and fusion approach which learns common abstract features between source domain and target domain based on knowledge extracted from task-irrelevant dual-domain training pairs without having task-relevant target domain training data available. FIG. 1 shows a basic flow diagram for the training data for a domain adaptation and fusion process presented in this disclosure, drawn from task relevant data 110 and task irrelevant data 120. Task relevant source domain data 115 and dual domain task irrelevant data pairs 125 are adapted and fused by the domain adaptation and fusion engine 150, which will be described in further detail below.
[0016] One or more the disclosed embodiments can be used to solve domain adaptation and sensor fusion tasks involving sequential data for when the task-relevant target domain data is unavailable at training time. As an implementation example, assume the given task is workpiece classification using depth images and/or the corresponding documents (e.g., captions or inspection reports of the task-relevant workpieces). The solution enables classifying task relevant workpieces when data from either one or both domains are available. Given (1) a set of dual domain pairwise training data (depth-document pairs) of task-irrelevant workpieces, and (2) a set of source domain training data (documents only without depth images) of task-relevant workpieces, the source domain is document data (sequential data units such as word sequences) and the target domain is depth images (static imagery). As a particular use case example, the objective task of interest is predicting remaining lifetime for workpieces 1, 2 and 3 (e.g., turbine blade) based on dual domain data of examination report documents (e.g., descriptions of periodic visual inspection of turbine blades) and depth scan images for workpieces 1, 2, 3. For example, the documents may contain semantic information pertaining to the workpieces, such as description of observed surface characteristics (e.g., smooth, cracked, pitted, surface imperfections, etc.) which correspond to the accompanying depth scan images. In this example, there is currently insufficient number of depth scan images to properly train a neural network classifier. However, there are sufficiently available data pairs of inspection documents and accompanying depth scan images for workpieces 4, 5 and 6 (e.g., propeller blades), which have been classified (i.e., labeled) for an adequate set of pairwise training data for source and target domain neural network training. In this case, data associated with workpieces 4, 5, 6 are“task- irrelevant” data pairs, (i.e., being unrelated to classifying workpieces 1, 2, 3 in terms of being a uniquely different item, and/or a different type of item) to which domain adaptation and domain fusion may be performed in accordance with embodiments of the present disclosure. As this example is non-limiting, disclosed embodiments also pertain to other tasks of interest (e.g., regression tasks) and/or other domains (either static imagery or sequential data or time series).
[0017] One or more embodiments of the present disclosure described herein provide a process for domain adaptation transferring learned feature representations, or transfer learning, where the learning is based on a source data distribution in a well performing model being applied to a different target data distribution. The transfer occurs from one domain to another domain using only pairwise information from the two domains, where either one or both domains can be sequential data or time series. The pairwise information used in the paired training data can be any kind of fixed correspondence or relationship between the source and target domains such as, but not limited to, spatial relation, or temporal relation of time series data.
[0018] In accordance with one or more embodiments of the present disclosure, a TOI solution (e.g., a classifier or detector) of a target domain is learned with only task-irrelevant pairwise data and task-relevant source domain data, where either the source or target domain, or both, is sequential data or time series.
[0019] In accordance with one or more embodiments of the present disclosure, shared abstract features are extracted from source and target domains by jointly optimizing over an objective TOI using task-irrelevant pairwise data pairs from source and target domains, where either the source or target domain, or both, is sequential data or time series.
[0020] In accordance with one or more embodiments of the present disclosure, where either the source or target domain, or both, is sequential data or time series, a versatile approach is provided that can effectively transfer learned abstract features from one modality to another without requiring objective- relevant, or task- relevant, data from the target modality, while at the same time optimizing over the target objective. Based on the transfer of the learned abstract features, an approach to effectively learn a feature representation by fusing the source modality and target modality without using any task-relevant data from the target modality is provided to further enhance the performance of analytics.
[0021] One or more embodiments of the present disclosure include a process for learning a fused representation and a TOI solver of source and target domain with task-relevant training data only from the source domain. In the fusion learning, source domain and a source neural network (e.g., a convolutional neural network (CNN), recurrent neural network (RNN), or deep neural network (DNN) can be used to simulate the input of the target domain in the target domain thread. For the source domain thread in the fusion learning, the neural network may be fine-tuned to explore effective unique (i.e., not shared by the target domain) abstract features in the source domain to further boost the fusion performance.
[0022] FIG. 2 is a flow diagram illustrating an example of a training process for a machine learning network to generate a single domain source representation for a task of interest in accordance with one or more embodiments of the present disclosure. A softmax loss function 230 receives supervisory labels, and a source representation 216 is generated by a series of neural network iterations shown as source RNN operations 211, 213, 215 being trained using the process shown in FIG. 2. Because sequential data is being processed as input data, each RNN operation is a sequential operation generating an intermediate representation 212, 214, which is a recurrent input for the next operation for a next intermediate representation with a current data input, with the final input producing a final representation, shown as source representation 216. In an embodiment for an objective task of classifying a workpiece property or characteristic (e.g., identifying surface and sub-surface inspection properties useful for predicting failure of a workpiece), the source domain data type A may be documents related to a workpiece ID, and the supervisory labels would the workpiece ID. For the example shown in FIG. 2, the neural network is trained by source domain data with the objective of the training being to recognize the class (or category). The source domain may be documents and the sequential data units of the source domain data may be words wi, W2, ... wn of the input documents 210, received as training data by the source RNN operations 211, 213, 215. However, process 200 may be implemented by applying any source domain data type having a sequential format. The RNN operations 211, 213, 215 are used to encode the sequential data, where each sequential data unit wi, W2, wn is sequentially fed to a neural network unit (e.g., RNN) which takes that data unit and the encoded features from the previous neural network operation as input, and computes encoded features according to an applicable machine learning process, including but not limited to classification, clustering, regression, sequence labeling, or probabilistic classification. For the example of sequential data words, the training process continues until the last word of the document is used as input, and the final encoded features are the source representation 216 (e.g., single dimension and/or multidimension vectors representing common, abstract features of the source domain and the target domain; example of feature values may include correspondence to image pixels or to document words). The neural network unit is trained with the objective of recognizing the workpiece ID to which each document belongs, represented by source representation 216. Each intermediate representation of RNN operations 211, 213, 215 and the source representation 216, implemented for example as a one-dimensional feature vector. As shown in FIG. 2, the training is according to a supervised approach in which workpiece ID labels are input to the softmax loss function 230 to supervise the training. The softmax loss function 230 is an objective function that provides feedback that is used to adjust the encoding by the source neural network.
[0023] FIG. 3 is a flow diagram illustrating an example of a training process for a machine learning network to transfer features from source domain to target domain data in accordance with one or more embodiments of the present disclosure. In the process shown in FIG. 3, a dual domain training is performed for a source domain of a neural network pipeline and a target domain neural network pipeline. The processing shown in FIG. 3 takes a set of discriminative abstract features in the source domain (e.g., such as a document domain for this example) that were learned using a process such as that shown in FIG. 2, and adapts them to the target domain which is of a different domain (e.g., such as a depth image domain). Here, both the source domain input data (e.g., data type A, or documents) and the target domain input data (e.g., data type B, or depth images) are task relevant in terms of classifying data of two domains that relate to a common feature (e.g., a workpiece). In an embodiment, parameters for the source domain neural network pipeline, shown as source RNN operations 311, 313, 315, source representation 316 are fixed, while target domain neural network, shown as DNN 321, is being trained using the process shown in FIG. 3. Here, the recurrent operation of source domain RNN operations 311, 313, 315 is similar to that described above for RNN operations 211, 213, 215 shown in FIG. 2. The target domain for this example may be workpiece depth images, and the input data 320 may be workpiece depth images used to train the target DNN 321 to generate target representation 326. In this example, the input data 320 is not sequential with respect to the source data, and as such, the target domain neural network pipeline need not be an RNN. For example, a single depth scan image of a workpiece may be processed by target DNN 321 in a non-recurrent manner, while a document of 1,000 words in the source domain pertain to the same workpiece, and the words are training data for the source pipeline RNN operations 311, 313, 315. In the case when the target domain is based on sequential data or time series, the target DNN 321 can be replaced with a series of RNNs and trained with a similar method as the source RNNs 311, 313, 315. FIG. 3 shows the use of an adversarial learning process for domain adaptation to transfer knowledge from the source domain (i.e., the source representation 316) to the target domain (e.g., the depth image domain) with minimal errors. As shown in FIG. 3, a classifier shown as discriminator 340 outputs a domain label 375 which is a value of zero or a one depending on which domain (target or source) the label comes from. To perform the discriminator decision, discriminator 340 accepts one input at a time, either from the source representation 316 or the target representation 326. The training process 300 converges once the discriminator 340 is unable to determine which domain is providing the input, such that the source representation 316 and the target representation 326 have become practically indistinguishable.
[0024] After target DNN 321 is trained, the target representation 326 and source representation 316 are theoretically interchangeable, as the learned target representation 326 is now adapted to the source representation 316. As such, the TOI solution working for the source domain should also work for the target domain. Using an adversarial learning process, such as that shown in FIG. 3, is effective in domain adaptation even without much supervision. A drawback to the approach shown in FIG. 3 is that real depth images 320 from the target domain that are task-relevant are required to train the target DNN 321, and these images are often unavailable or difficult to obtain. For example, workpiece images for training the target domain should be relevant to workpiece documents used by the source domain pipeline, but such workpiece images may not be readily available for effective training.
[0025] Turning now to FIG. 4, a flow diagram is shown illustrating a training process for machine learning network to simulate a target domain representation using task-irrelevant dual domain data pairs in accordance with one or more embodiments of the present disclosure. The neural network for the source domain pipeline in FIG. 4, shown as RNN operations 411, 413, 415, operates in a recurrent manner similar to that described above for RNN operations 211, 213, 215 in FIG. 2. As described above, the adversarial learning-based domain adaption approach shown in FIG. 3 requires task-relevant data from the target domain which may be unavailable. In an embodiment, process 400 includes L2 loss function 455 to remove dependency on task relevant data from the source domain, by extracting abstract features common to both the source domain and target domain found in task-irrelevant data pairings. In an embodiment, the task- irrelevant dual domain data pairs 401 are related to the task-relevant dual domain data in that they share a common source domain (data type A) and a common target domain (data type B). For example, for a TOI of classifying workpiece ID from a document or a depth image, the source domain may be documents and the target domain may be depth images. As such, the task relevant source domain may be documents related to a workpiece, the task relevant target domain may be depth images related to a workpiece, the task-irrelevant source domain may be a document related to automobile parts (e.g., data 410), and the task-irrelevant target domain may be depth images of automobile parts (e.g., data 420), where each pair of dual domain pairs 401 of a document 410 and a depth image 420 are labeled with pairwise label 430. The process 400 exploits the availability of a large quantity of dual domain training data pairs 401 which is not relevant to the TOI. By learning abstract features common to the source domain and the target domain for the paired data 401, the source representation 416 and target representation 426 can be adapted by the loss function 455.
[0026] In an embodiment, parameters for source domain neural network pipeline, shown as R N operations 411, 413, 415, and source representation 416 are trainable, while parameters for target neural network unit, shown as DNN 421, and target representation 426 are being fixed, which permits a domain adaptation for a dual domain training process 400. In another embodiment, the parameters are fixed for the source pipeline while the target pipeline is trainable. The task-irrelevant dual-domain pairs are simultaneously fed to the source pipeline and the target pipeline to train the source pipeline until source representation 416 is adapted to the target representation 426. The source domain data 410 may be sequential data type A (e.g., document data with sequential words) wi, W2, wn. The target domain data 420 may be data type B (e.g., depth images) related to the data 410, such as documents and depth images related to a common feature (e.g., automobile parts), and unrelated to the TOI (e.g., for classification of particular workpieces). L2 loss function 455, which takes the supervision of the pairwise label 430, is applied to output from the source representation 416 and output from the target representation 426. In one or more embodiments, the L2 loss function 455 can be replaced with any suitable loss function that encourages the similarity of the two input representations (e.g., an LI loss function). The result is a source representation 416 which simulates a target representation (e.g., target representation 326) that would be generated if task relevant target domain training data (e.g., data 320) had been available. The result of process 400 is that source representation 416 is trained to resemble (or“simulate”) the target representation 426.
[0027] One drawback to using the processing shown in FIG. 4 is that it is possible that there may be insufficient abstract features identifiable as common to both source and target domains, which may significantly degrade performance of transferring the abstract features in the task irrelevant data pairs for the source domain and the target domain. The embodiment shown in FIG. 5 overcomes this drawback as a refinement to process 400 by combining two loss functions together as a joint training process.
[0028] FIG. 5 depicts a flow diagram illustrating a joint training process of domain adaptation according to one or more embodiments of the present disclosure. A way to overcome the lack of identifiable features from both source and target domains is achieved with process 500 by jointly training the source domain pipeline with both the task-relevant source domain data and the task- irrelevant source domain data. A first loss function is used for the TOI task-relevant data, and a second loss function compares the dual domain task irrelevant source/target pipelines of FIG. 4, which enforces similarity of abstract features extracted from source domain data and target domain data. As shown in FIG. 5, a training process 500 applies a source domain neural network pipeline shown as RNN operations 511, 513, 515, which are shared for both sets of source domain data (i.e., the task-relevant data 510 and the task-irrelevant data 505). The source representation function 516 is also shared by both source domain data input streams 510, 505. The recurrent operation of RNN operations 511, 513, 515 is similar to that described above for RNN operations 211, 213, 215 in FIG. 2. By integrating the training of the source representation according to the source domain pipeline shared by two input streams 510, 505, the task of transferring abstract features from the source domain to the target domain and optimization over the target task objective can be achieved simultaneously. In an embodiment, the target pipeline is previously trained using target domain data 520 of the task irrelevant data pairs 501. Once trained, the target neural network unit shown as DNN 521 and target representation 525 are fixed for training the source RNNs 511, 512, 513 and source representation 515. The training output is two analytic pipelines with source modality and target modality, which can solve the task objective effectively, despite having no task- relevant data from the target domain for use throughout the training process. Class labeled task-relevant data 510 (e.g., document data 510 with sequential words wi, W2, wn) is a first input for the source domain pipeline, while task- irrelevant data 505 (e.g., document data 505 with sequential words wi, W2, wn) is a second input for the source domain pipeline. The task-irrelevant data 505 includes a plurality of task- irrelevant sequential data units that are paired with target domain data 520 to form dual domain task- irrelevant data pairs 501 which are simultaneously fed to the source domain pipeline and target domain pipeline. Each pair 501 of the task-irrelevant data comprises a series of task- irrelevant sequential data units (e.g., words wi, W2, wn) and a task-irrelevant data unit 520 (e.g., a depth scan image). Pairwise labels 542 for the input pairs 501 are fed to the L2 loss function 535 (e.g., a property feature of the automobile part described and depicted by the dual domain data input pair). The result of training process 500 is a source representation 516 configured as a source domain classifier refined by knowledge of abstract features learned from task irrelevant data pairings.
[0029] In an embodiment, the product of domain adaptation of process 500 allows the trained source representation 516 to be applied as a simulated target representation, which could be used in a testing phase operation for classifying task relevant target domain data. For example, in a case where the TOI relates to workpiece property classification with document data for the source domain, and depth scan image data for the target domain, the source representation 516 may be applied in a target pipeline being fed depth image data related to source domain document data related to a TOI workpiece. In other words, the source representation 516 may be implemented as a classifier in place of a trained target domain classifier, where task relevant target domain training data was unavailable to generate a proper target domain classifier.
[0030] Domain fusion is a process that can construct a TOI solution for a joint classifier capable of handling input data of either source or target domain. In an ideal training situation, task relevant training data is available in both target and source domains. In real practice, one or both of source and target domain data when operating the classifier may be noisy, which can impact performance even if a robust training of the neural network was performed with sufficient amount of training data in both domains. The following describes one or more embodiments for domain fusion that apply training dual domain neural network pipelines to simulate noisy real data, or where insufficient training data is available in target domain, to provide more robust joint classifier performance under such conditions.
[0031] Turning now to FIG. 6, a flow diagram illustrating an example of a training process for a machine learning network using domain fusion to generate a joint classifier in accordance with one or more embodiments of the present disclosure. As shown in FIG. 6, the domain fusion pipeline is trained by concatenating two analytics streams for the source domain and the target domain, which are represented by source representation 616 and target representation 626, respectively, to generate a concatenated representation 630, thereby optimizing a joint classifier 640 for a TOI objective function (e.g., workpiece classification). While a concatenation function 630 is shown, other combination techniques may be used, including but not limited to average pooling (compute the average representation of the source and target representation), or randomly dropping out of a few elements of either source or target representation or both before concatenation to improve robustness. In an embodiment in which task-relevant training data is not available from target domain, the learning of abstract features shared between the domains (i.e., similarity of domains) can be taken from the previously described embodiments shown in FIGs. 4 and 5, which learned abstract features from the task-irrelevant data pairings. This allows the output of the target representation 626 to be simulated by feeding task relevant data input 620 to a target pipeline initialized by the parameters that generated the simulated target representation 416 in the training process 400, shown in FIG. 4. As in previously described embodiments, the training data input 620 for the target pipeline may be in sequential form, such as words wi, W2, wn of a document, to be processed by a series of neural network pipeline operations shown as RNN 621, 623, 625. Similarly, the training data input 610 for the source pipeline may be sequential data units, such as words wi, W2, wn of a document, to be processed by a series of neural network operations shown as RNN 611, 613, 615. To address the lack of target domain training data, the source pipeline is initialized using parameters that generated the source representation 516 of the domain adaptation process 500 shown in FIG. 5. In an embodiment, the fusion training process 600 fixes the parameters for the target pipeline to learn abstract features between the source domain and target domain as a refinement to the domain adaptation process 500, using the source pipeline of process 600 to learn a source representation 616, which can explore the potential from the source domain fully, without the constraint that only the shared abstract features should be utilized for the domain fusion. [0032] As shown in FIG. 6, a task-relevant data input stream 610 (e.g., data type A) is fed to the source pipeline representation and task- relevant data input stream 620 (e.g., data type A) is input to the target pipeline, which allows the use of simulated target domain data to train the classifier 640 in the absence of target domain training data (e.g., data type B). This is made possible by the simulated target representation 416 being applied to initialize the target pipeline for the fusion training process 600. In an embodiment, either one of the task-relevant data 610, 620 that is input to the source domain pipeline or target domain pipeline can optionally include empty inputs to simulate missing data and/or noise for a more robust performance of the classifier during training.
[0033] As shown in FIG. 6, output from the source representation 616 and the target representation 626 are combined by concatenation representation function 630, which is input to joint classifier 640 and trained using a loss function 650 (e.g., a softmax loss function). The resultant trained joint classifier 640 is capable of generating a prediction for a dual domain TOI objective. For example, in an implementation for solving a TOI that identifies a workpiece condition for analyzing remaining lifetime prediction, the source domain data is sequential words of examination report documentation, the target domain data is depth scan images each generated for a respective examination report, to form paired data inputs. When testing the trained joint classifier 640 with these paired inputs, the classifier 640 can identify the condition of the workpiece using a fused representation of the document-data input pairs.
[0034] FIG. 7 is a flow diagram illustrating an example of a testing process for the joint classifier trained in FIG. 6 in in accordance with one or more embodiments of the present disclosure. After the fusion analytics pipeline learning phase in FIG. 6 is completed, the elements of process 600 are altered by removal of the loss function 650, as the joint classifier 740 is configured to directly output a predicted class label 755, which provides a TOI objective result. Also, the simulated target analytics pipeline of RNN operations 621, 623, 625 can be replaced with a neural network unit, shown as target DNN 721, suitable for actual task relevant target domain data. For example, if the target domain data type is not in sequential form, another type of DNN, such as a CNN, may be used in place of an RNN. In operation of the testing process 700, simulated data is not applied as during the training phase, and actual task relevant source domain data and task relevant target domain data are paired according to at least one common feature (e.g., document words for sequential data units in source domain, and depth scan images for the target domain, paired according to being originated from a common examination report for a TOI workpiece). The joint classifier 740 is capable of generating an output prediction that relates to the TOI objective for the given input pair (e.g., a workpiece property label based on a dual-domain paired input of document information and depth scan image information from an examination report of a particular workpiece of interest).
[0035] Turning now to FIG. 8, a block diagram illustrates an example of a non-limiting system that facilitates machine learning in accordance with one or more embodiments of the present disclosure. System 800 is a machine learning system that can be utilized to solve a variety of technical issues (e.g., learning previously unknown functional relationships) in connection with technologies such as, but not limited to, machine learning technologies, time-series data technologies, data analysis technologies, data classification technologies, data clustering technologies, trajectory/journey analysis technologies, medical device technologies, collaborative filtering technologies, recommendation system technologies, signal processing technologies, word embedding technologies, topic model technologies, image processing technologies, video processing technologies, audio processing technologies, and/or other digital technologies. System 800 employs hardware and/or software to solve problems that are highly technical in nature, that are not abstract and that cannot be performed as a set of mental acts by a human.
[0036] In certain embodiments of the disclosure, some or all of the processes performed by system 800 are performed by one or more specialized computers (e.g., one or more specialized processing units, a specialized computer with a domain adaptation and fusion component, etc.) for carrying out defined tasks related to machine learning. In some embodiments of the disclosure, system 800 and/or components of the system are employed to solve new problems that arise through advancements in technologies mentioned above, employment of image data, machine learning process, and/or computer architecture, and the like. In one or more embodiments of the disclosure, the system 800 provides the above-described technical improvements to machine learning systems, artificial intelligence systems, data analysis systems, data analytics systems, data classification systems, data clustering systems, trajectory /journey analysis systems, medical device systems, collaborative filtering systems, recommendation systems, signal processing systems, word embedding systems, topic model systems, image processing systems, video processing systems, and/or other digital systems. In one or more embodiments of the disclosure, the system 800 also provide technical improvements to a central processing unit associated with a machine learning process by improving processing performance of the central processing unit, reducing computing bottlenecks of the central processing unit, improving processing efficiency of the central processing unit, and/or reducing an amount of time for the central processing unit to perform the machine learning process.
[0037] In FIG. 8, system 800 is configured to perform one or more processes, such as the ones described herein with reference to FIGS. 2-7. In this example, system 800 includes preprocessing component 804, domain adaptation component 806, and domain fusion component 808. In some embodiments of the disclosure, system 800 constitutes machine-executable component(s) embodied within machine(s), ( e.g ., embodied in one or more computer readable mediums (or media) associated with one or more machines). Such component s), when executed by the one or more machines, (e.g., computer(s), computing device(s), virtual machine(s), etc.) cause the machine(s) to perform the operations described. In some embodiments of the disclosure, system 800 includes memory 810 that stores computer executable components and instructions. Furthermore, system 800 in some embodiments of the disclosure includes a processor 8 12 to facilitate execution of the computer executable components and instructions (e.g., computer executable components and corresponding instructions). As shown, preprocessing component 804, domain adaptation component 806, and domain fusion component 808, memory 8 10, and/or processor 8 12 are electrically and/or communicatively coupled to one another in one or more embodiments of the disclosure.
[0038] In some embodiments, preprocessing component 804 may be used to set up data for inputs to machine learning elements, such as source RNNs described above (e.g., RNNs 511, 512, 513). For example, preprocessing component 804 may be used to implement word parsing of documents used for source domain training data (e.g., documents 510 in FIG. 5), or pairing of task- irrelevant dual-domain data inputs (e.g., document-depth image pairs 501 in FIG. 5).
[0039] Referring now to FIG. 9, an illustrative cloud computing environment 900 is depicted. As shown, cloud computing environment 900 comprises one or more cloud computing nodes 902 with which local computing devices used by cloud consumers, such as, for example, camera 904, computers 906, 908. In an embodiment, at least one of the computers 906 implements an IPS system as described previously. Nodes 902 may communicate with one another. They may be grouped (not shown) physically or virtually, in one or more networks, such as private, community, public, or hybrid clouds as described hereinabove, or a combination thereof. This allows cloud computing environment 900 to offer infrastructure, platforms and/or software as services for which a cloud consumer does not need to maintain resources on a local computing device. It is understood that the types of computing devices 906-908 shown in FIG. 9 are intended to be illustrative only and that computing nodes 902 and cloud computing environment 900 can communicate with any type of computerized device over any type of network and/or network addressable connection (e.g., using a web browser).
[0040] Referring now to FIG. 10, a set of functional abstraction layers provided by cloud computing environment 900 (FIG. 9) is shown. It should be understood in advance that the components, layers, and functions shown in FIG. 10 are intended to be illustrative only and embodiments of the disclosure are not limited thereto. As depicted, the following layers and corresponding functions are provided: hardware and software layer 1012 includes hardware and software components. Examples of hardware components include: mainframes 1014; RISC (Reduced Instruction Set Computer) architecture based servers 1016; servers 1018; blade servers 1020; storage devices 1022; and networks and networking components 1024. In some embodiments, software components include network application server software 1026 and database software 1028; virtualization layer 1030 provides an abstraction layer from which the following examples of virtual entities may be provided: virtual servers 1032; virtual storage 1034; virtual networks 1036, including virtual private networks; virtual applications and operating systems 1038; and virtual clients 1040.
[0041] In one example, management layer 1042 may provide the functions described below. Resource provisioning 1044 provides dynamic procurement of computing resources and other resources that are utilized to perform tasks within the cloud computing environment. Metering and pricing 1046 provide cost tracking as resources are utilized within the cloud computing environment, and billing or invoicing for consumption of these resources. In one example, these resources may comprise application software licenses. Security provides identity verification for cloud consumers and tasks, as well as protection for data and other resources. User portal 1048 provides access to the cloud computing environment for consumers and system administrators. Service level management 1050 provides cloud computing resource allocation and management such that required service levels are met. Service Level Agreement (SLA) planning and fulfillment 1052 provides pre-arrangement for, and procurement of, cloud computing resources for which a future requirement is anticipated in accordance with an SLA.
[0042] Workloads layer 1054 provides examples of functionality for which the cloud computing environment may be utilized. Examples of workloads and functions that may be provided from this layer include: mapping and navigation 1056; software development and lifecycle management 1058; transaction processing 1060; point cloud to virtual reality data processing 1064; user defined content to point cloud processing 1066; and domain adaptation and fusion processing 1068.
[0043] Turning now to FIG. 11, a schematic illustration of a system 1100 is depicted upon which aspects of one or more embodiments of domain adaption and fusion using weakly supervised target-irrelevant data may be implemented. In an embodiment, all or a portion of the system 1100 may be incorporated into one or more of the camera and processors described herein. In one or more exemplary embodiments, in terms of hardware architecture, as shown in FIG. 11, the computer 1101 includes a processing device 1105 and a memory 1110 coupled to a memory controller 1115 and an input/output controller 1135. The input/output controller 1135 can be, for example, but not limited to, one or more buses or other wired or wireless connections, as is known in the art. The input/output controller 1135 may have additional elements, which are omitted for simplicity, such as controllers, buffers (caches), drivers, repeaters, and receivers, to enable communications. Further, the computer 1101 may include address, control, and/or data connections to enable appropriate communications among the aforementioned components.
[0044] In one or more exemplary embodiments, a keyboard 1150 and mouse 1155 or similar devices can be coupled to the input/output controller 1135. Alternatively, input may be received via a touch-sensitive or motion sensitive interface (not depicted). The computer 1101 can further include a display controller 1125 coupled to a display 1130. It should be appreciated that a camera (e.g., camera 904 of FIG. 9) may be coupled to the system 1100.
[0045] The processing device 1105 is a hardware device for executing software, particularly software stored in secondary storage 1120 or memory 1110. The processing device 1105 can be any custom made or commercially available computer processor, a central processing unit (CPU), an auxiliary processor among several processors associated with the computer 1101, a semiconductor-based microprocessor (in the form of a microchip or chip set), a macro-processor, or generally any device for executing instructions. [0046] The memory 1110 can include any one or combination of volatile memory elements (e.g., random access memory (RAM, such as DRAM, SRAM, SDRAM, etc.)) and nonvolatile memory elements (e.g., ROM, erasable programmable read only memory (EPROM), electronically erasable programmable read only memory (EEPROM), flash memory, programmable read only memory (PROM), tape, compact disc read only memory (CD-ROM), flash drive, disk, hard disk drive, diskette, cartridge, cassette or the like, etc.). Moreover, the memory 1110 may incorporate electronic, magnetic, optical, and/or other types of storage media. Accordingly, the memory 1110 is an example of a tangible computer readable storage medium 1140 upon which instructions executable by the processing device 1105 may be embodied as a computer program product. The memory 1110 can have a distributed architecture, where various components are situated remote from one another, but can be accessed by the processing device 1105.
[0047] The instructions in memory 1110 may include one or more separate programs, each of which comprises an ordered listing of executable instructions for implementing logical functions. In the example of FIG. 11, the instructions in the memory 1110 include a suitable operating system (OS) 1111 and program instructions 1116. The operating system 1111 essentially controls the execution of other computer programs and provides scheduling, input-output control, file and data management, memory management, and communication control and related services. When the computer 1101 is in operation, the processing device 1105 is configured to execute instructions stored within the memory 1110, to communicate data to and from the memory 1110, and to generally control operations of the computer 1101 pursuant to the instructions. Examples of program instructions 1116 can include instructions to implement the processing described herein in reference to FIGs. 1-10.
[0048] The computer 1101 of FIG. 11 also includes a network interface 1160 that can establish communication channels with one or more other computer systems via one or more network links. The network interface 1160 can support wired and/or wireless communication protocols known in the art. For example, when embodied in a user system, the network interface 1160 can establish communication channels with an application server.
[0049] It will be appreciated that aspects of the present disclosure may be embodied as a system, method, or computer program product and may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, micro-code, etc.), or a combination thereof. Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
[0050] One or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non- exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In one aspect, the computer readable storage medium may be a tangible medium containing or storing a program for use by or in connection with an instruction execution system, apparatus, or device.
[0051] A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
[0052] The computer readable medium may contain program code embodied thereon, which may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. In addition, computer program code for carrying out operations for implementing aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
[0053] It will be appreciated that aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block or step of the flowchart illustrations and/or block diagrams, and combinations of blocks or steps in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
[0054] These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
[0055] The present disclosure may be a system, a method, and/or a computer program product at any possible technical detail level of integration. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
[0056] The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide, or other transmission media ( e.g ., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
[0057] Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
[0058] Computer readable program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, configuration data for integrated circuitry, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++, or the like, and procedural programming languages, such as the“C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user’s computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments of the disclosure, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instruction by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
[0059] Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
[0060] These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
[0061] The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks. [0062] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
[0063] Numerous embodiments are described in this patent application, and are presented for illustrative purposes only. The described embodiments are not, and are not intended to be, limiting in any sense and may be practiced with various modifications and alterations, such as structural, logical, software, and electrical modifications. Although particular features of the disclosed embodiments may be described with reference to one or more particular embodiments and/or drawings, it should be understood that such features are not limited to usage in the one or more particular embodiments or drawings with reference to which they are described, unless expressly specified otherwise.
[0064] Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise. On the contrary, such devices need only transmit to each other as necessary or desirable, and may actually refrain from exchanging data most of the time. For example, a machine in communication with another machine via the Internet may not transmit data to the other machine for weeks at a time. In addition, devices that are in communication with each other may communicate directly or indirectly through one or more intermediaries.
[0065] Further, although process steps, algorithms or the like may be described in a sequential order, such processes may be configured to work in different orders. In other words, any sequence or order of steps that may be explicitly described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously despite being described or implied as occurring non-simultaneously (e.g., because one step is described after the other step). Moreover, the illustration of a process by its depiction in a drawing does not imply that the illustrated process is exclusive of other variations and modifications thereto, and does not imply that the illustrated process is preferred.
[0066] "Determining" something can be performed in a variety of manners and therefore the term "determining" (and like terms) includes calculating, computing, deriving, looking up (e.g., in a table, database or data structure), ascertaining and the like.
[0067] It will be readily apparent that the various methods and algorithms described herein may be implemented by, e.g., appropriately and/or specially-programmed general purpose computers and/or computing devices. Typically a processor (e.g., one or more microprocessors) will receive instructions from a memory or like device, and execute those instructions, thereby performing one or more processes defined by those instructions. Further, programs that implement such methods and algorithms may be stored and transmitted using a variety of media (e.g., computer readable media) in a number of manners. In some embodiments, hard- wired circuitry or cFustom hardware may be used in place of, or in combination with, software instructions for implementation of the processes of various embodiments. Thus, embodiments are not limited to any specific combination of hardware and software.
[0068] A "processor" generally means any one or more microprocessors, CPU devices, GPU devices, computing devices, microcontrollers, digital signal processors, or like devices, as further described herein. A CPU typically performs a variety of tasks while a GPU is optimized to display images.
[0069] Where databases are described, (i) alternative database structures to those described may be readily employed, and (ii) other memory structures besides databases may be readily employed. Any illustrations or descriptions of any sample databases presented herein are illustrative arrangements for stored representations of information. Any number of other arrangements may be employed besides those suggested by, e.g., tables illustrated in drawings or elsewhere. Similarly, any illustrated entries of the databases represent exemplary information only; the number and content of the entries can be different from those described herein. Further, despite any depiction of the databases as tables, other formats (including relational databases, object-based models and/or distributed databases) could be used to store and manipulate the data types described herein. Likewise, object methods or behaviors of a database can be used to implement various processes, such as the described herein. In addition, the databases may, in a known manner, be stored locally or remotely from a device that accesses data in such a database.

Claims

CLAIMS What is claimed is:
1. A system comprising: a memory having computer readable instructions; and one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising: defining a source pipeline and a target pipeline of a data classifier for learning a dual domain task of interest, wherein each of the source pipeline and the target pipeline comprises a neural network unit; wherein available training data includes task-relevant source domain data and a plurality of task-irrelevant data pairs, each task-irrelevant data pair representing one unit of source domain data and one unit of target domain data; wherein task-relevant data in the target domain is unavailable for training the data classifier; training the data classifier, comprising: simulating a target representation during a first training session (si) of the source domain pipeline by simultaneously feeding the source domain pipeline and the target domain pipeline with a plurality of dual-domain task-irrelevant data pairs and keeping parameters fixed in the target domain pipeline, while using a first loss function to adapt a source representation of the source domain pipeline to a fixed target domain
representation of the target domain pipeline, wherein upon convergence of a loss value from the loss function to a predetermined threshold, the trained source representation is a simulated target representation; performing a domain adaptation to transfer abstract features between the source domain and the target domain by performing a second training session (s2) of the source domain pipeline against the fixed target domain pipeline while simultaneously feeding task irrelevant data pairs and applying the first loss function, wherein the second training session (s2) is initialized with the parameters of the source domain pipeline trained by the first training session (si), wherein the source pipeline is shared for a joint training using a second loss function to generate a second source domain representation using task relevant data inputs; wherein at least one of the training data sets is in sequential form.
2. The system of claim 1, wherein the data classifier training further comprises:
performing a domain fusion to generate a joint classifier by performing a third training (s3) of the source domain pipeline initialized by the state of the second source domain representation and using task relevant source domain data as input, wherein a concatenation of the source domain pipeline is performed with the target domain pipeline initialized by the state of the simulated target representation of the first training session (si), wherein a loss function drives the training from the fixed parameters of the target domain pipeline to the source domain pipeline.
3. The system of claim 2, wherein the operations further comprise:
testing the joint classifier using task-relevant source domain data for inputs to the source domain pipeline and task-relevant target domain data, wherein the joint classifier produces a prediction of a task of interest objective.
4. The system of claim 1 , wherein the neural network unit used for training data sets in sequential form is configured as a recurrent neural network (R N).
5. The system of claim 1, wherein the source domain data is sequential words of documents, and the target domain data is depth scan images that share at least one common feature with the documents.
6. A method comprising:
defining a source pipeline and a target pipeline of a data classifier for learning a dual domain task of interest, wherein each of the source pipeline and the target pipeline comprises a neural network unit; wherein available training data includes task-relevant source domain data and a plurality of task-irrelevant data pairs, each task-irrelevant data pair representing one unit of source domain data and one unit of target domain data; wherein task-relevant data in the target domain is unavailable for training the data classifier; training the data classifier, comprising: simulating a target representation during a first training session (si) of the source domain pipeline by simultaneously feeding the source domain pipeline and the target domain pipeline with a plurality of dual-domain task-irrelevant data pairs and keeping parameters fixed in the target domain pipeline, while using a first loss function to adapt a source representation of the source domain pipeline to a fixed target domain
representation of the target domain pipeline, wherein upon convergence of a loss value from the loss function to a predetermined threshold, the trained source representation is a simulated target representation; performing a domain adaptation to transfer abstract features between the source domain and the target domain by performing a second training session (s2) of the source domain pipeline against the fixed target domain pipeline while simultaneously feeding task irrelevant data pairs and applying the first loss function, wherein the second training session (s2) is initialized with the parameters of the source domain pipeline trained by the first training session (si), wherein the source pipeline is shared for a joint training using a second loss function to generate a second source domain representation using task relevant data inputs; wherein at least one of the training data sets is in sequential form.
7. The method of claim 6, wherein the data classifier training further comprises:
performing a domain fusion to generate a joint classifier by performing a third training (s3) of the source domain pipeline initialized by the state of the second source domain representation and using task relevant source domain data as input, wherein a concatenation of the source domain pipeline is performed with the target domain pipeline initialized by the state of the simulated target representation of the first training session (si), wherein a loss function drives the training from the fixed parameters of the target domain pipeline to the source domain pipeline.
8. The method of claim 7, wherein the operations further comprise:
testing the joint classifier using task-relevant source domain data for inputs to the source domain pipeline and task-relevant target domain data, wherein the joint classifier produces a prediction of a task of interest objective.
9. The method of claim 6, wherein the neural network unit used for training data sets in sequential form is configured as a recurrent neural network (R N).
10. The method of claim 6, wherein the source domain data is sequential words of documents, and the target domain data is depth scan images that share at least one common feature with the documents.
11. A computer program product for data classification comprising a computer readable storage medium having program instructions stored thereon and executable by a system comprising one or more processors, to cause the system to perform steps comprising: defining a source pipeline and a target pipeline of a data classifier for learning a dual domain task of interest, wherein each of the source pipeline and the target pipeline comprises a neural network unit; wherein available training data includes task-relevant source domain data and a plurality of task-irrelevant data pairs, each task-irrelevant data pair representing one unit of source domain data and one unit of target domain data; wherein task-relevant data in the target domain is unavailable for training the data classifier; training the data classifier, comprising: simulating a target representation during a first training session (si) of the source domain pipeline by simultaneously feeding the source domain pipeline and the target domain pipeline with a plurality of dual-domain task-irrelevant data pairs and keeping parameters fixed in the target domain pipeline, while using a first loss function to adapt a source representation of the source domain pipeline to a fixed target domain
representation of the target domain pipeline, wherein upon convergence of a loss value from the loss function to a predetermined threshold, the trained source representation is a simulated target representation; performing a domain adaptation to transfer abstract features between the source domain and the target domain by performing a second training session (s2) of the source domain pipeline against the fixed target domain pipeline while simultaneously feeding task irrelevant data pairs and applying the first loss function, wherein the second training session (s2) is initialized with the parameters of the source domain pipeline trained by the first training session (si), wherein the source pipeline is shared for a joint training using a second loss function to generate a second source domain representation using task relevant data inputs; wherein at least one of the training data sets is in sequential form.
12. The computer program product of claim 11, wherein the data classifier training further comprises:
performing a domain fusion to generate a joint classifier by performing a third training (s3) of the source domain pipeline initialized by the state of the second source domain representation and using task relevant source domain data as input, wherein a concatenation of the source domain pipeline is performed with the target domain pipeline initialized by the state of the simulated target representation of the first training session (si), wherein a loss function drives the training from the fixed parameters of the target domain pipeline to the source domain pipeline.
13. The computer program product of claim 12, wherein the operations further comprise: testing the joint classifier using task-relevant source domain data for inputs to the source domain pipeline and task-relevant target domain data, wherein the joint classifier produces a prediction of a task of interest objective.
14. The computer program product of claim 11 , wherein the neural network unit used for training data sets in sequential form is configured as a recurrent neural network (RNN).
15. The computer program product of claim 11, wherein the source domain data is sequential words of documents, and the target domain data is depth scan images that share at least one common feature with the documents.
PCT/US2019/038370 2019-06-21 2019-06-21 Domain adaptation and fusion using task-irrelevant paired data in sequential form WO2020256732A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2019/038370 WO2020256732A1 (en) 2019-06-21 2019-06-21 Domain adaptation and fusion using task-irrelevant paired data in sequential form

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2019/038370 WO2020256732A1 (en) 2019-06-21 2019-06-21 Domain adaptation and fusion using task-irrelevant paired data in sequential form

Publications (1)

Publication Number Publication Date
WO2020256732A1 true WO2020256732A1 (en) 2020-12-24

Family

ID=67211910

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/038370 WO2020256732A1 (en) 2019-06-21 2019-06-21 Domain adaptation and fusion using task-irrelevant paired data in sequential form

Country Status (1)

Country Link
WO (1) WO2020256732A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022206498A1 (en) * 2021-03-31 2022-10-06 华为技术有限公司 Federated transfer learning-based model training method and computing nodes
CN115510926A (en) * 2022-11-23 2022-12-23 武汉理工大学 Cross-machine type diesel engine combustion chamber fault diagnosis method and system
US11640518B2 (en) * 2017-11-17 2023-05-02 Samsung Electronics Co., Ltd. Method and apparatus for training a neural network using modality signals of different domains
WO2024009708A1 (en) * 2022-07-08 2024-01-11 Mitsubishi Electric Corporation System and method for cross-modal knowledge transfer without task-relevant source data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180300205A1 (en) 2017-04-18 2018-10-18 Netapp, Inc. Systems and methods for backup and restore of master-less distributed database clusters
US20180330205A1 (en) * 2017-05-15 2018-11-15 Siemens Aktiengesellschaft Domain adaptation and fusion using weakly supervised target-irrelevant data
EP3404582A1 (en) * 2017-05-15 2018-11-21 Siemens Aktiengesellschaft Training an rgb-d classifier with only depth data and privileged information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180300205A1 (en) 2017-04-18 2018-10-18 Netapp, Inc. Systems and methods for backup and restore of master-less distributed database clusters
US20180330205A1 (en) * 2017-05-15 2018-11-15 Siemens Aktiengesellschaft Domain adaptation and fusion using weakly supervised target-irrelevant data
EP3404582A1 (en) * 2017-05-15 2018-11-21 Siemens Aktiengesellschaft Training an rgb-d classifier with only depth data and privileged information

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CASTREJON LLUIS ET AL: "Learning Aligned Cross-Modal Representations from Weakly Aligned Data", 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 27 June 2016 (2016-06-27), pages 2940 - 2949, XP033021475, DOI: 10.1109/CVPR.2016.321 *
HOFFMAN JUDY ET AL: "Learning with Side Information through Modality Hallucination", 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 27 June 2016 (2016-06-27), pages 826 - 834, XP033021260, DOI: 10.1109/CVPR.2016.96 *
KUAN-CHUAN PENG ET AL: "Zero-Shot Deep Domain Adaptation", 6 July 2017 (2017-07-06), XP055499392, Retrieved from the Internet <URL:https://arxiv.org/pdf/1707.01922v2.pdf> [retrieved on 20200224] *
SERGEY ZAKHAROV ET AL: "3D Object Instance Recognition and Pose Estimation Using Triplet Loss with Dynamic Margin", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 9 April 2019 (2019-04-09), XP081167241, DOI: 10.1109/IROS.2017.8202207 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11640518B2 (en) * 2017-11-17 2023-05-02 Samsung Electronics Co., Ltd. Method and apparatus for training a neural network using modality signals of different domains
WO2022206498A1 (en) * 2021-03-31 2022-10-06 华为技术有限公司 Federated transfer learning-based model training method and computing nodes
WO2024009708A1 (en) * 2022-07-08 2024-01-11 Mitsubishi Electric Corporation System and method for cross-modal knowledge transfer without task-relevant source data
CN115510926A (en) * 2022-11-23 2022-12-23 武汉理工大学 Cross-machine type diesel engine combustion chamber fault diagnosis method and system

Similar Documents

Publication Publication Date Title
US11556749B2 (en) Domain adaptation and fusion using weakly supervised target-irrelevant data
US11017271B2 (en) Edge-based adaptive machine learning for object recognition
US11556746B1 (en) Fast annotation of samples for machine learning model development
WO2020256732A1 (en) Domain adaptation and fusion using task-irrelevant paired data in sequential form
US11049239B2 (en) Deep neural network based identification of realistic synthetic images generated using a generative adversarial network
US10346782B2 (en) Adaptive augmented decision engine
US11537506B1 (en) System for visually diagnosing machine learning models
AU2020385264B2 (en) Fusing multimodal data using recurrent neural networks
US10832149B2 (en) Automatic segmentation of data derived from learned features of a predictive statistical model
US11675896B2 (en) Using multimodal model consistency to detect adversarial attacks
Helu et al. Scalable data pipeline architecture to support the industrial internet of things
US11379718B2 (en) Ground truth quality for machine learning models
US20200074267A1 (en) Data prediction
JP2022531974A (en) Dealing with rare training data for artificial intelligence
US11720846B2 (en) Artificial intelligence-based use case model recommendation methods and systems
CN111159241A (en) Click conversion estimation method and device
US20230196204A1 (en) Agnostic machine learning inference
US20200219014A1 (en) Distributed learning using ensemble-based fusion
CN114598610B (en) Network business rule identification
US10832407B2 (en) Training a neural network adapter
US20230196203A1 (en) Agnostic machine learning training integrations
Du et al. A Knowledge Transfer Method for Unsupervised Pose Keypoint Detection Based on Domain Adaptation and CAD Models
WO2024031984A1 (en) Task processing system, and task processing method and device
US20220245838A1 (en) Visual question generation with answer-awareness and region-reference
Staron Machine Learning Infrastructure and Best Practices for Software Engineers: Take your machine learning software from a prototype to a fully fledged software system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19737384

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19737384

Country of ref document: EP

Kind code of ref document: A1