WO2018032354A1 - Method and apparatus for zero-shot learning - Google Patents

Method and apparatus for zero-shot learning Download PDF

Info

Publication number
WO2018032354A1
WO2018032354A1 PCT/CN2016/095512 CN2016095512W WO2018032354A1 WO 2018032354 A1 WO2018032354 A1 WO 2018032354A1 CN 2016095512 W CN2016095512 W CN 2016095512W WO 2018032354 A1 WO2018032354 A1 WO 2018032354A1
Authority
WO
WIPO (PCT)
Prior art keywords
features
dictionary
multimedia content
visual
model
Prior art date
Application number
PCT/CN2016/095512
Other languages
French (fr)
Inventor
Yunlong YU
Original Assignee
Nokia Technologies Oy
Nokia Technologies (Beijing) Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy, Nokia Technologies (Beijing) Co., Ltd. filed Critical Nokia Technologies Oy
Priority to CN201680088517.8A priority Critical patent/CN109643384A/en
Priority to PCT/CN2016/095512 priority patent/WO2018032354A1/en
Priority to EP16913114.1A priority patent/EP3500978A4/en
Publication of WO2018032354A1 publication Critical patent/WO2018032354A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle

Definitions

  • Embodiments of the present disclosure generally relate to information processing, and more particularly to a method, apparatus and computer program product for Zero-Shot Learning (ZSL) .
  • ZSL Zero-Shot Learning
  • ZSL refers to a learning process where no training samples are available to discriminate new classes (also called unseen classes) . It aims at improving the scalability of conventional classification methods. It appears frequently in practice because of the enormous amount of real-world object classes that are still constantly changing. It would be too time-consuming and expensive to obtain human annotated labels for each of the classes.
  • ZSL can be widely used in applications of natural scene understanding, object recognition, autonomous vehicles, virtual reality, and so on. For example, in the application of autonomous vehicles, surrounding objects need to be recognized. Conventional recognition methods need to predefine some classes, and then to train a model to recognize objects in these classes. However, if there is an object in an unseen class, the model will fail to recognize it. ZSL is proposed to solve this problem. With ZSL, the model can recognize the objects not only in the seen classes but also in the unseen classes.
  • Conventional methods for ZSL generally apply one transformation matrix to embed the visual features of testing samples into a semantic space, or two transformation matrixes to embed both of the visual features and semantic features of the testing samples into the semantic space. In this way, a connection between the visual features and the semantic features is bridged and a class of the testing samples of unseen classes can be inferred by using the nearest neighborhood method.
  • the conventional methods for ZSL cannot reflect intrinsical structures in the semantic space, leading to unsatisfying performance.
  • example embodiments of the present disclosure include a method, apparatus and computer program product for ZSL.
  • a method comprises: constructing a dictionary model based on visual features and semantic features of multimedia content of seen classes, the semantic features corresponding to the visual features; reconstructing visual features of multimedia content of unseen classes using the dictionary model and semantic features of multimedia content of unseen classes; and determining a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
  • determining the class of the testing sample comprises: in response to the visual feature of the testing sample being closest to one of the reconstructed visual features, designating the class of the testing sample to be a class associated with the one of the reconstructed visual features.
  • constructing the dictionary model comprises: randomly initializing model parameters for the dictionary model; and updating the model parameters so as to obtain a minimum of an objective function for the dictionary model, the objective function being defined at least by the model parameters.
  • the model parameters include at least one of the following: a dictionary matrix, a dictionary coefficient matrix and a transformation matrix.
  • the objective function for the dictionary model is formulized as:
  • the semantic features of the multimedia content include at least one of the following: semantic attributes and distributed text representations of the multimedia content.
  • the visual features of the multimedia content include at least one of the following: color features, texture features, motion features and Convolutional Neural Network features of the multimedia content.
  • an apparatus comprising at least one processor and at least one memory including computer program code.
  • the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: construct a dictionary model based on visual features and semantic features of multimedia content of seen classes, the semantic features corresponding to the visual features; reconstruct visual features of multimedia content of unseen classes using the dictionary model and semantic features of multimedia content of unseen classes; and determine a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
  • an apparatus comprising means for performing the method in the first aspect of the present disclosure.
  • a computer program product comprises at least one computer readable non-transitory memory medium having program code stored thereon, the program code which, when executed by an apparatus, causes the apparatus to perform the method in the first aspect of the present disclosure.
  • Fig. 1 schematically shows an architecture in which embodiments of the present disclosure can be implemented
  • Fig. 2 is a flowchart of a method in accordance with embodiments of the present disclosure.
  • Fig. 3 shows a block diagram of an example computer system suitable for implementing embodiments of the present invention.
  • the term “includes” and its variants are to be read as opened terms that mean “includes, but is not limited to. ”
  • the term “based on” is to be read as “based at least in part on. ”
  • the term “one embodiment” and “an embodiment” are to be read as “at least one embodiment. ”
  • the term “another embodiment” is to be read as “at least one other embodiment. ”
  • Other definitions, explicit and implicit, may be included below.
  • a dictionary model is constructed by using visual features and semantic features of multimedia content of seen classes.
  • semantic features of multimedia content of unseen classes are embedded into the visual space.
  • a class of a testing sample of multimedia content of unseen classes is determined based on comparison of a visual feature of the testing sample and reconstructed visual features. Details of the embodiments of the present disclosure will be described with reference to Figs. 1 to 3.
  • FIG. 1 schematically shows an architecture 100 in which embodiments of the present disclosure can be implemented. It is to be understood that the structure and functionality of the architecture 100 are described only for the purpose of illustration without suggesting any limitations as to the scope of the present disclosure described herein. The present disclosure described herein can be embodied with a different structure and/or functionality.
  • the architecture 100 includes a training system 110 and a testing system 120.
  • the training system 110 is configured to receive visual features 112 of multimedia content of seen classes and semantic features 114 of multimedia content of seen classes.
  • the semantic features 114 correspond to the visual features 112.
  • the training system 110 is further configured to construct a dictionary model based on the visual features 112 and the semantic features 114.
  • Examples of multimedia content include, but are not limited to, images, video and the like.
  • Examples of the visual features 112 include, but are not limited to, color features, texture features, motion features, Convolutional Neural Network (CNN) features and the like.
  • Examples of the semantic features 114 include, but are not limited to, semantic attributes of the multimedia content, distributed text representations of the multimedia content and the like.
  • the testing system 120 is configured to receive the dictionary model from the training system 110, semantic features 126 of multimedia content of unseen classes, and a visual feature 128 of a testing sample.
  • the testing system 120 is further configured to output a classification result of the testing sample.
  • the testing system 120 includes a reconstructing unit 122 and a classifier 124.
  • the reconstructing unit 122 is configured to reconstruct visual features of multimedia content of unseen classes using the dictionary model and the semantic features 126 of multimedia content of unseen classes.
  • the classifier 124 is configured to receive the reconstructed visual features of the seen classes from the reconstructing unit 122 and the visual feature 128 of the testing sample. The classifier 124 is further configured to determine a class of the testing sample based on comparison of the visual feature of the testing sample and the reconstructed visual features. The classifier 124 is further configured to output the classification result of the testing sample.
  • Fig. 2 shows a flowchart of a method 200 for ZSL in accordance with embodiments of the present disclosure.
  • the method 200 may be implemented in the architecture 100 as shown in Fig. 1.
  • the method 200 is entered in step 210, where the training system 110 constructs a dictionary model based on the visual features 112 and semantic features 114 of multimedia content of seen classes.
  • any known feature extracting methods may be used for extract the visual features 112 and the corresponding semantic features 114 from training samples of the seen classes, and that the description thereof is omitted for the purpose of conciseness.
  • the dictionary model may be associated with one or more model parameters.
  • the dictionary model may be constructed by training the model parameters with the training system 110.
  • the model parameters comprise at least one of a dictionary matrix, a dictionary coefficient matrix and a transformation matrix.
  • an objective function for the dictionary model may be predetermined and the objective function may be defined at least by the model parameters for the dictionary model.
  • the objective function of the dictionary model may be formulized as below:
  • F represent an operation of solving a F-norm and F may be in the range of 2 to 4; represents the visual features of the training samples of seen classes; represents the semantic features of the training samples corresponding to the visual features; N represents the number of the training samples of seen classes; d x and d y represent dimensionalities of the matrixes X and Y; respectively; D represents a dictionary matrix; P represents a transformation matrix; represents a dictionary coefficient matrix and d represents a dimensionality of the dictionary coefficient matrix C; and ⁇ represents a predetermined constant for balancing the importance of the two terms in Equation (1) and may be in the range of 0.001 to 1000.
  • constructing the dictionary model comprises randomly initializing model parameters for the dictionary model, and updating the model parameters so as to obtain a minimum of an objective function for the dictionary model.
  • the model parameters are optimized so as to obtain a minimum of an objective function for the dictionary model.
  • the dictionary matrix D, the dictionary coefficient matrix C and the transformation matrix P may be optimized so as to obtain a minimum of the objective function denoted by Equation (1) as below:
  • d i represents an i th base vector in the dictionary matrix D, i ⁇ ⁇ 1; 2; ...; N ⁇ and I represents an identity matrix.
  • a joint optimization process may be used for optimizing the dictionary matrix D, the dictionary coefficient matrix C and the transformation matrix P.
  • the dictionary matrix D and the transformation matrix P may be randomly initialized, respectively.
  • the dictionary coefficient matrix C may be optimized by using Equation (3) as below:
  • the optimized dictionary coefficient matrix C may be represented as:
  • the dictionary matrix D and the dictionary coefficient matrix C may be fixed.
  • the transformation matrix P may be optimized by using Equation (5) as below:
  • the optimized transformation matrix P may be represented as:
  • the transformation matrix P and the dictionary coefficient matrix C may be fixed.
  • the dictionary matrix D may be optimized by using Equation (7) as below:
  • Equation (7) may be solved by the known Alternating Direction Method of Multipliers (ADMM) and the description thereof is omitted for the purpose of conciseness.
  • ADMM Alternating Direction Method of Multipliers
  • the reconstructing unit 122 reconstructs visual features of multimedia content of unseen classes using the dictionary model and semantic features 126 of multimedia content of unseen classes.
  • the semantic features 126 may be represented byyv, v ⁇ ⁇ 1, 2, ..., m ⁇ , where m represents the number of unseen classes.
  • the visual features may be reconstructed by multiplying the optimized dictionary matrix D and the optimized transformation matrix P by the semantic features y v . That is, the reconstructed visual features of multimedia content of unseen classes may be represented as DPy v , v ⁇ ⁇ 1, 2, ..., m ⁇ .
  • the classifier 124 determines a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
  • the classifier 124 may determine the class of the testing sample by using the nearest neighborhood method.
  • determining the class of the testing sample comprises: in response to the visual feature of the testing sample being closest to one of the reconstructed visual features, designating the class of the testing sample to be a class associated with the one of the reconstructed visual features.
  • the nearest neighborhood method is described by way of example without suggesting any limitation to the scope of the present disclosure.
  • the classifier 124 may determine the class of the testing sample by using other suitable methods than the nearest neighborhood method.
  • the dictionary model is constructed based on the visual features and semantic features of multimedia content of seen classes.
  • the dictionary model is learned from both the visual space and the semantic space.
  • the dictionary model may reflect the intrinsical structures in the semantic space, leading to better performance of classification.
  • the model parameters for the dictionary model are jointly optimized, the better performance of classification may be also guaranteed. Further, in the embodiments of present disclosure, because no sparse constraints are imposed on the dictionary coefficient matrix C, the optimization process may be implemented very fast.
  • Fig. 3 shows a block diagram of an example computer system suitable for implementing embodiments of the present invention.
  • the computer system 300 comprises a central processing unit (CPU) 301 which is capable of performing various processes in accordance with a program stored in a read only memory (ROM) 302 or a program loaded from a storage unit 308 to a random access memory (RAM) 303.
  • ROM read only memory
  • RAM random access memory
  • data required when the CPU 301 performs the various processes or the like is also stored as required.
  • the CPU 301, the ROM 302 and the RAM 303 are connected to one another via a bus 304.
  • An input/output (I/O) interface 305 is also connected to the bus 304.
  • the following components are connected to the I/O interface 305: an input unit 306 including a keyboard, a mouse, or the like; an output unit 307 including a display such as a cathode ray tube (CRT) , a liquid crystal display (LCD) , or the like, and a loudspeaker or the like; the storage unit 308 including a hard disk or the like; and a communication unit 309 including a network interface card such as a LAN card, a modem, or the like. The communication unit 309 performs a communication process via the network such as the internet.
  • a drive 310 is also connected to the I/O interface 305 as required.
  • embodiments of the present invention comprise a computer program product including a computer program tangibly embodied on a machine readable medium, the computer program including program code for performing the method 200.
  • the computer program may be downloaded and mounted from the network via the communication unit 309, and/or installed from the removable medium 311.
  • various example embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device. While various aspects of the example embodiments of the present disclosure are illustrated and described as block diagrams, flowcharts, or using some other pictorial representation, it will be appreciated that the blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
  • various blocks shown in the flowcharts may be viewed as method steps, and/or as operations that result from operation of computer program code, and/or as a plurality of coupled logic circuit elements constructed to carry out the associated function (s) .
  • embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program containing program codes configured to carry out the methods as described above.
  • a machine readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine readable medium may be a machine readable signal medium or a machine readable storage medium.
  • a machine readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
  • machine readable storage medium More specific examples of the machine readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM) , a read-only memory (ROM) , an erasable programmable read-only memory (EPROM or Flash memory) , an optical fiber, a portable compact disc read-only memory (CD-ROM) , an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • CD-ROM portable compact disc read-only memory
  • magnetic storage device or any suitable combination of the foregoing.
  • Computer program code for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer program codes may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor of the computer or other programmable data processing apparatus, cause the functions/operations specified in the flowcharts and/or block diagrams to be implemented.
  • the program code may execute entirely on a computer, partly on the computer, as a stand-alone software package, partly on the computer and partly on a remote computer or entirely on the remote computer or server.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the present disclosure provide a method, apparatus and computer program product for ZSL. The method comprises: constructing a dictionary model based on visual features and semantic features of multimedia content of seen classes, the semantic features corresponding to the visual features; reconstructing visual features of multimedia content of unseen classes using the dictionary model and semantic features of multimedia content of unseen classes; and determining a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.

Description

METHOD AND APPARATUS FOR ZERO-SHOT LEARNING FIELD OF THE INVENTION
 Embodiments of the present disclosure generally relate to information processing, and more particularly to a method, apparatus and computer program product for Zero-Shot Learning (ZSL) .
BACKGROUND
 ZSL refers to a learning process where no training samples are available to discriminate new classes (also called unseen classes) . It aims at improving the scalability of conventional classification methods. It appears frequently in practice because of the enormous amount of real-world object classes that are still constantly changing. It would be too time-consuming and expensive to obtain human annotated labels for each of the classes.
 ZSL can be widely used in applications of natural scene understanding, object recognition, autonomous vehicles, virtual reality, and so on. For example, in the application of autonomous vehicles, surrounding objects need to be recognized. Conventional recognition methods need to predefine some classes, and then to train a model to recognize objects in these classes. However, if there is an object in an unseen class, the model will fail to recognize it. ZSL is proposed to solve this problem. With ZSL, the model can recognize the objects not only in the seen classes but also in the unseen classes.
 Conventional methods for ZSL generally apply one transformation matrix to embed the visual features of testing samples into a semantic space, or two transformation matrixes to embed both of the visual features and semantic features of the testing samples into the semantic space. In this way, a connection between the visual features and the semantic features is bridged and a class of the testing samples of unseen classes can be inferred by using the nearest neighborhood method. However, the conventional methods for ZSL cannot reflect intrinsical structures in the semantic space, leading to unsatisfying performance.
SUMMARY
 In general, example embodiments of the present disclosure include a method, apparatus and computer program product for ZSL.
 In a first aspect of the present disclosure, a method is provided. The method comprises: constructing a dictionary model based on visual features and semantic features of multimedia content of seen classes, the semantic features corresponding to the visual features; reconstructing visual features of multimedia content of unseen classes using the dictionary model and semantic features of multimedia content of unseen classes; and determining a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
 In some embodiments, determining the class of the testing sample comprises: in response to the visual feature of the testing sample being closest to one of the reconstructed visual features, designating the class of the testing sample to be a class associated with the one of the reconstructed visual features.
 In some embodiments, constructing the dictionary model comprises: randomly initializing model parameters for the dictionary model; and updating the model parameters so as to obtain a minimum of an objective function for the dictionary model, the objective function being defined at least by the model parameters.
 In some embodiments, the model parameters include at least one of the following: a dictionary matrix, a dictionary coefficient matrix and a transformation matrix.
 In some embodiments, the objective function for the dictionary model is formulized as:
Figure PCTCN2016095512-appb-000001
wherein ||·||F represents an operation of solving an F-norm, X represents the visual features of multimedia content of the seen classes, and Y represents the semantic features of multimedia content of the seen classes, D represents a dictionary matrix, P represents a transformation matrix, C represents a dictionary coefficient matrix, and λ represents a predetermined constant.
 In some embodiments, the semantic features of the multimedia content include at least one of the following: semantic attributes and distributed text representations of the multimedia content.
 In some embodiments, the visual features of the multimedia content include at  least one of the following: color features, texture features, motion features and Convolutional Neural Network features of the multimedia content.
 In a second aspect of the present disclosure, an apparatus is provided. The apparatus comprises at least one processor and at least one memory including computer program code. The at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: construct a dictionary model based on visual features and semantic features of multimedia content of seen classes, the semantic features corresponding to the visual features; reconstruct visual features of multimedia content of unseen classes using the dictionary model and semantic features of multimedia content of unseen classes; and determine a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
 In a third aspect of the present disclosure, an apparatus is provided. The apparatus comprises means for performing the method in the first aspect of the present disclosure.
 In a fourth aspect of the present disclosure, a computer program product is provided. The computer program product comprises at least one computer readable non-transitory memory medium having program code stored thereon, the program code which, when executed by an apparatus, causes the apparatus to perform the method in the first aspect of the present disclosure.
 It is to be understood that the Summary is not intended to identify key or essential features of embodiments of the present disclosure, nor is it intended to be used to limit the scope of the present disclosure. Other features of the present disclosure will become easily comprehensible through the description below.
BRIEF DESCRIPTION OF THE DRAWINGS
 Through the more detailed description of some embodiments of the present disclosure in the accompanying drawings, the above and other objects, features and advantages of the present disclosure will become more apparent, wherein:
 Fig. 1 schematically shows an architecture in which embodiments of the present disclosure can be implemented;
 Fig. 2 is a flowchart of a method in accordance with embodiments of the present disclosure; and
 Fig. 3 shows a block diagram of an example computer system suitable for implementing embodiments of the present invention.
 Throughout the drawings, same or similar reference numerals represent the same or similar element.
DETAILED DESCRIPTION
 Principles of the present disclosure will now be described with reference to some example embodiments. It is to be understood that these embodiments are described for the purpose of illustration only and help those skilled in the art to understand and implement the present disclosure, without suggesting any limitations as to the scope of the invention. The invention described herein can be implemented in various manners other than the ones describe below.
 As used herein, the term “includes” and its variants are to be read as opened terms that mean “includes, but is not limited to. ” The term “based on” is to be read as “based at least in part on. ” The term “one embodiment” and “an embodiment” are to be read as “at least one embodiment. ” The term “another embodiment” is to be read as “at least one other embodiment. ” Other definitions, explicit and implicit, may be included below.
 In general, embodiments of the present disclosure tackle the ZSL task with the idea of dictionary learning. In particular, in accordance with the embodiments of the present disclosure, a dictionary model is constructed by using visual features and semantic features of multimedia content of seen classes. With the dictionary model, semantic features of multimedia content of unseen classes are embedded into the visual space. A class of a testing sample of multimedia content of unseen classes is determined based on comparison of a visual feature of the testing sample and reconstructed visual features. Details of the embodiments of the present disclosure will be described with reference to Figs. 1 to 3.
 Reference is first made to Fig. 1, which schematically shows an architecture 100 in which embodiments of the present disclosure can be implemented. It is to be understood that the structure and functionality of the architecture 100 are described only for the purpose of illustration without suggesting any limitations as to the scope of the present disclosure described herein. The present disclosure described herein can be embodied with a different structure and/or functionality.
 The architecture 100 includes a training system 110 and a testing system 120.  The training system 110 is configured to receive visual features 112 of multimedia content of seen classes and semantic features 114 of multimedia content of seen classes. The semantic features 114 correspond to the visual features 112. The training system 110 is further configured to construct a dictionary model based on the visual features 112 and the semantic features 114.
 Examples of multimedia content include, but are not limited to, images, video and the like. Examples of the visual features 112 include, but are not limited to, color features, texture features, motion features, Convolutional Neural Network (CNN) features and the like. Examples of the semantic features 114 include, but are not limited to, semantic attributes of the multimedia content, distributed text representations of the multimedia content and the like.
 The testing system 120 is configured to receive the dictionary model from the training system 110, semantic features 126 of multimedia content of unseen classes, and a visual feature 128 of a testing sample. The testing system 120 is further configured to output a classification result of the testing sample.
 Specifically, the testing system 120 includes a reconstructing unit 122 and a classifier 124. The reconstructing unit 122 is configured to reconstruct visual features of multimedia content of unseen classes using the dictionary model and the semantic features 126 of multimedia content of unseen classes.
 The classifier 124 is configured to receive the reconstructed visual features of the seen classes from the reconstructing unit 122 and the visual feature 128 of the testing sample. The classifier 124 is further configured to determine a class of the testing sample based on comparison of the visual feature of the testing sample and the reconstructed visual features. The classifier 124 is further configured to output the classification result of the testing sample.
 Fig. 2 shows a flowchart of a method 200 for ZSL in accordance with embodiments of the present disclosure. The method 200 may be implemented in the architecture 100 as shown in Fig. 1.
 As shown, the method 200 is entered in step 210, where the training system 110 constructs a dictionary model based on the visual features 112 and semantic features 114 of multimedia content of seen classes.
 It is to be appreciated that any known feature extracting methods may be used for  extract the visual features 112 and the corresponding semantic features 114 from training samples of the seen classes, and that the description thereof is omitted for the purpose of conciseness.
 Generally, the dictionary model may be associated with one or more model parameters. In this regard, the dictionary model may be constructed by training the model parameters with the training system 110. In some embodiments, the model parameters comprise at least one of a dictionary matrix, a dictionary coefficient matrix and a transformation matrix.
 In addition, for the purpose of constructing the dictionary model, an objective function for the dictionary model may be predetermined and the objective function may be defined at least by the model parameters for the dictionary model. As a non-limiting example, the objective function of the dictionary model may be formulized as below:
Figure PCTCN2016095512-appb-000002
where ||·||F represent an operation of solving a F-norm and F may be in the range of 2 to 4; 
Figure PCTCN2016095512-appb-000003
represents the visual features of the training samples of seen classes; 
Figure PCTCN2016095512-appb-000004
represents the semantic features of the training samples corresponding to the visual features; N represents the number of the training samples of seen classes; dx and dy represent dimensionalities of the matrixes X and Y; respectively; D represents a dictionary matrix; P represents a transformation matrix; 
Figure PCTCN2016095512-appb-000005
represents a dictionary coefficient matrix and d represents a dimensionality of the dictionary coefficient matrix C; and λ represents a predetermined constant for balancing the importance of the two terms in Equation (1) and may be in the range of 0.001 to 1000.
 In some embodiments, constructing the dictionary model comprises randomly initializing model parameters for the dictionary model, and updating the model parameters so as to obtain a minimum of an objective function for the dictionary model. In other words, the model parameters are optimized so as to obtain a minimum of an objective function for the dictionary model. For example, in the case of Equation (1) , the dictionary matrix D, the dictionary coefficient matrix C and the transformation matrix P may be optimized so as to obtain a minimum of the objective function denoted by Equation (1) as  below:
Figure PCTCN2016095512-appb-000006
where di represents an ith base vector in the dictionary matrix D, i ∈ {1; 2; ...; N} and I represents an identity matrix.
 In some embodiments, a joint optimization process may be used for optimizing the dictionary matrix D, the dictionary coefficient matrix C and the transformation matrix P.
 Consider a non-limiting example of the joint optimization process. First, the dictionary matrix D and the transformation matrix P may be randomly initialized, respectively. Then, the dictionary coefficient matrix C may be optimized by using Equation (3) as below:
Figure PCTCN2016095512-appb-000007
The optimized dictionary coefficient matrix C may be represented as:
C = (DTD + λI) -1 (λPY + DTX)                (4)
 Next, the dictionary matrix D and the dictionary coefficient matrix C may be fixed. Then, the transformation matrix P may be optimized by using Equation (5) as below:
Figure PCTCN2016095512-appb-000008
The optimized transformation matrix P may be represented as:
P =λCY (λYYT + τI) -1                  (6)
 Afterwards, the transformation matrix P and the dictionary coefficient matrix C may be fixed. Then, the dictionary matrix D may be optimized by using Equation (7) as below:
Figure PCTCN2016095512-appb-000009
Equation (7) may be solved by the known Alternating Direction Method of Multipliers (ADMM) and the description thereof is omitted for the purpose of conciseness.
 It is to be understood that the operations of optimizing the dictionary matrix D, the  dictionary coefficient matrix C and the transformation matrix P, as described above, may be performed iteratively until convergence condition is satisfied.
 Referring back to Fig. 2, in step 220, the reconstructing unit 122 reconstructs visual features of multimedia content of unseen classes using the dictionary model and semantic features 126 of multimedia content of unseen classes.
 Still consider the non-limiting example as described above. It is assumed that the semantic features 126 may be represented byyv, v ∈ {1, 2, ..., m} , where m represents the number of unseen classes. Then, the visual features may be reconstructed by multiplying the optimized dictionary matrix D and the optimized transformation matrix P by the semantic features yv. That is, the reconstructed visual features of multimedia content of unseen classes may be represented as DPyv, v ∈ {1, 2, ..., m} .
 In step 230, the classifier 124 determines a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
 In some embodiments, the classifier 124 may determine the class of the testing sample by using the nearest neighborhood method. In this regard, determining the class of the testing sample comprises: in response to the visual feature of the testing sample being closest to one of the reconstructed visual features, designating the class of the testing sample to be a class associated with the one of the reconstructed visual features. It is to be appreciated that the nearest neighborhood method is described by way of example without suggesting any limitation to the scope of the present disclosure. The classifier 124 may determine the class of the testing sample by using other suitable methods than the nearest neighborhood method.
 To sum up, in accordance with the embodiments of present disclosure, the dictionary model is constructed based on the visual features and semantic features of multimedia content of seen classes. In other words, the dictionary model is learned from both the visual space and the semantic space. Thus, compared with conventional models, the dictionary model may reflect the intrinsical structures in the semantic space, leading to better performance of classification.
 In addition, because the model parameters for the dictionary model are jointly optimized, the better performance of classification may be also guaranteed. Further, in the embodiments of present disclosure, because no sparse constraints are imposed on the dictionary coefficient matrix C, the optimization process may be implemented very fast.
 Fig. 3 shows a block diagram of an example computer system suitable for implementing embodiments of the present invention. As shown, the computer system 300 comprises a central processing unit (CPU) 301 which is capable of performing various processes in accordance with a program stored in a read only memory (ROM) 302 or a program loaded from a storage unit 308 to a random access memory (RAM) 303. In the RAM 303, data required when the CPU 301 performs the various processes or the like is also stored as required. The CPU 301, the ROM 302 and the RAM 303 are connected to one another via a bus 304. An input/output (I/O) interface 305 is also connected to the bus 304.
 The following components are connected to the I/O interface 305: an input unit 306 including a keyboard, a mouse, or the like; an output unit 307 including a display such as a cathode ray tube (CRT) , a liquid crystal display (LCD) , or the like, and a loudspeaker or the like; the storage unit 308 including a hard disk or the like; and a communication unit 309 including a network interface card such as a LAN card, a modem, or the like. The communication unit 309 performs a communication process via the network such as the internet. A drive 310 is also connected to the I/O interface 305 as required. A removable medium 311, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 310 as required, so that a computer program read therefrom is installed into the storage unit 308 as required.
 Specifically, in accordance with embodiments of the present invention, the processes described above with reference to Figures 2-4 may be implemented as computer software programs. For example, embodiments of the present invention comprise a computer program product including a computer program tangibly embodied on a machine readable medium, the computer program including program code for performing the method 200. In such embodiments, the computer program may be downloaded and mounted from the network via the communication unit 309, and/or installed from the removable medium 311.
 Generally speaking, various example embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device. While various aspects of the example embodiments of the present disclosure are illustrated and described as block diagrams,  flowcharts, or using some other pictorial representation, it will be appreciated that the blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
 Additionally, various blocks shown in the flowcharts may be viewed as method steps, and/or as operations that result from operation of computer program code, and/or as a plurality of coupled logic circuit elements constructed to carry out the associated function (s) . For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program containing program codes configured to carry out the methods as described above.
 In the context of the disclosure, a machine readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine readable medium may be a machine readable signal medium or a machine readable storage medium. A machine readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of the machine readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM) , a read-only memory (ROM) , an erasable programmable read-only memory (EPROM or Flash memory) , an optical fiber, a portable compact disc read-only memory (CD-ROM) , an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
 Computer program code for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer program codes may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor of the computer or other programmable data processing apparatus, cause the functions/operations specified in the flowcharts and/or block diagrams to be implemented. The program code may execute entirely on a computer, partly on the computer, as a stand-alone software package, partly on the computer and partly on a remote computer or entirely on the remote computer or server.
 Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are contained in the above discussions, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable sub-combination.
 Various modifications, adaptations to the foregoing example embodiments of the present disclosure may become apparent to those skilled in the relevant arts in view of the foregoing description, when read in conjunction with the accompanying drawings. Any and all modifications will still fall within the scope of the non-limiting and example embodiments of the present disclosure. Furthermore, other embodiments of the present disclosure set forth herein will come to mind to one skilled in the art to which these embodiments of the present disclosure pertain having the benefit of the teachings presented in the foregoing descriptions and the drawings.
 It will be appreciated that the embodiments of the present disclosure are not to be limited to the specific embodiments as discussed above and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are used herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (16)

  1. A method, comprising:
    constructing a dictionary model based on visual features and semantic features of multimedia content of seen classes, the semantic features corresponding to the visual features;
    reconstructing visual features of multimedia content of unseen classes using the dictionary model and semantic features of multimedia content of unseen classes; and
    determining a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
  2. The method of Claim 1, wherein determining the class of the testing sample comprises:
    in response to the visual feature of the testing sample being closest to one of the reconstructed visual features, designating the class of the testing sample to be a class associated with the one of the reconstructed visual features.
  3. The method of Claim 1, wherein constructing the dictionary model comprises:
    randomly initializing model parameters for the dictionary model; and
    updating the model parameters so as to obtain a minimum of an objective function for the dictionary model, the objective function being defined at least by the model parameters.
  4. The method of Claim 3, wherein the model parameters include at least one of the following: a dictionary matrix, a dictionary coefficient matrix and a transformation matrix.
  5. The method of Claim 4, wherein the objective function for the dictionary model is formulized as:
    Figure PCTCN2016095512-appb-100001
    wherein ||·||F represents an operation of solving an F-norm, X represents the visual features of multimedia content of the seen classes, and Y represents the semantic features of multimedia content of the seen classes, D represents a dictionary matrix, P represents a transformation matrix, C represents a dictionary coefficient matrix, and λ represents a  predetermined constant.
  6. The method of any of Claims 1 to 5, wherein the semantic features of the multimedia content include at least one of the following: semantic attributes and distributed text representations of the multimedia content.
  7. The method of any of Claims 1 to 5, wherein the visual features of the multimedia content include at least one of the following: color features, texture features, motion features and Convolutional Neural Network features of the multimedia content.
  8. An apparatus, comprising:
    at least one processor; and
    at least one memory including computer program code;
    the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to:
    construct a dictionary model based on visual features and semantic features of multimedia content of seen classes, the semantic features corresponding to the visual features;
    reconstruct visual features of multimedia content of unseen classes using the dictionary model and semantic features of multimedia content of unseen classes; and
    determine a class of a testing sample based on comparison of a visual feature of the testing sample and reconstructed visual features.
  9. The apparatus of Claim 8, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus to:
    designate the class of the testing sample to be a class associated with one of the reconstructed visual features in response to the visual feature of the testing sample being closest to the one of the reconstructed visual features.
  10. The apparatus of Claim 8, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus to:
    randomly initialize model parameters for the dictionary model; and
    update the model parameters to obtain a minimum of an objective function for the dictionary model so as to construct the dictionary model, the objective function being  defined at least by the model parameters.
  11. The apparatus of Claim 10, wherein the model parameters include at least one of the following: a dictionary matrix, a dictionary coefficient matrix and a transformation matrix.
  12. The apparatus of Claim 11, wherein the objective function for the dictionary model is formulized as:
    Figure PCTCN2016095512-appb-100002
    wherein ||·||F represents an operation of solving an F-norm, X represents the visual features of multimedia content of the seen classes, and Y represents the semantic features of multimedia content of the seen classes, D represents a dictionary matrix, P represents a transformation matrix, C represents a dictionary coefficient matrix, and λ represents a predetermined constant.
  13. The apparatus of any of Claims 8 to 12, wherein the semantic features of the multimedia content include at least one of the following: semantic attributes and distributed text representations of the multimedia content.
  14. The apparatus of any of Claims 8 to 12, wherein the visual features of the multimedia content include at least one of the following: color features, texture features, motion features and Convolutional Neural Network features of the multimedia content.
  15. An apparatus comprising means for performing the method according to any of Claims 1 to 7.
  16. A computer program product comprising at least one computer readable non-transitory memory medium having program code stored thereon, the program code which, when executed by an apparatus, causes the apparatus to perform the method according to any of Claims 1 to 7.
PCT/CN2016/095512 2016-08-16 2016-08-16 Method and apparatus for zero-shot learning WO2018032354A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201680088517.8A CN109643384A (en) 2016-08-16 2016-08-16 Method and apparatus for zero sample learning
PCT/CN2016/095512 WO2018032354A1 (en) 2016-08-16 2016-08-16 Method and apparatus for zero-shot learning
EP16913114.1A EP3500978A4 (en) 2016-08-16 2016-08-16 Method and apparatus for zero-shot learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/095512 WO2018032354A1 (en) 2016-08-16 2016-08-16 Method and apparatus for zero-shot learning

Publications (1)

Publication Number Publication Date
WO2018032354A1 true WO2018032354A1 (en) 2018-02-22

Family

ID=61196222

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/095512 WO2018032354A1 (en) 2016-08-16 2016-08-16 Method and apparatus for zero-shot learning

Country Status (3)

Country Link
EP (1) EP3500978A4 (en)
CN (1) CN109643384A (en)
WO (1) WO2018032354A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380374A (en) * 2020-10-23 2021-02-19 华南理工大学 Zero sample image classification method based on semantic expansion
CN114627312A (en) * 2022-05-17 2022-06-14 中国科学技术大学 Zero sample image classification method, system, equipment and storage medium
CN116109877A (en) * 2023-04-07 2023-05-12 中国科学技术大学 Combined zero-sample image classification method, system, equipment and storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580501B (en) * 2019-08-20 2023-04-25 天津大学 Zero sample image classification method based on variational self-coding countermeasure network
CN112418257B (en) * 2019-08-22 2023-04-18 四川大学 Effective zero sample learning method based on potential visual attribute mining
CN110826638B (en) * 2019-11-12 2023-04-18 福州大学 Zero sample image classification model based on repeated attention network and method thereof
CN111914903B (en) * 2020-07-08 2022-10-25 西安交通大学 Generalized zero sample target classification method and device based on external distribution sample detection and related equipment
CN116051909B (en) * 2023-03-06 2023-06-16 中国科学技术大学 Direct push zero-order learning unseen picture classification method, device and medium
CN117541882B (en) * 2024-01-05 2024-04-19 南京信息工程大学 Instance-based multi-view vision fusion transduction type zero sample classification method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130148881A1 (en) * 2011-12-12 2013-06-13 Alibaba Group Holding Limited Image Classification
EP2642427A2 (en) * 2012-03-21 2013-09-25 Intellectual Ventures Fund 83 LLC Video concept classification using temporally-correlated grouplets
CN103400160A (en) * 2013-08-20 2013-11-20 中国科学院自动化研究所 Zero training sample behavior identification method
CN105701514A (en) * 2016-01-15 2016-06-22 天津大学 Multi-modal canonical correlation analysis method for zero sample classification
CN105740879A (en) * 2016-01-15 2016-07-06 天津大学 Zero-sample image classification method based on multi-mode discriminant analysis

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646256A (en) * 2013-12-17 2014-03-19 上海电机学院 Image characteristic sparse reconstruction based image classification method
CN105184260B (en) * 2015-09-10 2019-03-08 北京大学 A kind of image characteristic extracting method and pedestrian detection method and device
CN105512679A (en) * 2015-12-02 2016-04-20 天津大学 Zero sample classification method based on extreme learning machine
CN105718940B (en) * 2016-01-15 2019-03-29 天津大学 The zero sample image classification method based on factorial analysis between multiple groups

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130148881A1 (en) * 2011-12-12 2013-06-13 Alibaba Group Holding Limited Image Classification
EP2642427A2 (en) * 2012-03-21 2013-09-25 Intellectual Ventures Fund 83 LLC Video concept classification using temporally-correlated grouplets
CN103400160A (en) * 2013-08-20 2013-11-20 中国科学院自动化研究所 Zero training sample behavior identification method
CN105701514A (en) * 2016-01-15 2016-06-22 天津大学 Multi-modal canonical correlation analysis method for zero sample classification
CN105740879A (en) * 2016-01-15 2016-07-06 天津大学 Zero-sample image classification method based on multi-mode discriminant analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3500978A4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380374A (en) * 2020-10-23 2021-02-19 华南理工大学 Zero sample image classification method based on semantic expansion
CN112380374B (en) * 2020-10-23 2022-11-18 华南理工大学 Zero sample image classification method based on semantic expansion
CN114627312A (en) * 2022-05-17 2022-06-14 中国科学技术大学 Zero sample image classification method, system, equipment and storage medium
CN114627312B (en) * 2022-05-17 2022-09-06 中国科学技术大学 Zero sample image classification method, system, equipment and storage medium
CN116109877A (en) * 2023-04-07 2023-05-12 中国科学技术大学 Combined zero-sample image classification method, system, equipment and storage medium
CN116109877B (en) * 2023-04-07 2023-06-20 中国科学技术大学 Combined zero-sample image classification method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN109643384A (en) 2019-04-16
EP3500978A4 (en) 2020-01-22
EP3500978A1 (en) 2019-06-26

Similar Documents

Publication Publication Date Title
WO2018032354A1 (en) Method and apparatus for zero-shot learning
US11328171B2 (en) Image retrieval method and apparatus
US11062453B2 (en) Method and system for scene parsing and storage medium
US20190304065A1 (en) Transforming source domain images into target domain images
US11120305B2 (en) Learning of detection model using loss function
CN108154222B (en) Deep neural network training method and system and electronic equipment
CN109272043B (en) Training data generation method and system for optical character recognition and electronic equipment
US20190385086A1 (en) Method of knowledge transferring, information processing apparatus and storage medium
US20180130203A1 (en) Automated skin lesion segmentation using deep side layers
WO2019129032A1 (en) Remote sensing image recognition method and apparatus, storage medium and electronic device
CN108230346B (en) Method and device for segmenting semantic features of image and electronic equipment
KR20220122566A (en) Text recognition model training method, text recognition method, and apparatus
WO2019240964A1 (en) Teacher and student based deep neural network training
CN113139628B (en) Sample image identification method, device and equipment and readable storage medium
CN113822428A (en) Neural network training method and device and image segmentation method
US11164004B2 (en) Keyframe scheduling method and apparatus, electronic device, program and medium
US20220092407A1 (en) Transfer learning with machine learning systems
EP4303767A1 (en) Model training method and apparatus
CN108154153B (en) Scene analysis method and system and electronic equipment
US20190205728A1 (en) Method for visualizing neural network models
US20230021551A1 (en) Using training images and scaled training images to train an image segmentation model
JP2022161564A (en) System for training machine learning model recognizing character of text image
CN114330588A (en) Picture classification method, picture classification model training method and related device
JP2022185143A (en) Text detection method, and text recognition method and device
US9928408B2 (en) Signal processing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16913114

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2016913114

Country of ref document: EP

Effective date: 20190318