WO2023231644A1 - 对象识别模型的训练方法、对象识别方法和对象识别设备 - Google Patents

对象识别模型的训练方法、对象识别方法和对象识别设备 Download PDF

Info

Publication number
WO2023231644A1
WO2023231644A1 PCT/CN2023/090318 CN2023090318W WO2023231644A1 WO 2023231644 A1 WO2023231644 A1 WO 2023231644A1 CN 2023090318 W CN2023090318 W CN 2023090318W WO 2023231644 A1 WO2023231644 A1 WO 2023231644A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature vector
object recognition
image
training
metadata
Prior art date
Application number
PCT/CN2023/090318
Other languages
English (en)
French (fr)
Inventor
徐青松
李青
Original Assignee
杭州睿胜软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州睿胜软件有限公司 filed Critical 杭州睿胜软件有限公司
Publication of WO2023231644A1 publication Critical patent/WO2023231644A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present disclosure relates to the field of computer technology, and specifically to a training method for an object recognition model, an object recognition method, and an object recognition device.
  • One of the purposes of the present disclosure is to provide an object recognition model training method, an object recognition method and an object recognition device.
  • the training method includes:
  • the training set includes multiple sets of input samples for training and labeling results corresponding to the input samples, each input sample includes an object image and metadata, and the metadata is configured to describe the corresponding object image;
  • the training accuracy is greater than or equal to the preset accuracy, the training ends and the trained object recognition model is obtained.
  • the training method before using the training set to train the object recognition model based on a neural network, the training method further includes:
  • the metadata includes at least one of a shooting geographical location, shooting time, shooting scene, object part, and object status of the object image.
  • the neural network includes a classifier component and at least one block component, so The at least one block component includes a first block component;
  • using the training set to train the object recognition model includes:
  • the classifier component generates a classification result according to the first fused feature vector to train the object recognition model.
  • generating the first embedding feature vector according to the metadata includes:
  • a first embedding feature vector is generated based on the metadata through at least one fully connected layer.
  • generating the first embedding feature vector according to the metadata includes:
  • the first embedding feature vector is generated according to the metadata through at least one fully connected layer and at least one linear rectification layer.
  • the dimensions of the first image feature vector are the same as the dimensions of the first embedding feature vector.
  • fusing the first image feature vector with the first embedding feature vector to generate the first fused feature vector includes:
  • Each image vector component in the first image feature vector is added to a corresponding embedded vector component in the first embedded feature vector to generate each fused vector component in the first fused feature vector.
  • fusing the first image feature vector with the first embedding feature vector to generate the first fused feature vector includes:
  • the first image feature vector and the first embedded feature vector are connected in a preset dimension to generate a first fusion feature vector.
  • the at least one block assembly further includes a second block assembly in series with the first block assembly
  • Generating the first image feature vector according to the object image through the first block component includes:
  • the first block component generates a first image feature vector based on the second fused feature vector.
  • the neural network further includes a preprocessing component
  • using the training set to train the object recognition model includes:
  • the preprocessing component generates a preprocessed image according to the object image, and uses the preprocessed image as an input of a block component adjacent to the preprocessing component in the at least one block component.
  • the training method further includes:
  • test set includes multiple sets of input samples for testing and labeling results corresponding to the input samples, each input sample includes an object image and metadata, and the metadata is configured to describe the corresponding object image, And the test set is different from the training set;
  • the object recognition model is retrained.
  • the neural network includes a residual neural network.
  • an object recognition method is proposed, and the object recognition method includes:
  • An object recognition model is used to determine the recognition result based on the object image and metadata, wherein the object recognition model is trained using the training method as described above.
  • an object recognition device includes a memory and a processor. Instructions are stored on the memory. When the instructions are executed by the processor, the above steps are implemented. The training method or object recognition method described above.
  • a computer-readable storage medium is proposed. Instructions are stored on the computer-readable storage medium. When the instructions are executed by a processor, the training method or object recognition as described above is implemented. method.
  • a computer program product which computer program product includes instructions that, when executed by a processor, implement the training method or the object recognition method as described above.
  • Figure 1 shows a schematic diagram of a network environment according to an exemplary embodiment of the present disclosure
  • Figure 2 shows a schematic flowchart of a training method for an object recognition model according to an exemplary embodiment of the present disclosure
  • Figure 3 shows a schematic flowchart of step S120 of the training method according to an exemplary embodiment of the present disclosure
  • Figure 4 shows a schematic diagram of a training method according to a specific example of the present disclosure
  • Figure 5 shows a schematic flowchart of step S121 of the training method according to an exemplary embodiment of the present disclosure
  • Figure 6 shows a schematic diagram of a training method according to another specific example of the present disclosure.
  • FIG. 7 shows a schematic flowchart of a training method for an object recognition model according to another exemplary embodiment of the present disclosure
  • Figure 8 shows a schematic flowchart of an object recognition method according to an exemplary embodiment of the present disclosure
  • FIG. 9 shows a schematic diagram of an object recognition device according to an exemplary embodiment of the present disclosure.
  • any specific values are to be construed as illustrative only and not as limiting. Accordingly, other examples of the exemplary embodiments may have different values.
  • FIG. 1 shows a schematic diagram of a network environment 900 according to an exemplary embodiment of the present disclosure.
  • Network environment 900 may include mobile device 902, remote server 903, training device 904, and database 905, which are wired or wirelessly coupled to each other through network 906.
  • Network 906 may embody a wide area network (such as a mobile phone network, a public switched telephone network, a satellite network, the Internet, etc.), a local area network (such as Wi-Fi, Wi-Max, ZigBeeTM, BluetoothTM, etc.), and/or other forms of networking functionality.
  • a wide area network such as a mobile phone network, a public switched telephone network, a satellite network, the Internet, etc.
  • a local area network such as Wi-Fi, Wi-Max, ZigBeeTM, BluetoothTM, etc.
  • other forms of networking functionality such as Wi-Fi, Wi-Max, ZigBeeTM, BluetoothTM, etc.
  • Mobile device 902 may be a mobile phone, tablet computer, laptop computer, personal digital assistant, and/or other computing device configured for capturing, storing, and/or transmitting images such as digital photos. Accordingly, mobile device 902 may include an image capture unit such as a digital camera and/or may be configured to receive images from other devices. Mobile device 902 may include a display. The display may be configured to provide user 901 with one or more user interfaces that may It includes multiple interface elements, and the user 901 can interact with the interface elements, etc. For example, the user 901 can use the mobile device 902 to take photos of objects such as plants and insects and upload or store the object images. The mobile device 902 can output species information about plants, insects, and the like to the user.
  • the remote server 903 may be configured to analyze object images and the like received from the mobile device 902 via the network 906 to determine the type of object and the like, for example, for performing an object recognition method as described below.
  • the remote server 903 may also be configured to create and train an object recognition model as described below.
  • Training device 904 may be coupled to network 906 to facilitate training of an object recognition model, such as for performing a method of training an object recognition model as described below.
  • the training device 904 may have multiple CPUs and/or GPUs to assist in training the object recognition model. The specific training process will be elaborated below.
  • Database 905 may be coupled to network 906 and provide data required by remote server 903 to perform related calculations.
  • Database 905 may be implemented using various database technologies known in the art.
  • the remote server 903 can access the database 905 as needed to perform related operations.
  • object recognition models can be trained and established based on neural networks.
  • the training process is as follows:
  • a certain number of object images annotated with labeling results are obtained for each object type.
  • the number of object images prepared for each object type may be equal or different.
  • the labeling result annotated for each object image may include the name of the object in the object image (including scientific name, alias, category name of biological classification, etc.).
  • the object images obtained for each object type may include images of different shooting geographical locations, different shooting times, different shooting scenes, different object parts, different object states, etc. of the object of that type as much as possible.
  • the object images processed by the above annotation are divided into a training set for training the object recognition model and a test set for testing the training results.
  • the number of samples in the training set is significantly larger than the number of samples in the test set.
  • the number of samples in the test set accounts for 5% to 5% of the total number of samples. 20%, the number of samples in the corresponding training set accounts for 80% to 95% of the total number of samples.
  • the number of samples in the training set and the test set can be adjusted as needed.
  • the test set can also be used to test the model accuracy of the trained neural network as needed. If the model accuracy does not meet the requirements, increase the number of samples in the training set and retrain the neural network using the updated training set until the model accuracy of the trained neural network meets the requirements.
  • the above-mentioned neural network may include, for example, a convolutional neural network (CNN) or a residual neural network (Resnet).
  • CNN convolutional neural network
  • Resnet residual neural network
  • the convolutional neural network is a deep feed-forward neural network, which uses a convolution kernel to scan the object image, extract the features to be identified in the object image, and then identify the features to be identified of the object.
  • the original object image can be directly input into the convolutional neural network model without preprocessing the object image.
  • the convolutional neural network model has higher recognition accuracy and recognition efficiency.
  • the residual neural network model adds an identity mapping layer, which can avoid the accuracy saturation or even decline caused by the convolutional neural network as the network (the number of stacks in the network) increases. Phenomenon.
  • the identity mapping function of the identity mapping layer in the residual neural network model needs to satisfy: the sum of the identity mapping function and the input of the residual neural network model is equal to the output of the residual neural network model. After the introduction of identity mapping, the residual neural network model changes the output more obviously, so the accuracy and efficiency of object recognition can be greatly improved.
  • the neural network will be a residual neural network as an example for detailed explanation. However, it can be understood that other types of neural networks can also be used for training without departing from the concept of the present disclosure.
  • the training when training an object recognition model, the training is generally based only on the object image itself. For example, the RGB data of the object image is input into the neural network for training, which often results in lower recognition accuracy.
  • the present disclosure proposes a training method of the object recognition model, which is not only based on the object image itself, but also based on the metadata configured to describe the object image to train the object recognition model to improve the accuracy of the object recognition model. Recognition accuracy and recognition efficiency.
  • the training method may include:
  • Step S110 Obtain a training set, where the training set includes multiple sets of input samples used for training and Labeling results corresponding to input samples, each input sample including an object image and metadata, the metadata being configured to describe the corresponding object image.
  • the object images may include photos, videos, etc. obtained by users taking photos of plants, insects, and other objects.
  • the object image may be in RGB format, for example.
  • Each pixel in the object image can be composed of three color components: red (R), green (G), and blue (B). It can be understood that other data formats can also be used to represent the object image, such as RGBA format, HSV format, etc.
  • Metadata is data used to describe object images.
  • Each object image can have its corresponding metadata.
  • the metadata may include at least one of a shooting geographical location, shooting time, shooting scene, object part, and object status of the object image.
  • the shooting scene can be used to describe the environment in which the object is photographed, such as indoors or outdoors;
  • the object part can be used to describe the object parts presented or mainly presented in the object image, such as the roots, stems, leaves, flowers of plants. , fruits, etc., the head, abdomen, etc. of insects;
  • the object status can be used to describe the current stage of the object presented in the object image, such as the seedling stage, flowering stage, fruit stage of plants, etc., the larval stage, adult stage of insects, etc. .
  • Metadata may also include other data used to describe the object image, which is not limited here.
  • all metadata can be involved in the training process; or, a part of the metadata that can play a key role in object recognition can be selected for training to improve the accuracy of the model. while maintaining high training efficiency and avoiding waste of training resources.
  • the training method may also include normalizing the metadata, thereby limiting the value range of the metadata to a desired range, for example, making the absolute value of the metadata Value between 0 and 1 (inclusive) to improve training results.
  • normalizing the metadata thereby limiting the value range of the metadata to a desired range, for example, making the absolute value of the metadata Value between 0 and 1 (inclusive) to improve training results.
  • the longitude and latitude can usually be used to characterize the shooting geographical location. Then, in the process of normalizing the shooting location, the sine function and the cosine function can be used to convert the longitude and latitude into values with absolute values between 0 and 1.
  • the sine function and the cosine function can be used to convert the longitude and latitude into values with absolute values between 0 and 1.
  • L1, L2 is the longitude of the shooting location (expressed as an angle)
  • L2 is the latitude of the shooting location (expressed as an angle).
  • the normalized shooting location can be expressed as (sin( ⁇ L1/180 ), cos( ⁇ L1/180), sin( ⁇ L2/180), cos( ⁇ L2/180)), that is, a vector with a dimension of 4 can be used to uniquely represent a specific shooting location.
  • the shooting time can be divided into two parts: the shooting date and the shooting time on this day, and the shooting time can be converted into the format of universal standard (utc) time. Then, normalized methods are used to express the shooting date and shooting time respectively.
  • utc universal standard
  • the shooting time can be expressed as (sin(2 ⁇ d/Y), cos(2 ⁇ d/Y), sin(2 ⁇ t/D), cos(2 ⁇ t/D)), where d represents the current shooting date is the dth day of the year, Y represents the total number of days in the year (365 or 366), t represents the current shooting time as the tth moment of the day, and D represents the total number of moments in the day (for example, in When the time is accurate to the hour, it is 24), that is, a vector with a dimension of 4 can also be used to uniquely represent a specific shooting time.
  • the sine function and the cosine function can be used to normalize the metadata.
  • other functions can also be used to normalize metadata that can take continuous values, and there is no limitation here.
  • a "dictionary lookup" method can be used to determine the metadata. Take value. For example, in the case where metadata represents various parts of a plant, 1 can be used to represent the root part of the plant, 2 to represent the stem part of the plant, 3 to represent the leaf part of the plant, 4 to represent the flower part of the plant, and 5 to represent the plant's flower part. Fruit parts.
  • the object part can be encoded as a matrix of [5,x], where 5 corresponds to 5 categories, that is, each category corresponds to a vector of x values. After encoding metadata such as object parts and object status respectively, a vector of [n, x] can be obtained, where n is the number of such discrete features.
  • various metadata can be combined together, for example, vectors are formed separately based on various metadata, and then these vectors are added for fusion.
  • the vectors of various metadata can be converted into vectors of preset dimensions (for example, 2048) through several (for example, 3) fully connected layers, and then the corresponding components of these vectors are added to obtain a representation of all participants in the training A vector of metadata.
  • the training method may further include:
  • Step S120 Use the training set to train the object recognition model based on the neural network.
  • the input samples of the training set contain object images and their metadata, it helps to improve the accuracy of the trained object recognition model.
  • metadata can be embedded into the training process in different ways.
  • the neural network may include a classifier component 830 and at least one block component, the at least one block component being included in the image feature extraction network 810, and at least one block
  • the component may include a first block of components. Accordingly, based on the neural network, using the training set to train the object recognition model can include:
  • Step S121 generate a first image feature vector according to the object image through the first block component
  • Step S122 generate the first embedded feature vector according to the metadata
  • Step S123 fuse the first image feature vector and the first embedded feature vector to generate a first fused feature vector
  • Step S124 The classifier component generates a classification result according to the first fusion feature vector to train the object recognition model.
  • the first block component may be a block component adjacent to the classifier component 830 in the image feature extraction network 810 . That is to say, during the training process, the object image is converted into the first image feature vector through the complete image feature extraction network 810 (for example, the network backbone in the residual neural network), and the metadata passes through the embedding network 820 is converted into a first embedded feature vector, the first image feature vector is fused with the first embedded feature vector to generate a first fused feature vector (not shown in the figure), and the generated first fused feature vector is input to the classifier Component 830 to generate classification results.
  • the image feature extraction network 810 and the embedding network 820 are independent of each other, and only the first image feature vector and the first embedding feature vector are fused before the final classification.
  • embedded network 820 may include at least one fully connected layer. Accordingly, generating the first embedding feature vector according to the metadata may include generating the first embedding feature vector according to the metadata through at least one fully connected layer.
  • the embedded network 820 may also include at least one linear rectification layer.
  • generating the first embedded feature vector according to the metadata may include generating the first embedded feature vector according to the metadata through at least one fully connected layer and at least one linear rectification layer.
  • various methods can be used to fuse the first image feature vector and the first embedding feature vector.
  • the dimensions of the first image feature vector may be the same as the dimensions of the first embedding feature vector.
  • fusing the first image feature vector with the first embedding feature vector to generate the first fused feature vector may include separately combining each image vector component in the first image feature vector with a corresponding embedding vector in the first embedding feature vector.
  • the components are added (add) to generate each fusion vector component in the first fusion feature vector, and the dimension of the generated first fusion feature vector still remains the same as the dimension of the first image feature vector or the dimension of the first embedded feature vector. same.
  • the dimension of the first image feature vector A is [ax, ay] and the dimension of the first embedded feature vector B is [bx, by]
  • a and B are added to obtain the first fusion feature vector C
  • the dimensions of the generated first fusion feature vector remain unchanged, so there is no need to adjust the subsequent network structure and the main structure of the network will not be destroyed.
  • the image feature extraction network 810 and the embedding network 820 may not be completely independent.
  • the input of a certain block component may be the image feature vector generated by another block component above it and the corresponding embedding feature vector.
  • the embedding network can be partially embedded into the image feature extraction network, inserting metadata at various stages to enhance the extracted features, where each embedding network can be regarded as an independent embedding network.
  • at least one block assembly may further include a second block assembly 812 connected in series with the first block assembly 811 .
  • each block component shown in Figure 6 is included in the image feature Extract network 810.
  • generating the first image feature vector according to the object image through the first block component may include:
  • Step S1211 generate a second image feature vector according to the object image through the second block component
  • Step S1212 generate a second embedded feature vector according to the metadata
  • Step S1213 fuse the second image feature vector and the second embedded feature vector to generate a second fused feature vector
  • Step S1214 Generate a first image feature vector based on the second fusion feature vector through the first block component.
  • the first embedding feature vector may be formed to generate the second embedding feature vector based on the metadata
  • the first fusion feature vector may be formed to generate the second embedding feature vector based on the second image feature vector and the second embedding feature vector. Fusion of feature vectors.
  • the image feature vector and the embedding feature vector can also be fused at more block components, and the resulting fused feature vector can be used as the input of the next level block component to improve the training effect.
  • the neural network may also include a preprocessing component 815 (stem), which is also included in the backbone.
  • using the training set to train the object recognition model may include: generating a preprocessed image according to the object image through a preprocessing component, and using the preprocessed image as a block adjacent to the preprocessing component in at least one block component.
  • Component input the specific setting methods of the preprocessing component and block component can refer to the settings in the residual neural network, and will not be described again here.
  • the object image is preprocessed by the preprocessing component 815 to generate a preprocessed image.
  • the preprocessed image is then input into the fourth block component 814 and converted into a fourth image feature vector.
  • the metadata is input into the fourth embedding network 824 and converted into a fourth embedding feature vector.
  • the fourth image feature vector is fused with the fourth embedded feature vector to generate a fourth fused feature vector, which is input into the third block component 813 to be converted into a third image feature vector.
  • the metadata is converted into a third embedding feature vector through the third embedding network 823.
  • the third image feature vector is then fused with the third embedded feature vector to produce a third fused feature vector, which continues is input into the second block component 812 and converted into a second image feature vector.
  • the metadata is converted into a second embedding feature vector through the second embedding network 822.
  • the second image feature vector is fused with the second embedded feature vector to generate a second fused feature vector, which is further input into the first block component 811 to be converted into a first image feature vector.
  • the metadata is converted into a first embedding feature vector through the first embedding network 821.
  • the fourth embedding network 824, the third embedding network 823, the second embedding network 822 and the first embedding network 821 output the fourth embedding feature vector, the third embedding feature vector, the second embedding
  • the dimensions of the feature vector and the first embedding feature vector can be 256, 512, 1024 and 2048 respectively.
  • the fourth embedding network 824, the third embedding network 823 and the second embedding network 822 may respectively include only one fully connected layer and one linear rectification layer, while the first embedding network 821 may include three fully connected layers.
  • the last fully connected layer outputs a feature vector with a dimension of 2048.
  • the first image feature vector is fused with the first embedded feature vector to generate a first fused feature vector, which is input into the classifier component 830 to generate a classification result.
  • block components there may be other block components between two adjacent block components that will receive the fused feature vector.
  • other block components may also exist between the preprocessing component and the fourth block component, and/or between the first block component and the classifier component.
  • the training method may further include:
  • Step S130 When the training accuracy rate is greater than or equal to the preset accuracy rate, the training is ended, and the trained object recognition model is obtained.
  • the training method may also include:
  • Step S210 obtain a test set, where the test set includes multiple sets of input samples for testing and labeling results corresponding to the input samples.
  • Each input sample includes an object image and metadata, and the metadata is configured to describe the corresponding object image.
  • the test set is different from the training set;
  • Step S220 use the test set to determine the model accuracy of the trained object recognition model
  • Step S230 When the model accuracy is less than the preset accuracy, retrain the object recognition model.
  • the test set can be used to test whether the object recognition model also has a good recognition effect on object images outside the training set.
  • the model accuracy of the object recognition model is calculated by comparing the output results produced based on the object images and their metadata in the test set.
  • model accuracy can be calculated the same way as training accuracy.
  • the training set can be adjusted. For example, the number of samples in the training set can be increased, or the object recognition model itself can be adjusted, or the Both of the above are adjusted and then the object recognition model is retrained to improve its recognition performance.
  • the object recognition method may include:
  • Step S310 Obtain the object image to be recognized and metadata, where the metadata is configured to describe the object image to be recognized;
  • Step S320 Use an object recognition model to determine a recognition result based on the object image and metadata, where the object recognition model is trained using the training method as described above.
  • the present disclosure also proposes an object recognition device.
  • the object recognition device may include a memory 720 and a processor 710. Instructions are stored on the memory 720. When the instructions are executed by the processor 710, the above mentioned training methods or object recognition methods.
  • the processor 710 can perform various actions and processes according to instructions stored in the memory 720.
  • the processor 710 may be an integrated circuit chip with signal processing capabilities.
  • the above-mentioned processor may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • Various methods, steps, and logical block diagrams disclosed in the embodiments of the present disclosure may be implemented or executed.
  • the general-purpose processor can be a microprocessor or the processor can be any conventional processor, etc., and can be an X86 architecture or an ARM architecture, etc.
  • the memory 720 stores executable instructions, which are used by the processor 710 to execute the training method of the object recognition model or the object recognition method described above.
  • Memory 720 may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • Non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM Direct Memory Bus Random Access Memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • DDRSDRAM double data rate synchronous dynamic Random Access Memory
  • ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM Synchronous Linked Dynamic Random Access Memory
  • DR RAM Direct Memory Bus Random Access Memory
  • the present disclosure also proposes a computer-readable storage medium, which stores instructions.
  • the instructions are executed by a processor, the training method of the object recognition model or the object recognition method as described above is implemented.
  • computer-readable storage media in embodiments of the present disclosure may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. It should be noted that computer-readable storage media described herein are intended to include, without limitation, these and any other suitable types of memory.
  • the present disclosure further proposes a computer program product, which may include instructions that, when executed by a processor, implement the training method of the object recognition model or the object recognition method as described above.
  • the instructions may be any set of instructions to be executed directly by one or more processors, such as machine code, or indirectly, such as a script.
  • processors such as machine code, or indirectly, such as a script.
  • instructions may be stored in object code format for direct processing by one or more processors, or in any other computer language, including a script or collection of independent source code modules that are interpreted on demand or compiled ahead of time. Instructions may include instructions that cause, for example, one or more processors to act as each neural network herein.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logic for implementing the specified Function executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the various example embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, firmware, logic, or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor, or other computing device. While aspects of embodiments of the present disclosure are illustrated or described as block diagrams, flowcharts, or using some other graphical representation, it will be understood that the blocks, devices, systems, techniques, or methods described herein may be used as non-limiting Examples are implemented in hardware, software, firmware, special purpose circuitry or logic, general purpose hardware or controllers, or other computing devices, or some combination thereof.
  • the word "exemplary” means “serving as an example, instance, or illustration” rather than as a “model” that will be accurately reproduced. Any implementation illustratively described herein is not necessarily to be construed as preferred or advantageous over other implementations. Furthermore, this disclosure is not intended to be expressed or implied by anything presented in the above technical field, background, brief summary or detailed description. limited by the theory shown.
  • the word “substantially” is meant to include any minor variations resulting from design or manufacturing defects, device or component tolerances, environmental effects, and/or other factors.
  • the word “substantially” also allows for differences from perfect or ideal conditions due to parasitic effects, noise, and other practical considerations that may be present in actual implementations.
  • connection means that one element/node/feature is electrically, mechanically, logically, or otherwise directly connected to another element/node/feature (or direct communication).
  • coupled means that one element/node/feature can be directly or indirectly connected mechanically, electrically, logically, or otherwise to another element/node/feature. to allow interactions even though the two features may not be directly connected. That is, “coupled” is intended to encompass both direct and indirect connections of elements or other features, including connections via one or more intervening elements.
  • first,” “second,” and similar terms may also be used herein for reference purposes only and are therefore not intended to be limiting.
  • the words “first,” “second,” and other such numerical terms referring to structures or elements do not imply a sequence or order unless clearly indicated by the context.
  • the term “provide” is used in a broad sense to cover all ways of obtaining an object, so “providing an object” includes but is not limited to “purchasing”, “preparing/manufacturing”, “arranging/setting up”, “installing/ Assembly”, and/or “Order” objects, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

本公开涉及一种对象识别模型的训练方法、对象识别方法和对象识别设备。训练方法包括:获取训练集,其中,所述训练集包括用于训练的多组输入样本和与输入样本对应的标记结果,每个输入样本包括对象图像和元数据,元数据被配置为描述相应的对象图像;基于神经网络,使用所述训练集来训练所述对象识别模型;以及当训练准确率大于或等于预设准确率时结束训练,并得到训练后的对象识别模型。

Description

对象识别模型的训练方法、对象识别方法和对象识别设备 技术领域
本公开涉及计算机技术领域,具体来说,涉及一种对象识别模型的训练方法、对象识别方法和对象识别设备。
背景技术
为了对例如植物、昆虫等各种对象进行识别,用户可以提供所拍摄的对象图像,并利用预先训练获得的对象识别模型来完成识别,从而获得例如物种信息等识别结果。然而,目前的对象识别模型的准确率还较低,因此存在对改进的对象识别模型的需求。
发明内容
本公开的目的之一是提供一种对象识别模型的训练方法、对象识别方法和对象识别设备。
根据本公开的第一方面,提出了一种对象识别模型的训练方法,所述训练方法包括:
获取训练集,其中,所述训练集包括用于训练的多组输入样本和与输入样本对应的标记结果,每个输入样本包括对象图像和元数据,元数据被配置为描述相应的对象图像;
基于神经网络,使用所述训练集来训练所述对象识别模型;以及
当训练准确率大于或等于预设准确率时结束训练,并得到训练后的对象识别模型。
在一些实施例中,在基于神经网络,使用所述训练集来训练所述对象识别模型之前,所述训练方法还包括:
对元数据进行归一化处理。
在一些实施例中,元数据包括对象图像的拍摄地理位置、拍摄时间、拍摄场景、对象部位和对象状态中的至少一者。
在一些实施例中,所述神经网络包括分类器组件和至少一个块组件,所 述至少一个块组件包括第一块组件;
基于神经网络,使用所述训练集来训练所述对象识别模型包括:
通过所述第一块组件根据对象图像产生第一图像特征向量;
根据元数据产生第一嵌入特征向量;
将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量;以及
通过所述分类器组件根据第一融合特征向量产生分类结果,以训练所述对象识别模型。
在一些实施例中,根据元数据产生第一嵌入特征向量包括:
通过至少一个全连接层根据元数据产生第一嵌入特征向量。
在一些实施例中,根据元数据产生第一嵌入特征向量包括:
通过至少一个全连接层和至少一个线性整流层根据元数据产生第一嵌入特征向量。
在一些实施例中,第一图像特征向量的维度与第一嵌入特征向量的维度相同。
在一些实施例中,将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量包括:
将第一图像特征向量中的每个图像向量分量分别与第一嵌入特征向量中的相应的嵌入向量分量相加,以产生第一融合特征向量中的各个融合向量分量。
在一些实施例中,将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量包括:
在预设维度上将第一图像特征向量与第一嵌入特征向量进行连接,以产生第一融合特征向量。
在一些实施例中,所述至少一个块组件还包括与所述第一块组件串联的第二块组件;
通过所述第一块组件根据对象图像产生第一图像特征向量包括:
通过所述第二块组件根据对象图像产生第二图像特征向量;
根据元数据产生第二嵌入特征向量;
将第二图像特征向量与第二嵌入特征向量融合以产生第二融合特征向量;以及
通过所述第一块组件根据第二融合特征向量产生第一图像特征向量。
在一些实施例中,所述神经网络还包括预处理组件;
基于神经网络,使用所述训练集来训练所述对象识别模型包括:
通过所述预处理组件根据对象图像产生预处理图像,并将预处理图像作为所述至少一个块组件中的与所述预处理组件相邻的块组件的输入。
在一些实施例中,所述训练方法还包括:
获取测试集,其中,所述测试集包括用于测试的多组输入样本和与输入样本对应的标记结果,每个输入样本包括对象图像和元数据,元数据被配置为描述相应的对象图像,且所述测试集不同于所述训练集;
使用所述测试集来确定训练后的对象识别模型的模型准确率;以及
当模型准确率小于所述预设准确率时,对所述对象识别模型重新进行训练。
在一些实施例中,神经网络包括残差神经网络。
根据本公开的第二方面,提出了一种对象识别方法,所述对象识别方法包括:
获取待识别的对象图像和元数据,其中,元数据被配置为描述待识别的对象图像;
使用对象识别模型根据对象图像和元数据确定识别结果,其中,所述对象识别模型是采用如上所述的训练方法来训练获得的。
根据本公开的第三方面,提出了一种对象识别设备,所述对象识别设备包括存储器和处理器,所述存储器上存储有指令,当所述指令被所述处理器执行时,实现如上所述的训练方法或对象识别方法。
根据本公开的第四方面,提出了一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现如上所述的训练方法或对象识别方法。
根据本公开的第五方面,提出了一种计算机程序产品,所述计算机程序产品包括指令,当所述指令被处理器执行时,实现如上所述的训练方法或对象识别方法。
通过以下参照附图对本公开的示例性实施例的详细描述,本公开的其它特征及其优点将会变得更为清楚。
附图说明
构成说明书的一部分的附图描述了本公开的实施例,并且连同说明书一起用于解释本公开的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本公开,其中:
图1示出了根据本公开的一示例性实施例的网络环境的示意图;
图2示出了根据本公开的一示例性实施例的用于对象识别模型的训练方法的流程示意图;
图3示出了根据本公开的一示例性实施例的训练方法的步骤S120的流程示意图;
图4示出了根据本公开的一具体示例的训练方法的示意图;
图5示出了根据本公开的一示例性实施例的训练方法的步骤S121的流程示意图;
图6示出了根据本公开的另一具体示例的训练方法的示意图;
图7示出了根据本公开的另一示例性实施例的用于对象识别模型的训练方法的流程示意图;
图8示出了根据本公开的一示例性实施例的对象识别方法的流程示意图;
图9示出了根据本公开的一示例性实施例的对象识别设备的示意图。
注意,在以下说明的实施方式中,有时在不同的附图之间共同使用同一附图标记来表示相同部分或具有相同功能的部分,而省略其重复说明。在一些情况中,使用相似的标号和字母表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
为了便于理解,在附图等中所示的各结构的位置、尺寸及范围等有时不 表示实际的位置、尺寸及范围等。因此,本公开并不限于附图等所公开的位置、尺寸及范围等。
具体实施方式
下面将参照附图来详细描述本公开的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。也就是说,本文中的结构及方法是以示例性的方式示出,来说明本公开中的结构和方法的不同实施例。然而,本领域技术人员将会理解,它们仅仅说明可以用来实施的本公开的示例性方式,而不是穷尽的方式。此外,附图不必按比例绘制,一些特征可能被放大以示出具体组件的细节。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为授权说明书的一部分。
在这里示出和讨论的所有示例中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它示例可以具有不同的值。
图1示出了根据本公开的一示例性实施例的网络环境900的示意图。网络环境900可以包括移动设备902、远程服务器903、训练设备904和数据库905,它们通过网络906彼此有线或无线地耦接。网络906可以体现为广域网(诸如移动电话网络、公共交换电话网络、卫星网络、互联网等)、局域网(诸如Wi-Fi、Wi-Max、ZigBeeTM、BluetoothTM等)和/或其它形式的联网功能。
移动设备902可以为移动电话、平板计算机、膝上型计算机、个人数字助理和/或被配置用于采集、存储和/或传输诸如数字照片之类的图像的其它计算装置。因此,移动设备902可以包括诸如数字相机之类的图像采集单元和/或可以被配置为从其它装置接收图像。移动设备902可以包括显示器。显示器可以被配置用于向用户901提供一个或多个用户界面,所述用户界面可以 包括多个界面元素,用户901可以与界面元素进行交互等。例如,用户901可以使用移动设备902对植物、昆虫等对象进行拍照并上传或存储对象图像。移动设备902可以向用户输出有关植物、昆虫等对象的物种信息等。
远程服务器903可以被配置为对经由网络906从移动设备902接收的对象图像等进行分析以确定对象的种类等,例如用于执行如下文所述的对象识别方法。远程服务器903还可以被配置为创建并训练如下文所述的对象识别模型。
训练设备904可以耦合到网络906以促进对象识别模型的训练,例如用于执行如下文所述的对象识别模型的训练方法。训练设备904可以具有多个CPU和/或GPU以辅助训练对象识别模型,具体的训练过程将在下文中详细阐述。
数据库905可以耦合到网络906并提供远程服务器903进行相关计算所需的数据。数据库905可以采取本领域中已知的各种数据库技术来实现。远程服务器903可以根据需要访问数据库905以进行相关操作。
应该理解的是,本文的网络环境仅仅是一个示例。本领域技术人员可以根据需要,增加更多的装置或删减一些装置,并且可以对一些装置的功能和配置进行修改。
在一些示例中,可以基于神经网络来训练和建立对象识别模型,其训练过程如下:
为每个对象种类获取一定数量的标注有标记结果的对象图像,为每个对象种类准备的对象图像的数量可以相等也可以不等。为每个对象图像标注的标记结果可以包括该对象图像中的对象名称(包括学名、别称、生物学分类的类别名称等)。为每个对象种类获取的对象图像可以尽可能包括该种类的对象的不同拍摄地理位置、不同拍摄时间、不同拍摄场景、不同对象部位、不同对象状态等的图像。
将经过上述标注处理的对象图像划分为用于训练对象识别模型的训练集和用于对训练结果进行测试的测试集。通常训练集内的样本的数量明显大于测试集内的样本的数量,例如,测试集内的样本的数量占总样本数量的5%到 20%,相应的训练集内的样本的数量占总样本数量的80%到95%。本领域技术人员应该理解的是,训练集和测试集内的样本数量可以根据需要来调整。
利用训练集对神经网络进行训练,直至达到预设准确率。在一些情况下,还可以根据需要利用测试集对经过训练的神经网络的模型准确率进行测试。若模型准确率不满足要求,则增加训练集中的样本数量,并利用更新的训练集重新对神经网络进行训练,直到经过训练的神经网络的模型准确率满足要求为止。
上述神经网络例如可以包括卷积神经网络(CNN)或者残差神经网络(Resnet)等。其中,卷积神经网络为深度前馈神经网络,其利用卷积核扫描对象图像,提取出对象图像中待识别的特征,进而对对象的待识别的特征进行识别。另外,在对对象图像进行识别的过程中,可以直接将原始的对象图像输入卷积神经网络模型,而无需对对象图像进行预处理。卷积神经网络模型相比于其他的识别模型,具备更高的识别准确率以及识别效率。而残差神经网络模型相比于卷积神经网络模型增加了恒等映射层,可以避免随着网络(网络中叠层的数量)的增加,卷积神经网络造成的准确率饱和、甚至下降的现象。残差神经网络模型中恒等映射层的恒等映射函数需要满足:恒等映射函数与残差神经网络模型的输入之和等于残差神经网络模型的输出。引入恒等映射以后,残差神经网络模型对输出的变化更加明显,因此可以大大提高对象识别的准确率和效率。在下文的具体示例中,将以神经网络为残差神经网络为例进行具体阐述,然而可以理解的是,在不脱离本公开的构思的前提下,也可以采用其他类型的神经网络进行训练。
根据上文的描述可知,在训练对象识别模型时,一般仅根据对象图像本身来进行训练,例如将对象图像的RGB数据输入神经网络进行训练,这往往导致较低的识别准确率。为了提高对象识别模型的准确率,本公开提出了一种对象识别模型的训练方法,不仅仅基于对象图像本身,还基于被配置为描述对象图像的元数据来进行对象识别模型的训练,以提高识别准确率和识别效率。如图2所示,在本公开的一示例性实施例中,该训练方法可以包括:
步骤S110,获取训练集,其中,训练集包括用于训练的多组输入样本和 与输入样本对应的标记结果,每个输入样本包括对象图像和元数据,元数据被配置为描述相应的对象图像。
其中,对象图像可以包括用户拍摄植物、昆虫等对象所获得的相片、视频等。对象图像可以是例如RGB格式的。对象图像中的每个像素可以由红(R)、绿(G)和蓝(B)3个颜色分量构成。可以理解的是,也可以采用其他的数据格式来表示对象图像,例如RGBA格式、HSV格式等。
元数据(meta data)是用来描述对象图像的数据,每一幅对象图像可以具有其相应的元数据。在一些实施例中,元数据可以包括对象图像的拍摄地理位置、拍摄时间、拍摄场景、对象部位和对象状态中的至少一者。其中,拍摄场景可以用来描述拍摄对象时的环境,例如是室内还是室外等;对象部位可以用来描述对象图像中所呈现的或主要呈现的对象部位,例如植物的根、茎、叶、花、果实等,昆虫的头、腹等;对象状态可以用来描述对象图像中所呈现的对象当前所处的阶段,例如植物的幼苗期、花期、果实期等,昆虫的幼虫期、成虫期等。可以理解的是,在其他一些实施例中,元数据还可以包括用来描述对象图像的其他数据,在此不作限制。在训练对象识别模型的过程中,可以使所有的元数据参与到训练过程中;或者,可以挑选其中能够对对象的识别起到关键作用的一部分元数据用于训练,以在提高模型的准确率的同时保持较高的训练效率,避免训练资源的浪费。
在一些实施例中,为了方便训练过程中数据的处理,训练方法还可以包括对元数据进行归一化处理,从而将元数据的取值范围限定在期望的范围内,例如使得元数据的绝对值在0至1之间(包含0和1),以改善训练结果。可以理解的是,根据具体的元数据对识别所起的作用的大小,以及元数据和对象图像之间的关系,可以根据需要来调整元数据的取值范围,在此不作限制。此外,根据不同类型的元数据,还可以采用不同的归一化方式。
例如,在元数据为对象图像的拍摄地理位置的情况下,通常可以采用经度和纬度来表征该拍摄地理位置。那么,在对拍摄地理位置进行归一化的过程中,可以采用正弦函数和余弦函数来将经度和纬度转换为绝对值在0至1之间的值。在一具体示例中,若某一拍摄地理位置可以被表示为(L1,L2), 其中,L1为该拍摄地理位置的经度(表示为角度),L2为该拍摄地理位置的纬度(表示为角度),那么,归一化后的拍摄地理位置可以被表示为(sin(πL1/180),cos(πL1/180),sin(πL2/180),cos(πL2/180)),即可以采用一个维度为4的向量来唯一地表示一具体的拍摄地理位置。
在元数据为对象图像的拍摄时间的情况下,可以将拍摄时间分为拍摄日期和在这一天中的拍摄时刻两个部分,并将拍摄时刻转换为世界标准(utc)时间的形式。然后,分别采用归一化的方式来表示拍摄日期和拍摄时刻。在一具体示例中,拍摄时间可以被表示为(sin(2πd/Y),cos(2πd/Y),sin(2πt/D),cos(2πt/D)),其中,d表示当前的拍摄日期为一年中的第d天,Y表示一年中的总天数(为365或366),t表示当前的拍摄时刻为一天中的第t个时刻,D表示一天中的总时刻数(例如在时刻精确到小时的情况下,为24),即同样可以采用一个维度为4的向量来唯一地表示一具体的拍摄时间。
综上可知,在元数据为某一能够连续取值的量时,可以利用正弦函数和余弦函数来实现该元数据的归一化。然而可以理解的是,也可以采用其他函数来实现能够连续取值的元数据的归一化,在此不作限制。
在元数据为对象图像的拍摄场景、对象部位和对象状态中的一者的情况下,或者说在元数据为某一离散量的情况下,可以采用“查字典”的方式来确定元数据的取值。例如,在元数据表示植物的各个部位的情况下,可以用1来表示植物的根部位,2表示植物的茎部位,3表示植物的叶部位,4表示植物的花部位,以及5表示植物的果实部位。这样,对象部位可以被编码为一个[5,x]的矩阵,其中5对应于5个分类,也就是说每个分类对应于一个x个数值的向量。分别对对象部位、对象状态等元数据进行编码后,可以得到一个[n,x]的向量,n为这种离散特征的个数。
在训练过程中,可以将各种元数据组合在一起,例如根据各种元数据分别形成向量,然后将这些向量相加进行融合。例如,可以将各种元数据的向量通过若干次(例如,3次)全连接层转换为预设维度(例如,2048)的向量,然后将这些向量的相应分量分别相加得到表示所有参与训练的元数据的向量。
返回图2,在本公开的示例性实施例中,训练方法还可以包括:
步骤S120,基于神经网络,使用训练集来训练对象识别模型。
如上文所述,由于训练集的输入样本中包含了对象图像及其元数据,因此有助于改善所训练的对象识别模型的准确率。对于不同结构的神经网络而言,可以采用不同的方式将元数据嵌入到训练过程中。
在一些实施例中,如图3和图4所示,神经网络可以包括分类器组件830和至少一个块组件(block),至少一个块组件被包含在图像特征提取网络810中,且至少一个块组件可以包括第一块组件。相应地,基于神经网络,使用训练集来训练对象识别模型可以包括:
步骤S121,通过第一块组件根据对象图像产生第一图像特征向量;
步骤S122,根据元数据产生第一嵌入特征向量;
步骤S123,将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量;以及
步骤S124,通过分类器组件根据第一融合特征向量产生分类结果,以训练对象识别模型。
在一具体示例中,第一块组件可以是图像特征提取网络810中的与分类器组件830相邻的块组件。也就是说,在训练过程中,对象图像经过完整的图像特征提取网络810(例如,残差神经网络中的网络骨架(backbone))被转换为第一图像特征向量,而元数据经过嵌入网络820被转换为第一嵌入特征向量,第一图像特征向量与第一嵌入特征向量相融合产生第一融合特征向量(图中未示出),且所产生的第一融合特征向量被输入到分类器组件830中以产生分类结果。在这样的示例中,图像特征提取网络810和嵌入网络820是彼此独立的,仅仅在做最后的分类前,第一图像特征向量与第一嵌入特征向量被融合。
在一些实施例中,嵌入网络820可以包括至少一个全连接层。相应地,根据元数据产生第一嵌入特征向量可以包括通过至少一个全连接层根据元数据产生第一嵌入特征向量。
在另一些实施例中,嵌入网络820还可以包括至少一个线性整流层。相 应地,根据元数据产生第一嵌入特征向量可以包括通过至少一个全连接层和至少一个线性整流层根据元数据产生第一嵌入特征向量。
此外,可以采用多种方式来对第一图像特征向量和第一嵌入特征向量进行融合。在一些实施例中,第一图像特征向量的维度可以与第一嵌入特征向量的维度相同。这样,将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量可以包括将第一图像特征向量中的每个图像向量分量分别与第一嵌入特征向量中的相应的嵌入向量分量相加(add),以产生第一融合特征向量中的各个融合向量分量,且所产生的第一融合特征向量的维度依然保持与第一图像特征向量的维度或第一嵌入特征向量的维度相同。例如,若第一图像特征向量A的维度为[ax,ay],第一嵌入特征向量B的维度为[bx,by],那么,如果A和B相加得到第一融合特征向量C,其维度为[ax,ay],其中要求ax=bx,且ay=by。在这种情况下,所产生的第一融合特征向量的维度不变,因此可以无需对后续的网络结构进行调整,不会破坏网络的主体结构。
在另一些实施例中,将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量可以包括在预设维度上将第一图像特征向量与第一嵌入特征向量进行连接(concatenate),以产生第一融合特征向量。例如,若第一图像特征向量A的维度为[ax,ay],第一嵌入特征向量B的维度为[bx,by],那么,如果A和B在第一个维度上进行连接,将得到第一融合特征向量C,其维度为[ax+bx,ay],其中要求ay=by。在这种情况下,在后续训练中可能涉及对网络结构的一些相应调整,因为第一融合特征向量的维度发生了变化,但此时不要求第一图像特征向量与第一嵌入特征向量的维度必须是相同的。
在另一些实施例中,图像特征提取网络810和嵌入网络820可以不是完全独立的,某一个块组件的输入可以是由其上方的另一个块组件产生的图像特征向量与相应的嵌入特征向量相融合所产生的融合特征向量。换句话说,嵌入网络可以部分地嵌入到图像特征提取网络中,在各个阶段插入元数据,强化提取的特征,其中,每个嵌入网络都可以看作是一个独立的嵌入网络。具体而言,如图5和图6所示,至少一个块组件还可以包括与第一块组件811串联的第二块组件812。其中,图6中所示的各个块组件都被包含在图像特征 提取网络810中。相应地,通过第一块组件根据对象图像产生第一图像特征向量可以包括:
步骤S1211,通过第二块组件根据对象图像产生第二图像特征向量;
步骤S1212,根据元数据产生第二嵌入特征向量;
步骤S1213,将第二图像特征向量与第二嵌入特征向量融合以产生第二融合特征向量;以及
步骤S1214,通过第一块组件根据第二融合特征向量产生第一图像特征向量。
类似地,可以采用形成第一嵌入特征向量的方式来根据元数据产生第二嵌入特征向量,并采用形成第一融合特征向量的方式来根据第二图像特征向量与第二嵌入特征向量产生第二融合特征向量。依此类推,还可以在更多的块组件处进行图像特征向量与嵌入特征向量的融合,并将所产生的融合特征向量作为下一级块组件的输入,以改善训练效果。
此外,如图6所示,在一些实施例中,神经网络还可以包括预处理组件815(stem),该预处理组件815也被包含在backbone中。相应地,基于神经网络,使用训练集来训练对象识别模型可以包括:通过预处理组件根据对象图像产生预处理图像,并将预处理图像作为至少一个块组件中的与预处理组件相邻的块组件的输入。其中,预处理组件和块组件的具体设置方式可以参考残差神经网络中的设置,在此不再赘述。
根据实验结果,如图6的具体示例中所示,在四个块组件处用同样的方式来嵌入可以强化元数据的作用,从而获得较好的训练效果。具体而言,对象图像经由预处理组件815的预处理后产生预处理图像。然后,预处理图像被输入到第四块组件814中,转换为第四图像特征向量。并且,元数据被输入到第四嵌入网络824中,转换为第四嵌入特征向量。然后,第四图像特征向量与第四嵌入特征向量融合以产生第四融合特征向量,该第四融合特征向量被输入到第三块组件813中,转换为第三图像特征向量。并且,元数据经过第三嵌入网络823转换为第三嵌入特征向量。然后,第三图像特征向量与第三嵌入特征向量融合以产生第三融合特征向量,该第三融合特征向量继续 被输入到第二块组件812中,转换为第二图像特征向量。并且,元数据经过第二嵌入网络822转换为第二嵌入特征向量。然后,第二图像特征向量与第二嵌入特征向量融合以产生第二融合特征向量,该第二融合特征向量继续被输入到第一块组件811中,转换为第一图像特征向量。并且,元数据经过第一嵌入网络821转换为第一嵌入特征向量。其中,根据残差神经网络的特点,第四嵌入网络824、第三嵌入网络823、第二嵌入网络822和第一嵌入网络821输出的第四嵌入特征向量、第三嵌入特征向量、第二嵌入特征向量和第一嵌入特征向量的维度可以分别为256,512,1024和2048。在一具体示例中,第四嵌入网络824、第三嵌入网络823和第二嵌入网络822可以分别包括仅一个全连接层和一个线性整流层,而第一嵌入网络821可以包括三个全连接层和三个线性整流层,其中最后一个全连接层输出维度为2048的特征向量。最后,第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量,该第一融合特征向量被输入到分类器组件830中以产生分类结果。
可以理解的是,在其他一些实施例中,在两个相邻的将接收融合特征向量的块组件之间还可以存在其他的块组件。此外,在预处理组件与第四块组件之间,和/或在第一块组件与分类器组件之间,也可以存在其他的块组件。
返回图2,在本公开的示例性实施例中,训练方法还可以包括:
步骤S130,当训练准确率大于或等于预设准确率时结束训练,并得到训练后的对象识别模型。
进一步地,为了验证训练所得的对象识别模型的模型准确率,在一些实施例中,如图7所示,训练方法还可以包括:
步骤S210,获取测试集,其中,测试集包括用于测试的多组输入样本和与输入样本对应的标记结果,每个输入样本包括对象图像和元数据,元数据被配置为描述相应的对象图像,且测试集不同于训练集;
步骤S220,使用测试集来确定训练后的对象识别模型的模型准确率;以及
步骤S230,当模型准确率小于预设准确率时,对对象识别模型重新进行训练。
如上文所述,测试集和训练集中的输入样本并不完全相同,因而可以用测试集来测试对象识别模型是否对训练集之外的对象图像也有很好的识别效果。在测试过程中,通过比较根据测试集中的对象图像及其元数据所产生的输出结果,来计算对象识别模型的模型准确率。在一些示例中,模型准确率的计算方法可以与训练准确率的计算方法相同。当测试得到的模型准确率小于预设准确率时,表明对象识别模型的识别效果还不够好,因而可以调整训练集,具体例如可以增加训练集中的样本数量,或者调整对象识别模型本身,或者对上述两者均进行调整,然后重新训练对象识别模型来改善其识别效果。
本公开还提出了一种对象识别方法,如图8所示,该对象识别方法可以包括:
步骤S310,获取待识别的对象图像和元数据,其中,元数据被配置为描述待识别的对象图像;
步骤S320,使用对象识别模型根据对象图像和元数据确定识别结果,其中,对象识别模型是采用如上所述的训练方法进行训练所获得的。
本公开还提出了一种对象识别设备,如图9所示,该对象识别设备可以包括存储器720和处理器710,存储器720上存储有指令,当指令被处理器710执行时,实现如上所述的训练方法或对象识别方法。
其中,处理器710可以根据存储在存储器720中的指令执行各种动作和处理。具体地,处理器710可以是一种集成电路芯片,具有信号的处理能力。上述处理器可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本公开实施例中公开的各种方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,可以是X86架构或者是ARM架构等。
存储器720存储有可执行指令,该指令在被处理器710执行上文所述的对象识别模型的训练方法或对象识别方法。存储器720可以是易失性存储器 或非易失性存储器,或可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(ROM)、可编程只读存储器(PROM)、可擦除可编程只读存储器(EPROM)、电可擦除可编程只读存储器(EEPROM)或闪存。易失性存储器可以是随机存取存储器(RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(SDRAM)、双倍数据速率同步动态随机存取存储器(DDRSDRAM)、增强型同步动态随机存取存储器(ESDRAM)、同步连接动态随机存取存储器(SLDRAM)和直接内存总线随机存取存储器(DR RAM)。应注意,本文描述的方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本公开还提出了一种计算机可读存储介质,该计算机可读存储介质上存储有指令,当指令被处理器执行时,实现如上所述的对象识别模型的训练方法或对象识别方法。
类似地,本公开实施例中的计算机可读存储介质可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。应注意,本文描述的计算机可读存储介质旨在包括但不限于这些和任意其它适合类型的存储器。
本公开进一步提出了一种计算机程序产品,该计算机程序产品可以包括指令,当指令被处理器执行时,实现如上所述的对象识别模型的训练方法或对象识别方法。
指令可以是将由一个或多个处理器直接地执行的任何指令集,诸如机器代码,或者间接地执行的任何指令集,诸如脚本。本文中的术语“指令”、“应用”、“过程”、“步骤”和“程序”在本文中可以互换使用。指令可以存储为目标代码格式以便由一个或多个处理器直接处理,或者存储为任何其他计算机语言,包括按需解释或提前编译的独立源代码模块的脚本或集合。指令可以包括引起诸如一个或多个处理器来充当本文中的各神经网络的指令。本文 其他部分更加详细地解释了指令的功能、方法和例程。
需要说明的是,附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,所述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
一般而言,本公开的各种示例实施例可以在硬件或专用电路、软件、固件、逻辑,或其任何组合中实施。某些方面可以在硬件中实施,而其他方面可以在可以由控制器、微处理器或其他计算设备执行的固件或软件中实施。当本公开的实施例的各方面被图示或描述为框图、流程图或使用某些其他图形表示时,将理解此处描述的方框、装置、系统、技术或方法可以作为非限制性的示例在硬件、软件、固件、专用电路或逻辑、通用硬件或控制器或其他计算设备,或其某些组合中实施。
在说明书及权利要求中的词语“前”、“后”、“顶”、“底”、“之上”、“之下”等,如果存在的话,用于描述性的目的而并不一定用于描述不变的相对位置。应当理解,这样使用的词语在适当的情况下是可互换的,使得在此所描述的本公开的实施例,例如,能够在与在此所示出的或另外描述的那些取向不同的其他取向上操作。
如在此所使用的,词语“示例性的”意指“用作示例、实例或说明”,而不是作为将被精确复制的“模型”。在此示例性描述的任意实现方式并不一定要被解释为比其它实现方式优选的或有利的。而且,本公开不受在上述技术领域、背景技术、发明内容或具体实施方式中所给出的任何所表述的或所暗 示的理论所限定。
如在此所使用的,词语“基本上”意指包含由设计或制造的缺陷、器件或元件的容差、环境影响和/或其它因素所致的任意微小的变化。词语“基本上”还允许由寄生效应、噪声以及可能存在于实际的实现方式中的其它实际考虑因素所致的与完美的或理想的情形之间的差异。
另外,前面的描述可能提及了被“连接”或“耦接”在一起的元件或节点或特征。如在此所使用的,除非另外明确说明,“连接”意指一个元件/节点/特征与另一种元件/节点/特征在电学上、机械上、逻辑上或以其它方式直接地连接(或者直接通信)。类似地,除非另外明确说明,“耦接”意指一个元件/节点/特征可以与另一元件/节点/特征以直接的或间接的方式在机械上、电学上、逻辑上或以其它方式连结以允许相互作用,即使这两个特征可能并没有直接连接也是如此。也就是说,“耦接”意图包含元件或其它特征的直接连结和间接连结,包括利用一个或多个中间元件的连接。
另外,仅仅为了参考的目的,还可以在本文中使用“第一”、“第二”等类似术语,并且因而并非意图限定。例如,除非上下文明确指出,否则涉及结构或元件的词语“第一”、“第二”和其它此类数字词语并没有暗示顺序或次序。
还应理解,“包括/包含”一词在本文中使用时,说明存在所指出的特征、整体、步骤、操作、单元和/或组件,但是并不排除存在或增加一个或多个其它特征、整体、步骤、操作、单元和/或组件以及/或者它们的组合。
在本公开中,术语“提供”从广义上用于涵盖获得对象的所有方式,因此“提供某对象”包括但不限于“购买”、“制备/制造”、“布置/设置”、“安装/装配”、和/或“订购”对象等。
虽然已经通过示例对本公开的一些特定实施例进行了详细说明,但是本领域的技术人员应该理解,以上示例仅是为了进行说明,而不是为了限制本公开的范围。在此公开的各实施例可以任意组合,而不脱离本公开的精神和范围。本领域的技术人员还应理解,可以对实施例进行多种修改而不脱离本公开的范围和精神。本公开的范围由所附权利要求来限定。

Claims (17)

  1. 一种对象识别模型的训练方法,其特征在于,所述训练方法包括:
    获取训练集,其中,所述训练集包括用于训练的多组输入样本和与输入样本对应的标记结果,每个输入样本包括对象图像和元数据,元数据被配置为描述相应的对象图像;
    基于神经网络,使用所述训练集来训练所述对象识别模型;以及
    当训练准确率大于或等于预设准确率时结束训练,并得到训练后的对象识别模型。
  2. 根据权利要求1所述的训练方法,其特征在于,在基于神经网络,使用所述训练集来训练所述对象识别模型之前,所述训练方法还包括:
    对元数据进行归一化处理。
  3. 根据权利要求1所述的训练方法,其特征在于,元数据包括对象图像的拍摄地理位置、拍摄时间、拍摄场景、对象部位和对象状态中的至少一者。
  4. 根据权利要求1所述的训练方法,其特征在于,所述神经网络包括分类器组件和至少一个块组件,所述至少一个块组件包括第一块组件;
    基于神经网络,使用所述训练集来训练所述对象识别模型包括:
    通过所述第一块组件根据对象图像产生第一图像特征向量;
    根据元数据产生第一嵌入特征向量;
    将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量;以及
    通过所述分类器组件根据第一融合特征向量产生分类结果,以训练所述对象识别模型。
  5. 根据权利要求4所述的训练方法,其特征在于,根据元数据产生第一嵌入特征向量包括:
    通过至少一个全连接层根据元数据产生第一嵌入特征向量。
  6. 根据权利要求4所述的训练方法,其特征在于,根据元数据产生第一嵌入特征向量包括:
    通过至少一个全连接层和至少一个线性整流层根据元数据产生第一嵌入特征向量。
  7. 根据权利要求4所述的训练方法,其特征在于,第一图像特征向量的维度与第一嵌入特征向量的维度相同。
  8. 根据权利要求7所述的训练方法,其特征在于,将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量包括:
    将第一图像特征向量中的每个图像向量分量分别与第一嵌入特征向量中的相应的嵌入向量分量相加,以产生第一融合特征向量中的各个融合向量分量。
  9. 根据权利要求4所述的训练方法,其特征在于,将第一图像特征向量与第一嵌入特征向量融合以产生第一融合特征向量包括:
    在预设维度上将第一图像特征向量与第一嵌入特征向量进行连接,以产生第一融合特征向量。
  10. 根据权利要求4所述的训练方法,其特征在于,所述至少一个块组件还包括与所述第一块组件串联的第二块组件;
    通过所述第一块组件根据对象图像产生第一图像特征向量包括:
    通过所述第二块组件根据对象图像产生第二图像特征向量;
    根据元数据产生第二嵌入特征向量;
    将第二图像特征向量与第二嵌入特征向量融合以产生第二融合特征向量;以及
    通过所述第一块组件根据第二融合特征向量产生第一图像特征向量。
  11. 根据权利要求4所述的训练方法,其特征在于,所述神经网络还包括预处理组件;
    基于神经网络,使用所述训练集来训练所述对象识别模型包括:
    通过所述预处理组件根据对象图像产生预处理图像,并将预处理图像作为所述至少一个块组件中的与所述预处理组件相邻的块组件的输入。
  12. 根据权利要求1所述的训练方法,其特征在于,所述训练方法还包括:
    获取测试集,其中,所述测试集包括用于测试的多组输入样本和与输入样本对应的标记结果,每个输入样本包括对象图像和元数据,元数据被配置为描述相应的对象图像,且所述测试集不同于所述训练集;
    使用所述测试集来确定训练后的对象识别模型的模型准确率;以及
    当模型准确率小于所述预设准确率时,对所述对象识别模型重新进行训练。
  13. 根据权利要求1所述的训练方法,其特征在于,神经网络包括残差神经网络。
  14. 一种对象识别方法,其特征在于,所述对象识别方法包括:
    获取待识别的对象图像和元数据,其中,元数据被配置为描述待识别的对象图像;
    使用对象识别模型根据对象图像和元数据确定识别结果,其中,所述对象识别模型是采用根据权利要求1至13中任一项所述的训练方法来训练获得的。
  15. 一种对象识别设备,其特征在于,所述对象识别设备包括存储器和处理器,所述存储器上存储有指令,当所述指令被所述处理器执行时,实现根据权利要求1至13中任一项所述的训练方法或根据权利要求14所述的对象识别方法。
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现根据权利要求1至13中任一项所述的训练方法或根据权利要求14所述的对象识别方法。
  17. 一种计算机程序产品,其特征在于,所述计算机程序产品包括指令,当所述指令被处理器执行时,实现根据权利要求1至13中任一项所述的训练方法或根据权利要求14所述的对象识别方法。
PCT/CN2023/090318 2022-06-01 2023-04-24 对象识别模型的训练方法、对象识别方法和对象识别设备 WO2023231644A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210616572.7 2022-06-01
CN202210616572.7A CN114998682A (zh) 2022-06-01 2022-06-01 对象识别模型的训练方法、对象识别方法和对象识别设备

Publications (1)

Publication Number Publication Date
WO2023231644A1 true WO2023231644A1 (zh) 2023-12-07

Family

ID=83030857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/090318 WO2023231644A1 (zh) 2022-06-01 2023-04-24 对象识别模型的训练方法、对象识别方法和对象识别设备

Country Status (2)

Country Link
CN (1) CN114998682A (zh)
WO (1) WO2023231644A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998682A (zh) * 2022-06-01 2022-09-02 杭州睿胜软件有限公司 对象识别模型的训练方法、对象识别方法和对象识别设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160234023A1 (en) * 2015-02-05 2016-08-11 Sensory, Incorporated Face-Based Authentication with Situational Adaptivity
US20190325259A1 (en) * 2018-04-12 2019-10-24 Discovery Communications, Llc Feature extraction and machine learning for automated metadata analysis
CN112016531A (zh) * 2020-10-22 2020-12-01 成都睿沿科技有限公司 模型训练方法、对象识别方法、装置、设备及存储介质
US20210192293A1 (en) * 2019-12-19 2021-06-24 Insitu, Inc. Automatic Classifier Profiles from Training Set Metadata
CN114120094A (zh) * 2021-11-19 2022-03-01 广州市云景信息科技有限公司 一种基于人工智能的水污染识别方法及系统
CN114998682A (zh) * 2022-06-01 2022-09-02 杭州睿胜软件有限公司 对象识别模型的训练方法、对象识别方法和对象识别设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160234023A1 (en) * 2015-02-05 2016-08-11 Sensory, Incorporated Face-Based Authentication with Situational Adaptivity
US20190325259A1 (en) * 2018-04-12 2019-10-24 Discovery Communications, Llc Feature extraction and machine learning for automated metadata analysis
US20210192293A1 (en) * 2019-12-19 2021-06-24 Insitu, Inc. Automatic Classifier Profiles from Training Set Metadata
CN112016531A (zh) * 2020-10-22 2020-12-01 成都睿沿科技有限公司 模型训练方法、对象识别方法、装置、设备及存储介质
CN114120094A (zh) * 2021-11-19 2022-03-01 广州市云景信息科技有限公司 一种基于人工智能的水污染识别方法及系统
CN114998682A (zh) * 2022-06-01 2022-09-02 杭州睿胜软件有限公司 对象识别模型的训练方法、对象识别方法和对象识别设备

Also Published As

Publication number Publication date
CN114998682A (zh) 2022-09-02

Similar Documents

Publication Publication Date Title
Dosovitskiy et al. Discriminative unsupervised feature learning with convolutional neural networks
CN109241880B (zh) 图像处理方法、图像处理装置、计算机可读存储介质
US20220262151A1 (en) Method, apparatus, and system for recognizing text in image
WO2023231644A1 (zh) 对象识别模型的训练方法、对象识别方法和对象识别设备
CN109886275A (zh) 翻拍图像识别方法、装置、计算机设备和存储介质
CN110598765A (zh) 样本生成方法、装置、计算机设备及存储介质
WO2021159990A1 (zh) 植物花期播报方法、系统及计算机可读存储介质
CN112215255A (zh) 一种目标检测模型的训练方法、目标检测方法及终端设备
US10256829B1 (en) Production of modified image inventories
WO2020036124A1 (ja) 物体認識装置、物体認識学習装置、方法、及びプログラム
WO2023130650A1 (zh) 一种图像复原方法、装置、电子设备及存储介质
Jenicek et al. No fear of the dark: Image retrieval under varying illumination conditions
WO2022218185A1 (zh) 用于植物病症诊断的方法和植物病症诊断系统
CN110347857B (zh) 基于强化学习的遥感影像的语义标注方法
WO2023098463A1 (zh) 植物识别方法、植物识别设备和植物识别系统
CN109492589A (zh) 通过二进制特征与联合层叠结构融合的人脸识别工作方法以及智能芯片
CN111324874A (zh) 一种证件真伪识别方法及装置
WO2023060434A1 (zh) 一种基于文本的图像编辑方法和电子设备
WO2021175040A1 (zh) 视频处理方法及相关装置
Nam et al. Modelling the scene dependent imaging in cameras with a deep neural network
CN106657817A (zh) 一种应用于手机平台的自动制作相册mv的处理方法
KR20200023696A (ko) 식물 이미지 분류 방법 및 장치
CN117131222A (zh) 基于开放世界大模型的半自动化标注方法和装置
Bhattacharjee et al. Two-stream convolutional network with multi-level feature fusion for categorization of human action from videos
CN115115552B (zh) 图像矫正模型训练及图像矫正方法、装置和计算机设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23814830

Country of ref document: EP

Kind code of ref document: A1