WO2021138911A1 - Image classification method and apparatus, storage medium, and electronic device - Google Patents

Image classification method and apparatus, storage medium, and electronic device Download PDF

Info

Publication number
WO2021138911A1
WO2021138911A1 PCT/CN2020/071502 CN2020071502W WO2021138911A1 WO 2021138911 A1 WO2021138911 A1 WO 2021138911A1 CN 2020071502 W CN2020071502 W CN 2020071502W WO 2021138911 A1 WO2021138911 A1 WO 2021138911A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
fine
grained
classification
feature
Prior art date
Application number
PCT/CN2020/071502
Other languages
French (fr)
Chinese (zh)
Inventor
高洪涛
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to CN202080087887.6A priority Critical patent/CN114830186A/en
Priority to PCT/CN2020/071502 priority patent/WO2021138911A1/en
Publication of WO2021138911A1 publication Critical patent/WO2021138911A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Abstract

An image classification method and apparatus, a storage medium, and an electronic device. The method comprises: determining a target image requiring image classification (101); performing fine-grained classification on the target image by calling a pre-trained fine-grained classification model (102); and obtaining a fine-grained category of the target image output by the fine-grained classification model (103).

Description

图像分类方法、装置、存储介质及电子设备Image classification method, device, storage medium and electronic equipment 技术领域Technical field
本申请实施例涉及图像识别技术领域,尤其涉及一种图像分类方法、装置、存储介质及电子设备。The embodiments of the present application relate to the field of image recognition technology, and in particular, to an image classification method, device, storage medium, and electronic equipment.
背景技术Background technique
图像分类主要包括粗粒度图像分类和细粒度图像分类,细粒度图像分类又称作子类别图像分类,其目的是对粗粒度的大类别进行更加细致的子类划分,如区分鸟的种类、车的款式、狗的品种等。Image classification mainly includes coarse-grained image classification and fine-grained image classification. Fine-grained image classification is also called sub-category image classification. Its purpose is to make more detailed sub-categories of coarse-grained categories, such as distinguishing bird types and vehicles. Styles, breeds of dogs, etc.
发明内容Summary of the invention
本申请提供了一种图像分类方法、装置、存储介质及电子设备,能够实现对图像的细粒度分类。This application provides an image classification method, device, storage medium, and electronic equipment, which can realize fine-grained classification of images.
第一方面,本申请提供一种图像分类方法,包括:In the first aspect, this application provides an image classification method, including:
确定需要进行图像分类的目标图像;Determine the target image that needs to be classified;
调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;A pre-trained fine-grained classification model is called to perform fine-grained classification of the target image, wherein the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction module is used to extract the The image feature of the target image, the feature optimization module is used to optimize the image feature to obtain the optimized image feature, and the fine-grained classification module performs fine-grained classification of the optimized image feature to obtain the image feature of the target image Fine-grained categories;
获取所述细粒度分类模型输出的所述目标图像的细粒度类别。Acquire the fine-grained category of the target image output by the fine-grained classification model.
第二方面,本申请还提供一种图像分类装置,包括:In the second aspect, this application also provides an image classification device, including:
图像确定组件,用于确定需要进行图像分类的目标图像;The image determination component is used to determine the target image that needs to be classified;
模型调用组件,用于调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;The model calling component is used to call a pre-trained fine-grained classification model to perform fine-grained classification of the target image, wherein the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction The module is used to extract image features of the target image, the feature optimization module is used to optimize the image features to obtain optimized image features, and the fine-grained classification module performs fine-grained classification on the optimized image features, Obtaining the fine-grained category of the target image;
类别获取组件,用于获取所述细粒度分类模型输出的所述目标图像的细粒度类别。The category acquisition component is used to acquire the fine-grained category of the target image output by the fine-grained classification model.
第三方面,本申请还提供一种存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机上执行时,使得所述计算机执行:In a third aspect, this application also provides a storage medium on which a computer program is stored, wherein when the computer program is executed on a computer, the computer is caused to execute:
确定需要进行图像分类的目标图像;Determine the target image that needs to be classified;
调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;Call a pre-trained fine-grained classification model to perform fine-grained classification of the target image, where the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction module is used to extract the The image feature of the target image, the feature optimization module is used to optimize the image feature to obtain the optimized image feature, and the fine-grained classification module performs fine-grained classification of the optimized image feature to obtain the image feature of the target image Fine-grained categories;
获取所述细粒度分类模型输出的所述目标图像的细粒度类别。Acquire the fine-grained category of the target image output by the fine-grained classification model.
第四方面,本申请还提供一种电子设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器通过调用所述存储器中存储的所述计算机程序,用于执行:In a fourth aspect, the present application also provides an electronic device, including a memory and a processor, the memory stores a computer program, and the processor invokes the computer program stored in the memory to execute:
确定需要进行图像分类的目标图像;Determine the target image that needs to be classified;
调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特 征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;Call a pre-trained fine-grained classification model to perform fine-grained classification of the target image, where the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction module is used to extract the The image feature of the target image, the feature optimization module is used to optimize the image feature to obtain the optimized image feature, and the fine-grained classification module performs fine-grained classification of the optimized image feature to obtain the image feature of the target image Fine-grained categories;
获取所述细粒度分类模型输出的所述目标图像的细粒度类别。Acquire the fine-grained category of the target image output by the fine-grained classification model.
附图说明Description of the drawings
下面结合附图,通过对本申请的具体实施方式详细描述,将使本申请的技术方案及其有益效果显而易见。The following detailed descriptions of the specific implementations of the present application in conjunction with the accompanying drawings will make the technical solutions of the present application and its beneficial effects obvious.
图1是本申请实施例提供的图像分类方法的一流程示意图。FIG. 1 is a schematic flowchart of an image classification method provided by an embodiment of the present application.
图2是本申请实施例中触发图像分类的示例图。Fig. 2 is an example diagram of triggering image classification in an embodiment of the present application.
图3是本申请实施例中提供的图像分类界面的示例图。Fig. 3 is an example diagram of an image classification interface provided in an embodiment of the present application.
图4是本申请实施例中提供的选择子界面的示例图。Fig. 4 is an example diagram of a selection sub-interface provided in an embodiment of the present application.
图5是本申请实施例提供的细粒度分类模型的一架构示意图。FIG. 5 is a schematic diagram of a structure of a fine-grained classification model provided by an embodiment of the present application.
图6是本申请实施例提供的目标图像的示例图。Fig. 6 is an example diagram of a target image provided by an embodiment of the present application.
图7是本申请实施例提供的细粒度分类模型的另一架构示意图。FIG. 7 is a schematic diagram of another architecture of a fine-grained classification model provided by an embodiment of the present application.
图8是本申请实施例提供的机器学习网络的一架构示意图。FIG. 8 is a schematic diagram of the architecture of a machine learning network provided by an embodiment of the present application.
图9是本申请实施例提供的图像分类方法的另一流程示意图。FIG. 9 is a schematic flowchart of another image classification method provided by an embodiment of the present application.
图10是本申请实施例提供的图像分类装置的结构示意图。FIG. 10 is a schematic structural diagram of an image classification device provided by an embodiment of the present application.
图11是本申请实施例提供的电子设备的一结构示意图。FIG. 11 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
请参照图示,其中相同的组件符号代表相同的组件,本申请的原理是以实施在一适当的运算环境中来举例说明。以下的说明是基于所例示的本申请具体实施例,其不应被视为限制本申请未在此详述的其它具体实施例。Please refer to the drawings, in which the same component symbols represent the same components, and the principle of the present application is implemented in an appropriate computing environment as an example. The following description is based on the exemplified specific embodiments of the application, which should not be regarded as limiting other specific embodiments of the application that are not described in detail herein.
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology. Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
其中,机器学习(Machine Learning,ML)是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机 具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习等技术。Among them, Machine Learning (ML) is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. Specializing in the study of how computers simulate or realize human learning behaviors in order to acquire new knowledge or skills, and reorganize the existing knowledge structure to continuously improve its own performance. Machine learning is the core of artificial intelligence, the fundamental way to make computers intelligent, and its applications cover all areas of artificial intelligence. Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning and other technologies.
本申请实施例提供的技术方案涉及人工智能的机器学习技术,具体通过如下实施例进行说明:The technical solutions provided by the embodiments of the present application involve artificial intelligence machine learning technology, which are specifically described by the following embodiments:
本申请实施例提供一种图像分类方法、图像分类装置、存储介质以及电子设备,其中,该图像分类方法的执行主体可以是本申请实施例中提供的图像分类装置,或者集成了该图像分类装置的电子设备,其中该图像分类装置可以采用硬件或软件的方式实现。其中,电子设备可以是智能手机、平板电脑、掌上电脑、笔记本电脑、或者台式电脑等配置有处理器(包括但不限于通用处理器、定制化处理器等)而具有处理能力的设备。The embodiment of the application provides an image classification method, an image classification device, a storage medium, and electronic equipment, wherein the execution subject of the image classification method may be the image classification device provided in the embodiment of the application, or integrate the image classification device The image classification device can be implemented in hardware or software. Among them, the electronic device may be a device equipped with a processor (including but not limited to a general-purpose processor, a customized processor, etc.) and having processing capabilities, such as a smart phone, a tablet computer, a palmtop computer, a notebook computer, or a desktop computer.
请参照图1,图1是本申请实施例提供的图像分类方法的一流程示意图。该图像分类方法的流程可以包括:Please refer to FIG. 1, which is a schematic flowchart of an image classification method provided by an embodiment of the present application. The process of the image classification method may include:
在101中,确定需要进行图像分类的目标图像。In 101, determine the target image that needs to be classified.
本申请实施例中,电子设备可以基于预设的图像分类周期,按照预设的图像选取规则,确定需要进行图像分类的目标图像,或者是在接收到用户输入的图像分类指令时,根据用户输入的图像分类操作确定需要进行图像分类的目标图像,等等。In the embodiment of the present application, the electronic device may determine the target image that needs to be classified based on the preset image classification cycle and the preset image selection rules, or when receiving the image classification instruction input by the user, according to the user input The image classification operation determines the target image that needs to be classified, and so on.
需要说明的是,本申请实施例对于图像分类周期、图像选取规则以及图像分类指令的设置均不做具体限定,可由电子设备根据用户输入进行设置,也可由电子设备的生产厂商对电子设备进行缺省设置,等等。It should be noted that the embodiment of the application does not specifically limit the setting of the image classification cycle, image selection rules, and image classification instructions. The settings can be set by the electronic device according to user input, or the manufacturer of the electronic device can perform the defect of the electronic device. Province settings, etc.
比如,假设图像分类周期被预先配置为以周一为起点的自然周,且图像选取规则被配置为“选取拍摄的图像进行图像分类”这样,电子设备可以在每周一自动触发进行图像分类,将拍摄得到的图像确定为需要进行图像分类的目标图像。For example, suppose that the image classification cycle is pre-configured as a natural week starting from Monday, and the image selection rule is configured as "select captured images for image classification". In this way, the electronic device can automatically trigger image classification on Mondays, and capture images. The obtained image is determined as the target image that needs to be classified.
又比如,请参照图2,电子设备在一图像浏览界面中提供有用于触发进行图像分类的“分类”控件”。其中,图示矩形表示不同的图像,矩形中的圆形框表示用于选择对应图像的“选择”控件。用户可以点击某图像对应的选择控件以选中该图像,并可以再次点击该图像对应的选择控件来撤销对该图像的选中。如图2所示,当用户选中需要进行分类的图像之后,通过点击分类控件来向电子设备输入图像分类指令,其中,该图像分类指令携带有指示用户选中的图像的指示信息。相应的,电子设备根据用户输入的图像分类指令中的指示信息,将用户选中的图像确定为需要进行图像分类的目标图像。For another example, please refer to Figure 2. The electronic device provides a "category" control for triggering image classification in an image browsing interface. Among them, the icon rectangles represent different images, and the round boxes in the rectangles represent selections. The "selection" control corresponding to the image. The user can click the selection control corresponding to an image to select the image, and click the selection control corresponding to the image again to cancel the selection of the image. As shown in Figure 2, when the user selects the desired After the classified images are classified, the image classification instruction is input to the electronic device by clicking the classification control, where the image classification instruction carries instruction information indicating the image selected by the user. Correspondingly, the electronic device according to the image classification instruction input by the user The instruction information determines the image selected by the user as the target image that needs to be classified.
又比如,电子设备可以通过包括请求输入接口的图像分类界面接收输入的图像分类指令,如图3所示,该请求输入接口可以为输入框的形式,用户可以在该输入框形式的请求输入接口中键入需要进行图像分类的图像的标识信息,并输入确认信息(如直接按下键盘的回车键)以输入图像分类指令,该图像分类指令携带有需要进行图像分类的图像的标识信息。相应的,电子设备即可根据接收到的图像分类指令中的标识信息确定需要进行图像分类的目标图像。For another example, the electronic device may receive an input image classification instruction through an image classification interface including a request input interface, as shown in FIG. 3, the request input interface may be in the form of an input box, and the user may request an input interface in the form of the input box Enter the identification information of the image that needs to be classified in the image, and enter the confirmation information (such as directly pressing the enter key of the keyboard) to input the image classification instruction, the image classification instruction carries the identification information of the image that needs to be classified. Correspondingly, the electronic device can determine the target image that needs to be classified according to the identification information in the received image classification instruction.
又比如,在图3所述的图像分类界面中,还包括“打开”控件,一方面,电子设备在侦测到该打开控件触发时,将在图像分类界面之上叠加显示选择子界面(如图4所示),该选择子界面向用户提供可进行图像分类的图像的缩略图,如图像A、图像B、图像C、图像D、图像E、图像F等图像的缩略图, 供用户查找并选中需要进行图像分类的图像的缩略图;另一方面,用户可以在选中需要进行图像分类的图像的缩略图之后,触发选择子界面提供的确认控件,以向电子设备输入图像分类指令,该图像分类指令与用户选中的图像的缩略图相关联,指示电子设备将用户选中的图像作为需要进行图像分类的目标图像。For another example, the image classification interface described in Figure 3 also includes an "open" control. On the one hand, when the electronic device detects that the open control is triggered, it will superimpose a selection sub-interface (such as (Shown in Figure 4), the selection sub-interface provides the user with thumbnails of images that can be classified, such as image A, image B, image C, image D, image E, image F and other image thumbnails for users to find And select the thumbnail of the image that needs to be classified; on the other hand, after selecting the thumbnail of the image that needs to be classified, the user can trigger the confirmation control provided in the selection sub-interface to input the image classification instruction to the electronic device. The image classification instruction is associated with the thumbnail of the image selected by the user, and instructs the electronic device to use the image selected by the user as the target image that needs to be classified.
此外,本领域普通技术人员还可以根据实际需要设置其它输入图像分类指令的具体实现方式,本发明对此不做具体限制。In addition, those of ordinary skill in the art can also set other specific implementation manners of input image classification instructions according to actual needs, and the present invention does not specifically limit this.
在102中,调用预训练的细粒度分类模型对目标图像进行细粒度分类,其中,细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,特征提取模块用于提取目标图像的图像特征,特征优化模块用于对图像特征进行优化处理,得到优化图像特征,细粒度分类模块对优化图像特征进行细粒度分类,得到目标图像的细粒度类别。In 102, the pre-trained fine-grained classification model is called to perform fine-grained classification of the target image. The fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The feature extraction module is used to extract the image of the target image. Features: The feature optimization module is used to optimize image features to obtain optimized image features, and the fine-grained classification module performs fine-grained classification on the optimized image features to obtain the fine-grained category of the target image.
应当说明的是,本申请中预先采用机器学习方法训练有细粒度分类模型,该细粒度分类模型被配置为对图像进行细粒度分类。It should be noted that in this application, a machine learning method is used to train a fine-grained classification model in advance, and the fine-grained classification model is configured to perform fine-grained classification of images.
请参照图5,本申请提供的细粒度分类模型由两部分构成,分别为用于提取特征的特征提取模块,用于对特征进行优化的特征优化模块,以及用于根据特征进行细粒度分类的细粒度分类模块。Please refer to Figure 5, the fine-grained classification model provided by this application consists of two parts, namely a feature extraction module for extracting features, a feature optimization module for optimizing features, and a feature optimization module for fine-grained classification based on features. Fine-grained classification module.
其中,电子设备首先将目标图像输入特征提取模块,基于特征提取模块对目标图像进行特征提取,从而得到目标图像的图像特征。Among them, the electronic device first inputs the target image into the feature extraction module, and performs feature extraction on the target image based on the feature extraction module, thereby obtaining the image features of the target image.
示例性的,特征提取模块包括N层卷积层,在基于特征提取模块对目标图像进行特征提取时,电子设备首先基于第1层卷积层对目标图像进行卷积计算,得到一个特征图,记为1层特征图;然后,电子设备基于第2层卷积层对第1层特征图进行卷积计算,得到一个新的特征图,记为第2层特征图;以此类推,直至基于第N层卷积层计算得到第N层特征图,将该第N层特征图作为对目标图像进行特征提取而得到的图像特征。Exemplarily, the feature extraction module includes N layers of convolutional layers. When feature extraction is performed on the target image based on the feature extraction module, the electronic device first performs convolution calculation on the target image based on the first layer of convolutional layer to obtain a feature map, Recorded as the 1-layer feature map; then, the electronic device performs convolution calculations on the first-layer feature map based on the second-layer convolutional layer to obtain a new feature map, which is recorded as the second-layer feature map; and so on, until it is based on The Nth layer of convolutional layer is calculated to obtain the Nth layer feature map, and the Nth layer feature map is used as the image feature obtained by feature extraction of the target image.
应当说明的是,本申请中对N的取值,也就是构成特征提取模块的卷积层层数不做具体限制,可由本领域普通技术人员根据实际需要进行设置,比如,本申请实施例中N取值为9。It should be noted that the value of N in this application, that is, the number of convolutional layers constituting the feature extraction module, is not specifically limited, and can be set by a person of ordinary skill in the art according to actual needs. For example, in the embodiment of this application The value of N is 9.
电子设备在基于特征提取模块提取得到目标图像的图像特征之后,进一步基于特征优化模块,按照预设优化策略对该图像特征进行优化处理,得到优化后的图像特征,记为优化图像特征。After the electronic device extracts the image feature of the target image based on the feature extraction module, it further optimizes the image feature based on the feature optimization module according to the preset optimization strategy, and obtains the optimized image feature, which is recorded as the optimized image feature.
电子设备在基于特征优化模块对提取到的图像特征进行优化处理而得到优化图像特征之后,进一步基于细粒度分类模块对优化图像特征进行细粒度分类,得到目标图像的细粒度类别。其中,细粒度分类模块可以为全连接层。After the electronic device optimizes the extracted image features based on the feature optimization module to obtain optimized image features, it further performs fine-grained classification of the optimized image features based on the fine-grained classification module to obtain the fine-grained category of the target image. Among them, the fine-grained classification module can be a fully connected layer.
比如,请参照图6,当图6左侧图像被确定为目标图像时,可知其粗粒度类别为“狗”,基于细粒度分类模块对该目标图像对应的优化图像特征进行细粒度分类,得到其细粒度类别为“柯基”,也即是表征该目标图像中狗的品种为柯基;而当图6右侧图像被确定为目标图像时,可知其粗粒度类别同样为“狗”,基于细粒度分类模块对该目标图像对应的优化图像特征进行细粒度分类,得到其细粒度类别为“哈士奇”,也即是表征该目标图像中狗的品种为哈士奇。For example, referring to Figure 6, when the image on the left side of Figure 6 is determined to be the target image, it can be known that its coarse-grained category is "dog". Based on the fine-grained classification module, perform fine-grained classification of the optimized image features corresponding to the target image to obtain Its fine-grained category is "Corgi", which means that the breed of dog in the target image is Corgi; and when the image on the right side of Figure 6 is determined as the target image, it can be seen that its coarse-grained category is also "dog". Based on the fine-grained classification module, fine-grained classification is performed on the optimized image features corresponding to the target image, and the fine-grained category is obtained as "huskies", which means that the breed of the dog in the target image is huskies.
在103中,获取细粒度分类模型输出的目标图像的细粒度类别。In 103, the fine-grained category of the target image output by the fine-grained classification model is obtained.
本申请实施例中,在细粒度分类模型完成对目标图像的细粒度分类之后,电子设备即可从细粒度分类模型的细粒度分类模块获取分类得到的细粒度类别。In the embodiment of the present application, after the fine-grained classification model completes the fine-grained classification of the target image, the electronic device can obtain the classified fine-grained category from the fine-grained classification module of the fine-grained classification model.
由上可知,本申请通过确定需要进行图像分类的目标图像,并调用预训练的细粒度分类模型对目标图像进行细粒度分类,其中,细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,特征提取模块用于提取目标图像的图像特征,特征优化模块用于对图像特征进行优化处理,得到优化图像特征,细粒度分类模块对优化图像特征进行细粒度分类,得到目标图像的细粒度类别,然后获取细粒度分类模型输出的目标图像的细粒度类别。由此,本申请无需人工进行细粒度分类,能够高效的实现对图像的细粒度分类。It can be seen from the above that this application determines the target image that needs to be image classified, and calls the pre-trained fine-grained classification model to fine-grain the target image. The fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification model. The classification module, the feature extraction module is used to extract the image features of the target image, the feature optimization module is used to optimize the image features to obtain the optimized image features, and the fine-grained classification module performs fine-grained classification of the optimized image features to obtain the fine-grained features of the target image. Granularity category, and then obtain the fine-grained category of the target image output by the fine-grained classification model. Therefore, the present application does not need to manually perform fine-grained classification, and can efficiently implement fine-grained classification of images.
在一实施例中,确定需要进行图像分类的目标图像,包括:In an embodiment, determining the target image that needs to be classified includes:
当到达图像分类周期时,将图像分类周期内新增的图像确定为目标图像。When the image classification period is reached, the newly added image in the image classification period is determined as the target image.
本申请实施例中,电子设备在到达图像分类周期时,触发确定需要进行图像分类的目标图像。其中,电子设备可以直接将该图像分类周期内新增的图像作为目标图像。比如,在一个图像分类周期内,电子设备新增了20个图像,则电子设备将这20个图像作为需要进行图像分类的目标图像。In the embodiment of the present application, when the electronic device reaches the image classification period, it triggers the determination of the target image that needs to be classified. Among them, the electronic device can directly use the newly added image in the image classification period as the target image. For example, in an image classification cycle, the electronic device adds 20 images, and the electronic device uses these 20 images as target images that need to be classified.
在一实施方式中,“确定需要进行图像分类的目标图像”,包括:In one embodiment, "determine the target image that needs to be classified" includes:
(1)、将预设储存路径下的图像确定为目标图像;或者,(1) Determine the image under the preset storage path as the target image; or,
(2)、将预设图像格式的图像确定为目标图像;或者,(2) Determine the image of the preset image format as the target image; or,
(3)、将预设储存路径下的预设图像格式的图像确定为目标图像。(3) Determine the image of the preset image format under the preset storage path as the target image.
其中,本申请实施例对于预设存储路径以及预设图像格式的设置不做具体限定,可由电子设备根据用户输入进行设置,也可由电子设备的生产厂商对电子设备进行缺省设置。需要说明的是,预设存储路径可以配置为一个,也可以配置为多个,相应的,预设图像格式可以配置为一个,也可以配置为多个。Among them, the embodiment of the present application does not specifically limit the setting of the preset storage path and the preset image format, which can be set by the electronic device according to user input, or the manufacturer of the electronic device can set the electronic device by default. It should be noted that the preset storage path can be configured as one or multiple. Correspondingly, the preset image format can be configured as one or multiple.
比如,假设用户需要电子设备对拍摄得到的图像进行分类,则可以将预设存储路径配置为电子设备拍摄图像的存储路径,示例性的,若电子设备基于安卓系统,则将预设存储路径配置为“/storage/0/DCIM”,这样,电子设备将把/storage/0/DCIM对应的文件目录“DCIM”中的所有图像确定为需要进行图像分类的目标图像。For example, if the user needs the electronic device to classify the captured images, the preset storage path can be configured as the storage path of the image captured by the electronic device. Illustratively, if the electronic device is based on the Android system, the preset storage path is configured "/Storage/0/DCIM", in this way, the electronic device will determine all images in the file directory "DCIM" corresponding to /storage/0/DCIM as target images that need to be classified.
又比如,假设用户需要电子设备对某图像格式的图像进行分类,则可以将预设图像格式配置为用户指定的图像格式,示例性的,若用户需要电子设备对“JPG”格式的图像进行分类,则将预设图像格式配置为“JPG”格式,这样,电子设备将把本地所有“JPG”格式的图像确定为需要进行图像分类的目标图像。For another example, if the user needs an electronic device to classify images in a certain image format, the preset image format can be configured as an image format specified by the user. Illustratively, if the user needs an electronic device to classify images in "JPG" format , The preset image format is configured as the "JPG" format, so that the electronic device will determine all local "JPG" format images as target images that need to be classified.
又比如,假设用户需要电子设备对拍摄得到的某种图像格式的图像进行分类,则可以将预设存储路径配置为电子设备拍摄图像的存储路径,将预设图像格式配置为用户指定的图像格式,示例性的,若电子设备基于安卓系统,则将预设存储路径配置为“/storage/0/DCIM”,此外,若用户需要电子设备的拍摄得到的“JPG”格式的图像进行分类,则将预设图像格式配置为“JPG”格式,这样,电子设备将把/storage/0/DCIM对应的文件目录“DCIM”中的所有“JPG”格式的图像确定为需要进行图像分类的目标图像。For another example, if the user needs an electronic device to classify images in a certain image format captured by the electronic device, the preset storage path can be configured as the storage path of the image captured by the electronic device, and the preset image format can be configured as the image format specified by the user. Exemplarily, if the electronic device is based on the Android system, the preset storage path is configured as "/storage/0/DCIM". In addition, if the user needs the "JPG" format images captured by the electronic device to be classified, then The preset image format is configured as the "JPG" format, so that the electronic device will determine all the "JPG" format images in the file directory "DCIM" corresponding to /storage/0/DCIM as target images that need to be classified.
在一实施例中,细粒度分类模型还包括降维模块,用于对优化图像特征进行特征降维,得到降维后 的优化图像特征;In an embodiment, the fine-grained classification model further includes a dimensionality reduction module, which is used to perform feature dimensionality reduction on optimized image features to obtain optimized image features after dimensionality reduction;
细粒度分类模块还用于对降维后的优化图像特征进行分类预测,得到目标图像的细粒度类别。The fine-grained classification module is also used to classify and predict the optimized image features after dimensionality reduction to obtain the fine-grained category of the target image.
请参照图7,本申请提供的细粒度分类模型还包括降维模块,该降维模块可以为池化层。如图7所示,该降维模块一端与特征优化模块连接,另一端与细粒度分类模块连接。Please refer to FIG. 7, the fine-grained classification model provided by the present application further includes a dimensionality reduction module, and the dimensionality reduction module may be a pooling layer. As shown in Figure 7, one end of the dimensionality reduction module is connected to the feature optimization module, and the other end is connected to the fine-grained classification module.
本申请实施例中,电子设备基于特征优化模块对特征提取模块提取到的图像进行优化处理之后,并不直接根据优化得到的优化图像特征进行细粒度分类,而是将该优化图像特征输入降维模块,从而基于降维模块对优化图像特征进行特征降维处理,得到降维后的优化图像特征。然后,电子设备将该降维后的优化图像特征输入细粒度分类模块进行细粒度分类,相应得到目标图像的细粒度类别。In the embodiment of the present application, after the electronic device optimizes the image extracted by the feature extraction module based on the feature optimization module, it does not directly perform fine-grained classification based on the optimized image features obtained by optimization, but inputs the optimized image features into dimensionality reduction Module, so as to perform feature reduction processing on optimized image features based on the dimensionality reduction module to obtain optimized image features after dimensionality reduction. Then, the electronic device inputs the optimized image feature after dimensionality reduction into the fine-grained classification module for fine-grained classification, and accordingly obtains the fine-grained category of the target image.
在一实施例中,获取细粒度分类模型输出的目标图像的细粒度类别之后,还包括:In an embodiment, after obtaining the fine-grained category of the target image output by the fine-grained classification model, the method further includes:
根据细粒度类别为目标图像分配存储路径,并将目标图像存储至储存路径中。The storage path is allocated to the target image according to the fine-grained category, and the target image is stored in the storage path.
本申请实施例中,为了便于用户浏览图像,电子设备还根据对目标图像进行细粒度分类得到的细粒度类别,对目标图像进行分类存储。In the embodiment of the present application, in order to facilitate the user to browse the image, the electronic device also classifies and stores the target image according to the fine-grained category obtained by fine-grained classification of the target image.
其中,电子设备可以为每一细粒度类别分配一储存路径,并将对应的目标图像存储至分配的存储路径中。比如,若目标图像被分类为九个类别,则电子设备将对应分配九个不同的储存路径,分别用于存储对应类别的目标图像。The electronic device can allocate a storage path for each fine-grained category, and store the corresponding target image in the allocated storage path. For example, if the target image is classified into nine categories, the electronic device will correspondingly allocate nine different storage paths, which are respectively used to store the target image of the corresponding category.
在一实施例中,将目标图像存储至分配储存路径中之后,还包括:In an embodiment, after storing the target image in the distribution storage path, the method further includes:
对于每一储存路径中的目标图像,获取用户浏览各目标图像的浏览行为数据,以及获取各目标图像的创建时长;For the target image in each storage path, obtain the browsing behavior data of the user browsing each target image, and obtain the creation time of each target image;
对各目标图像的浏览行为数据和创建时长进行加权求和,得到各目标图像的加权和值;Perform a weighted summation of the browsing behavior data and creation time of each target image to obtain the weighted sum of each target image;
根据各目标图像的加权和值对各目标图像进行排序。Sort each target image according to the weighted sum value of each target image.
本申请实施例中,在对目标图像进行分类之后,还对每一类的目标图像(即每一储存路径下的目标图像)进行排序。In the embodiment of the present application, after the target images are classified, the target images of each category (that is, the target images under each storage path) are also sorted.
其中,浏览行为数据包括描述用户浏览行为的相关数据,比如,浏览行为数据包括用户浏览目标图像的次数,以及用户每次浏览目标图像的打开时刻和关闭时刻,等等。Among them, the browsing behavior data includes relevant data describing the user's browsing behavior. For example, the browsing behavior data includes the number of times the user browses the target image, and the opening and closing moments of the target image each time the user browses, and so on.
电子设备除了获取用户浏览各目标图像的浏览行为数据之外,还获取各目标图像的创建时长。其中,创建时长为当前时刻与目标图像的生成时刻的差值。In addition to obtaining the browsing behavior data of the user browsing each target image, the electronic device also obtains the creation time of each target image. Wherein, the creation time is the difference between the current moment and the creation moment of the target image.
需要说明的是,上述当前时刻并不特指某一时刻,而是代指电子设备执行“获取各目标图像的创建时长”这一操作的时刻。此外,本申请实施例对目标图像的生成方式不做具体限制,比如,某目标图像为电子设备通过拍摄的方式生成,则该目标图像的生成时刻即为电子设备拍摄得到该目标图像的拍摄时刻;又比如,某目标图像为电子设备通过互联网下载的方式生成,则该目标图像的生成时刻即为电子设备通过互联网下载得到该目标图像的下载时刻,等等。It should be noted that the above current time does not specifically refer to a certain time, but refers to the time when the electronic device performs the operation of “acquiring the creation time length of each target image”. In addition, the embodiments of the present application do not specifically limit the generation method of the target image. For example, if a target image is generated by an electronic device by shooting, the time when the target image is generated is the shooting time when the target image is captured by the electronic device. ; For another example, if a target image is generated by an electronic device through the Internet, the generation time of the target image is the download time when the electronic device downloads the target image through the Internet, and so on.
本申请实施例中,电子设备在获取到各目标图像的浏览行为数据以及创建时长之后,根据预设的加权求和算法对获取到的浏览行为数据和创建时长进行加权求和,得到对应各目标图像的加权和值。In the embodiment of the present application, after acquiring the browsing behavior data and creation duration of each target image, the electronic device performs a weighted summation on the acquired browsing behavior data and creation duration according to a preset weighted sum algorithm to obtain the corresponding target image The weighted sum value of the image.
其中,浏览行为数据能够反映用户浏览行为的特征,而创建时长则为图像自身的特征,电子设备对 获取到的浏览行为数据和创建时长进行加权求和的目的在于:结合目标图像的自身特征以及图像之外的用户特征对目标图像进行综合评价,这样,加权求和得到加权和值也即是对目标图像进行综合评价所得到的“评分”,这个评分的高低也就反映了目标图像可能被用户浏览的概率大小。Among them, the browsing behavior data can reflect the characteristics of the user's browsing behavior, and the creation time is the characteristics of the image itself. The electronic device performs a weighted summation of the acquired browsing behavior data and the creation time for the purpose of combining the characteristics of the target image and The user characteristics outside the image comprehensively evaluate the target image. In this way, the weighted sum value obtained by the weighted sum is the "score" obtained by the comprehensive evaluation of the target image. The level of this score also reflects that the target image may be The probability of the user browsing.
本申请实施例中,电子设备在得到各目标图像的加权和值之后,根据加权和值由大至小的顺序进行排序。In the embodiment of the present application, after obtaining the weighted sum value of each target image, the electronic device sorts according to the weighted sum value in descending order.
在一实施例中,对各目标图像的浏览行为数据和创建时长进行加权求和,得到各目标图像的加权和值,包括:In an embodiment, the weighted summation of the browsing behavior data and creation time of each target image is performed to obtain the weighted sum value of each target image, which includes:
根据各目标图像的浏览行为数据,获取各目标图像的浏览次数以及每次浏览时的浏览时长;According to the browsing behavior data of each target image, obtain the browsing times of each target image and the browsing duration of each browsing;
根据各目标图像的浏览次数以及每次浏览时的浏览时长,获取各目标图像的平均浏览时长;Obtain the average browsing time of each target image according to the number of browsing times of each target image and the browsing time of each browsing;
对各目标图像的浏览次数、平均浏览时长以及创建时长进行归一化处理;Normalize the browsing times, average browsing time, and creation time of each target image;
对各目标图像归一化后的浏览次数、平均浏览时长以及创建时长进行加权求和,得到各目标图像的加权和值。The normalized browsing times, average browsing duration, and creation duration of each target image are weighted and summed to obtain the weighted sum of each target image.
本申请实施例中,电子设备在目标图像被用户浏览时,记录用户浏览该目标图像的浏览行为数据,其中,该浏览行为数据包括但不限于用户浏览该目标图像的次数,以及用户每次浏览该目标图像的打开时刻和关闭时刻,等等。In the embodiment of the present application, the electronic device records the browsing behavior data of the user browsing the target image when the target image is browsed by the user. The browsing behavior data includes but not limited to the number of times the user browses the target image and each time the user browses the target image. The opening time and closing time of the target image, and so on.
由此,电子设备在对各目标图像的浏览行为数据和创建时长进行加权求和时,可以直接从各目标图像的浏览行为数据中提取出各目标图像的浏览次数(即用户浏览目标图像的次数),并根据各目标图像的浏览行为数据中“用户每次浏览目标图像的打开时刻和关闭时刻”,得到各目标图像每次浏览时的浏览时长。Thus, when the electronic device performs a weighted summation of the browsing behavior data and creation time of each target image, it can directly extract the browsing times of each target image (that is, the number of times the user browses the target image) from the browsing behavior data of each target image. ), and according to the "opening time and closing time of the user browsing the target image each time" in the browsing behavior data of each target image, the browsing time of each target image is obtained.
电子设备在获取到各目标图像的浏览次数以及每次浏览时的浏览时长之后,进一步根据各目标图像的浏览次数以及每次浏览时的浏览时长,计算得到各目标图像的平均浏览时长。需要说明的是,本领域普通技术人员可以理解的是,此处所指平均浏览时长为单一目标图像的平均浏览时长,而不是多个目标图像的平均浏览时长。After acquiring the number of times of browsing each target image and the length of each browsing time, the electronic device further calculates the average browsing time of each target image according to the number of times each target image is viewed and the length of each browsing time. It should be noted that those of ordinary skill in the art can understand that the average browsing duration referred to here is the average browsing duration of a single target image, rather than the average browsing duration of multiple target images.
此外,本申请实施例中,对于浏览次数、平均浏览时长以及创建时长这三种数据,分别预先分配有对应的权重值,但对于浏览次数、平均浏览时长以及创建时长各自对应权重值的取值不做具体限定,可由本领域普通技术人员根据实际需要进行设置。比如,可以设置浏览次数对应的权重值为0.3,设置平均浏览时长对应的权重值为0.2,设置创建时长对应的权重值为0.5。In addition, in the embodiment of the present application, the three types of data, the number of views, the average browsing duration, and the creation duration, are respectively pre-assigned with corresponding weight values, but the respective weight values for the number of views, average browsing duration, and creation duration are assigned values. It is not specifically limited, and can be set by a person of ordinary skill in the art according to actual needs. For example, you can set the weight value corresponding to the number of browsing times to 0.3, the weight value corresponding to the average browsing duration to 0.2, and the weight value corresponding to the creation duration to 0.5.
为了提升加权求和的效率,电子设备在对各目标图像的浏览次数、平均浏览时长以及创建时长进行加权求和时,首先对各目标图像的浏览次数、平均浏览时长以及创建时长进行归一化处理,将各目标图像的浏览次数、平均浏览时长以及创建时长归一化到同一数值区间内。In order to improve the efficiency of weighted summation, when the electronic device performs weighted summation of the number of views, average browsing time, and creation time of each target image, it first normalizes the number of views, average browsing time, and creation time of each target image Processing, normalizing the number of times of browsing, average browsing time, and creation time of each target image into the same numerical interval.
然后,电子设备再根据预设的加权求和算法对各目标图像归一化后的浏览次数、平均浏览时长以及创建时长进行加权求和,得到对应各目标图像的加权和值。Then, the electronic device performs a weighted summation on the normalized browsing times, average browsing duration, and creation duration of each target image according to a preset weighted sum algorithm to obtain a weighted sum value corresponding to each target image.
在一实施例中,图像特征包括特征图,特征优化模块用于对特征图进行转置处理,得到转置特征图,并对特征图和转置特征图进行矩阵相乘处理,将矩阵相乘的结果作为优化图像特征。In an embodiment, the image feature includes a feature map, and the feature optimization module is used to transpose the feature map to obtain the transposed feature map, and perform matrix multiplication processing on the feature map and the transposed feature map to multiply the matrixes. The result is used as an optimized image feature.
以图像特征为特征图为例,本申请提供一种对图像特征进行优化的方式。Taking image features as a feature map as an example, this application provides a way to optimize image features.
其中,电子设备首先基于特征优化模块对提取到的图像特征,也即是对目标图像的特征图进行转置处理,得到一个转置后的特征图,记为转置特征图。然后,电子设备进一步基于特征优化模块对原始的特征图和转置特征图进行矩阵相乘处理,并将矩阵相乘的结果作为优化图像特征,用于细粒度分类。Among them, the electronic device first performs transposition processing on the extracted image features based on the feature optimization module, that is, the feature map of the target image, to obtain a transposed feature map, which is recorded as the transposed feature map. Then, the electronic device further performs matrix multiplication processing on the original feature map and the transposed feature map based on the feature optimization module, and uses the result of the matrix multiplication as an optimized image feature for fine-grained classification.
在一实施例中,确定需要进行图像分类的目标图像之前,还包括:In an embodiment, before determining the target image that needs to be classified, the method further includes:
(1)获取多个样本图像,以及获取样本图像的细粒度类别标签以及粗粒度类别标签;(1) Obtain multiple sample images, and obtain fine-grained category labels and coarse-grained category labels of the sample images;
(2)构建机器学习网络,机器学习网络包括结构相同的第一分支网络、第二分支网络以及二分类模块,第一分支网络包括特征提取模块、特征优化模块和细粒度分类模块,二分类模块与第一分支网络的特征提取模块和第二分支网络的特征提取模块连接;(2) Build a machine learning network. The machine learning network includes a first branch network, a second branch network, and two classification modules with the same structure. The first branch network includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The second classification module Connected with the feature extraction module of the first branch network and the feature extraction module of the second branch network;
(3)从多个样本图像中选取第一样本图像,并基于第一分支网络的特征提取模块提取第一样本图像的图像特征,经由第一分支网络的特征优化模块优化后输入第一分支网络的细粒度分类模块进行细粒度分类,得到第一预测细粒度类别;(3) Select the first sample image from multiple sample images, extract the image features of the first sample image based on the feature extraction module of the first branch network, and input the first sample image after optimization by the feature optimization module of the first branch network. The fine-grained classification module of the branch network performs fine-grained classification to obtain the first predicted fine-grained category;
(4)从多个样本图像中选取第二样本图像,并基于第二分支网络的特征提模块取提取第二样本图像的图像特征,经由第二分支网络的特征优化模块优化后输入第二分支网络的细粒度分类模块进行细粒度分类,得到第二预测细粒度类别;(4) Select a second sample image from a plurality of sample images, extract the image features of the second sample image based on the feature extraction module of the second branch network, and input the second branch after optimization by the feature optimization module of the second branch network The fine-grained classification module of the network performs fine-grained classification to obtain the second predicted fine-grained category;
(5)融合第一样本图像的图像特征以及第二样本图像的图像特征得到融合图像特征,并基于二分类模块预测第一样本图像和第二样本图像的粗粒度类别是否相同,得到预测结果;(5) Fuse the image features of the first sample image and the image features of the second sample image to obtain the fused image feature, and predict whether the coarse-grained categories of the first sample image and the second sample image are the same based on the two classification module, and get the prediction result;
(6)根据第一预测细粒度类别以及第一样本图像的细粒度类别标签获取第一分支网络的第一分类损失,根据第二预测细粒度类别以及第二样本图像的细粒度类别标签获取第二分支网络的第二分类损失,以及根据预测结果、第一样本图像和第二样本图像的粗粒度类别获取第三分类损失;(6) Obtain the first classification loss of the first branch network according to the first predicted fine-grained category and the fine-grained category label of the first sample image, and obtain the first classification loss of the first branch network according to the second predicted fine-grained category and the fine-grained category label of the second sample image The second classification loss of the second branch network, and the third classification loss is obtained according to the prediction result, the coarse-grained categories of the first sample image and the second sample image;
(7)根据第一分类损失、第二分类损失以及第三分类损失获取对应的总损失,并根据总损失调整第一分支网络和第二分支网络的参数,直至满足预设训练停止条件时结束训练;(7) Obtain the corresponding total loss according to the first classification loss, the second classification loss, and the third classification loss, and adjust the parameters of the first branch network and the second branch network according to the total loss, until the preset training stop condition is met training;
(8)从第一分支网络和第二分支网络选取一个分支网络作为细粒度分类模型。(8) Select a branch network from the first branch network and the second branch network as the fine-grained classification model.
本申请还提供一种可选的训练细粒度分类模型的方式。This application also provides an optional way of training a fine-grained classification model.
其中,电子设备首先获取多个样本图像,以及获取每一样本图像的细粒度类别标签以及粗粒度类别标签。细粒度类别标签用于描述样本图像的细粒度类别,粗粒度标签用于描述样本图像的粗粒度类别。Among them, the electronic device first obtains a plurality of sample images, and obtains the fine-grained category label and the coarse-grained category label of each sample image. The fine-grained category label is used to describe the fine-grained category of the sample image, and the coarse-grained label is used to describe the coarse-grained category of the sample image.
应当说明的是,本申请对于获取样本图像的方式以及数量不做具体限制,可由本领域普通技术人员根据实际需要进行配置。比如,电子设备可以从互联网爬取图像作为样本图像,并接收人工标注的样本图像的细粒度类别标签以及粗粒度类别标签。It should be noted that this application does not make specific restrictions on the manner and quantity of obtaining sample images, and can be configured by a person of ordinary skill in the art according to actual needs. For example, the electronic device can crawl an image from the Internet as a sample image, and receive artificially annotated fine-grained category labels and coarse-grained category labels of the sample image.
此外,电子设备还构建机器学习网络,请参照图8,构建的机器学习网络包括结构相同的两个分支网络和二分类模块,每一分支网络均包括特征提取模块、特征优化模块和细粒度分类模块,为便于区分,将其中一分支网络记为第一分支网络,将另一分支网络记为第二分支网络。此外,二分类模块与第一分支网络的特征提取模块和第二分支网络的特征提取模块连接。In addition, the electronic equipment also builds a machine learning network. Please refer to Figure 8. The built machine learning network includes two branch networks with the same structure and two classification modules. Each branch network includes a feature extraction module, a feature optimization module, and a fine-grained classification. Modules, for easy distinction, one of the branch networks is recorded as the first branch network, and the other branch network is recorded as the second branch network. In addition, the two-classification module is connected to the feature extraction module of the first branch network and the feature extraction module of the second branch network.
在完成机器学习网络构建构建之后,电子设备即利用获取到的样本图像对该机器学习网络进行训练。After completing the construction of the machine learning network, the electronic device uses the acquired sample images to train the machine learning network.
其中,电子设备从获取到的多个样本图像中选取两个样本图像,将其中一个样本图像记为第一样本图像,将另一个样本图像记为第二样本图像。对于第一样本图像,电子设备基于第一分支网络的特征提取模块提取第一样本图像的图像特征,经由第一分支网络的特征优化模块优化后输入第一分支网络的细粒度分类模块进行细粒度分类,得到第一预测细粒度类别。对于第二样本图像,电子设备基于第二分支网络的特征提取模块提取第二样本图像的图像特征,经由第二分支网络的特征优化模块优化后输入第二分支网络的细粒度分类模块进行细粒度分类,得到第二预测细粒度类别。Wherein, the electronic device selects two sample images from the acquired multiple sample images, and records one of the sample images as the first sample image, and records the other sample image as the second sample image. For the first sample image, the electronic device extracts the image features of the first sample image based on the feature extraction module of the first branch network, is optimized by the feature optimization module of the first branch network, and then is input into the fine-grained classification module of the first branch network for processing. Fine-grained classification, the first predicted fine-grained category is obtained. For the second sample image, the electronic device extracts the image features of the second sample image based on the feature extraction module of the second branch network, is optimized by the feature optimization module of the second branch network, and then is input into the fine-grained classification module of the second branch network for fine-grained Classification to obtain the second predicted fine-grained category.
此外,电子设备还融合第一样本图像的图像特征以及第二样本图像的图像特征得到融合图像特征,并基于二分类模块根据融合图像特征预测第一样本图像和第二样本图像的粗粒度类别是否相同,得到预测结果。In addition, the electronic device also fuses the image features of the first sample image and the image features of the second sample image to obtain the fused image feature, and predicts the coarse-grained first sample image and the second sample image based on the fused image feature based on the binary classification module Whether the categories are the same, get the prediction result.
此外,电子设备还根据第一预测细粒度类别以及第一样本图像的细粒度类别标签获取第一分支网络的第一分类损失,根据第二预测细粒度类别以及第二样本图像的细粒度类别标签获取第二分支网络的第二分类损失,以及根据预测结果、第一样本图像和第二样本图像的粗粒度类别获取第三分类损失。其中,可由本领域普通技术人员根据实际需要配置用于获取前述第一分类损失、第二分类损失以及第三分类损失的损失函数,本申请对此并不做具体限制。In addition, the electronic device also obtains the first classification loss of the first branch network according to the first predicted fine-grained category and the fine-grained category label of the first sample image, and according to the second predicted fine-grained category and the fine-grained category of the second sample image The label obtains the second classification loss of the second branch network, and obtains the third classification loss according to the prediction result, the coarse-grained categories of the first sample image and the second sample image. Among them, a person of ordinary skill in the art can configure the loss functions used to obtain the aforementioned first classification loss, second classification loss, and third classification loss according to actual needs, and this application does not specifically limit this.
然后,电子设备根据第一分类损失、第二分类损失以及第三分类损失获取对应的总损失,可以表示为:Then, the electronic device obtains the corresponding total loss according to the first classification loss, the second classification loss, and the third classification loss, which can be expressed as:
L total=Loss1+Loss2+Loss3; L total =Loss1+Loss2+Loss3;
其中,L total表示总损失,Loss1表示第一分类损失,Loss2表示第二分类损失,Loss3表示第三分类损失。 Among them, L total represents the total loss, Loss1 represents the first classification loss, Loss2 represents the second classification loss, and Loss3 represents the third classification loss.
在获取到总损失后,电子设备根据该总损失调整第一分支网络和第二分支网络的参数。应当说明的是,模型训练的目标就是最小化总损失,因此,在每次确定总损失后,即可以最小化总损失为方向,对第一分支网络和第二分支网络的参数进行调整。After obtaining the total loss, the electronic device adjusts the parameters of the first branch network and the second branch network according to the total loss. It should be noted that the goal of model training is to minimize the total loss. Therefore, after the total loss is determined each time, the total loss can be minimized as a direction to adjust the parameters of the first branch network and the second branch network.
如上,通过不断的对第一分支网络和第二分支网络的参数进行调整,直至满足预设训练停止条件时结束训练。其中,预设训练停止条件可由本领域普通技术人员根据实际需要进行设置,本申请实施例对此不做具体限制。As above, by continuously adjusting the parameters of the first branch network and the second branch network, the training ends when the preset training stop condition is met. Among them, the preset training stop condition can be set by a person of ordinary skill in the art according to actual needs, which is not specifically limited in the embodiment of the present application.
比如,预设训练停止条件被配置为:当总损失取最小值时停止训练;For example, the preset training stop condition is configured to stop training when the total loss takes the minimum value;
又比如,预设训练停止条件被配置为:当参数的迭代次数达到预设次数时停止训练。For another example, the preset training stop condition is configured to stop training when the number of iterations of the parameter reaches the preset number.
当满足预设训练停止条件时,电子设备判定机器学习网络中的第一分支网络和第二分支均能够准确的对图像进行细粒度分类,此时从第一分支网络和第二分支网络中选取一个分支网络作为用于对图像进行细粒度分类的细粒度分类模型。When the preset training stop condition is met, the electronic device determines that both the first branch network and the second branch in the machine learning network can accurately classify the image at a fine-grained level. At this time, select from the first branch network and the second branch network A branch network is used as a fine-grained classification model for fine-grained classification of images.
应当说明的是,本申请对于如何从第一分支网络和第二分支网络中选取细粒度分类模型不做具体限制,比如,本申请实施例中,电子设备可以随机从第一分支网络和第二分支网络中选取一个分支网络作为细粒度分类模型。It should be noted that this application does not make specific restrictions on how to select fine-grained classification models from the first branch network and the second branch network. For example, in the embodiment of this application, the electronic device can randomly select from the first branch network and the second branch network. A branch network is selected as a fine-grained classification model in the branch network.
在一实施例中,按照如下公式获取第三分类损失:In an embodiment, the third classification loss is obtained according to the following formula:
Loss3=-[η*y*log(p)+(1-y)*log(1-p)];Loss3=-[η*y*log(p)+(1-y)*log(1-p)];
其中,Loss3表示第三分类损失;η表示修正系数(取经验值,比如本申请在0.3-0.5之间取值);y用于表征第一样本图像的粗粒度类别标签和第二样本图像的粗粒度类别标签是否相同,可由本领域普通技术人员根据实际需要进行取值;p表示预测结果。Among them, Loss3 represents the third classification loss; η represents the correction coefficient (taking an empirical value, for example, the value is between 0.3-0.5 in this application); y is used to characterize the coarse-grained category label of the first sample image and the second sample image Whether the coarse-grained category labels are the same can be determined by those of ordinary skill in the art according to actual needs; p represents the prediction result.
本申请实施例中,增加修正系数η的目的是为了当第一样本图像和第二样本图像的粗粒度类别标签不同时,使得y*log(p)对总损失的贡献减小,从而让第一分支网络和第二分支网络理解第一样本图像和第二样本图像在某种维度上来说是相近的。In the embodiment of the present application, the purpose of increasing the correction coefficient η is to reduce the contribution of y*log(p) to the total loss when the coarse-grained category labels of the first sample image and the second sample image are different, so as to reduce the contribution of y*log(p) to the total loss. The first branch network and the second branch network understand that the first sample image and the second sample image are similar in a certain dimension.
在一实施例中,从第一分支网络和第二分支网络选取一个分支网络作为细粒度分类模型,包括:In an embodiment, selecting a branch network from the first branch network and the second branch network as the fine-grained classification model includes:
获取第一分支网络的分类准确率,以及获取第二分支网络的分类准确率;Obtain the classification accuracy rate of the first branch network, and obtain the classification accuracy rate of the second branch network;
从第一分支网络和第二分支网络中选取分类准确率较高的分支网络作为细粒度分类模型。The branch network with higher classification accuracy is selected from the first branch network and the second branch network as the fine-grained classification model.
本申请提供一种选取细粒度分类模型的方式,其中,电子设备分别获取第一分支网络的分类准确率以及获取第二分支网络的分类准确率,然后从第一分支网络和第二分支网络中选取分类准确率较高的分支网络作为细粒度分类模型。This application provides a method for selecting a fine-grained classification model, in which the electronic device obtains the classification accuracy of the first branch network and the classification accuracy of the second branch network respectively, and then obtains the classification accuracy of the first branch network and the second branch network. The branch network with higher classification accuracy is selected as the fine-grained classification model.
在一实施例中,融合第一样本图像的图像特征以及第二样本图像的图像特征得到融合图像特征,包括:In an embodiment, fusing the image features of the first sample image and the image features of the second sample image to obtain the fused image features includes:
对第一样本图像的图像特征和第二样本图像的图像特征进行通道合并,并将通道合并的结果作为融合图像特征。Channel merging is performed on the image features of the first sample image and the image features of the second sample image, and the result of the channel merging is used as the fused image feature.
比如,电子设备可以采用Concat的方式将第一样本图像的图像特征和第二样本图像的图像特征进行通道合并,并将通道合并的结果作为融合图像特征。For example, the electronic device can combine the image features of the first sample image and the image features of the second sample image in a Concat manner, and use the result of the channel combination as the fused image feature.
请参照图9,本申请还提供一种模型训练方法,该模型训练方法的流程可以为:Please refer to Figure 9. This application also provides a model training method, and the process of the model training method may be:
在201中,获取多个样本图像,以及获取样本图像的细粒度类别标签以及粗粒度类别标签。In 201, a plurality of sample images are acquired, and the fine-grained category labels and coarse-grained category labels of the sample images are acquired.
电子设备首先获取多个样本图像,以及获取每一样本图像的细粒度类别标签以及粗粒度类别标签。细粒度类别标签用于描述样本图像的细粒度类别,粗粒度标签用于描述样本图像的粗粒度类别。The electronic device first obtains a plurality of sample images, and obtains the fine-grained category label and the coarse-grained category label of each sample image. The fine-grained category label is used to describe the fine-grained category of the sample image, and the coarse-grained label is used to describe the coarse-grained category of the sample image.
应当说明的是,本申请对于获取样本图像的方式以及数量不做具体限制,可由本领域普通技术人员根据实际需要进行配置。比如,请参照图10,可以预先从ImageNet数据集中获取一部分样本图像存储在电子设备中,从网络爬取一部分样本图像存储在电子设备中,以及手动从网络采集一部分样本图像。It should be noted that this application does not make specific restrictions on the manner and quantity of obtaining sample images, and can be configured by a person of ordinary skill in the art according to actual needs. For example, referring to Figure 10, a part of the sample images can be obtained in advance from the ImageNet data set and stored in the electronic device, part of the sample images can be crawled from the network and stored in the electronic device, and a part of the sample images can be manually collected from the network.
电子设备在获取到样本图像之后,进一步接收人工标注的样本图像的细粒度类别标签以及粗粒度类别标签。其中,细粒度类别标签用于描述样本图像的细粒度类别,粗粒度标签用于描述样本图像的粗粒度类别。After acquiring the sample image, the electronic device further receives the fine-grained category label and the coarse-grained category label of the manually annotated sample image. Among them, the fine-grained category tag is used to describe the fine-grained category of the sample image, and the coarse-grained tag is used to describe the coarse-grained category of the sample image.
在202中,构建机器学习网络,机器学习网络包括结构相同的第一分支网络、第二分支网络以及二分类模块,第一分支网络包括特征提取模块、特征优化模块和细粒度分类模块,二分类模块与第一分支网络的特征提取模块和第二分支网络的特征提取模块连接。In 202, a machine learning network is constructed. The machine learning network includes a first branch network, a second branch network, and two classification modules with the same structure. The first branch network includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The module is connected with the feature extraction module of the first branch network and the feature extraction module of the second branch network.
请参照图8,构建的机器学习网络包括结构相同的两个分支网络和二分类模块,每一分支网络均包 括特征提取模块、特征优化模块和细粒度分类模块,为便于区分,将其中一分支网络记为第一分支网络,将另一分支网络记为第二分支网络。此外,二分类模块与第一分支网络的特征提取模块和第二分支网络的特征提取模块连接。Please refer to Figure 8. The constructed machine learning network includes two branch networks with the same structure and two classification modules. Each branch network includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The network is recorded as the first branch network, and the other branch network is recorded as the second branch network. In addition, the two-classification module is connected to the feature extraction module of the first branch network and the feature extraction module of the second branch network.
在203中,从多个样本图像中选取第一样本图像,并基于第一分支网络的特征提取模块提取第一样本图像的图像特征,经由第一分支网络的特征优化模块优化后输入第一分支网络的细粒度分类模块进行细粒度分类,得到第一预测细粒度类别。In 203, the first sample image is selected from a plurality of sample images, and the image features of the first sample image are extracted based on the feature extraction module of the first branch network, and the image features of the first sample image are optimized by the feature optimization module of the first branch network and input the first sample image. The fine-grained classification module of a branch network performs fine-grained classification to obtain the first predicted fine-grained category.
在204中,从多个样本图像中选取第二样本图像,并基于第二分支网络的特征提模块取提取第二样本图像的图像特征,经由第二分支网络的特征优化模块优化后输入第二分支网络的细粒度分类模块进行细粒度分类,得到第二预测细粒度类别。In 204, a second sample image is selected from a plurality of sample images, and the image features of the second sample image are extracted based on the feature extraction module of the second branch network, and input the second sample image after optimization by the feature optimization module of the second branch network. The fine-grained classification module of the branch network performs fine-grained classification to obtain the second predicted fine-grained category.
其中,电子设备从获取到的多个样本图像中选取两个样本图像,将其中一个样本图像记为第一样本图像,将另一个样本图像记为第二样本图像。对于第一样本图像,电子设备基于第一分支网络的特征提取模块提取第一样本图像的图像特征,经由第一分支网络的特征优化模块优化后输入第一分支网络的细粒度分类模块进行细粒度分类,得到第一预测细粒度类别。对于第二样本图像,电子设备基于第二分支网络的特征提取模块提取第二样本图像的图像特征,经由第二分支网络的特征优化模块优化后输入第二分支网络的细粒度分类模块进行细粒度分类,得到第二预测细粒度类别。Wherein, the electronic device selects two sample images from the acquired multiple sample images, and records one of the sample images as the first sample image, and records the other sample image as the second sample image. For the first sample image, the electronic device extracts the image features of the first sample image based on the feature extraction module of the first branch network, is optimized by the feature optimization module of the first branch network, and then is input into the fine-grained classification module of the first branch network for processing. Fine-grained classification, the first predicted fine-grained category is obtained. For the second sample image, the electronic device extracts the image features of the second sample image based on the feature extraction module of the second branch network, is optimized by the feature optimization module of the second branch network, and then is input into the fine-grained classification module of the second branch network for fine-grained Classification to obtain the second predicted fine-grained category.
在205中,融合第一样本图像的图像特征以及第二样本图像的图像特征得到融合图像特征,并基于二分类模块根据融合图像特征预测第一样本图像和第二样本图像的粗粒度类别是否相同,得到预测结果。In 205, the image features of the first sample image and the image features of the second sample image are fused to obtain the fused image feature, and the coarse-grained categories of the first sample image and the second sample image are predicted based on the fusion image feature based on the binary classification module Whether they are the same, get the predicted result.
此外,电子设备还融合第一样本图像的图像特征以及第二样本图像的图像特征得到融合图像特征,并基于二分类模块根据融合图像特征预测第一样本图像和第二样本图像的粗粒度类别是否相同,得到预测结果。In addition, the electronic device also fuses the image features of the first sample image and the image features of the second sample image to obtain the fused image feature, and predicts the coarse-grained first sample image and the second sample image based on the fused image feature based on the binary classification module Whether the categories are the same, get the prediction result.
比如,电子设备可以采用Concat的方式将第一样本图像的图像特征和第二样本图像的图像特征进行通道合并,并将通道合并的结果作为融合图像特征。For example, the electronic device can combine the image features of the first sample image and the image features of the second sample image in a Concat manner, and use the result of the channel combination as the fused image feature.
在206中,根据第一预测细粒度类别以及第一样本图像的细粒度类别标签获取第一分支网络的第一分类损失,根据第二预测细粒度类别以及第二样本图像的细粒度类别标签获取第二分支网络的第二分类损失,以及根据预测结果、第一样本图像和第二样本图像的粗粒度类别获取第三分类损失。In 206, obtain the first classification loss of the first branch network according to the first predicted fine-grained category and the fine-grained category label of the first sample image, and obtain the first classification loss of the first branch network according to the second predicted fine-grained category and the fine-grained category label of the second sample image Obtain the second classification loss of the second branch network, and obtain the third classification loss according to the prediction result, the coarse-grained categories of the first sample image and the second sample image.
其中,可由本领域普通技术人员根据实际需要配置用于获取前述第一分类损失、第二分类损失以及第三分类损失的损失函数,本申请对此并不做具体限制。Among them, a person of ordinary skill in the art can configure the loss functions used to obtain the aforementioned first classification loss, second classification loss, and third classification loss according to actual needs, and this application does not specifically limit this.
比如,按照如下公式获取第三分类损失:For example, the third classification loss is obtained according to the following formula:
Loss3=-[η*y*log(p)+(1-y)*log(1-p)];Loss3=-[η*y*log(p)+(1-y)*log(1-p)];
其中,Loss3表示第三分类损失;η表示修正系数(取经验值,比如本申请在0.3-0.5之间取值);y用于表征第一样本图像的粗粒度类别标签和第二样本图像的粗粒度类别标签是否相同,可由本领域普通技术人员根据实际需要进行取值;p表示预测结果。Among them, Loss3 represents the third classification loss; η represents the correction coefficient (taking an empirical value, for example, the value is between 0.3-0.5 in this application); y is used to characterize the coarse-grained category label of the first sample image and the second sample image Whether the coarse-grained category labels are the same can be determined by those of ordinary skill in the art according to actual needs; p represents the prediction result.
本申请实施例中,增加修正系数η的目的是为了当第一样本图像和第二样本图像的粗粒度类别标签不同时,使得y*log(p)对总损失的贡献减小,从而让第一分支网络和第二分支网络理解第一样本图像和 第二样本图像在某种维度上来说是相近的。In the embodiment of the present application, the purpose of increasing the correction coefficient η is to reduce the contribution of y*log(p) to the total loss when the coarse-grained category labels of the first sample image and the second sample image are different, so as to reduce the contribution of y*log(p) to the total loss. The first branch network and the second branch network understand that the first sample image and the second sample image are similar in a certain dimension.
在207中,根据第一分类损失、第二分类损失以及第三分类损失获取对应的总损失,并根据总损失调整第一分支网络和第二分支网络的参数,直至满足预设训练停止条件时结束训练。In 207, the corresponding total loss is obtained according to the first classification loss, the second classification loss, and the third classification loss, and the parameters of the first branch network and the second branch network are adjusted according to the total loss until the preset training stop condition is met End training.
比如,按照如下公式获取总损失:For example, obtain the total loss according to the following formula:
L total=Loss1+Loss2+Loss3; L total =Loss1+Loss2+Loss3;
其中,L total表示总损失,Loss1表示第一分类损失,Loss2表示第二分类损失,Loss3表示第三分类损失。 Among them, L total represents the total loss, Loss1 represents the first classification loss, Loss2 represents the second classification loss, and Loss3 represents the third classification loss.
在获取到总损失后,电子设备根据该总损失调整第一分支网络和第二分支网络的参数。应当说明的是,模型训练的目标就是最小化总损失,因此,在每次确定总损失后,即可以最小化总损失为方向,对第一分支网络和第二分支网络的参数进行调整。After obtaining the total loss, the electronic device adjusts the parameters of the first branch network and the second branch network according to the total loss. It should be noted that the goal of model training is to minimize the total loss. Therefore, after the total loss is determined each time, the total loss can be minimized as a direction to adjust the parameters of the first branch network and the second branch network.
如上,通过不断的对第一分支网络和第二分支网络的参数进行调整,直至满足预设训练停止条件时结束训练。其中,预设训练停止条件可由本领域普通技术人员根据实际需要进行设置,本申请实施例对此不做具体限制。As above, by continuously adjusting the parameters of the first branch network and the second branch network, the training ends when the preset training stop condition is met. Among them, the preset training stop condition can be set by a person of ordinary skill in the art according to actual needs, which is not specifically limited in the embodiment of the present application.
比如,预设训练停止条件被配置为:当总损失取最小值时停止训练;For example, the preset training stop condition is configured to stop training when the total loss takes the minimum value;
又比如,预设训练停止条件被配置为:当参数的迭代次数达到预设次数时停止训练。For another example, the preset training stop condition is configured to stop training when the number of iterations of the parameter reaches the preset number.
在208中,从第一分支网络和第二分支网络选取一个分支网络作为细粒度分类模型。In 208, a branch network is selected from the first branch network and the second branch network as a fine-grained classification model.
当满足预设训练停止条件时,电子设备判定机器学习网络中的第一分支网络和第二分支均能够准确的对图像进行细粒度分类,此时从第一分支网络和第二分支网络中选取一个分支网络作为用于对图像进行细粒度分类的细粒度分类模型。When the preset training stop condition is met, the electronic device determines that both the first branch network and the second branch in the machine learning network can accurately classify the image at a fine-grained level. At this time, select from the first branch network and the second branch network A branch network is used as a fine-grained classification model for fine-grained classification of images.
请参照图10,图10是本申请提供的图像分类装置的结构示意图。该图像分类装置可以包括:图像确定组件301、模型调用组件302以及类别获取组件303。Please refer to FIG. 10, which is a schematic structural diagram of the image classification device provided by the present application. The image classification device may include: an image determining component 301, a model calling component 302, and a category obtaining component 303.
图像确定组件301,用于确定需要进行图像分类的目标图像;The image determining component 301 is used to determine the target image that needs to be image classified;
模型调用组件302,用于调用预训练的细粒度分类模型对目标图像进行细粒度分类,其中,细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,特征提取模块用于提取目标图像的图像特征,特征优化模块用于对图像特征进行优化处理,得到优化图像特征,细粒度分类模块对优化图像特征进行细粒度分类,得到目标图像的细粒度类别;The model calling component 302 is used to call the pre-trained fine-grained classification model to perform fine-grained classification of the target image. The fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The feature extraction module is used to extract the target The image feature of the image, the feature optimization module is used to optimize the image feature to obtain the optimized image feature, and the fine-grained classification module performs fine-grained classification on the optimized image feature to obtain the fine-grained category of the target image;
类别获取组件303,用于获取细粒度分类模型输出的目标图像的细粒度类别。The category obtaining component 303 is used to obtain the fine-grained category of the target image output by the fine-grained classification model.
在一实施例中,在确定需要进行图像分类的目标图像时,图像确定组件301用于:In one embodiment, when determining the target image that needs to be classified, the image determining component 301 is used to:
当到达图像分类周期时,将图像分类周期内新增的图像确定为目标图像。When the image classification period is reached, the newly added image in the image classification period is determined as the target image.
在一实施例中,细粒度分类模型还包括降维模块,用于对优化图像特征进行特征降维,得到降维后的优化图像特征;In an embodiment, the fine-grained classification model further includes a dimensionality reduction module, which is used to perform feature dimensionality reduction on optimized image features to obtain optimized image features after dimensionality reduction;
细粒度分类模块还用于对降维后的优化图像特征进行分类预测,得到目标图像的细粒度类别。The fine-grained classification module is also used to classify and predict the optimized image features after dimensionality reduction to obtain the fine-grained category of the target image.
在一实施例中,本申请提供的图像分类装置还包括分类存储组件,用于在获取细粒度分类模型输出的目标图像的细粒度类别之后,根据细粒度类别为目标图像分配存储路径,并将目标图像存储至储存路 径中。In an embodiment, the image classification device provided in the present application further includes a classification storage component, which is used to allocate a storage path for the target image according to the fine-grained category after obtaining the fine-grained category of the target image output by the fine-grained classification model, and The target image is stored in the storage path.
在一实施例中,图像特征包括特征图,特征优化模块用于对特征图进行转置处理,得到转置特征图,并对特征图和转置特征图进行矩阵相乘处理,将矩阵相乘的结果作为优化图像特征。In an embodiment, the image feature includes a feature map, and the feature optimization module is used to transpose the feature map to obtain the transposed feature map, and perform matrix multiplication processing on the feature map and the transposed feature map to multiply the matrixes. The result is used as an optimized image feature.
在一实施例中,本申请提供的图像分类装置还包括模型训练组件,在确定需要进行图像分类的目标图像之前,用于:In an embodiment, the image classification device provided in the present application further includes a model training component, which is used to: before determining the target image that needs to be image classified:
获取多个样本图像,以及获取样本图像的细粒度类别标签以及粗粒度类别标签;Obtain multiple sample images, and obtain fine-grained category labels and coarse-grained category labels of the sample images;
构建机器学习网络,机器学习网络包括结构相同的第一分支网络、第二分支网络以及二分类模块,第一分支网络包括特征提取模块、特征优化模块和细粒度分类模块,二分类模块与第一分支网络的特征提取模块和第二分支网络的特征提取模块连接;Construct a machine learning network. The machine learning network includes a first branch network, a second branch network, and two classification modules with the same structure. The first branch network includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The second classification module is the same as the first branch network. The feature extraction module of the branch network is connected to the feature extraction module of the second branch network;
从多个样本图像中选取第一样本图像,并基于第一分支网络的特征提取模块提取第一样本图像的图像特征,经由第一分支网络的特征优化模块优化后输入第一分支网络的细粒度分类模块进行细粒度分类,得到第一预测细粒度类别;Select the first sample image from a plurality of sample images, and extract the image features of the first sample image based on the feature extraction module of the first branch network, and input the image features of the first branch network after optimization by the feature optimization module of the first branch network The fine-grained classification module performs fine-grained classification to obtain the first predicted fine-grained category;
从多个样本图像中选取第二样本图像,并基于第二分支网络的特征提模块取提取第二样本图像的图像特征,经由第二分支网络的特征优化模块优化后输入第二分支网络的细粒度分类模块进行细粒度分类,得到第二预测细粒度类别;Select a second sample image from a plurality of sample images, and extract the image features of the second sample image based on the feature extraction module of the second branch network, and input the details of the second branch network after optimization by the feature optimization module of the second branch network The granularity classification module performs fine-grained classification to obtain the second predicted fine-grained category;
融合第一样本图像的图像特征以及第二样本图像的图像特征得到融合图像特征,并基于二分类模块预测第一样本图像和第二样本图像的粗粒度类别是否相同,得到预测结果;Fusing the image features of the first sample image and the image features of the second sample image to obtain the fused image feature, and predicting whether the coarse-grained categories of the first sample image and the second sample image are the same based on the two classification module, to obtain the prediction result;
根据第一预测细粒度类别以及第一样本图像的细粒度类别标签获取第一分支网络的第一分类损失,根据第二预测细粒度类别以及第二样本图像的细粒度类别标签获取第二分支网络的第二分类损失,以及根据预测结果、第一样本图像和第二样本图像的粗粒度类别获取第三分类损失;Obtain the first classification loss of the first branch network according to the first predicted fine-grained category and the fine-grained category label of the first sample image, and obtain the second branch according to the second predicted fine-grained category and the fine-grained category label of the second sample image The second classification loss of the network, and the third classification loss is obtained according to the prediction result, the coarse-grained categories of the first sample image and the second sample image;
根据第一分类损失、第二分类损失以及第三分类损失获取对应的总损失,并根据总损失调整第一分支网络和第二分支网络的参数,直至满足预设训练停止条件时结束训练;Obtain the corresponding total loss according to the first classification loss, the second classification loss, and the third classification loss, and adjust the parameters of the first branch network and the second branch network according to the total loss, and end the training when the preset training stop condition is met;
从第一分支网络和第二分支网络选取一个分支网络作为细粒度分类模型。A branch network is selected from the first branch network and the second branch network as the fine-grained classification model.
在一实施例中,按照如下公式获取第三分类损失:In an embodiment, the third classification loss is obtained according to the following formula:
Loss3=-[η*y*log(p)+(1-y)*log(1-p)];Loss3=-[η*y*log(p)+(1-y)*log(1-p)];
其中,Loss3表示第三分类损失,η表示修正系数,y用于表征第一样本图像的粗粒度类别标签和第二样本图像的粗粒度类别标签是否相同,p表示预测结果。Among them, Loss3 represents the third classification loss, η represents the correction coefficient, y is used to characterize whether the coarse-grained category label of the first sample image and the coarse-grained category label of the second sample image are the same, and p represents the prediction result.
在一实施例中,在从第一分支网络和第二分支网络选取一个分支网络作为细粒度分类模型时,模型训练组件用于:In an embodiment, when selecting a branch network from the first branch network and the second branch network as the fine-grained classification model, the model training component is used to:
获取第一分支网络的分类准确率,以及获取第二分支网络的分类准确率;Obtain the classification accuracy rate of the first branch network, and obtain the classification accuracy rate of the second branch network;
从第一分支网络和第二分支网络中选取分类准确率较高的分支网络作为所述细粒度分类模型。A branch network with a higher classification accuracy rate is selected from the first branch network and the second branch network as the fine-grained classification model.
在一实施例中,在融合第一样本图像的图像特征以及第二样本图像的图像特征得到融合图像特征时,模型训练组件用于:In an embodiment, when the image features of the first sample image and the image features of the second sample image are fused to obtain the fused image features, the model training component is used to:
对第一样本图像的图像特征和第二样本图像的图像特征进行通道合并,并将通道合并的结果作为融 合图像特征。Perform channel merging on the image features of the first sample image and the image features of the second sample image, and use the result of channel merging as the fused image feature.
应当说明的是,本申请实施例提供的图像分类装置与上文实施例中的图像分类方法属于同一构思,在图像分类装置上可以运行图像分类方法实施例中提供的任一方法,其具体实现过程详见以上图像分类方法的实施例,此处不再赘述。It should be noted that the image classification device provided in this embodiment of the application belongs to the same concept as the image classification method in the above embodiment. Any method provided in the image classification method embodiment can be run on the image classification device, and its specific implementation For details of the process, please refer to the above embodiment of the image classification method, which will not be repeated here.
本申请还提供一种计算机可读的存储介质,其上存储有计算机程序,当其存储的计算机程序在计算机上执行时,使得计算机执行如本申请实施例提供的图像分类方法。其中,存储介质可以是磁碟、光盘、只读存储器(Read Only Memory,ROM,)或者随机存取器(Random Access Memory,RAM)等。The present application also provides a computer-readable storage medium on which a computer program is stored. When the stored computer program is executed on a computer, the computer is caused to execute the image classification method provided in the embodiment of the present application. Among them, the storage medium may be a magnetic disk, an optical disk, a read only memory (Read Only Memory, ROM,), or a random access device (Random Access Memory, RAM), etc.
本申请还提供一种电子设备,包括存储器,处理器,存储器中存储有计算机程序,处理器通过调用存储器中存储的计算机程序,用于执行如本申请提供的图像分类方法。The present application also provides an electronic device including a memory and a processor, and a computer program is stored in the memory. The processor is used to execute the image classification method as provided in the present application by calling the computer program stored in the memory.
比如,上述电子设备可以是诸如平板电脑或者智能手机等移动终端。请参照图11,图11为本申请实施例提供的电子设备的一结构示意图。For example, the above-mentioned electronic device may be a mobile terminal such as a tablet computer or a smart phone. Please refer to FIG. 11, which is a schematic structural diagram of an electronic device provided by an embodiment of the application.
处理器401与存储器402电性连接。The processor 401 is electrically connected to the memory 402.
处理器401是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或加载存储在存储器402内的计算机程序,以及调用存储在存储器402内的数据,执行电子设备的各种功能并处理数据。The processor 401 is the control center of the electronic device. It uses various interfaces and lines to connect the various parts of the entire electronic device. It executes the electronic device by running or loading the computer program stored in the memory 402 and calling the data stored in the memory 402. Various functions and process data.
存储器402可用于存储软件程序以及模块,处理器401通过运行存储在存储器402的计算机程序以及模块,从而执行各种功能应用以及数据处理。存储器402可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的计算机程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器402可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器402还可以包括存储器控制器,以提供处理器401对存储器402的访问。The memory 402 may be used to store software programs and modules. The processor 401 executes various functional applications and data processing by running the computer programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area. The program storage area may store an operating system, a computer program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of electronic equipment, etc. In addition, the memory 402 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices. Correspondingly, the memory 402 may also include a memory controller to provide the processor 401 with access to the memory 402.
在本申请实施例中,电子设备中的处理器401会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器402中,并由处理器401运行存储在存储器402中的计算机程序,从而实现各种功能,比如:In the embodiment of the present application, the processor 401 in the electronic device will load the instructions corresponding to the process of one or more computer programs into the memory 402 according to the following steps, and run the instructions by the processor 401 and store them in the memory 402 Computer programs to achieve various functions, such as:
确定需要进行图像分类的目标图像;Determine the target image that needs to be classified;
调用预训练的细粒度分类模型对目标图像进行细粒度分类,其中,细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,特征提取模块用于提取目标图像的图像特征,特征优化模块用于对图像特征进行优化处理,得到优化图像特征,细粒度分类模块对优化图像特征进行细粒度分类,得到目标图像的细粒度类别;Call the pre-trained fine-grained classification model to perform fine-grained classification of the target image. The fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The feature extraction module is used to extract the image features of the target image and optimize the features. The module is used to optimize image features to obtain optimized image features, and the fine-grained classification module performs fine-grained classification of optimized image features to obtain the fine-grained category of the target image;
获取细粒度分类模型输出的目标图像的细粒度类别。Obtain the fine-grained category of the target image output by the fine-grained classification model.
应当说明的是,本申请实施例提供的电子设备与上文实施例中的图像分类方法属于同一构思,在电子设备上可以运行图像分类方法实施例中提供的任一方法,其具体实现过程详见特征提取方法实施例,此处不再赘述。It should be noted that the electronic device provided in the embodiment of this application belongs to the same concept as the image classification method in the above embodiment. Any method provided in the image classification method embodiment can be run on the electronic device. The specific implementation process is detailed. See the embodiment of the feature extraction method, which will not be repeated here.
需要说明的是,对本申请实施例的图像分类方法而言,本领域普通测试人员可以理解实现本申请实 施例的图像分类方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述计算机程序可存储于一计算机可读取存储介质中,如存储在电子设备的存储器中,并被该电子设备内的至少一个处理器执行,在执行过程中可包括如图像分类方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储器、随机存取记忆体等。It should be noted that for the image classification method of the embodiment of the application, ordinary testers in the field can understand that all or part of the process of implementing the image classification method of the embodiment of the application can be completed by controlling the relevant hardware through a computer program. The computer program may be stored in a computer readable storage medium, such as stored in the memory of an electronic device, and executed by at least one processor in the electronic device. The execution process may include methods such as image classification methods. The flow of the embodiment. Wherein, the storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.
对本申请实施例的数据筛选装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,所述存储介质譬如为只读存储器,磁盘或光盘等。For the data screening device of the embodiment of the present application, each functional module may be integrated in a processing chip, or each module may exist alone physically, or two or more modules may be integrated in one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium, such as a read-only memory, a magnetic disk or an optical disk, etc. .

Claims (20)

  1. 一种图像分类方法,其中,所述图像分类方法包括:An image classification method, wherein the image classification method includes:
    确定需要进行图像分类的目标图像;Determine the target image that needs to be classified;
    调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;A pre-trained fine-grained classification model is called to perform fine-grained classification of the target image, wherein the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction module is used to extract the The image feature of the target image, the feature optimization module is used to optimize the image feature to obtain the optimized image feature, and the fine-grained classification module performs fine-grained classification of the optimized image feature to obtain the image feature of the target image Fine-grained categories;
    获取所述细粒度分类模型输出的所述目标图像的细粒度类别。Acquire the fine-grained category of the target image output by the fine-grained classification model.
  2. 根据权利要求1所述的图像分类方法,其中,所述确定需要进行图像分类的目标图像,包括:The image classification method according to claim 1, wherein the determining the target image that needs to be classified includes:
    当到达图像分类周期时,将所述图像分类周期内新增的图像确定为所述目标图像。When the image classification period is reached, an image newly added in the image classification period is determined as the target image.
  3. 根据权利要求1所述的图像分类方法,其中,所述细粒度分类模型还包括降维模块,所述降维模块用于对所述优化图像特征进行特征降维,得到降维后的优化图像特征;The image classification method according to claim 1, wherein the fine-grained classification model further comprises a dimensionality reduction module, and the dimensionality reduction module is used to perform feature dimensionality reduction on the optimized image features to obtain an optimized image after dimensionality reduction feature;
    所述细粒度分类模块还用于对所述降维后的优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别。The fine-grained classification module is also used to perform fine-grained classification on the optimized image features after dimensionality reduction to obtain the fine-grained category of the target image.
  4. 根据权利要求1所述的图像分类方法,其中,所述获取所述细粒度分类模型输出的所述目标图像的细粒度类别之后,还包括:The image classification method according to claim 1, wherein after said obtaining the fine-grained category of the target image output by the fine-grained classification model, the method further comprises:
    根据所述细粒度类别为所述目标图像分配存储路径,并将所述目标图像存储至所述储存路径中。A storage path is allocated to the target image according to the fine-grained category, and the target image is stored in the storage path.
  5. 根据权利要求1所述图像分类方法,其中,所述图像特征包括特征图,所述特征优化模块用于对所述特征图进行转置处理,得到转置特征图,并对所述特征图和所述转置特征图进行矩阵相乘处理,将矩阵相乘的结果作为所述优化图像特征。The image classification method according to claim 1, wherein the image features include a feature map, and the feature optimization module is used to transpose the feature map to obtain a transposed feature map, and compare the feature map and The transposed feature map is subjected to matrix multiplication processing, and the result of the matrix multiplication is used as the optimized image feature.
  6. 根据权利要求1所述的种图像分类方法,其中,所述确定需要进行图像分类的目标图像之前,还包括:The method for image classification according to claim 1, wherein before said determining the target image that needs to be classified, the method further comprises:
    获取多个样本图像,以及获取所述样本图像的细粒度类别标签以及粗粒度类别标签;Acquiring a plurality of sample images, and acquiring fine-grained category labels and coarse-grained category labels of the sample images;
    构建机器学习网络,所述机器学习网络包括结构相同的第一分支网络、第二分支网络以及二分类模块,所述第一分支网络包括特征提取模块、特征优化模块和细粒度分类模块,所述二分类模块与所述第一分支网络的特征提取模块和所述第二分支网络的特征提取模块连接;Construct a machine learning network, the machine learning network includes a first branch network, a second branch network, and a second classification module with the same structure, the first branch network includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The two-classification module is connected to the feature extraction module of the first branch network and the feature extraction module of the second branch network;
    从所述多个样本图像中选取第一样本图像,并基于第一分支网络的特征提取模块提取所述第一样本图像的图像特征,经由所述第一分支网络的特征优化模块优化后输入所述第一分支网络的细粒度分类模块进行细粒度分类,得到第一预测细粒度类别;A first sample image is selected from the plurality of sample images, and the image features of the first sample image are extracted based on the feature extraction module of the first branch network, and after optimization by the feature optimization module of the first branch network Input the fine-grained classification module of the first branch network to perform fine-grained classification to obtain the first predicted fine-grained category;
    从所述多个样本图像中选取第二样本图像,并基于第二分支网络的特征提模块取提取第二样本图像的图像特征,经由所述第二分支网络的特征优化模块优化后输入所述第二分支网络的细粒度分类模块进行细粒度分类,得到第二预测细粒度类别;Select a second sample image from the plurality of sample images, extract the image features of the second sample image based on the feature extraction module of the second branch network, and input the image features of the second sample image after optimization by the feature optimization module of the second branch network The fine-grained classification module of the second branch network performs fine-grained classification to obtain the second predicted fine-grained category;
    融合所述第一样本图像的图像特征以及所述第二样本图像的图像特征得到融合图像特征,并基于所述二分类模块根据所述融合图像特征预测所述第一样本图像和所述第二样本图像的粗粒度类别是否相 同,得到预测结果;The image features of the first sample image and the image features of the second sample image are fused to obtain a fused image feature, and based on the two classification module, the first sample image and the image feature of the fused image are predicted Whether the coarse-grained categories of the second sample image are the same, and the prediction result is obtained;
    根据所述第一预测细粒度类别以及所述第一样本图像的细粒度类别标签获取所述第一分支网络的第一分类损失,根据所述第二预测细粒度类别以及所述第二样本图像的细粒度类别标签获取所述第二分支网络的第二分类损失,以及根据所述预测结果、所述第一样本图像和所述第二样本图像的粗粒度类别获取第三分类损失;Obtain the first classification loss of the first branch network according to the first predicted fine-grained category and the fine-grained category label of the first sample image, and obtain the first classification loss of the first branch network according to the second predicted fine-grained category and the second sample Acquiring the second classification loss of the second branch network from the fine-grained category label of the image, and acquiring the third classification loss according to the prediction result, the coarse-grained category of the first sample image and the second sample image;
    根据所述第一分类损失、所述第二分类损失以及所述第三分类损失获取对应的总损失,并根据所述总损失调整所述第一分支网络和所述第二分支网络的参数,直至满足预设训练停止条件时结束训练;Obtain the corresponding total loss according to the first classification loss, the second classification loss, and the third classification loss, and adjust the parameters of the first branch network and the second branch network according to the total loss, The training ends when the preset training stop conditions are met;
    从所述第一分支网络和所述第二分支网络选取一个分支网络作为所述细粒度分类模型。A branch network is selected from the first branch network and the second branch network as the fine-grained classification model.
  7. 根据权利要求6所述的图像分类方法,其中,按照如下公式获取第三分类损失:The image classification method according to claim 6, wherein the third classification loss is obtained according to the following formula:
    Loss3=-[η*y*log(p)+(1-y)*log(1-p)];Loss3=-[η*y*log(p)+(1-y)*log(1-p)];
    其中,Loss3表示第三分类损失,η表示修正系数,y用于表征第一样本图像的粗粒度类别标签和第二样本图像的粗粒度类别标签是否相同,p表示预测结果。Among them, Loss3 represents the third classification loss, η represents the correction coefficient, y is used to characterize whether the coarse-grained category label of the first sample image and the coarse-grained category label of the second sample image are the same, and p represents the prediction result.
  8. 根据权利要求6所述的图像分类方法,其中,所述从所述第一分支网络和所述第二分支网络选取一个分支网络作为所述细粒度分类模型,包括:7. The image classification method according to claim 6, wherein the selecting a branch network from the first branch network and the second branch network as the fine-grained classification model comprises:
    获取第一分支网络的分类准确率,以及获取第二分支网络的分类准确率;Obtain the classification accuracy rate of the first branch network, and obtain the classification accuracy rate of the second branch network;
    从第一分支网络和第二分支网络中选取分类准确率较高的分支网络作为所述细粒度分类模型。A branch network with a higher classification accuracy rate is selected from the first branch network and the second branch network as the fine-grained classification model.
  9. 根据权利要求6所述图像分类方法,其中,所述融合所述第一样本图像的图像特征以及所述第二样本图像的图像特征得到融合图像特征,包括:8. The image classification method according to claim 6, wherein said fusing the image features of the first sample image and the image features of the second sample image to obtain the fused image features comprises:
    对所述第一样本图像的图像特征和所述第二样本图像的图像特征进行通道合并,并将通道合并的结果作为所述融合图像特征。Channel merging is performed on the image feature of the first sample image and the image feature of the second sample image, and the result of the channel merging is used as the fused image feature.
  10. 一种图像分类装置,其中,包括:An image classification device, which includes:
    图像确定组件,用于确定需要进行图像分类的目标图像;The image determination component is used to determine the target image that needs to be classified;
    模型调用组件,用于调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;The model calling component is used to call a pre-trained fine-grained classification model to perform fine-grained classification of the target image, wherein the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction The module is used to extract image features of the target image, the feature optimization module is used to optimize the image features to obtain optimized image features, and the fine-grained classification module performs fine-grained classification on the optimized image features, Obtaining the fine-grained category of the target image;
    类别获取组件,用于获取所述细粒度分类模型输出的所述目标图像的细粒度类别。The category acquisition component is used to acquire the fine-grained category of the target image output by the fine-grained classification model.
  11. 一种存储介质,其中,所述存储介质中存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行:A storage medium, wherein a computer program is stored in the storage medium, and when the computer program runs on a computer, the computer is caused to execute:
    确定需要进行图像分类的目标图像;Determine the target image that needs to be classified;
    调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;A pre-trained fine-grained classification model is called to perform fine-grained classification of the target image, wherein the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction module is used to extract the The image feature of the target image, the feature optimization module is used to optimize the image feature to obtain the optimized image feature, and the fine-grained classification module performs fine-grained classification of the optimized image feature to obtain the image feature of the target image Fine-grained categories;
    获取所述细粒度分类模型输出的所述目标图像的细粒度类别。Acquire the fine-grained category of the target image output by the fine-grained classification model.
  12. 一种电子设备,其中,所述电子设备包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器通过调用所述存储器中存储的所述计算机程序,用于执行:An electronic device, wherein the electronic device includes a processor and a memory, and a computer program is stored in the memory, and the processor is configured to execute:
    确定需要进行图像分类的目标图像;Determine the target image that needs to be classified;
    调用预训练的细粒度分类模型对所述目标图像进行细粒度分类,其中,所述细粒度分类模型包括特征提取模块、特征优化模块和细粒度分类模块,所述特征提取模块用于提取所述目标图像的图像特征,所述特征优化模块用于对所述图像特征进行优化处理,得到优化图像特征,所述细粒度分类模块对所述优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别;A pre-trained fine-grained classification model is called to perform fine-grained classification of the target image, wherein the fine-grained classification model includes a feature extraction module, a feature optimization module, and a fine-grained classification module, and the feature extraction module is used to extract the The image feature of the target image, the feature optimization module is used to optimize the image feature to obtain the optimized image feature, and the fine-grained classification module performs fine-grained classification of the optimized image feature to obtain the image feature of the target image Fine-grained categories;
    获取所述细粒度分类模型输出的所述目标图像的细粒度类别。Acquire the fine-grained category of the target image output by the fine-grained classification model.
  13. 根据权利要求12所述的电子设备,其中,在确定需要进行图像分类的目标图像时,所述处理器用于执行:The electronic device according to claim 12, wherein, when determining a target image that needs to be image classified, the processor is configured to execute:
    当到达图像分类周期时,将所述图像分类周期内新增的图像确定为所述目标图像。When the image classification period is reached, an image newly added in the image classification period is determined as the target image.
  14. 根据权利要求12所述的电子设备,其中,所述细粒度分类模型还包括降维模块,所述降维模块对所述优化图像特征进行特征降维,得到降维后的优化图像特征;The electronic device according to claim 12, wherein the fine-grained classification model further comprises a dimensionality reduction module, and the dimensionality reduction module performs feature reduction on the optimized image features to obtain optimized image features after dimensionality reduction;
    所述细粒度分类模块还用于对所述降维后的优化图像特征进行细粒度分类,得到所述目标图像的细粒度类别。The fine-grained classification module is also used to perform fine-grained classification on the optimized image features after dimensionality reduction to obtain the fine-grained category of the target image.
  15. 根据权利要求12所述的电子设备,在获取所述细粒度分类模型输出的所述目标图像的细粒度类别之后,所述处理器还用于执行:The electronic device according to claim 12, after acquiring the fine-grained category of the target image output by the fine-grained classification model, the processor is further configured to execute:
    根据所述细粒度类别为所述目标图像分配存储路径,并将所述目标图像存储至所述储存路径中。A storage path is allocated to the target image according to the fine-grained category, and the target image is stored in the storage path.
  16. 根据权利要求12所述的电子设备,其中,所述图像特征包括特征图,所述特征优化模块用于对所述特征图进行转置处理,得到转置特征图,并对所述特征图和所述转置特征图进行矩阵相乘处理,将矩阵相乘的结果作为所述优化图像特征。The electronic device according to claim 12, wherein the image feature comprises a feature map, and the feature optimization module is used to transpose the feature map to obtain a transposed feature map, and compare the feature map and The transposed feature map is subjected to matrix multiplication processing, and the result of the matrix multiplication is used as the optimized image feature.
  17. 根据权利要求12所述的电子设备,其中,在确定需要进行图像分类的目标图像之前,所述处理器还用于执行:The electronic device according to claim 12, wherein, before determining the target image that needs to be image classified, the processor is further configured to execute:
    获取多个样本图像,以及获取所述样本图像的细粒度类别标签以及粗粒度类别标签;Acquiring a plurality of sample images, and acquiring fine-grained category labels and coarse-grained category labels of the sample images;
    构建机器学习网络,所述机器学习网络包括结构相同的第一分支网络、第二分支网络以及二分类模块,所述第一分支网络包括特征提取模块、特征优化模块和细粒度分类模块,所述二分类模块与所述第一分支网络的特征提取模块和所述第二分支网络的特征提取模块连接;Construct a machine learning network, the machine learning network includes a first branch network, a second branch network, and a second classification module with the same structure, the first branch network includes a feature extraction module, a feature optimization module, and a fine-grained classification module. The two-classification module is connected to the feature extraction module of the first branch network and the feature extraction module of the second branch network;
    从所述多个样本图像中选取第一样本图像,并基于第一分支网络的特征提取模块提取所述第一样本图像的图像特征,经由所述第一分支网络的特征优化模块优化后输入所述第一分支网络的细粒度分类模块进行细粒度分类,得到第一预测细粒度类别;A first sample image is selected from the plurality of sample images, and the image features of the first sample image are extracted based on the feature extraction module of the first branch network, and after optimization by the feature optimization module of the first branch network Input the fine-grained classification module of the first branch network to perform fine-grained classification to obtain the first predicted fine-grained category;
    从所述多个样本图像中选取第二样本图像,并基于第二分支网络的特征提模块取提取第二样本图像的图像特征,经由所述第二分支网络的特征优化模块优化后输入所述第二分支网络的细粒度分类模块进行细粒度分类,得到第二预测细粒度类别;Select a second sample image from the plurality of sample images, extract the image features of the second sample image based on the feature extraction module of the second branch network, and input the image features of the second sample image after optimization by the feature optimization module of the second branch network The fine-grained classification module of the second branch network performs fine-grained classification to obtain the second predicted fine-grained category;
    融合所述第一样本图像的图像特征以及所述第二样本图像的图像特征得到融合图像特征,并基于所述二分类模块根据所述融合图像特征预测所述第一样本图像和所述第二样本图像的粗粒度类别是否相同,得到预测结果;The image features of the first sample image and the image features of the second sample image are fused to obtain a fused image feature, and based on the two classification module, the first sample image and the image feature of the fused image are predicted Whether the coarse-grained categories of the second sample image are the same, and the prediction result is obtained;
    根据所述第一预测细粒度类别以及所述第一样本图像的细粒度类别标签获取所述第一分支网络的第一分类损失,根据所述第二预测细粒度类别以及所述第二样本图像的细粒度类别标签获取所述第二分支网络的第二分类损失,以及根据所述预测结果、所述第一样本图像和所述第二样本图像的粗粒度类别获取第三分类损失;Obtain the first classification loss of the first branch network according to the first predicted fine-grained category and the fine-grained category label of the first sample image, and obtain the first classification loss of the first branch network according to the second predicted fine-grained category and the second sample Acquiring the second classification loss of the second branch network from the fine-grained category label of the image, and acquiring the third classification loss according to the prediction result, the coarse-grained category of the first sample image and the second sample image;
    根据所述第一分类损失、所述第二分类损失以及所述第三分类损失获取对应的总损失,并根据所述总损失调整所述第一分支网络和所述第二分支网络的参数,直至满足预设训练停止条件时结束训练;Obtain the corresponding total loss according to the first classification loss, the second classification loss, and the third classification loss, and adjust the parameters of the first branch network and the second branch network according to the total loss, The training ends when the preset training stop conditions are met;
    从所述第一分支网络和所述第二分支网络选取一个分支网络作为所述细粒度分类模型。A branch network is selected from the first branch network and the second branch network as the fine-grained classification model.
  18. 根据权利要求17所述的电子设备,其中,按照如下公式获取第三分类损失:The electronic device according to claim 17, wherein the third classification loss is obtained according to the following formula:
    Loss3=-[η*y*log(p)+(1-y)*log(1-p)];Loss3=-[η*y*log(p)+(1-y)*log(1-p)];
    其中,Loss3表示第三分类损失,η表示修正系数,y用于表征第一样本图像的粗粒度类别标签和第二样本图像的粗粒度类别标签是否相同,p表示预测结果。Among them, Loss3 represents the third classification loss, η represents the correction coefficient, y is used to characterize whether the coarse-grained category label of the first sample image and the coarse-grained category label of the second sample image are the same, and p represents the prediction result.
  19. 根据权利要求17所述的电子设备,其中,在从所述第一分支网络和所述第二分支网络选取一个分支网络作为所述细粒度分类模型时,所述处理器用于执行:The electronic device according to claim 17, wherein, when a branch network is selected from the first branch network and the second branch network as the fine-grained classification model, the processor is configured to execute:
    获取第一分支网络的分类准确率,以及获取第二分支网络的分类准确率;Obtain the classification accuracy rate of the first branch network, and obtain the classification accuracy rate of the second branch network;
    从第一分支网络和第二分支网络中选取分类准确率较高的分支网络作为所述细粒度分类模型。A branch network with a higher classification accuracy rate is selected from the first branch network and the second branch network as the fine-grained classification model.
  20. 根据权利要求17所述的电子设备,其中,在融合所述第一样本图像的图像特征以及所述第二样本图像的图像特征得到融合图像特征时,所述处理器用于执行:The electronic device according to claim 17, wherein, when the image features of the first sample image and the image features of the second sample image are fused to obtain the fused image feature, the processor is configured to execute:
    对所述第一样本图像的图像特征和所述第二样本图像的图像特征进行通道合并,并将通道合并的结果作为所述融合图像特征。Channel merging is performed on the image feature of the first sample image and the image feature of the second sample image, and the result of the channel merging is used as the fused image feature.
PCT/CN2020/071502 2020-01-10 2020-01-10 Image classification method and apparatus, storage medium, and electronic device WO2021138911A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080087887.6A CN114830186A (en) 2020-01-10 2020-01-10 Image classification method and device, storage medium and electronic equipment
PCT/CN2020/071502 WO2021138911A1 (en) 2020-01-10 2020-01-10 Image classification method and apparatus, storage medium, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/071502 WO2021138911A1 (en) 2020-01-10 2020-01-10 Image classification method and apparatus, storage medium, and electronic device

Publications (1)

Publication Number Publication Date
WO2021138911A1 true WO2021138911A1 (en) 2021-07-15

Family

ID=76788470

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/071502 WO2021138911A1 (en) 2020-01-10 2020-01-10 Image classification method and apparatus, storage medium, and electronic device

Country Status (2)

Country Link
CN (1) CN114830186A (en)
WO (1) WO2021138911A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452896A (en) * 2023-06-16 2023-07-18 中国科学技术大学 Method, system, device and medium for improving fine-grained image classification performance

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304847A (en) * 2017-11-30 2018-07-20 腾讯科技(深圳)有限公司 Image classification method and device, personalized recommendation method and device
CN108875827A (en) * 2018-06-15 2018-11-23 广州深域信息科技有限公司 A kind of method and system of fine granularity image classification
CN109165699A (en) * 2018-10-17 2019-01-08 中国科学技术大学 Fine granularity image classification method
CN109886321A (en) * 2019-01-31 2019-06-14 南京大学 A kind of image characteristic extracting method and device for icing image fine grit classification
CN110084285A (en) * 2019-04-08 2019-08-02 安徽艾睿思智能科技有限公司 Fish fine grit classification method based on deep learning
CN110378356A (en) * 2019-07-16 2019-10-25 北京中科研究院 Fine granularity image-recognizing method based on multiple target Lagrange canonical

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304847A (en) * 2017-11-30 2018-07-20 腾讯科技(深圳)有限公司 Image classification method and device, personalized recommendation method and device
CN108875827A (en) * 2018-06-15 2018-11-23 广州深域信息科技有限公司 A kind of method and system of fine granularity image classification
CN109165699A (en) * 2018-10-17 2019-01-08 中国科学技术大学 Fine granularity image classification method
CN109886321A (en) * 2019-01-31 2019-06-14 南京大学 A kind of image characteristic extracting method and device for icing image fine grit classification
CN110084285A (en) * 2019-04-08 2019-08-02 安徽艾睿思智能科技有限公司 Fish fine grit classification method based on deep learning
CN110378356A (en) * 2019-07-16 2019-10-25 北京中科研究院 Fine granularity image-recognizing method based on multiple target Lagrange canonical

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YANG JUAN;CAO HAOYU;WANG RONGGUI;XUE LIXIA: "Fine-Grained Car Recognition Model Based on Semantic DCNN Features Fusion", JOURNAL OF COMPUTER-AIDED DESIGN & COMPUTER GRAPHICS, vol. 31, no. 1, 5 January 2019 (2019-01-05), pages 141 - 157, XP055827883, ISSN: 1003-9775, DOI: 10.3724/SP.J.1089.2019.17130 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452896A (en) * 2023-06-16 2023-07-18 中国科学技术大学 Method, system, device and medium for improving fine-grained image classification performance
CN116452896B (en) * 2023-06-16 2023-10-20 中国科学技术大学 Method, system, device and medium for improving fine-grained image classification performance

Also Published As

Publication number Publication date
CN114830186A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
US10459975B1 (en) Method and system for creating an automatic video summary
Bianco et al. Predicting image aesthetics with deep learning
CN110598620B (en) Deep neural network model-based recommendation method and device
WO2021136060A1 (en) Image classification method and apparatus, storage medium and electronic device
CN111008640A (en) Image recognition model training and image recognition method, device, terminal and medium
US11755641B2 (en) Image searches based on word vectors and image vectors
US10621137B2 (en) Architecture for predicting network access probability of data files accessible over a computer network
CN110059577B (en) Pedestrian attribute information extraction method and device
CN111340112B (en) Classification method, classification device and classification server
CN110110233A (en) Information processing method, device, medium and calculating equipment
CN113761359B (en) Data packet recommendation method, device, electronic equipment and storage medium
KR20190021833A (en) An integrated system for searching plant diseases and insect pests
Wang et al. Patient admission prediction using a pruned fuzzy min–max neural network with rule extraction
CN112069338A (en) Picture processing method and device, electronic equipment and storage medium
WO2021138911A1 (en) Image classification method and apparatus, storage medium, and electronic device
CN116051388A (en) Automatic photo editing via language request
CN115168720A (en) Content interaction prediction method and related equipment
CN111522926A (en) Text matching method, device, server and storage medium
CN114611692A (en) Model training method, electronic device, and storage medium
CN114332893A (en) Table structure identification method and device, computer equipment and storage medium
CN117275086A (en) Gesture recognition method, gesture recognition device, computer equipment and storage medium
Dasiopoulou et al. Applying fuzzy DLs in the extraction of image semantics
CN109815309A (en) A kind of user information recommended method and system based on personalization
Lin et al. Learning niche features to improve image-based species identification
CN111159450A (en) Picture classification method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20912233

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 08.12.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20912233

Country of ref document: EP

Kind code of ref document: A1