CN110414561A - A Construction Method of Natural Scene Dataset Suitable for Machine Vision - Google Patents

A Construction Method of Natural Scene Dataset Suitable for Machine Vision Download PDF

Info

Publication number
CN110414561A
CN110414561A CN201910561320.7A CN201910561320A CN110414561A CN 110414561 A CN110414561 A CN 110414561A CN 201910561320 A CN201910561320 A CN 201910561320A CN 110414561 A CN110414561 A CN 110414561A
Authority
CN
China
Prior art keywords
data set
ecosystems
pictures
classification
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910561320.7A
Other languages
Chinese (zh)
Inventor
王嘉乐
邹炼
范赐恩
程谟凡
陈丽琼
魏文澜
张捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201910561320.7A priority Critical patent/CN110414561A/en
Publication of CN110414561A publication Critical patent/CN110414561A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

本发明公开了一种适用于机器视觉的自然场景数据集的构建方法,包括如下步骤:基于生态系统类型确定数据集的分类类别;利用关键词从互联网上下载原始图像;对下载的图像进行初步匹配判断,对模糊类别图片进行再次分类,形成图像数据集;利用卷积神经网络对数据集进行验证,再根据混淆矩阵对歧义类别进行合并。解决了自然场景分类难以像物体分类进行准确客观定义的问题,为训练深度卷积神经网络提供了可靠的数据支撑,可用于自然场景识别系统。

The invention discloses a method for constructing a natural scene data set suitable for machine vision, comprising the following steps: determining the classification category of the data set based on the ecosystem type; downloading the original image from the Internet by using keywords; performing preliminary Matching judgment, classify the fuzzy category pictures again to form an image data set; use the convolutional neural network to verify the data set, and then merge the ambiguous categories according to the confusion matrix. It solves the problem that natural scene classification is difficult to define accurately and objectively like object classification, and provides reliable data support for training deep convolutional neural networks, which can be used in natural scene recognition systems.

Description

一种适用于机器视觉的自然场景数据集的构建方法A Construction Method of Natural Scene Dataset Suitable for Machine Vision

技术领域technical field

本发明涉及图像处理技术领域,具体涉及一种适用于机器视觉的自然场景数据集的构建方法。The invention relates to the technical field of image processing, in particular to a method for constructing a natural scene data set suitable for machine vision.

背景技术Background technique

在图像处理技术领域,近几年图像识别受到了越来越多的关注,图像识别技术是运动分析、立体视觉、数据融合等实用技术的基础,其中目标识别由于卷积神经网络和超大规模数据集的应用,在识别的准确率方面有了非常大的提升,而利用卷积神经网络对图像发生的场景进行识别却难以获得相同水平的准确度,这其中很大一部分原因是由于场景数据集自身分类不准确所导致的。In the field of image processing technology, image recognition has received more and more attention in recent years. Image recognition technology is the basis of practical technologies such as motion analysis, stereo vision, and data fusion. Among them, target recognition is due to convolutional neural networks and large-scale data. The application of the set has greatly improved the recognition accuracy, but it is difficult to obtain the same level of accuracy by using the convolutional neural network to recognize the scene where the image occurs. A large part of the reason is that the scene data set caused by inaccurate self-categorization.

利用机器视觉对图像中物体进行识别可以提供图像中大概正在发生什么事情,而对图像发生的场景进行识别则能使我们获得更多与图片有关的信息,同时通过场景猜想过去和未来可能会发生什么事件。随着自动驾驶、无人车以及无人机技术的发展,对周围环境进行准确检测为安全提供了保障,因此一个标准的场景数据库显得尤为重要。Using machine vision to identify objects in an image can provide an overview of what is happening in the image, and identifying the scene where the image occurs allows us to obtain more information related to the image, and at the same time guess what may happen in the past and the future through the scene what event. With the development of autonomous driving, unmanned vehicles, and unmanned aerial vehicles, accurate detection of the surrounding environment provides a guarantee for safety, so a standard scene database is particularly important.

对于大多数场景数据库,往往存在着以下两个问题:1)类别与类别之间层次混乱,在某些数据集中存在“农田”和“平原”类别之间的混乱,其中平原是一种地貌类型,而农田也可以在平原上建立,两者之间并非独立;2)含义重复混淆,例如“林场”与“森林”只有在特殊情况下才会进行区分。For most scene databases, there are often the following two problems: 1) The hierarchy between categories is confusing, and there is confusion between the categories of "farmland" and "plain" in some datasets, where plain is a landform type , and farmland can also be established on plains, and the two are not independent; 2) The meaning is repeated and confused, for example, "forest" and "forest" are only distinguished under special circumstances.

不同于物体数据集的构建过程中类别标签有着客观的评价,场景数据集经常由于主观评价不同而对同一幅图像有着不一样的判定标准,这也导致了同一类别中图像之间差异很大,限制了算法准确度的提升,因此在场景数据集中需要建立一个统一的分类准则。Different from the objective evaluation of category labels in the construction process of object data sets, scene data sets often have different judgment standards for the same image due to different subjective evaluations, which also leads to great differences between images in the same category. This limits the improvement of algorithm accuracy, so a unified classification criterion needs to be established in the scene dataset.

对于一个自然场景,生物与环境帮助我们确定场景的具体类型,而生物与环境构成的统一整体在生态学研究中被称为生态系统,这些生态系统具有最为多种的种类和大小,通过生态系统的分类来对自然场景数据集中的分类进行对应,使数据集的分类满足一个统一的判定尺度。For a natural scene, organisms and the environment help us determine the specific type of the scene, and the unified whole composed of organisms and the environment is called an ecosystem in ecological research, and these ecosystems have the most diverse types and sizes. The classification of the natural scene data set corresponds to the classification of the natural scene data set, so that the classification of the data set meets a unified judgment scale.

判定一张图像所属的自然场景,需要结合图像中的物体和背景;判定某一空间所属的生态系统,需要结合该空间中的生物和环境;两者在判定上有着极大的相似性,因此该发明提出将生态系统作为数据集类别的判定尺度,即生态系统中的类别对应自然场景数据集中的类别,这样操作无论是从分类的主观感受上还是客观要求上都是基本一致的。To determine the natural scene to which an image belongs, it is necessary to combine the objects and background in the image; to determine the ecosystem to which a certain space belongs, it is necessary to combine the organisms and the environment in the space; the two are very similar in determination, so The invention proposes to use the ecosystem as the judgment scale of the data set category, that is, the category in the ecosystem corresponds to the category in the natural scene data set, so that the operation is basically the same in terms of the subjective feeling of classification and the objective requirements.

发明内容Contents of the invention

为了解决上述问题,本发明提供了一种构建自然场景数据库的方法,旨在改善现有数据集分类中缺陷,使得在构造自然场景数据集分类时有一个统一的分类准则,并且使最终的数据集能更好地适用于机器视觉领域。In order to solve the above problems, the present invention provides a method for constructing a natural scene database, which aims to improve the defects in the existing data set classification, so that there is a unified classification criterion when constructing the natural scene data set classification, and the final data The set can be better applied to the field of machine vision.

为实现上述目的,本发明的技术方案具体包括以下步骤:To achieve the above object, the technical solution of the present invention specifically includes the following steps:

1.一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,包括以下步骤:1. A method for building a natural scene dataset suitable for machine vision, comprising the following steps:

步骤1:依据生态系统类型初步确定数据集中含有的类别与对应子类别;其中,生态系统类型分为:Step 1: Preliminarily determine the categories and corresponding subcategories contained in the data set according to the ecosystem type; among them, the ecosystem types are divided into:

陆地生态系统:包括自然陆地生态系统和人工陆地生态系统;Terrestrial ecosystems: including natural terrestrial ecosystems and artificial terrestrial ecosystems;

水生生态系统:包括内陆水生生态系统、海洋生态系统和人工水生生态系统;Aquatic ecosystems: including inland aquatic ecosystems, marine ecosystems and artificial aquatic ecosystems;

地下生态系统:包含洞穴生态系统;Underground ecosystems: including cave ecosystems;

步骤2:将子类别名称作为关键字在互联网上进行搜索,至少搜索1000张与该子类别对应的图片,完成图片的收集;Step 2: Use the subcategory name as a keyword to search on the Internet, search at least 1000 pictures corresponding to the subcategory, and complete the collection of pictures;

步骤3:对所收集到的图片进行初步的人工筛选,对不符合对应类别的图片进行进一步筛选,得到含有场景类别标签的数据集;Step 3: Preliminary manual screening of the collected pictures, and further screening of pictures that do not meet the corresponding category, to obtain a data set containing scene category labels;

步骤4:将数据集分为训练集和验证集,利用深度卷积神经网络对图像进行分类,并利用混淆矩阵将预测结果与真实类别标签结合生成相似度矩阵,依据相似度矩阵合并相似度过高的类别,得到最终的数据集,具体包括:Step 4: Divide the data set into a training set and a validation set, use a deep convolutional neural network to classify images, and use the confusion matrix to combine the prediction results with the real category labels to generate a similarity matrix, and combine the similarity degrees according to the similarity matrix High category, get the final data set, including:

步骤4.1,按照80%/20%的比例对原始数据集随机进行划分得到训练集和验证集;Step 4.1, randomly divide the original data set according to the ratio of 80%/20% to obtain a training set and a verification set;

步骤4.2,将训练集中的图片转化为224×224分辨率的图片,选用13层的bn-inception网络结构作为场景预测分类网络;含有两个卷积层和3×3池化层的原始bn-inception网络将224×224分辨率的图片转化为28×28的特征图,使得在后续10个inception层中能够更快的进行处理,其中两层步长为2,其余步长为1,因此28×28的特征图最终变化为7×7的特征图;最后利用全局均值池化层对跨空间维度的激活进行聚合;Step 4.2, convert the pictures in the training set into 224×224 resolution pictures, and select the 13-layer bn-inception network structure as the scene prediction classification network; the original bn- The inception network converts a 224×224 resolution image into a 28×28 feature map, which enables faster processing in the subsequent 10 inception layers, of which the step size of two layers is 2, and the remaining step size is 1, so 28 The feature map of ×28 is finally changed to a feature map of 7×7; finally, the global mean pooling layer is used to aggregate the activations across spatial dimensions;

步骤4.3,利用步骤4.2中所得到的模型对验证集进行分类预测,得到分类结果,将预测结果与真实分类结合得到混淆矩阵C,该混淆矩阵显示类别与类别之间的交叉错误,隐含地指示了它们之间的相似程度,在形式上,我们用以下方式定义这种相似性:In step 4.3, use the model obtained in step 4.2 to predict the classification of the verification set to obtain the classification result, and combine the prediction result with the real classification to obtain the confusion matrix C, which shows the crossover error between categories, implicitly Indicates the degree of similarity between them, formally we define this similarity in the following way:

S=C+CT S=C+C T

其中C∈RN×N是混淆矩阵,N是自然场景类别的数目,Cij表示将第i类分类为第j类的概率,该值较大表示两个类别之间的模糊度较高;按照该公式生成相似度矩阵S,该等式保证了相似性是对称的度量,为合并两个相似类别提供线索;where C∈R N×N is the confusion matrix, N is the number of natural scene categories, C ij represents the probability of classifying the i-th class into the j-th class, and a larger value indicates a higher ambiguity between the two classes; According to this formula, the similarity matrix S is generated. This equation ensures that the similarity is a symmetric measure and provides clues for merging two similar categories;

步骤4.4,设定一个合适的阈值τ,此处τ=0.5,将相似度超过该阈值τ的两类进行合并;分析合并的合理性,若第i行和第j行合并为同一类型,更新对应的相似性矩阵,删除第i和第j行和列,将作为新的行和列添加进去,重复该步骤直到最终的数据集不超过阈值或满足其他要求,这样得到的数据集更适用于机器视觉方面的处理。Step 4.4, set an appropriate threshold τ, where τ=0.5, merge the two categories whose similarity exceeds the threshold τ; analyze the rationality of the merger, if the i-th row and the j-th row are merged into the same type, update The corresponding similarity matrix, delete i and j rows and columns, will be Add it as a new row and column, and repeat this step until the final data set does not exceed the threshold or meet other requirements, so that the obtained data set is more suitable for machine vision processing.

2.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,所述生态系统子类别分类要满足:2. a kind of construction method that is applicable to the natural scene data set of machine vision according to claim 1, is characterized in that, described ecosystem subcategory classification will satisfy:

条件一、子类别属于某一生态系统;Condition 1. The subcategory belongs to an ecosystem;

条件二、子类别之间相互独立。The second condition is that the subcategories are independent of each other.

3.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,子类别分为:3. a kind of construction method applicable to the natural scene dataset of machine vision according to claim 1, is characterized in that, subcategory is divided into:

自然陆地生态系统:极地与高山苔原、半沙漠、针叶林、落叶林、天然草原、荒地、常绿阔叶林、地中海型灌木丛、沙漠、山地、热带雨林;Natural terrestrial ecosystems: polar and alpine tundra, semi-desert, coniferous forest, deciduous forest, natural grassland, wasteland, evergreen broad-leaved forest, Mediterranean shrub, desert, mountain, tropical rainforest;

人工陆地生态系统:人工草地、农田、温室、城市;Artificial terrestrial ecosystem: artificial grassland, farmland, greenhouse, city;

内陆水生生态系统:沼泽、湿地、河流、湖泊;Inland aquatic ecosystems: swamps, wetlands, rivers, lakes;

海洋生态系统:沿海、珊瑚礁、深海;Marine ecosystems: coastal, coral reef, deep sea;

人工水生生态系统:池塘;Artificial aquatic ecosystems: ponds;

包含洞穴生态系统:洞穴。Contains cave ecosystems: Caves.

4.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,步骤2所述类别名称加上天气形容词,天气形容词包括:晴朗的、多云的、下雨的、下雪的。4. A method for constructing a natural scene data set suitable for machine vision according to claim 1, wherein the category name in step 2 is added with weather adjectives, and the weather adjectives include: sunny, cloudy, under rainy, snowy.

5.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,步骤3所述不符合对应类别的图片挑出汇总至未分类图片,若总搜集图片为M张,总类别为N类,当无法分类的图片超过N/M时,构建新的类别,新的类别满足步骤1中述的子类别分类条件。5. A method for constructing a natural scene data set suitable for machine vision according to claim 1, characterized in that, the pictures that do not meet the corresponding categories in step 3 are picked out and summarized into unclassified pictures, if the total collection of pictures There are M pictures, and the total category is N categories. When the unclassified pictures exceed N/M, a new category is constructed, and the new category meets the subcategory classification conditions described in step 1.

因此,本发明具有如下优点:(1)依照此方法构建的数据集中同一子类别内图片样本多样化,构建的数据集整体质量较高;(2)不同子类别之间不易混淆,能够提高测试精度。Therefore, the present invention has the following advantages: (1) the picture samples in the same subcategory in the data set constructed according to this method are diversified, and the overall quality of the constructed data set is high; (2) different subcategories are not easily confused, which can improve the test precision.

附图说明Description of drawings

图1为本发明的总体框图。Fig. 1 is the overall block diagram of the present invention.

图2为实例中所选用的13层的bn-inception网络结构图。Figure 2 is a 13-layer bn-inception network structure diagram used in the example.

具体实施方式Detailed ways

为了更清楚的阐述本发明的目的、技术方案和有益效果,下面结合附图和实施例进一步说明。应当理解,本发明不应限于实施例公开的内容,本发明要求保护的范围以权利要求书界定的范围为准。In order to more clearly illustrate the purpose, technical solutions and beneficial effects of the present invention, further description will be given below in conjunction with the accompanying drawings and embodiments. It should be understood that the present invention should not be limited to the content disclosed in the embodiments, and the protection scope of the present invention is subject to the scope defined in the claims.

如图1所示,本发明公开了一种适用于机器视觉的自然场景数据集的构建方法,具体实施时的步骤如下:As shown in Figure 1, the present invention discloses a method for constructing a natural scene data set suitable for machine vision, and the specific implementation steps are as follows:

步骤1:依据生态系统类型初步确定数据集中含有的类别:Step 1: Initially determine the categories contained in the dataset according to the ecosystem type:

步骤1.1,生态系统类型可分为陆地生态系统、水生生态系统、地下生态系统最基本的三个类别;In step 1.1, ecosystem types can be divided into three basic categories: terrestrial ecosystems, aquatic ecosystems, and underground ecosystems;

步骤1.2,每个类别又可以进一步进行分类,陆地生态系统又可分为自然陆地生态系统和人工陆地生态系统,水生生态系统可分为内陆水生生态系统、海洋生态系统和人工水生生态系统,地下生态系统则包含洞穴生态系统;Step 1.2, each category can be further classified, terrestrial ecosystems can be further divided into natural terrestrial ecosystems and artificial terrestrial ecosystems, aquatic ecosystems can be divided into inland aquatic ecosystems, marine ecosystems and artificial aquatic ecosystems, Underground ecosystems include cave ecosystems;

步骤1.3,上述类别又可进一步分出子类别,其中子类别的分类可以依据所需类别的数目进行制定,分类满足两个条件:(1)子类别属于某一生态系统(2)子类别之间相互独立;In step 1.3, the above categories can be further divided into subcategories, wherein the classification of subcategories can be formulated according to the number of required categories, and the classification meets two conditions: (1) the subcategory belongs to a certain ecosystem; (2) the subcategory belongs to independent of each other;

作为优选,所述子类别可分为:沿海,极地与高山苔原,沼泽,半沙漠,针叶林,落叶林,天然草原,荒地,常绿阔叶林,地中海型灌木丛,沙漠,山地,热带雨林,湿地,人工草地,农田,温室,河流,湖泊,池塘,珊瑚礁,深海,城市,洞穴。Preferably, said subcategories are divided into: coastal, polar and alpine tundra, swamp, semi-desert, coniferous forest, deciduous forest, natural grassland, wasteland, evergreen broad-leaved forest, Mediterranean-type scrub, desert, mountain, Tropical rainforest, wetland, artificial grass, farmland, greenhouse, river, lake, pond, coral reef, deep sea, city, cave.

步骤2:搜集对应图片,完成图片的收集:Step 2: Collect corresponding pictures and complete the collection of pictures:

对于其中一个类别C,将C作为关键词输入搜索引擎,利用在线图片搜索引擎下载图片。为了增加图片的多样性,在关键词中加上与天气有关的词语(例如晴朗,下雨,多云)。排除掉尺寸过小的图片,完成所有类别图片的下载。For one of the categories C, enter C as a keyword into the search engine, and use the online image search engine to download pictures. In order to increase the variety of pictures, add weather-related words (such as sunny, rainy, cloudy) to the keywords. Exclude pictures that are too small in size and complete the download of all categories of pictures.

步骤3:对所收集到的图片进行筛选:Step 3: Filter the collected pictures:

步骤3.1,对下载后的图片进行人工筛选,保留类别相符的图片,将类别不符的图片挑出汇总至未分类图片;Step 3.1, manually screen the downloaded pictures, keep the pictures that match the categories, and pick out the pictures that do not match the categories to uncategorized pictures;

步骤3.2,剩余未分类的图片进行分类,若出现大量图片无法找到合适分类的情形,考虑构建新的类别,新的类别满足步骤1中述的子类别分类条件;Step 3.2: Classify the remaining unclassified pictures. If a large number of pictures cannot find a suitable classification, consider building a new category. The new category meets the subcategory classification conditions described in step 1;

步骤3.3,对少数无法判断场景的图像进行剔除,完成所有图片的类别确认。In step 3.3, a small number of images that cannot judge the scene are eliminated, and the category confirmation of all images is completed.

步骤4:利用相似度矩阵对相似类型进行合并:Step 4: Use the similarity matrix to merge similar types:

步骤4.1,按照80%/20%的比例对原始数据集随机进行划分得到训练集和验证集;Step 4.1, randomly divide the original data set according to the ratio of 80%/20% to obtain a training set and a verification set;

步骤4.2,将训练集中的图片转化为224×224分辨率的图片,如图2选用13层的bn-inception网络结构作为场景预测分类网络。含有两个卷积层和3×3池化层的原始bn-inception网络将224×224分辨率的图片转化为28×28的特征图,小尺寸的特征图使得在后续10个inception层中能够更快的进行处理,其中两层步长为2,其余步长为1,因此28×28的特征图最终变化为7×7的特征图。最后利用全局均值池化层对跨空间维度的激活进行聚合;In step 4.2, the pictures in the training set are converted into pictures with a resolution of 224×224. As shown in Figure 2, a 13-layer bn-inception network structure is selected as the scene prediction and classification network. The original bn-inception network with two convolutional layers and a 3×3 pooling layer converts a 224×224 resolution image into a 28×28 feature map, and the small size of the feature map enables the following 10 inception layers to be able to Faster processing, where the two layers have a step size of 2 and the rest of the step size is 1, so the 28×28 feature map is finally changed to a 7×7 feature map. Finally, the global mean pooling layer is used to aggregate the activations across spatial dimensions;

步骤4.3,利用步骤4.2中所得到的模型对验证集进行分类预测,得到分类结果,将预测结果与真实分类结合得到混淆矩阵C,该混淆矩阵显示类别与类别之间的交叉错误,隐含地指示了它们之间的相似程度,在形式上,我们用以下方式定义这种相似性:In step 4.3, use the model obtained in step 4.2 to predict the classification of the verification set to obtain the classification result, and combine the prediction result with the real classification to obtain the confusion matrix C, which shows the crossover error between categories, implicitly Indicates the degree of similarity between them, formally we define this similarity in the following way:

S=C+CT S=C+C T

其中C∈RN×N是混淆矩阵,N是自然场景类别的数目,Cij表示将第i类分类为第j类的概率,该值较大表示两个类别之间的模糊度较高。按照该公式生成相似度矩阵S,该等式保证了相似性是对称的度量,为合并两个相似类别提供线索;Where C∈R N×N is the confusion matrix, N is the number of natural scene categories, C ij represents the probability of classifying the i-th class into the j-th class, and a larger value indicates a higher degree of ambiguity between the two classes. According to this formula, the similarity matrix S is generated. This equation ensures that the similarity is a symmetric measure and provides clues for merging two similar categories;

步骤4.4,设定一个合适的阈值τ,此处τ=0.5,将相似度超过该阈值τ的两类进行合并。分析合并的合理性,若第i行和第j行合并为同一类型,更新对应的相似性矩阵,删除第i和第j行和列,将作为新的行和列添加进去,重复该步骤直到最终的数据集不超过阈值或满足其他要求,这样得到的数据集更适用于机器视觉方面的处理。In step 4.4, set an appropriate threshold τ, where τ=0.5, and merge the two categories whose similarity exceeds the threshold τ. Analyze the rationality of merging, if the i-th row and the j-th row are merged into the same type, update the corresponding similarity matrix, delete the i-th and j-th row and column, and Add it as a new row and column, and repeat this step until the final data set does not exceed the threshold or meet other requirements, so that the obtained data set is more suitable for machine vision processing.

本文中所描述的具体实施例仅仅是对本发明精神作举例说明。本发明所属技术领域的技术人员可以对所描述的具体实施例做各种各样的修改或补充或采用类似的方式替代,但并不会偏离本发明的精神或者超越所附权利要求书所定义的范围。The specific embodiments described herein are merely illustrative of the spirit of the invention. Those skilled in the art to which the present invention belongs can make various modifications or supplements to the described specific embodiments or adopt similar methods to replace them, but they will not deviate from the spirit of the present invention or go beyond the definition of the appended claims range.

Claims (5)

1.一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,包括以下步骤:1. A method for building a natural scene dataset suitable for machine vision, comprising the following steps: 步骤1:依据生态系统类型初步确定数据集中含有的类别与对应子类别;其中,生态系统类型分为:Step 1: Preliminarily determine the categories and corresponding subcategories contained in the data set according to the ecosystem type; among them, the ecosystem types are divided into: 陆地生态系统:包括自然陆地生态系统和人工陆地生态系统;Terrestrial ecosystems: including natural terrestrial ecosystems and artificial terrestrial ecosystems; 水生生态系统:包括内陆水生生态系统、海洋生态系统和人工水生生态系统;Aquatic ecosystems: including inland aquatic ecosystems, marine ecosystems and artificial aquatic ecosystems; 地下生态系统:包含洞穴生态系统;Underground ecosystems: including cave ecosystems; 步骤2:将子类别名称作为关键字在互联网上进行搜索,至少搜索1000张与该子类别对应的图片,完成图片的收集;Step 2: Use the subcategory name as a keyword to search on the Internet, search at least 1000 pictures corresponding to the subcategory, and complete the collection of pictures; 步骤3:对所收集到的图片进行初步的人工筛选,对不符合对应类别的图片进行进一步筛选,得到含有场景类别标签的数据集;Step 3: Preliminary manual screening of the collected pictures, and further screening of pictures that do not meet the corresponding category, to obtain a data set containing scene category labels; 步骤4:将数据集分为训练集和验证集,利用深度卷积神经网络对图像进行分类,并利用混淆矩阵将预测结果与真实类别标签结合生成相似度矩阵,依据相似度矩阵合并相似度过高的类别,得到最终的数据集,具体包括:Step 4: Divide the data set into a training set and a validation set, use a deep convolutional neural network to classify images, and use the confusion matrix to combine the prediction results with the real category labels to generate a similarity matrix, and combine the similarity degrees according to the similarity matrix High category, get the final data set, including: 步骤4.1,按照80%/20%的比例对原始数据集随机进行划分得到训练集和验证集;Step 4.1, randomly divide the original data set according to the ratio of 80%/20% to obtain a training set and a verification set; 步骤4.2,将训练集中的图片转化为224×224分辨率的图片,选用13层的bn-inception网络结构作为场景预测分类网络;含有两个卷积层和3×3池化层的原始bn-inception网络将224×224分辨率的图片转化为28×28的特征图,使得在后续10个inception层中能够更快的进行处理,其中两层步长为2,其余步长为1,因此28×28的特征图最终变化为7×7的特征图;最后利用全局均值池化层对跨空间维度的激活进行聚合;Step 4.2, convert the pictures in the training set into 224×224 resolution pictures, and select the 13-layer bn-inception network structure as the scene prediction classification network; the original bn- The inception network converts a 224×224 resolution image into a 28×28 feature map, which enables faster processing in the subsequent 10 inception layers, of which the step size of two layers is 2, and the remaining step size is 1, so 28 The feature map of ×28 is finally changed to a feature map of 7×7; finally, the global mean pooling layer is used to aggregate the activations across spatial dimensions; 步骤4.3,利用步骤4.2中所得到的模型对验证集进行分类预测,得到分类结果,将预测结果与真实分类结合得到混淆矩阵C,该混淆矩阵显示类别与类别之间的交叉错误,隐含地指示了它们之间的相似程度,在形式上,我们用以下方式定义这种相似性:In step 4.3, use the model obtained in step 4.2 to predict the classification of the verification set to obtain the classification result, and combine the prediction result with the real classification to obtain the confusion matrix C, which shows the crossover error between categories, implicitly Indicates the degree of similarity between them, formally we define this similarity in the following way: S=C+CT S=C+C T 其中C∈RN×N是混淆矩阵,N是自然场景类别的数目,Cij表示将第i类分类为第j类的概率,该值较大表示两个类别之间的模糊度较高;按照该公式生成相似度矩阵S,该等式保证了相似性是对称的度量,为合并两个相似类别提供线索;where C∈R N×N is the confusion matrix, N is the number of natural scene categories, C ij represents the probability of classifying the i-th class into the j-th class, and a larger value indicates a higher ambiguity between the two classes; According to this formula, the similarity matrix S is generated. This equation ensures that the similarity is a symmetric measure and provides clues for merging two similar categories; 步骤4.4,设定一个合适的阈值τ,此处τ=0.5,将相似度超过该阈值τ的两类进行合并;分析合并的合理性,若第i行和第j行合并为同一类型,更新对应的相似性矩阵,删除第i和第j行和列,将作为新的行和列添加进去,重复该步骤直到最终的数据集不超过阈值或满足其他要求,这样得到的数据集更适用于机器视觉方面的处理。Step 4.4, set an appropriate threshold τ, where τ=0.5, merge the two categories whose similarity exceeds the threshold τ; analyze the rationality of the merger, if the i-th row and the j-th row are merged into the same type, update The corresponding similarity matrix, delete i and j rows and columns, will be Add it as a new row and column, and repeat this step until the final data set does not exceed the threshold or meet other requirements, so that the obtained data set is more suitable for machine vision processing. 2.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,所述生态系统子类别分类要满足:2. a kind of construction method that is applicable to the natural scene data set of machine vision according to claim 1, is characterized in that, described ecosystem subcategory classification will satisfy: 条件一、子类别属于某一生态系统;Condition 1. The subcategory belongs to an ecosystem; 条件二、子类别之间相互独立。The second condition is that the subcategories are independent of each other. 3.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,子类别分为:3. a kind of construction method applicable to the natural scene dataset of machine vision according to claim 1, is characterized in that, subcategory is divided into: 自然陆地生态系统:极地与高山苔原、半沙漠、针叶林、落叶林、天然草原、荒地、常绿阔叶林、地中海型灌木丛、沙漠、山地、热带雨林;Natural terrestrial ecosystems: polar and alpine tundra, semi-desert, coniferous forest, deciduous forest, natural grassland, wasteland, evergreen broad-leaved forest, Mediterranean shrub, desert, mountain, tropical rainforest; 人工陆地生态系统:人工草地、农田、温室、城市;Artificial terrestrial ecosystem: artificial grassland, farmland, greenhouse, city; 内陆水生生态系统:沼泽、湿地、河流、湖泊;Inland aquatic ecosystems: swamps, wetlands, rivers, lakes; 海洋生态系统:沿海、珊瑚礁、深海;Marine ecosystems: coastal, coral reef, deep sea; 人工水生生态系统:池塘;Artificial aquatic ecosystems: ponds; 包含洞穴生态系统:洞穴。Contains cave ecosystems: Caves. 4.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,步骤2所述类别名称加上天气形容词,天气形容词包括:晴朗的、多云的、下雨的、下雪的。4. A method for constructing a natural scene data set suitable for machine vision according to claim 1, wherein the category name in step 2 is added with weather adjectives, and the weather adjectives include: sunny, cloudy, under rainy, snowy. 5.根据权利要求1所述的一种适用于机器视觉的自然场景数据集的构建方法,其特征在于,步骤3所述不符合对应类别的图片挑出汇总至未分类图片,若总搜集图片为M张,总类别为N类,当无法分类的图片超过N/M时,构建新的类别,新的类别满足步骤1中述的子类别分类条件。5. A method for constructing a natural scene data set suitable for machine vision according to claim 1, characterized in that, the pictures that do not meet the corresponding categories in step 3 are picked out and summarized into unclassified pictures, if the total collection of pictures There are M pictures, and the total category is N categories. When the unclassified pictures exceed N/M, a new category is constructed, and the new category meets the subcategory classification conditions described in step 1.
CN201910561320.7A 2019-06-26 2019-06-26 A Construction Method of Natural Scene Dataset Suitable for Machine Vision Pending CN110414561A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910561320.7A CN110414561A (en) 2019-06-26 2019-06-26 A Construction Method of Natural Scene Dataset Suitable for Machine Vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910561320.7A CN110414561A (en) 2019-06-26 2019-06-26 A Construction Method of Natural Scene Dataset Suitable for Machine Vision

Publications (1)

Publication Number Publication Date
CN110414561A true CN110414561A (en) 2019-11-05

Family

ID=68359871

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910561320.7A Pending CN110414561A (en) 2019-06-26 2019-06-26 A Construction Method of Natural Scene Dataset Suitable for Machine Vision

Country Status (1)

Country Link
CN (1) CN110414561A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062441A (en) * 2019-12-18 2020-04-24 武汉大学 Scene classification method and device based on self-supervision mechanism and regional suggestion network
CN111104898A (en) * 2019-12-18 2020-05-05 武汉大学 Image scene classification method and device based on target semantics and attention mechanism
CN111444819A (en) * 2020-03-24 2020-07-24 北京百度网讯科技有限公司 Cutting frame determining method, network training method, device, equipment and storage medium
CN112231535A (en) * 2020-10-23 2021-01-15 山东科技大学 A method, processing device and storage medium for making a multimodal data set in the field of agricultural pests and diseases
CN112766427A (en) * 2021-04-06 2021-05-07 浙江口碑网络技术有限公司 Training method, device and equipment of image recognition model
CN113569986A (en) * 2021-08-18 2021-10-29 网易(杭州)网络有限公司 Computer vision data classification method and device, electronic equipment and storage medium
CN113902984A (en) * 2021-09-06 2022-01-07 特赞(上海)信息科技有限公司 Optimization method and device for scene classification model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110026840A1 (en) * 2009-07-28 2011-02-03 Samsung Electronics Co., Ltd. System and method for indoor-outdoor scene classification
CN103390046A (en) * 2013-07-20 2013-11-13 西安电子科技大学 Multi-scale dictionary natural scene image classification method based on latent Dirichlet model
CN108921037A (en) * 2018-06-07 2018-11-30 四川大学 A kind of Emotion identification method based on BN-inception binary-flow network
CN108921190A (en) * 2018-05-24 2018-11-30 北京飞搜科技有限公司 A kind of image classification method, device and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110026840A1 (en) * 2009-07-28 2011-02-03 Samsung Electronics Co., Ltd. System and method for indoor-outdoor scene classification
CN103390046A (en) * 2013-07-20 2013-11-13 西安电子科技大学 Multi-scale dictionary natural scene image classification method based on latent Dirichlet model
CN108921190A (en) * 2018-05-24 2018-11-30 北京飞搜科技有限公司 A kind of image classification method, device and electronic equipment
CN108921037A (en) * 2018-06-07 2018-11-30 四川大学 A kind of Emotion identification method based on BN-inception binary-flow network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YANFEI LIU等: "SCENE SEMANTIC CLASSIFICATION BASED ON RANDOM­SCALE STRETCHED CONVOLUTIONAL NEURAL NETWORK FOR HIGH­SPATIAL RESOLUTION REMOTE SENSING IMAGERY", 《IGARSS 2016》 *
龚希等: "融合全局和局部深度特征的高分辨率遥感影像场景分类方法", 《光学学报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062441A (en) * 2019-12-18 2020-04-24 武汉大学 Scene classification method and device based on self-supervision mechanism and regional suggestion network
CN111104898A (en) * 2019-12-18 2020-05-05 武汉大学 Image scene classification method and device based on target semantics and attention mechanism
CN111104898B (en) * 2019-12-18 2022-03-25 武汉大学 Image scene classification method and device based on target semantics and attention mechanism
CN111444819A (en) * 2020-03-24 2020-07-24 北京百度网讯科技有限公司 Cutting frame determining method, network training method, device, equipment and storage medium
CN111444819B (en) * 2020-03-24 2024-01-23 北京百度网讯科技有限公司 Cut frame determining method, network training method, device, equipment and storage medium
CN112231535A (en) * 2020-10-23 2021-01-15 山东科技大学 A method, processing device and storage medium for making a multimodal data set in the field of agricultural pests and diseases
CN112231535B (en) * 2020-10-23 2022-11-15 山东科技大学 Method for making multi-modal data set in field of agricultural diseases and insect pests, processing device and storage medium
CN112766427A (en) * 2021-04-06 2021-05-07 浙江口碑网络技术有限公司 Training method, device and equipment of image recognition model
CN113569986A (en) * 2021-08-18 2021-10-29 网易(杭州)网络有限公司 Computer vision data classification method and device, electronic equipment and storage medium
CN113569986B (en) * 2021-08-18 2023-06-30 网易(杭州)网络有限公司 Computer vision data classification method, device, electronic equipment and storage medium
CN113902984A (en) * 2021-09-06 2022-01-07 特赞(上海)信息科技有限公司 Optimization method and device for scene classification model

Similar Documents

Publication Publication Date Title
CN110414561A (en) A Construction Method of Natural Scene Dataset Suitable for Machine Vision
CN115934990B (en) Remote sensing image recommendation method based on content understanding
CN111291639B (en) Cross-source ship feature fusion learning and recognition method based on hierarchical variational autoencoder
CN112418117A (en) A small target detection method based on UAV images
CN108446312A (en) Remote sensing image search method based on depth convolution semantic net
CN108764018A (en) A kind of multitask vehicle based on convolutional neural networks recognition methods and device again
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
Guo et al. Adaln: a vision transformer for multidomain learning and predisaster building information extraction from images
Zhang et al. An improved multi‐task pointwise network for segmentation of building roofs in airborne laser scanning point clouds
CN114519819A (en) Remote sensing image target detection method based on global context awareness
Lin et al. An unsupervised transformer-based multivariate alteration detection approach for change detection in VHR remote sensing images
Kleban et al. Global annotation on georeferenced photographs
Oñate-Valdivieso et al. Spatiotemporal dynamics of soil impermeability and its impact on the hydrology of an urban basin
Ye et al. M2f2-net: Multi-modal feature fusion for unstructured off-road freespace detection
Xu et al. A large-scale remote sensing scene dataset construction for semantic segmentation
Lu et al. a geoscience-aware network (GASlumNet) combining UNet and ConvNeXt for slum mapping
Shi et al. Fine object change detection based on vector boundary and deep learning with high-resolution remote sensing images
CN113963271A (en) Model for identifying impervious surfaces from remote sensing images and method for training the model
CN112836590A (en) Flood disaster monitoring method, device, electronic device and storage medium
CN115063692B (en) Remote sensing image scene classification method based on active learning
CN115456957B (en) Method for detecting change of remote sensing image by full-scale feature aggregation
CN114677831B (en) Channel security situation analysis method and system based on high-resolution remote sensing image
CN116524258A (en) A landslide detection method and system based on multi-label classification
CN117132914A (en) Method and system for identifying large model of universal power equipment
Liu et al. A Memory Guided Network and A Novel Dataset for Cropland Semantic Change Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191105