CN112364844B

CN112364844B - Data acquisition method and system based on computer vision technology

Info

Publication number: CN112364844B
Application number: CN202110032387.9A
Authority: CN
Inventors: 王兆君; 金震; 张京日; 康进港
Original assignee: Beijing SunwayWorld Science and Technology Co Ltd
Current assignee: Beijing SunwayWorld Science and Technology Co Ltd
Priority date: 2021-01-12
Filing date: 2021-01-12
Publication date: 2021-05-18
Anticipated expiration: 2041-01-12
Also published as: CN112364844A

Abstract

The invention provides a data acquisition method and a system based on a computer vision technology, which comprises the following steps: acquiring a target object identified by a computer, and photographing the target object to acquire a target image; training a target image in a preset convolutional neural network, and storing a training result into a data acquisition model; based on the data acquisition model of the computer vision technology, the data of the target image is identified, analyzed and acquired; by acquiring the target image and training through the convolutional neural network, accurate and efficient data analysis and acquisition can be effectively realized through a computer vision technology and a data acquisition model based on a training result.

Description

Data acquisition method and system based on computer vision technology

Technical Field

The invention relates to the technical field of computer vision, in particular to a data acquisition method and system based on a computer vision technology.

Background

Image data acquisition is an intelligent detection technology for detecting an object to be detected by acquiring and analyzing image data of the object to be detected.

However, the acquisition of image data is usually acquired manually, which not only results in long data acquisition time, low efficiency, but also results in high labor cost and low accuracy of the acquired image data.

Disclosure of Invention

The invention provides a data acquisition method and a data acquisition system based on a computer vision technology, which are used for accurately analyzing and acquiring data of a target image through the computer vision technology.

The invention provides a data acquisition method based on a computer vision technology, which comprises the following steps:

acquiring a target object identified by a computer, and photographing the target object to acquire a target image;

training the target image in a preset convolutional neural network, and storing a training result into a data acquisition model;

and identifying, analyzing and collecting the data of the target image based on a computer vision technology and the data collection model.

Preferably, after the target object identified by the computer is obtained and before the photographing and sampling of the target object, the data acquisition method based on the computer vision technology further includes:

determining effective information of the target object based on the gray matrix texture, and determining the position characteristic of the target object based on an Euclidean distance mapping method;

preprocessing the position characteristics of the target object and the effective information to acquire processing information;

establishing an analysis file, and analyzing and processing the processing information by using the analysis file so as to mark each element point in the target object;

marking each element point in the target object to obtain a marking result of the element point;

evaluating the marking result of the element points based on a preset evaluation index, and acquiring an evaluation result;

meanwhile, based on the evaluation result, the target object is subjected to area delineation, and the delineated target object is photographed through the computer to obtain the target image.

Preferably, a data acquisition method based on computer vision technology,

the valid information includes: the color of the target object, the texture window of the target object, the target angle, the target gray texture and the color texture feature.

Preferably, a data acquisition method based on computer vision technology, after the target object is photographed, further includes:

acquiring an initial image of the target object, acquiring a gray value of each pixel in the initial image, and acquiring a gray matrix based on the gray value;

extracting the gray distribution trend of the row/column of the gray matrix, and acquiring a balance weight array of the initial image according to the gray distribution trend;

correcting the gray matrix based on the balance weight number to obtain an initial gray image;

acquiring M classification sets and a filtering parameter corresponding to each classification set;

the classification set is obtained by classifying the gray pixels of the current initial gray image according to a preset classification mode;

filtering each gray pixel of the initial gray image according to the filtering parameters corresponding to the classification set to obtain an initial gray filtering image;

partitioning the initial gray level filtering image to obtain sub image blocks of N initial gray level filtering images;

acquiring the main direction of the texture of the sub image block based on a preset image texture analysis file;

meanwhile, according to a preset algorithm, obtaining gradient values of pixel gray values corresponding to the sub-image blocks in the x direction or the y direction, and calculating an average value of the gradient values in the x direction or the y direction;

meanwhile, a gradient threshold value of the gradient value is obtained, and binarization processing is carried out on the main direction of the texture of the sub-image block based on the gradient threshold value;

taking the average value of the gradient values as a clustering center, and combining the sub-image blocks into a complete binary image according to the clustering center;

removing impurities in the area of the binary image according to a preset area threshold value, and setting the length-width ratio of the binary image;

removing non-strip-shaped impurities in the binary image according to the length-width ratio to obtain a final processed image;

and the final processed image is the target image.

Preferably, a data acquisition method based on computer vision technology, a process of training the target image in a preset convolutional neural network, includes:

acquiring a feature map area in a preset convolutional neural network, extracting a preset size in the feature map area,

dividing the complete region of the target image according to the preset size, and acquiring a plurality of divided independent sub-image regions;

generating the coordinates of each pixel point of the sub-graph area into the relative coordinates of the pixel points at the corresponding position of the feature graph, and acquiring a coordinate channel corresponding to the feature graph;

and carrying out deep learning on the target image with the coordinate channel according to the parameters of the convolutional neural network, and generating a training result containing the characteristic diagram.

Preferably, the working process of storing the training result to the data acquisition model comprises the following steps:

extracting feature data of the target image based on the training result;

analyzing and judging whether redundant data exist in the characteristic data according to a preset algorithm;

if the characteristic data contains redundant data, removing the redundant data in the characteristic data to obtain simplified data;

classifying the simplified data according to a preset format type;

if the characteristic data does not contain redundant data, directly classifying the characteristic data;

sequencing the sorted characteristic data according to data types, and matching the sequenced characteristic data with a matching segment in the data acquisition model;

and storing the characteristic data into the data acquisition model according to the matching result.

Preferably, after the feature data is stored in the data acquisition model, the method for acquiring data based on computer vision technology further includes:

and screening the characteristic data in the data acquisition model based on the current task of acquiring the specific position data of the target object by the computer, and acquiring final data.

Preferably, before the target image is trained in a preset convolutional neural network, the data acquisition method based on a computer vision technology further includes:

based on the target image, acquiring the spatial density of the target image in the convolutional neural network, and meanwhile, constructing a training space of the target image in the convolutional neural network according to the spatial density, wherein the specific working process comprises the following steps:

obtaining a labeled sample set of the target image

Wherein

In order to train the sample to be trained,

is marked corresponding to the training sample, and

，

to train the number of samples, an

；

Is as follows

The number of class marks is set to be,

is the number of categories;

based on the set of labeled samples

Obtaining the set of labeled samples

Sample distance between the i training samples

Wherein, in the step (A),

is shown as

A marked sample

To the first

A marked sample

The distance between them;

to pair

Sample distance between the training samples

Arranged in ascending order to obtain

Wherein, 1, 2.. M,indicates that the sequence samples sequentially increase, and

is the maximum sample distance;

determining a truncation distance of the training sample based on the ranking result

；

Wherein the content of the first and second substances,

representing a truncation distance of the training sample;

representing a number of sample points of the training sample;

is a constant with the value range of (0, 1);

truncation distance based on the training sample

And sample distance between the training samples

Acquiring the spatial density of the target image in the convolutional neural network

；

；

Wherein the content of the first and second substances,

representing a spatial density of the target image in the convolutional neural network;

to represent

To

A distance therebetween, and

；

representing a truncation distance of the training sample;

representing the training sample variance;

represents a density threshold;

is an impulse function;

spatial density of the target image in the convolutional neural network

And the density threshold value

Comparing;

if it is

Then determining the training sample

Is a core point;

if it is

And the training sample

Distance of truncation of

Core points exist in the field, then the training sample

Is a boundary point;

if the training sample

Neither core nor boundary points, the training samples

Is a noise point;

and constructing a training space of the target image in the convolutional neural network based on the core points, the boundary points and the noise points.

The invention provides a data acquisition system based on computer vision technology, comprising:

a target image acquisition module: acquiring a target object identified by a computer, and photographing the target object based on a computer vision technology to acquire a target image;

a training module: training the target image in a preset convolutional neural network, and storing a training result into a data acquisition model;

a data acquisition module: and acquiring the data of the target image based on the data acquisition model.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a flow chart of a data collection method based on computer vision technology according to an embodiment of the present invention;

fig. 2 is a structural diagram of a data acquisition system based on computer vision technology in an embodiment of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

Example 1:

the invention provides a data acquisition method based on a computer vision technology, as shown in fig. 1, comprising the following steps:

step 1: acquiring a target object identified by a computer, and photographing the target object to acquire a target image;

step 2: training the target image in a preset convolutional neural network, and storing a training result into a data acquisition model;

and step 3: and identifying, analyzing and collecting the data of the target image based on a computer vision technology and the data collection model.

In this embodiment, the convolutional neural network is a feedforward neural network, and the artificial neuron can respond to the surrounding units and can perform large-scale image processing. The convolutional neural network comprises a convolutional layer and a pooling layer, wherein the convolutional neural network can be of a one-dimensional convolutional neural network, a two-dimensional convolutional neural network or a three-dimensional convolutional neural network.

In this embodiment, the image data is identified and analyzed mainly by analyzing the pixel and gray scale of the image and the three primary color components of red, green and blue.

In this embodiment, the target object may be a real object, such as a desk, a textbook, etc.;

in this embodiment, the target image may be a target image obtained by photographing a target object.

The beneficial effects of the above technical scheme are:

by acquiring the target image and training through the convolutional neural network, accurate and efficient data analysis and acquisition can be effectively realized through a computer vision technology and a data acquisition model based on a training result.

Example 2:

on the basis of embodiment 1, the present invention provides a data acquisition method based on a computer vision technology, which further includes, after acquiring a target object identified by a computer and before taking a picture of the target object for sampling:

In this embodiment, the valid information includes: color of the target object, texture window of the target object, target angle, target gray-scale texture, and color texture features.

In this embodiment, the euclidean distance mapping method may be that for a binary image, where we assume white as foreground color and black as background color, the value of the pixel in the foreground is converted into the distance from the point to the nearest background point, so that the position feature of the target object can be determined;

in this embodiment, the position feature may be the posture of the target object and the proportion of the photo space occupied by the target object.

In this embodiment, the effective information is preprocessed mainly to remove interference information of the effective information, and the effective information is converted into information that can be recognized by the analysis file.

In this embodiment, the analysis file may be a data packet of file processing, where the data types included in the data packet include int type, float type, string type, and the like.

In this embodiment, the evaluation index is used to evaluate the marking result of the element point, and the index is used to measure whether the marking result of the element point is accurate.

In this embodiment, the element points are determined based on the position of the target object, and for example, the position of the target object may be in a hospital acquisition room or the like.

In this embodiment, the marking result may be a marking result based on the determined position after the element point is marked.

In this embodiment, the area delineation of the target object may be performed by delineating an effective image, for example, the target object includes: books and pencils, and ultimately books, need to be circled to obtain the final target image.

The beneficial effects of the above technical scheme are:

the position characteristics of the target object can be accurately determined by determining the effective information of the target object and the Euclidean distance mapping method, and the target image is positioned by analyzing and evaluating, so that the target object is photographed and sampled.

Example 3:

on the basis of the embodiment 2, the invention provides a data acquisition method based on computer vision technology,

The beneficial effects of the above technical scheme are:

the target object can be accurately analyzed by acquiring the effective information, and the high efficiency of acquiring the target image is realized.

Example 4:

on the basis of embodiment 1, the present invention provides a data acquisition method based on a computer vision technology, wherein after the target object is photographed, the method further includes:

and the final processed image is the target image.

In this embodiment, the gray distribution trend may be a gray distribution trend that is a ratio of black biased to the same saturation to the maximum black among black with different saturations.

In this embodiment, the equalization weight array may be a set formed by equalizing weights of pixels in the image according to different saturation degrees in the gray distribution trend, and the set is an equalization full-weight array.

In this embodiment, the preset classification manner may include at least one of the following manners: classification based on the size of the pixels themselves; m, L are all positive integers based on the classification of the relationship between a pixel and its adjacent L pixels.

In this embodiment, the image texture analysis file may be an image texture analysis file composed of ready-made rich patterns and textures provided by Fireworks (hereinafter, FW).

In this embodiment, the preset algorithm is to obtain a gradient value of a pixel gray value corresponding to the x direction or the y direction of the sub image block; wherein the preset algorithm may be an OTSU algorithm.

In this embodiment, the cluster center may be the cluster center of K-means.

In this embodiment, the gradient threshold refers to a maximum gradient value among the gradient values, and is referred to as a gradient threshold.

In this embodiment, the impurity may be an excess portion exceeding the area threshold, which is referred to as an impurity.

In this embodiment, the non-linear impurities refer to the impurities which are not in the aspect ratio range and are called non-linear.

The beneficial effects of the above technical scheme are:

the method comprises the steps of obtaining a gray level image of an initial image of a target object, carrying out image filtering processing on the gray level image, determining a texture main direction of a sub-image block and a gradient threshold of a pixel gray level value, accurately determining a binary image, and carrying out impurity removal processing on the binary image to accurately obtain the target image, so that the image processing efficiency is improved, and the extraction precision of image data is improved.

Example 5:

on the basis of embodiment 1, the invention provides a data acquisition method based on a computer vision technology, which is a process of training a target image in a preset convolutional neural network, and comprises the following steps:

In this embodiment, the feature map region in the convolutional neural network is a reference map for training the target image, and the target image is placed in the region of the feature map.

In this embodiment, the coordinates of each pixel point in the sub-image region and the relative coordinates of the pixel points at the corresponding positions of the feature map are used to obtain the coordinate channel corresponding to the feature map, so that the deep learning of the target image is facilitated.

In this embodiment, the coordinate channel refers to the coordinate of each pixel point passing through the sub-image region and the relative coordinate of the pixel point at the corresponding position of the feature map, and a gap between the coordinates is referred to as a coordinate channel.

The beneficial effects of the above technical scheme are:

the target image is segmented according to the size of the characteristic image area, so that a plurality of independent sub-images can be acquired, the deep learning of the target image is realized by acquiring the coordinate channel, and the training of the target image is accurately realized.

Example 6:

on the basis of embodiment 1, the invention provides a data acquisition method based on computer vision technology, and the working process of storing the training result to the data acquisition model comprises the following steps:

extracting feature data of the target image based on the training result;

classifying the simplified data according to a preset format type;

In this embodiment, the redundant data refers to the redundant data occurring in the feature data, and the redundant data is made redundant.

In this embodiment, the reduced data may be the feature data in the removed redundant data.

In this embodiment, the data classification may be performed according to the data type to which the data belongs, and for example, the data with data of int type may be classified into one class, the data with data of String type may be classified into one class, and the like.

The beneficial effects of the above technical scheme are:

the method is beneficial to improving the precision of the feature data by removing redundant data in the feature data, and can accurately realize the storage of the feature data in the data acquisition model by classifying the simplified data and performing sequencing matching.

Example 7:

on the basis of embodiment 6, the present invention provides a data acquisition method based on computer vision technology, and after storing the feature data in the data acquisition model, the method further includes:

In this embodiment, the collection task may be, for example: image data of a yellow portion in the image is acquired.

The beneficial effects of the above technical scheme are:

and the characteristic data is screened through a specific acquisition task, so that the final data can be effectively extracted.

Example 8:

on the basis of embodiment 1, the present invention provides a data acquisition method based on a computer vision technology, wherein before training the target image in a preset convolutional neural network, the method further includes:

obtaining a labeled sample set of the target image

Wherein

In order to train the sample to be trained,

is marked corresponding to the training sample, and

，

to train the number of samples, an

；

Is as follows

The number of class marks is set to be,

is the number of categories;

based on the set of labeled samples

Obtaining the set of labeled samples

Sample distance between the i training samples

Wherein, in the step (A),

is shown as

A marked sample

To the first

A marked sample

The distance between them;

to pair

Sample distance between the training samples

Arranged in ascending order to obtain

Wherein 1,2,. M, indicates that the sequence samples sequentially increase, and

is the maximum sample distance;

；

Wherein the content of the first and second substances,

representing a truncation distance of the training sample;

representing a number of sample points of the training sample;

is a constant with the value range of (0, 1);

truncation distance based on the training sample

And sample distance between the training samples

；

；

Wherein the content of the first and second substances,

to represent

To

A distance therebetween, and

；

representing a truncation distance of the training sample;

representing the training sample variance;

represents a density threshold;

is an impulse function;

spatial density of the target image in the convolutional neural network

And the density threshold value

Comparing;

if it is

Then determining the training sample

Is a core point;

if it is

And the training sample

Distance of truncation of

Core points exist in the field, then the training sample

Is a boundary point;

if the training sample

Neither core nor boundary points, the training samples

Is a noise point;

In this embodiment, acquiring a labeled sample set of a target image refers to performing a part of specific training on the target image, and the labeled sample set is used for identifying the sample.

In this embodiment, the truncated distance is obtained from the maximum sample distance in the ascending order of distances between training samples and the number of sample points, and is used to calculate the spatial density in the convolutional neural network according to the truncated distance.

In this embodiment, the core points represent the central constituent points of the training space in the convolutional neural network.

In this embodiment, the boundary points may be points used to bound the boundary of the training space in the convolutional neural network.

In this embodiment, the noise point is an interference point in a training space in the convolutional neural network, and the noise point is removed and avoided before the sample training.

The technical scheme has the beneficial effects that:

through obtaining the mark sample set to the training sample to carry out the ascending sequence according to the distance between the training sample of mark sample set and practise, can accurate acquisition training sample's truncation distance, through the truncation distance of training sample, can accurately acquire the space density in the convolutional neural network, thereby according to the comparison of space density and density threshold, can accurately constitute the training space in the convolutional neural network, not only provide the space for the sample training, but also can realize the high-efficient analysis to data.

Example 9:

the invention provides a data acquisition system based on computer vision technology, as shown in fig. 2, comprising:

The beneficial effects of the above technical scheme are:

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A data acquisition method based on computer vision technology is characterized by comprising the following steps:

based on a computer vision technology and the data acquisition model, carrying out recognition analysis and acquisition on the data of the target image;

before training the target image in a preset convolutional neural network, the method further comprises:

obtaining a labeled sample set of the target image

Wherein

In order to train the sample to be trained,

is marked corresponding to the training sample, and

，

to train the number of samples, an

；

Is as follows

The number of class marks is set to be,

is the number of categories;

based on the set of labeled samples

Obtaining the set of labeled samples

Sample distance between the i training samples

Wherein, in the step (A),

is shown as

A marked sample

To the first

A marked sample

The distance between them;

to pair

Sample distance between the training samples

Arranged in ascending order to obtain

Wherein 1,2,. M, indicates that the sequence samples sequentially increase, and

is the maximum sample distance;

；

Wherein the content of the first and second substances,

representing a truncation distance of the training sample;

representing a number of sample points of the training sample;

is a constant with the value range of (0, 1);

truncation distance based on the training sample

And sample distance between the training samples

；

；

Wherein the content of the first and second substances,

representing spatial density of the target image in the convolutional neural networkDegree;

to represent

To

A distance therebetween, and

；

representing a truncation distance of the training sample;

representing the training sample variance;

represents a density threshold;

is an impulse function;

spatial density of the target image in the convolutional neural network

And the density threshold value

Comparing;

if it is

Then determining the training sample

Is a core point;

if it is

And the training sample

Distance of truncation of

Core points exist in the field, then the training sample

Is a boundary point;

if the training sample

Neither core nor boundary points, the training samples

Is a noise point;

2. The data acquisition method based on computer vision technology as claimed in claim 1, wherein after the target object identified by the computer is obtained and before the photographing and sampling of the target object, the method further comprises:

3. A method for data acquisition based on computer vision technology as claimed in claim 2,

4. The data acquisition method based on computer vision technology as claimed in claim 1, wherein after the target object is photographed, the method further comprises:

and the final processed image is the target image.

5. The data acquisition method based on computer vision technology as claimed in claim 1, wherein the process of training the target image in a preset convolutional neural network comprises:

6. A data collection method based on computer vision technology as claimed in claim 1, wherein the working process of storing the training result to the data collection model comprises:

extracting feature data of the target image based on the training result;

classifying the simplified data according to a preset format type;

7. The data acquisition method based on computer vision technology as claimed in claim 6, wherein after storing the feature data in the data acquisition model, further comprising:

8. A data acquisition system based on computer vision technology, comprising: