CN116935192A

CN116935192A - Data acquisition method and system based on computer vision technology

Info

Publication number: CN116935192A
Application number: CN202310943038.1A
Authority: CN
Inventors: 王智武
Original assignee: Beijing Yuanjing Digital Technology Co ltd
Current assignee: Beijing Yuanjing Digital Technology Co ltd
Priority date: 2023-07-28
Filing date: 2023-07-28
Publication date: 2023-10-24

Abstract

The invention relates to the technical field of data acquisition methods, in particular to a data acquisition method and a system based on a computer vision technology, comprising the following steps: preparing a data scene, and introducing an asynchronous data acquisition strategy while selecting camera parameters; and introducing a depth sensor into the data scene to acquire three-dimensional point cloud data. In the invention, the depth sensor is used for acquiring the three-dimensional structure and geometric information of the target, so as to provide a comprehensive data source. The asynchronous acquisition strategy ensures that the data of a plurality of sensors are accurately synchronized and fused, and more comprehensive and accurate data are acquired. The automatic labeling improves labeling efficiency and accuracy by using a computer vision algorithm. The long-sequence acquisition strategy captures the time-sequence nature and dynamic characteristics of the data, providing comprehensive and representative data. The real-time quality control and adjustment mechanism optimizes the acquisition process and improves the data quality. Measures such as data enhancement, quality assessment and abnormal data processing improve the quality and usability of the data.

Description

Data acquisition method and system based on computer vision technology

Technical Field

The invention relates to the technical field of data acquisition methods, in particular to a data acquisition method and system based on a computer vision technology.

Background

Data acquisition methods are processes and techniques for collecting, recording and acquiring data. Data acquisition is a precondition for data processing and analysis, which can provide a basis for decision making, problem solving, and insight discovery. Depending on the type, source and purpose of the data, it is important to select a suitable data acquisition method.

Wherein computer vision technology based data acquisition methods involve the use of computer vision algorithms and tools to collect, process and analyze image or video data. These methods may be used to extract useful data from visual information, such as object recognition, object detection, image segmentation, and the like.

In the data acquisition method of the existing computer vision technology, the traditional method ignores the three-dimensional structure and geometric information of the target, and only considers factors such as illumination, background, camera angle and the like, so that complete information of the target cannot be obtained in some applications. Secondly, the traditional method mainly focuses on camera parameter adjustment, omits data synchronization and fusion of a plurality of sensors or cameras, possibly causes data deletion or inaccuracy, and limits subsequent data analysis and algorithm application. In addition, the data labeling in the traditional method generally depends on manual operation, is time-consuming and is easy to make mistakes, and the efficiency and quality of the data labeling are affected. The data acquisition strategy is also limited, mainly focusing on the quantity and frequency of data, ignoring the time series nature and dynamic characteristics of the data, and possibly causing the data to be incomplete and not represent a real scene. Finally, the data quality control in the traditional method is mainly checked after acquisition, and the data quality problem cannot be found and corrected in time, so that a large amount of low-quality or invalid data exists in the acquisition process.

Disclosure of Invention

The invention aims to solve the defects in the prior art, and provides a data acquisition method and system based on a computer vision technology.

In order to achieve the above purpose, the present invention adopts the following technical scheme: a data acquisition method based on computer vision technology comprises the following steps:

preparing a data scene, and introducing an asynchronous data acquisition strategy while selecting camera parameters;

introducing a depth sensor into the data scene to acquire three-dimensional point cloud data;

an image processing technology is introduced to perform denoising and distortion correction operation on the three-dimensional point cloud data, and the accuracy of the data is ensured through a data quality evaluation mechanism;

acquiring video streams or dynamic scenes in the three-dimensional point cloud data by adopting a long-time sequence data acquisition strategy, and extracting key frames;

performing automatic data annotation on the key frames by using a computer vision algorithm, establishing a real-time feedback and adjustment mechanism, and displaying image quality indexes and target detection results of the key frames in real time;

identifying and processing abnormal data in the key frames, and optimizing the storage and management process of the key frames by adopting a data enhancement technology;

And storing metadata information comprising acquisition conditions, equipment parameters and object types, and providing additional information for data analysis and algorithm training.

As a further scheme of the present invention, the step of preparing a data scene, and introducing an asynchronous data acquisition strategy while selecting camera parameters specifically includes:

setting related objects, backgrounds and illumination conditions, and selecting a representative data acquisition scene;

an industrial camera is adopted as acquisition equipment, and resolution, frame rate, exposure time and focal length parameters are debugged;

asynchronous data acquisition is carried out by using a plurality of groups of industrial cameras, specifically, the plurality of groups of industrial cameras are configured based on the requirements and geometric layout of acquired scenes, and the plurality of groups of industrial cameras are ensured to be synchronous in physical and time by adopting a software synchronization mechanism;

meanwhile, an industrial camera which is asynchronously arranged is started to acquire visual data;

based on camera calibration, feature matching and a stereoscopic vision algorithm, vision calibration and alignment are carried out on asynchronously acquired vision data;

and fusing or superposing visual data acquired by a plurality of groups of industrial cameras based on an image fusion algorithm to generate multi-view visual information.

As a further scheme of the invention, the step of introducing the depth sensor to acquire the three-dimensional point cloud data comprises the following steps:

Adopting a structure light sensor as a depth sensor, and adopting the depth sensor to collect depth information in a scene;

through hardware synchronization or software calibration, the synchronization between the depth sensor and the industrial camera and the corresponding relation between the depth and the color image are ensured;

extracting depth information for the structured light sensor based on a triangulation algorithm;

calibrating the acquired depth data by using camera internal parameter calibration, external parameter calibration and laser plane calibration to eliminate errors and distortion of a depth sensor;

based on a three-dimensional reconstruction algorithm, the depth information and the corresponding color image are fused to obtain accurate three-dimensional point cloud data with a geometric structure.

As a further scheme of the present invention, the step of denoising and distortion correcting the three-dimensional point cloud data by introducing an image processing technology and ensuring the accuracy of the data by a data quality assessment mechanism specifically comprises:

removing noise points in the three-dimensional point cloud data by using a statistical filtering algorithm, particularly Gaussian filtering;

performing camera distortion correction on the three-dimensional point cloud data based on polar line correction, and removing lens distortion by using calibration information;

Taking the point cloud density, the point cloud stability and the curvature consistency as data quality evaluation indexes, judging the accuracy and the usability of the three-dimensional point cloud data, and acquiring a data quality evaluation result;

and repairing and compensating the part with poor quality by adopting a data interpolation and missing region filling method based on the data quality evaluation result.

As a further scheme of the present invention, the step of acquiring the video stream or the dynamic scene in the three-dimensional point cloud data by adopting the long-time sequence data acquisition strategy specifically includes:

setting the acquired time range, frame rate and sampling rate parameters;

storing and managing three-dimensional point cloud data acquired for a long time to be used as long-time sequence data;

tracking a moving object by using a target tracking algorithm aiming at a dynamic scene in the long-time sequence data, or modeling dynamically-changed objects and structures by using a scene reconstruction algorithm;

calibrating and aligning the long time sequence data by using a time stamp calibration algorithm;

and dividing the long time sequence data to extract key frames.

As a further scheme of the invention, the step of using a computer vision algorithm to automatically annotate the data of the key frame, establishing a real-time feedback and adjustment mechanism, and displaying the image quality index and the target detection result of the key frame in real time comprises the following steps:

Performing automatic data annotation on the key frames by adopting a computer vision algorithm comprising target detection, target segmentation, gesture estimation and key point detection;

extracting the position, category, boundary frame and key point information of the target as a marked training data set;

training a computer vision model based on the noted training data set to enable the computer vision model to automatically identify target objects and required noted information in the key frames;

based on the fact that the trained computer vision model is applied to a key frame, automatically performing data annotation, and performing positioning, classification and bounding box drawing operations on targets in the key frame according to image features and a prediction result of the model to generate annotation information;

performing image quality index calculation on the key frames by using an image processing algorithm to obtain an automatically marked target detection result;

and displaying the target detection result of the automatic labeling on the key frame in real time, and feeding back and adjusting in real time according to the display of the image quality index and the target detection result.

As a further scheme of the present invention, the steps of identifying and processing the abnormal data in the key frame and optimizing the storage and management process of the key frame by adopting the data enhancement technology specifically include:

Identifying abnormal data in the key frame by using an abnormal detection algorithm based on statistical analysis, and identifying and marking the abnormal data in the key frame by analyzing the data characteristics and learning an abnormal mode;

according to the type and the characteristics of the abnormal data, selecting an abnormal processing method, wherein the abnormal processing method comprises data restoration, data rejection and data interpolation;

processing the key frame by using a data enhancement technology comprising image rotation, scaling, translation and overturning, and expanding a data set;

and (3) optimizing storage and management of the key frames, performing data compression by using a compression algorithm, establishing an index and database system, and improving the efficiency of data storage and access.

As a further aspect of the present invention, the step of storing metadata information including collection conditions, device parameters, and object types, and providing additional information for data analysis and algorithm training specifically includes:

defining metadata information to be saved, including acquisition conditions, equipment parameters and object categories, and determining the type, format and structure of the metadata;

in the data acquisition process, recording metadata information related to acquisition conditions and equipment parameters, wherein the metadata information comprises acquisition time, location, illumination conditions, camera parameters and sensor parameters;

In the data labeling process, recording metadata information related to the object type and the labeling process, wherein the metadata information comprises the object type, the information of a labeling person and the labeling time;

associating and storing metadata in the acquisition process and the labeling process with actual data, and associating and storing the metadata as an additional file with a data file;

in the process of data analysis and algorithm training, the stored metadata information is applied to perform data segmentation, label selection and model evaluation tasks.

A data acquisition system based on computer vision technology is composed of a data preparation module, a depth sensor module, an image processing module, a long time sequence acquisition module, an automatic labeling module, an abnormality processing module and a metadata management module, wherein the data acquisition system based on computer vision technology is responsible for executing the data acquisition method based on computer vision technology according to any one of claims 1-8.

As a further scheme of the invention, the functional items of the data preparation module comprise scene setting, camera parameter configuration and asynchronous acquisition;

the function items of the depth sensor module comprise sensor selection, synchronization and calibration and three-dimensional point cloud acquisition;

The functional items of the image processing module comprise denoising, correction and data quality evaluation;

the function items of the long time sequence acquisition module comprise parameter setting, data storage and management and dynamic scene processing;

the function items of the automatic labeling module comprise target identification and labeling and image quality evaluation;

the function items of the exception handling module comprise exception identification and handling, data enhancement and optimization;

the functional items of the metadata management module comprise metadata records and association, data analysis and training support.

Compared with the prior art, the invention has the advantages and positive effects that:

according to the invention, the three-dimensional structure and geometric information of the target can be obtained by introducing the depth sensor, so that a more comprehensive data source is provided. The asynchronous data acquisition strategy ensures that the data of a plurality of sensors or cameras are accurately synchronized and fused, so that more comprehensive and accurate data are obtained. The automatic data marking can improve marking efficiency and accuracy by using a computer vision algorithm. The long-time sequence data acquisition strategy can better capture the time sequence property and dynamic characteristics of the data, and provide more comprehensive and representative data. The real-time quality control and adjustment mechanism is introduced to help the acquisition personnel to optimize the acquisition process in time and improve the data quality. The quality and usability of the data are improved through data enhancement technology, data quality evaluation and improvement, abnormal data identification and processing and other measures.

Drawings

FIG. 1 is a schematic diagram of a workflow of a data acquisition method and system based on computer vision technology according to the present invention;

FIG. 2 is a detailed flow chart of step 1 of the data acquisition method and system based on the computer vision technology;

FIG. 3 is a step 2 refinement flowchart of a data acquisition method and system based on computer vision technology according to the present invention;

FIG. 4 is a step 3 refinement flowchart of a data acquisition method and system based on computer vision technology according to the present invention;

FIG. 5 is a step 4 refinement flowchart of a data acquisition method and system based on computer vision technology according to the present invention;

FIG. 6 is a detailed flowchart of step 5 of the data acquisition method and system based on computer vision technology according to the present invention;

FIG. 7 is a detailed flowchart of step 6 of the data acquisition method and system based on computer vision technology;

FIG. 8 is a detailed flowchart of step 7 of the data acquisition method and system based on computer vision technology according to the present invention;

fig. 9 is a schematic diagram of a system frame of a data acquisition method and system based on computer vision technology according to the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In the description of the present invention, it should be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the present invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present invention. Furthermore, in the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.

Example 1

Referring to fig. 1, the present invention provides a technical solution: a data acquisition method based on computer vision technology comprises the following steps:

introducing a depth sensor into a data scene to acquire three-dimensional point cloud data;

an image processing technology is introduced to perform denoising and distortion correction operation on the three-dimensional point cloud data, and the accuracy of the data is ensured through a data quality assessment mechanism;

the method comprises the steps of performing automatic data annotation on a key frame by using a computer vision algorithm, establishing a real-time feedback and adjustment mechanism, and displaying an image quality index and a target detection result of the key frame in real time;

identifying and processing abnormal data in the key frames, and optimizing the storage and management processes of the key frames by adopting a data enhancement technology;

First, camera parameters are selected while preparing a data scene, and an asynchronous data acquisition strategy is introduced. Next, a depth sensor is introduced in the data scene to acquire three-dimensional point cloud data. Then, denoising and distortion correction are carried out on the three-dimensional point cloud data by utilizing an image processing technology, and the accuracy of the data is ensured through a data quality evaluation mechanism. And acquiring video streams or dynamic scenes in the three-dimensional point cloud data by adopting a long-time sequence data acquisition strategy, and extracting key frames. And (3) carrying out automatic data annotation on the key frames by using a computer vision algorithm, and establishing a real-time feedback and adjustment mechanism, wherein the real-time feedback and adjustment mechanism comprises the step of displaying the image quality indexes and the target detection results of the key frames in real time. By identifying and processing the abnormal data in the key frame, the storage and management process of the key frame is optimized by adopting a data enhancement technology. Meanwhile, metadata information including acquisition conditions, equipment parameters and object categories is saved, and additional information is provided for data analysis and algorithm training. The method has beneficial effects in the aspects of data quality assurance, diversified data collection, automatic data labeling, abnormal data processing, data management optimization, metadata information addition and the like, and has important significance for promoting the development of computer vision application.

Referring to fig. 2, preparing a data scene, and selecting camera parameters while introducing an asynchronous data acquisition strategy specifically includes:

asynchronous data acquisition is carried out by using a plurality of groups of industrial cameras, specifically, the plurality of groups of industrial cameras are configured based on the requirements and geometric layout of acquired scenes, and the plurality of groups of industrial cameras are ensured to be synchronous in physics and time by adopting a software synchronization mechanism;

First, the related objects, background, lighting conditions are set, and a representative data acquisition scene is selected. Then, an industrial camera is used as acquisition equipment, and parameters such as resolution, frame rate, exposure time, focal length and the like are debugged. Next, multiple sets of industrial cameras are configured and a software synchronization mechanism is employed to ensure that they remain physically and temporally synchronized. And simultaneously starting a plurality of groups of industrial cameras to perform asynchronous data acquisition so as to acquire visual data of multiple visual angles. Based on camera calibration, feature matching and a stereoscopic vision algorithm, the asynchronously acquired visual data are calibrated and aligned, and the consistency and accuracy of the data are ensured. And finally, fusing or superposing visual data acquired by a plurality of groups of industrial cameras by using an image fusion algorithm to generate multi-view visual information. From an implementation perspective, the introduction of asynchronous data collection strategies can improve data quality, data diversity, and data integrity. The asynchronous acquisition of synchronous multi-group industrial cameras can be used for acquiring multi-view data, so that the coverage range and the information richness of a scene can be increased. The camera parameters are reasonably configured, visual calibration and alignment are carried out, and the accuracy and consistency of data can be ensured. Finally, comprehensive and stereoscopic visual information is obtained through visual data fusion, and a rich data basis is provided for research and application of a computer visual algorithm. Such an approach may facilitate the development and application of computer vision applications in various fields.

Referring to fig. 3, the step of acquiring three-dimensional point cloud data by introducing a depth sensor specifically includes:

First, a structured light sensor is employed as a depth sensor, by which depth information in a scene is acquired. Meanwhile, it is necessary to ensure synchronization between the depth sensor and the industrial camera, which can be achieved through hardware synchronization or software calibration, and to ensure correspondence between the depth and color images. Next, for a structured light sensor, depth information may be extracted based on a triangulation algorithm. And then, calibrating the acquired depth data by using methods such as camera internal parameter calibration, external parameter calibration, laser plane calibration and the like, and eliminating errors and distortion of the depth sensor. And finally, fusing the depth information with the corresponding color image based on a three-dimensional reconstruction algorithm, so as to obtain accurate three-dimensional point cloud data with a geometric structure. From an implementation perspective, the introduction of depth sensors and acquisition of three-dimensional point cloud data may have a number of beneficial effects. First, the depth sensor can provide accurate depth information of a scene, enriching the content and dimension of data. Secondly, through synchronization and calibration between the depth sensor and the camera, accurate correspondence between depth information and color images can be ensured, and consistency and usability of data are improved. In addition, in the process of calibrating and fusing the depth data, errors and distortion of the depth sensor can be eliminated, and quality and accuracy of the obtained three-dimensional point cloud data are improved. In summary, the introduction of the depth sensor and the acquisition of the three-dimensional point cloud data bring beneficial effects in aspects of data quality, data accuracy, data richness and the like, and provide an important data basis for the research of computer vision algorithms and applications.

Referring to fig. 4, the steps of denoising and distortion correcting the three-dimensional point cloud data by introducing an image processing technology and ensuring the accuracy of the data by a data quality evaluation mechanism are specifically as follows:

carrying out camera distortion correction on the three-dimensional point cloud data based on polar line correction, and removing lens distortion by using calibration information;

the accuracy and the usability of the three-dimensional point cloud data are judged by taking the point cloud density, the point cloud stability and the curvature consistency as data quality evaluation indexes, and a data quality evaluation result is obtained;

based on the data quality evaluation result, repairing and compensating the part with poor quality by adopting a data interpolation and missing region filling method.

Firstly, denoising the three-dimensional point cloud data by using a statistical filtering algorithm (such as Gaussian filtering) so as to remove noise points in the three-dimensional point cloud data and improve the definition and accuracy of the data. And then, carrying out camera distortion correction on the three-dimensional point cloud data, and removing lens distortion by using calibration information of a camera by adopting an epipolar correction method, so that the data is more accurate in geometry. Then, a data quality assessment mechanism is established, and indexes such as point cloud density, point cloud stability, curvature consistency and the like are utilized to assess the quality of the data. The accuracy and the usability of the three-dimensional point cloud data can be judged by analyzing the numerical value of the evaluation index, and a data quality evaluation result is obtained. And finally, repairing and compensating the part with poor quality by adopting methods such as data interpolation, filling the missing area and the like according to the data quality evaluation result, so that the data is more complete and reliable in whole. From the implementation angle analysis, an image processing technology is introduced to denoise and correct distortion of the three-dimensional point cloud data, the accuracy of the data is ensured through a data quality evaluation mechanism, and the quality and usability of the data can be improved. Removing noise and distortion helps to improve the clarity and accuracy of the data, while the data quality assessment mechanism can determine the accuracy and credibility of the data through quantitative analysis. And repairing and compensating the data part with poor quality, thereby further improving the integrity and reliability of the data. In summary, implementation of these steps may bring beneficial effects, improve accuracy and quality of three-dimensional point cloud data, and provide a more reliable data base for subsequent computer vision algorithms and applications.

Referring to fig. 5, the steps for acquiring a video stream or a dynamic scene in three-dimensional point cloud data by adopting a long-time sequence data acquisition strategy are specifically as follows:

setting the acquired time range, frame rate and sampling rate parameters;

tracking a moving object by using a target tracking algorithm aiming at a dynamic scene in long-time sequence data, or modeling dynamically-changed objects and structures by using a scene reconstruction algorithm;

using a time stamp calibration algorithm to calibrate and align the long time sequence data;

and dividing the long time sequence data to extract key frames.

Firstly, parameters such as a time range, a frame rate, a sampling rate and the like of acquisition are set so as to determine the duration and the frequency of acquisition. And then, storing and managing the three-dimensional point cloud data acquired for a long time, and adopting a proper storage mode and a management strategy. For dynamic scenes, moving objects are tracked using a target tracking algorithm, or dynamically changing objects and structures are modeled using a scene reconstruction algorithm. Thus, key object information in the dynamic scene can be extracted or a model of the dynamic scene can be built. In the acquisition process, the time synchronization of each device or sensor is ensured, and the data can be calibrated and aligned by adopting a time stamp calibration algorithm, so that the time consistency of the data is ensured. At the same time, key points in time or events are selected to reduce the amount of data stored and processed by key frame extraction. From the implementation point of view, the long-time sequence data acquisition strategy has beneficial effects on video streaming or dynamic scenes of three-dimensional point cloud data. The method can provide more comprehensive and continuous scene information, and is suitable for analyzing and modeling dynamic scenes. Through target tracking and scene reconstruction, key dynamic object information can be extracted or a geometric model of a dynamic scene can be established. By time alignment and key frame extraction, the time consistency of the data can be ensured, and the amount of data stored and processed can be reduced. Implementation of these steps will increase the processing efficiency and usability of the data, providing a more reliable data base for subsequent computer vision algorithms and applications.

Referring to fig. 6, the steps of using a computer vision algorithm to perform automatic data labeling on a key frame, establishing a real-time feedback and adjustment mechanism, and displaying an image quality index and a target detection result of the key frame in real time are specifically as follows:

training a computer vision model based on the marked training data set to enable the computer vision model to automatically identify target objects and required marking information in the key frames;

based on the trained computer vision model applied to the key frame, automatically marking data, positioning, classifying and drawing a boundary frame on a target in the key frame according to image characteristics and a prediction result of the model, and generating marking information;

and displaying the automatically marked target detection result on the key frame in real time, and feeding back and adjusting in real time according to the image quality index and the display of the target detection result.

First, automatic data labeling is carried out on key frames by applying computer vision algorithms such as target detection, target segmentation, gesture estimation, key point detection and the like, information such as the position, the category, the boundary box, the key point and the like of the target is extracted, and a labeled training data set is constructed. Next, a computer vision model is trained using the labeled dataset, enabling it to automatically identify target objects in the keyframes and the desired labeling information. Based on the model which is completed by training, the method is applied to key frames and is used for automatic data labeling, and targets are positioned, classified and drawn by boundary frames according to the prediction result of the model, so that labeling information is generated. Meanwhile, an image processing algorithm is utilized to calculate the image quality index of the key frame, and the target detection result of the automatic labeling is displayed on the key frame in real time. And real-time feedback and adjustment are performed by observing the display of the image quality index and the target detection result, so that the accuracy and quality of the labeling are ensured. Such an embodiment facilitates reduced manual labeling effort, improved labeling efficiency, and provides fast and accurate labeling results through automated labeling of computer vision models. The real-time feedback and adjustment mechanism enables the labeling result to be checked and optimized in real time, and the accuracy and reliability of labeling are improved. Therefore, the automatic data labeling is performed by integrating a computer vision algorithm, and a real-time feedback and adjustment mechanism is established, so that the method has remarkable implementation effect, and the labeling efficiency and quality are improved.

Referring to fig. 7, the steps for identifying and processing abnormal data in a key frame and optimizing the storage and management process of the key frame by adopting the data enhancement technology are specifically as follows:

according to the type and characteristics of the abnormal data, selecting an abnormal processing method, wherein the abnormal processing method comprises data restoration, data rejection and data interpolation;

and optimizing storage and management of the key frames, performing data compression by using a compression algorithm, establishing an index and database system, and improving the efficiency of data storage and access.

And identifying and processing abnormal data in the key frames, and optimizing the storage and management process of the key frames by adopting a data enhancement technology. First, the key frames are identified by an anomaly detection algorithm based on statistical analysis, and anomaly data such as noise, missing or damage is accurately marked. According to the type and characteristics of the abnormal data, a proper abnormal processing method is selected, including data restoration, rejection or interpolation, so as to ensure the accuracy and the integrity of the data. And then, processing the key frames by utilizing a data enhancement technology, such as image rotation, scaling, translation, overturning and the like, so as to expand a data set, increase the diversity and the quantity of data and improve the generalization capability and the robustness of the model. In order to optimize the storage and management of key frames, a data compression algorithm is adopted to reduce the occupation of storage space, and an index and database system is established to improve the data access efficiency. The implementation method of the integration is beneficial to improving the data quality and enhancing the data diversity, and simultaneously optimizes the storage and management efficiency of the key frame data. By identifying and processing the abnormal data, the accuracy and the integrity of the data are ensured; the data enhancement technology is adopted, so that the diversity of data is increased, and the robustness of the model is improved; and the storage and management processes are optimized, and the data access efficiency is improved, so that the quality and the application effect of the key frame data are comprehensively improved.

Referring to fig. 8, the step of storing metadata information including acquisition conditions, device parameters, and object types and providing additional information for data analysis and algorithm training specifically includes:

defining metadata information to be stored, including acquisition conditions, equipment parameters and object types, and determining the type, format and structure of the metadata;

associating and storing the metadata with the actual data in the acquisition process and the labeling process, and associating and storing the metadata with the data file as an additional file;

First, metadata information types to be saved are defined, including acquisition conditions, device parameters, and object categories, and the format and structure of metadata are determined. Metadata information related to acquisition conditions and device parameters, such as acquisition time, location, illumination conditions, camera parameters, sensor parameters, and the like, is recorded during data acquisition. This information provides a context and environment for data acquisition, helping to understand the results of subsequent data analysis and algorithm training. Meanwhile, metadata information related to the object type and the labeling process, such as the object type, the information of a labeling person and the labeling time, is recorded in the data labeling process. Such metadata information may track the source and processing of the data, providing a reliable reference for subsequent data analysis and algorithm training. And associating and storing the metadata with the actual data in the acquisition process and the labeling process, and storing the metadata as an additional file and a data file together. Such an associative storage ensures consistency and integrity of the metadata. In the process of data analysis and algorithm training, the stored metadata information is utilized to carry out tasks such as data analysis, label selection, model evaluation and the like. By applying the stored metadata information, not only can the accuracy and reliability of data analysis and algorithm training be improved, but also additional background information and context can be provided, so that analysts and researchers can better understand the data and the results of the data analysis and algorithm training. In summary, the storage of metadata information of acquisition conditions, equipment parameters and object types has important implementation effects on data analysis and algorithm training, can provide additional information support, and enhances the interpretation and analysis capability of data, thereby improving the overall data analysis and algorithm training quality.

Referring to fig. 9, a data acquisition system based on computer vision technology is composed of a data preparation module, a depth sensor module, an image processing module, a long time sequence acquisition module, an automatic labeling module, an exception processing module and a metadata management module, wherein the data acquisition system based on computer vision technology is responsible for executing the data acquisition method based on computer vision technology according to any one of claims 1 to 8.

The function items of the data preparation module comprise scene setting, camera parameter configuration and asynchronous acquisition;

the functional items of the metadata management module include metadata records and associations, data analysis, and training support.

The data acquisition system based on the computer vision technology comprises a data preparation module, a depth sensor module, an image processing module, a long time sequence acquisition module, an automatic labeling module, an abnormality processing module and a metadata management module. The implementation of the system can bring the following benefits: the data preparation module ensures the accuracy and flexibility of data acquisition through the functions of scene setting, camera parameter configuration, asynchronous acquisition and the like. The sensor selection, synchronization and calibration of the depth sensor module and the three-dimensional point cloud acquisition function provide the ability to accurately acquire depth information. The denoising and correction and data quality evaluation functions of the image processing module can improve the image quality and the data quality. The long time sequence acquisition module supports long-time and stable data acquisition through functions of parameter setting, data storage and management, dynamic scene processing and the like. The automatic labeling module has the advantages that the target identification and labeling and image quality evaluation functions can improve the labeling efficiency and accuracy. The abnormality recognition and processing, data enhancement and optimization functions of the abnormality processing module ensure the accuracy and integrity of the data. The metadata record and association, data analysis and training support functions of the metadata management module provide additional information and support for data analysis and algorithm training. In summary, the data acquisition system based on the computer vision technology improves the efficiency, accuracy and reliability of data acquisition through implementation of each functional module, and provides a reliable data basis for subsequent data analysis and algorithm training.

Working principle: the data acquisition system based on the computer vision technology consists of a data preparation module, a depth sensor module, an image processing module, a long time sequence acquisition module, an automatic labeling module, an abnormality processing module and a metadata management module. The data preparation module is responsible for scene setting, camera parameter configuration and asynchronous acquisition, and ensures the accuracy and flexibility of data acquisition. The depth sensor module selects a sensor and performs synchronization and calibration to acquire accurate three-dimensional point cloud data. The image processing module performs denoising and correction and data quality assessment, and improves image quality and data quality. The long time sequence acquisition module sets parameters, performs data storage and management, processes dynamic scenes and supports long-time and stable data acquisition. The automatic labeling module realizes target identification and labeling and image quality evaluation, and improves labeling efficiency and accuracy. The exception handling module identifies and handles exceptions, optimizes data enhancement, and ensures data accuracy and integrity. The metadata management module records and associates metadata, supports data analysis and training. The implementation of integrating each functional module improves the efficiency, accuracy and reliability of data acquisition, and provides a reliable data basis for subsequent data analysis and algorithm training. The data acquisition method based on the computer vision technology comprises the steps of preparing a data scene, acquiring three-dimensional point cloud data by a depth sensor, processing an image, acquiring a long time sequence, automatically labeling, carrying out exception handling and managing metadata. The method ensures the benefits of data quality, diversified data collection, automatic labeling, abnormal data processing, data management optimization, additional metadata information provision and the like through a plurality of steps, and has important significance for the development of computer vision application.

The present invention is not limited to the above embodiments, and any equivalent embodiments which can be changed or modified by the technical disclosure described above can be applied to other fields, but any simple modification, equivalent changes and modification made to the above embodiments according to the technical matter of the present invention will still fall within the scope of the technical disclosure.

Claims

1. The data acquisition method based on the computer vision technology is characterized by comprising the following steps of:

2. The method for data collection based on computer vision technology according to claim 1, wherein the step of preparing a data scene and selecting camera parameters while introducing an asynchronous data collection strategy specifically comprises:

3. The method for acquiring data based on the computer vision technology according to claim 1, wherein the step of acquiring three-dimensional point cloud data by introducing a depth sensor specifically comprises:

4. The data acquisition method based on the computer vision technology according to claim 1, wherein the step of denoising and distortion correcting the three-dimensional point cloud data by using the image processing technology and ensuring the accuracy of the data by using a data quality assessment mechanism is specifically as follows:

5. The method for collecting data based on computer vision technology according to claim 1, wherein the step of collecting video streams or dynamic scenes in the three-dimensional point cloud data by using a long-time sequence data collection strategy specifically comprises:

setting the acquired time range, frame rate and sampling rate parameters;

and dividing the long time sequence data to extract key frames.

6. The method for collecting data based on computer vision technology according to claim 1, wherein the step of using computer vision algorithm to automatically label the key frame, and establishing a real-time feedback and adjustment mechanism, and displaying the image quality index and the target detection result of the key frame in real time comprises the following steps:

7. The method for data collection based on computer vision technology according to claim 1, wherein the step of identifying and processing abnormal data in the key frame and optimizing the process of storing and managing the key frame by using data enhancement technology specifically comprises:

8. The method for data collection based on computer vision technology according to claim 1, wherein the step of storing metadata information including collection conditions, equipment parameters, object categories, and providing additional information for data analysis and algorithm training is specifically as follows:

9. A data acquisition system based on computer vision technology, which is characterized by comprising a data preparation module, a depth sensor module, an image processing module, a long time sequence acquisition module, an automatic labeling module, an exception processing module and a metadata management module, wherein the data acquisition system based on computer vision technology is responsible for executing the data acquisition method based on computer vision technology according to any one of claims 1-8.

10. The computer vision technology based data acquisition system of claim 9, wherein the functional items of the data preparation module include scene settings, camera parameter configuration, asynchronous acquisition;