CN114419018A

CN114419018A - Image sampling method, system, device and medium

Info

Publication number: CN114419018A
Application number: CN202210088759.4A
Authority: CN
Inventors: 姜恒; 张桂荣
Original assignee: Chongqing Unisinsight Technology Co Ltd
Current assignee: Chongqing Unisinsight Technology Co Ltd
Priority date: 2022-01-25
Filing date: 2022-01-25
Publication date: 2022-04-29

Abstract

The method comprises the steps of sequentially obtaining images to be sampled, determining a definition difference value according to the current definition of the images to be sampled and the previous definition of the images to be sampled, judging whether the images to be sampled meet preset sampling conditions, and sampling the images to be sampled if the images to be sampled meet the condition that the images to be sampled comprise target objects and/or the definition difference value of the images to be sampled is larger than the preset definition difference value.

Description

Image sampling method, system, device and medium

Technical Field

The present invention relates to the field of data sampling technologies, and in particular, to an image sampling method, system, device, and medium.

Background

The current AI artificial intelligence technology is widely applied to the industries of security monitoring and the like. The target detection technology is one of the important branches of the AI artificial intelligence technology, and is also a core technology driving the update of security monitoring. For a target detection model, mass image information is often required to be input as a training set for training the target detection model, and the generalization of the target detection model is further improved.

But is limited by the difficulty and time consumption of manual image data acquisition, resulting in higher difficulty of image acquisition in training set. Taking a scene as an example, materials need to be shot and extracted under different imaging conditions (illumination, smoke and noise) in different periods of time in different seasons in spring, summer, autumn, winter, sunny, snow and fog day and night, and the whole process consumes much manpower and financial resources from the establishment of video recording equipment to the video frame extraction processing.

Disclosure of Invention

In view of the above-mentioned shortcomings of the prior art, the present invention provides an image sampling method, system, device and medium to solve the above-mentioned technical problems.

The invention provides an image sampling method, which comprises the following steps:

sequentially acquiring images to be sampled;

determining a definition difference value according to the current definition of the image to be sampled and the previous definition of the previous image to be sampled;

if the image to be sampled meets a preset sampling condition, sampling the image to be sampled;

the preset sampling condition comprises at least one of the following conditions, wherein the image to be sampled comprises a target object; the definition difference is greater than a preset definition difference.

Optionally, if the preset sampling condition includes that the definition difference of the image to be sampled is greater than the preset definition difference and the image to be sampled includes the target object, the image to be sampled satisfying the preset sampling condition includes:

respectively acquiring the current definition of the image to be sampled and the previous definition of the previous image to be sampled;

determining a definition difference value according to the current definition and the previous definition;

if the definition difference value is larger than a preset definition difference value, performing target detection on the image to be sampled;

and if the image to be sampled comprises the target object, the image to be sampled meets a preset sampling condition.

Optionally, the definition determining method includes:

where Ten is the sharpness, n is the total number of pixels in the image, and S (x, y) is the gradient at image pixel I (x, y).

Optionally, the target detection is performed on the image to be sampled by using a target detection model, and before the target detection is performed on the image to be sampled, the method further includes at least one of the following steps:

if the current definition is lower than the previous definition, reducing a detection threshold value of the target detection model;

if the current definition is higher than the previous definition, the detection threshold value of the target detection model is increased;

wherein the detection threshold comprises at least one of a confidence and a cross-over ratio.

Optionally, the method further comprises at least one of:

the target detection model carries out target detection on the foreground of the image to be sampled;

and the target detection model is pre-configured with a plurality of groups of confidence degrees and cross-over ratios.

Optionally, before the images to be sampled are sequentially acquired, the method further includes:

the method comprises the steps of obtaining a plurality of initial images, grouping the initial images according to a shooting scene, sequencing the initial images according to group classes, sequentially using the initial images in each group as alternative images, and waiting for the alternative images to be obtained sequentially.

Optionally, if the sharpness difference of the image to be sampled is smaller than a preset sharpness difference, and the plurality of initial images are determined according to a plurality of continuous video frames, the method further includes:

and taking the preset number of the alternative images behind the image to be sampled as the next image to be sampled.

The present invention also provides an image sampling system, the system comprising:

the image acquisition module is used for sequentially acquiring images to be sampled;

the determining module is used for determining a definition difference value according to the current definition of the image to be sampled and the previous definition of the previous image to be sampled;

the sampling module is used for sampling the image to be sampled if the image to be sampled meets a preset sampling condition;

wherein the preset sampling condition comprises at least one of the following conditions, and the image to be sampled comprises a target object; the definition difference is greater than a preset definition difference.

The invention also provides an electronic device, which comprises a processor, a memory and a communication bus;

the communication bus is used for connecting the processor and the memory;

the processor is configured to execute the computer program stored in the memory to implement the method according to any one of the embodiments described above.

The present invention also provides a computer-readable storage medium, having stored thereon a computer program,

the computer program is for causing a computer to perform a method as in any one of the embodiments described above.

The invention has the beneficial effects that: according to the image sampling method, the system, the equipment and the medium, the image to be sampled is sequentially acquired, whether the image to be sampled meets the preset sampling condition is judged, if the image to be sampled meets the condition that the image to be sampled comprises the target object and/or the definition difference value of the image to be sampled is larger than the preset definition difference value, the image to be sampled is sampled, the image sampling can be conveniently, accurately and quickly realized, the difficulty of image sampling of a training set is reduced, the cost of image acquisition is reduced, and the dependence on manpower is reduced.

Drawings

FIG. 1 is a schematic flow chart of an image sampling method provided in an embodiment of the present invention;

fig. 2 is a schematic diagram of a structure of a Sobel convolution kernel provided in an embodiment of the present invention;

fig. 3 is another schematic diagram of the structure of the Sobel convolution kernel provided in an embodiment of the present invention;

FIG. 4 is a flow chart illustrating an exemplary image sampling method provided in an embodiment of the present invention;

FIG. 5 is a schematic flow chart of a specific image sampling method provided in an embodiment of the present invention;

FIG. 6 is a schematic diagram of an embodiment of an image sampling system;

FIG. 7 is a schematic diagram of a specific structure of an image sampling system according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

In the following description, numerous details are set forth to provide a more thorough explanation of embodiments of the present invention, however, it will be apparent to one skilled in the art that embodiments of the present invention may be practiced without these specific details, and in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring embodiments of the present invention.

As shown in fig. 1, the present embodiment provides an image sampling method, including:

step S101: and sequentially acquiring images to be sampled.

Step S102: and determining a definition difference value according to the current definition of the image to be sampled and the previous definition of the image to be sampled.

Step S103: and if the image to be sampled meets the preset sampling condition, sampling the image to be sampled.

In an embodiment, the obtaining manner of the image to be sampled may be the following method, that is, before step S101, the method further includes:

the method comprises the steps of obtaining a plurality of initial images, grouping the initial images according to a shooting scene, sequencing the initial images according to group types, sequentially taking the initial images in each group as alternative images, and waiting for the alternative images to be obtained sequentially. The candidate images are acquired in sequence, and the currently acquired candidate image is used as an image to be sampled. The previously acquired alternative image is used as a history image to be sampled, an alternative image acquired before the image to be sampled is used as a previous image to be sampled, or a plurality of acquired alternative images before the image to be sampled are used as previous images to be sampled, and at this time, the previous definition of the previous image to be sampled can be determined by the average value of the plurality of alternative images. For example, after the initial images are grouped and sorted, the

candidate image sequence

1, 2, 3, 4, 5, 6, 7, 8, 1 is obtained first. Currently, the acquired image to be sampled is 6, and the previous image to be sampled may be 5, or 2, 3, 4, 5, or other candidate images that have been acquired by a preset number by those skilled in the art.

Wherein the initial image may be one or more pieces of video. When the initial image is derived from a plurality of videos, the plurality of videos may be shot based on one shooting scene or shot based on a plurality of shooting scenes. The photographic scene may be an image background. The same shooting scene can be determined according to the image background with the same image background or the image background similarity higher than a preset background similarity threshold.

Optionally, the initial images in each group may be ordered according to the acquisition time order of the initial images. When the initial image is derived from a video, the ordering may be according to the order of the video frames.

Optionally, the images are sorted according to group, that is, a plurality of grouped initial image groups are sorted, and at this time, a sorting rule may be set by a person skilled in the art according to needs. For example, the initial image shooting scene primary screening can be manually completed to obtain a plurality of initial image groups, the initial image groups are sorted according to the shooting scene, and at the moment, the initial image groups can be sorted according to similar or same shooting scenes, so that the efficiency of subsequent image sampling can be improved.

Optionally, only one candidate image is determined each time as an image to be sampled for use in the image sampling method.

Alternatively, the initial image may be obtained from open source data, or may be acquired by legal means, or the like, as known to those skilled in the art.

Wherein the preset sampling condition comprises at least one of the following conditions:

the image to be sampled comprises a target object;

and the definition difference value of the image to be sampled is greater than the preset definition difference value.

In one embodiment, the sharpness difference is determined based on a current sharpness of the image to be sampled and a previous sharpness of a previous image to be sampled.

For example, the currently acquired image to be sampled is denoted as a, one or more acquired images to be sampled before a are denoted as B, the definition of a is obtained as the current definition and the definition of B is obtained as the previous definition, and at this time, the definition difference is | a-B |.

The preset sharpness difference may be a value set by a person skilled in the art, and the preset sharpness difference may be one or more, for example, the preset sharpness difference may be two, and one preset sharpness difference is used when the current sharpness is greater than the previous sharpness, and another preset sharpness difference is used when the current sharpness is less than the previous sharpness.

It should be noted that the above definition difference determination method is only an optional example, and the change amplitude of the definition may also be determined as the definition difference, and still taking the above example as an example, the definition difference determination method may also be the definition difference | a-b |/b. At this time, the preset definition difference is also adjusted correspondingly. The definition difference may also be determined in other ways known to those skilled in the art.

In one embodiment, the sharpness may be determined based on the gradient of pixels in the image to be sampled, the total number of pixels in the image.

Alternatively, the sharpness may be determined based on the total number of pixels of the image to be sampled and the similar gradient of each image.

Specifically, the definition determination method includes:

That is, either the current sharpness or the previous sharpness may be determined in the manner described above.

Alternatively, the gradient S (x, y) at the image pixel I (x, y) may be determined in the following manner:

wherein, let the Sobel convolution kernel be Gx, Gy, and an example of the structure of the Sobel convolution kernel can be seen in fig. 2 and fig. 3.

Optionally, the definition determining mode may be implemented by using models such as a tenegrad definition evaluation model.

Optionally, by determining the sharpness difference, it may be determined whether there is a significant scene change between the currently acquired image to be sampled and the previously acquired image to be sampled, or whether there is a significant change in imaging conditions between the currently acquired image to be sampled and the previously acquired image to be sampled. It can be understood that, when the sharpness difference is smaller than the preset sharpness difference, or the sharpness difference falls within a preset difference range, it can be considered that the shooting scene between the currently acquired image to be sampled and the previously acquired image to be sampled is almost unchanged, and at this time, if each image is sampled, repeated images exist, which causes data redundancy and resource waste. At this time, the currently acquired image to be sampled is not sampled any more. When the definition difference is greater than the preset definition difference, or the definition difference does not fall within the preset difference range, it may be considered that the shooting scene between the currently acquired image to be sampled and the previously acquired image to be sampled changes, and if the scene is switched, the scene background changes, the illumination changes, the smoke occlusion, and the like, the currently acquired image to be sampled needs to be sampled, and different samples for sampling are added. The consideration of whether the definition difference is greater than the preset definition difference is mainly based on the definition evaluation principle of a non-reference image, namely the gradient calculation of the image, and the image with more details is included, namely more gradient changes exist in the clear image.

In one embodiment, if the sharpness difference of the to-be-sampled image is smaller than a preset sharpness difference, and a plurality of the initial images are determined according to a plurality of consecutive video frames, the method further includes:

It can be understood that, as mentioned in the foregoing embodiment, the initial images are substantially grouped and sequenced, that is, before the images to be sampled are sequentially acquired one by one, an image queue to be selected, which is formed by a plurality of initial images, is preset, if the difference in sharpness of the images to be sampled is smaller than the preset difference in sharpness, that is, the shooting scene between the currently acquired image to be sampled and the previously acquired image to be sampled is hardly changed, if the source of the initial images is a continuous image frame, if the images to be sampled are continuously acquired according to the sequence of the image queue to be selected at this time, the difference in sharpness of the images is still smaller than the preset difference in sharpness with a very high probability, which results in wasted computation. At this time, after skipping a preset number N of initial images (a preset number) in the image queue to be selected, the next N initial images are used as the next image to be sampled. In this way, computing power and other related resources can be effectively saved. Optionally, the determination manner of the preset number may be set by a person skilled in the art as needed, or may be determined according to the definition difference, for example, a mapping relationship between the definition difference and the preset number is preset by a person skilled in the art, and then the proper preset number is determined from the mapping relationship according to the definition difference. Or according to a preset function, the function is the relation between the definition difference and the preset number, and at the moment, the definition difference is substituted into the preset function, so that the preset number can be determined. The predetermined number may also be determined in other ways known to those skilled in the art.

In one embodiment, the mode of judging whether the image to be sampled includes the target object may be that the image to be sampled is subjected to target detection through a pre-trained target detection model, and then a detection result of whether the image to be sampled includes the target object is obtained. Optionally, in the training process of the target detection model, corresponding confidence degrees and Intersection ratios IoU (Intersection over Union) are set for different shooting scenes, and at this time, a target detection model is preset with multiple sets of confidence degrees and Intersection ratios IoU.

In one embodiment, the target detection is performed on the image to be sampled through the target detection model, and before the target detection is performed on the image to be sampled, the method further includes at least one of the following steps:

if the current definition is lower than the previous definition, reducing the detection threshold of the target detection model;

It can be understood that the current definition is lower than the previous definition, for example, the definition of the current image to be sampled is lower than that of the previous image to be sampled due to influence of weather conditions such as illumination, rain, snow, fog, and the like in road monitoring, and at this time, in order to improve the detection rate and accuracy of target detection, the detection threshold (such as confidence, IoU, and the like) of the target detection model may be reduced, so that although a certain accuracy is sacrificed, more images to be sampled including the target object may be detected, and therefore, it is avoided that the detection threshold of the target detection model is too high due to the too low definition, and a large amount of images to be sampled including the target object are falsely detected, even all images are not detected, and the applicability of the method is affected. Similarity, to current definition being higher than former definition, the detection threshold reason that promotes target detection model is similar with above-mentioned current definition is less than former definition, has promoted when the definition, and imaging quality becomes good, then needs target detection model's detection precision higher, avoids the false retrieval.

Optionally, as in the foregoing embodiment, if the target detection model is configured with a preset confidence level and an intersection ratio, the confidence level and/or the intersection ratio of the target detection model may be reduced or improved according to a magnitude relationship between the current definition and the previous definition.

In one embodiment, the method further comprises at least one of:

the target detection model is pre-configured with multiple sets of confidence degrees and cross-over ratios.

In one embodiment, the target detection model may be a model trained by a target detection algorithm based on a convolutional neural network and YOLO. The target detection model can also be a neural network detection model and a classification model based on a deep learning algorithm, image features are extracted by adopting resnet50 as a backbone, multi-scale feature information is extracted by adopting an FPN network, and finally yolov3 is used as a detection head for detection, and a deep learning frame Pythrch or TensorFlow is used for constructing a target detection model structure. The target detection model may be trained by at least one of a sampling image obtained by the previous manual work and the simple definition detection, and an image acquired by a person skilled in the art through other known methods, and the trained target detection model is used to perform the target detection on the image to be sampled.

In one embodiment, if the preset sampling condition includes that the definition difference of the image to be sampled is greater than the preset definition difference and the image to be sampled includes the target object, the step of satisfying the preset sampling condition by the image to be sampled includes:

respectively acquiring the current definition of an image to be sampled and the previous definition of a previous image to be sampled;

if the definition difference value is larger than the preset definition difference value, performing target detection on the image to be sampled;

and if the image to be sampled comprises the target object, the image to be sampled meets the preset sampling condition.

In other words, in an image sampling method, the definition of an acquired image to be sampled is detected, when the definition fluctuates greatly, the image is subjected to target detection, and if the image to be sampled includes a target object, the image to be sampled is subjected to image sampling. Therefore, data sampling can be realized more efficiently, and the accuracy and the usability of sampling data (images) are improved.

For example, mass data is sequentially input into a definition evaluation module, the definition of an image is calculated, historical definition is recorded, when the definition is greatly changed and the difference value between the definition and the historical definition is greater than a set threshold value, the image scene is judged to be greatly changed, such as scene switching, scene illumination and other transformations, and the image and subsequent images are sampled (the number of samples is set); and if the definition difference is smaller than the set threshold, judging that the image scene is not obviously changed, and skipping the partial image without sampling if the image scene is a repeated image. (2) And (3) foreground target detection: detecting a specified target by using a deep learning detection model, and sampling an image when the image contains a specified foreground, such as a vehicle, a human body and the like which are interested by a user; if not, skipping sampling. The invention also provides an auxiliary guidance for foreground target detection by using definition: and when the definition of the current image is obviously lower than the historical definition, reducing the threshold value of target detection and improving the detection rate of target detection, and when the definition of the image is higher than the historical definition, improving the detection threshold value.

In one embodiment, when the initial images are of the same video series, the sharpness evaluation model can be optimized by detecting the floating degree of the confidence level according to the same video series and providing a monitoring signal function for the sharpness evaluation model in a reaction mode.

The embodiment is based on a definition evaluation mechanism and a foreground target detection algorithm, realizes automatic scene state judgment of mass data, automatically specifies target detection judgment, completes automatic and high-efficiency data sampling, reduces the working difficulty of model training personnel in data preparation, and improves the data set construction efficiency.

The embodiment provides an image sampling method, whether the image to be sampled meets the preset sampling condition is judged by sequentially acquiring the image to be sampled, if the image to be sampled meets the requirement that the image to be sampled comprises a target object and/or the definition difference value of the image to be sampled is greater than the preset definition difference value, the image to be sampled is sampled, image sampling can be conveniently, accurately and rapidly achieved, the difficulty of image sampling of a training set is reduced, the cost of image acquisition is reduced, and dependence on manpower is reduced.

Optionally, through the application of the definition evaluation technology and the target detection technology, the image to be sampled is subjected to definition and target detection at least one dimension screening, and the image to be sampled which passes the screening is sampled, so that the data sampling efficiency is improved, the collection frequency of repeated invalid data is reduced, the difficulty in constructing the data set is reduced, a large amount of data collection, data marking and data correction work can be saved, the construction of a balanced data set is facilitated, and the labor is saved.

Optionally, the image sampling method described in the above embodiment may be applied to a conventional target detection algorithm, for example, the video imaging quality is affected by weather conditions such as light, rain, snow, fog, and the like in road monitoring, and key parameters such as confidence, IOU, and the like in a target detection technology may be automatically provided according to a definition evaluation technology, so that the detection rate and accuracy of target detection are improved.

Optionally, the images to be sampled may be preliminarily screened through the definition dimension (whether the definition difference of the images to be sampled is greater than the preset definition difference), and the images to be sampled whose definition difference is greater than the preset definition difference are used for the training set of the target detection model, so that the labeling difficulty of the training set of the target training model is reduced, and the stability of the confidence coefficient of the target detection model is improved.

In addition, data acquisition in the current AI artificial intelligence industry mainly depends on manual work, and is completed by data engineers and model engineers, and on one hand, the data acquisition is easily influenced by personal acquisition habits, subjective willingness, understanding depth of models and other factors. For example, in a traffic scene data collection work, an engineer A collects data of the first half part, an engineer B collects similar weather condition data at the same gate without knowing the previous data collection situation, and omits the data after the weather condition changes, and the model requires the data of the second half part because the data of the second half part can enrich the model training scene more greatly. The image sampling method provided by the embodiment can automatically complete the preferential sampling of the monitored scene materials, greatly reduce the influence caused by external factors such as the change of a data engineer and the like, and reduce the manual participation. Meanwhile, the method has good popularization, such as automatic alarm video recording in the monitoring field, cost reduction and efficiency improvement.

The inventor finds that in recent years, the AI technology enables the security monitoring industry to meet a new rapid development, and the city monitoring network construction around the world faces comprehensive 'artificial intelligence' upgrading and updating. The target detection technology, as one of the important branches of the AI artificial intelligence technology, is also a core technology driving the updating of security monitoring, and the development of the target detection technology is more and more overlapped, thus profoundly influencing the development of the security monitoring industry. In recent years, scenes responded by the security monitoring industry have been identified from narrow urban main road traffic vehicles, human body identification, personnel vehicle violation light running snapshot, generalized urban road vehicle identification, forest region personnel identification, border line human body vehicle detection identification, human body identification in shopping malls, storage battery vehicle identification in elevators and the like. Therefore, artificial intelligence technologies such as target detection and the like are gradually replacing the old generation of personnel monitoring technologies and become important technical members for constructing modern public security systems.

The security monitoring equipment of the modern public security system puts forward higher and almost strict requirements on the generalization of a target detection model. And the model generalization needs mass picture information to be taken as a training set and abstracted to the characteristic information of the detection target. Taking a human brain as an example, from the moment of birth, the external scene image information is received by eyes continuously for decades, and the information is analyzed and refined by the brain, so that when encountering objects such as vehicles in a scene, the conclusion that the object is a vehicle can be obtained by combining the information such as the appearance and the like with the subjective judgment of the brain, and the model learns the appearance characteristics of the object under different background characteristics, different angles and different imaging conditions from the training of mass data.

In the whole year, the number of scene picture information received by human eyes is hundreds of millions, the number of target detection model training data sets in the security monitoring industry is tens of thousands to millions at present, and the picture information received and processed by the human brain is orders of magnitude different from the number of picture information received and processed by the human brain. For the reasons, the difficulty and time consumption of manual data acquisition are limited, for example, a scene is taken as an example, materials need to be photographed and extracted under different imaging conditions (illumination, smoke and noise) at different times in different seasons and in different times in the daytime, at night, in spring, summer, winter, fine rain, snow and fog, the whole process consumes much manpower and financial resources from the construction of video recording equipment to the frame extraction processing of videos and then to the manual labeling processing. For another example, when a data set is constructed at present, screening of a large amount of data is based on manual processing, data of different scenes are selected from the mass data, data of different prospects (including different targets) and data of different imaging conditions (influenced by illumination and smoke) are selected from the same scene data, and a large amount of manpower is required in the scene screening and matching process, the video material frame extraction process and the video material labeling process, so that the limitation of low efficiency exists.

In order to solve the above problems, the following describes the image sampling method exemplarily by a specific embodiment, and data sampling of different foreground targets and background transformation (such as illumination, rain fog, etc.) is completed for the same scene sequence data by a scene-adaptive-based data sampling method in combination with a deep learning target detection model and a definition evaluation function, so as to realize automatic scene data screening and sampling, greatly save manpower, improve the target detection model data set construction efficiency, greatly reduce the data sampling difficulty, save manpower, and improve the efficiency.

For example, referring to fig. 4, fig. 4 is a flowchart illustrating a specific image sampling method. According to the image sampling method, based on massive image data which can be acquired by technical personnel in the field, sample data (sampling images) of different foreground targets, different imaging conditions (illumination and smoke) and the like which are useful for a deep learning model can be automatically screened out from a large amount of same scene data through manual preprocessing or other modes known by the technical personnel in the field, the manpower-saving screening cost can be realized, the difficulty in constructing the deep learning data set is reduced, and the model training efficiency is improved. Specifically, firstly, mass image data are collected from open source data or user data to serve as data storage, the mass image data are subjected to manual preprocessing to obtain an initial image for automatic sampling, scene rough screening can be manually completed in a manual preprocessing mode, the data are sequenced according to scenes, similar or identical scenes are sequentially arranged to obtain an image to be sampled, and therefore the efficiency of subsequent automatic sampling can be improved. And finally, carrying out automatic sampling. The automatic sampling comprises two parts, namely definition calculation and foreground object detection. (1) And (3) calculating the definition: inputting mass data into a definition evaluation module in sequence, calculating the definition of an image, recording historical definition, judging that an image scene has large change, such as scene switching, scene illumination and the like, and sampling the image and subsequent images (setting the sampling quantity) when the definition is changed greatly and the difference value between the definition and the historical definition is larger than a set threshold value (the definition difference value of the image to be sampled is larger than a preset definition difference value); and if the definition difference is smaller than the set threshold, judging that the image scene is not obviously changed, and skipping the partial image without sampling if the image scene is a repeated image. (2) And (3) foreground target detection: detecting a specified target by using a deep learning detection model, and sampling an image when the image contains a specified foreground, such as a vehicle, a human body and the like which are interested by a user; if not, skipping sampling. And carrying out data sampling on the image to be sampled which is subjected to definition calculation and foreground target detection to obtain a sampled image. Optionally, the definition may also be used to assist in guiding foreground object detection: if the definition of the current image is obviously lower than the historical definition, reducing the threshold value of target detection and improving the detection rate of target detection, and if the definition of the current image is higher than the historical definition, improving the detection threshold value. Based on a definition evaluation mechanism and a foreground target detection algorithm, automatic scene state judgment of mass image data is realized, target detection judgment is automatically specified, automatic and efficient data sampling is completed, the working difficulty of model training personnel in data preparation is reduced, and the data set construction efficiency is improved.

Referring now to fig. 5, the image acquisition method described above is illustrated schematically by another embodiment of a specific image sampling method.

S501: and acquiring mass image data.

For example, according to the deep learning model training requirement, a large amount of image data is collected from open source data and user data and used as a data set construction basis.

S502: and constructing a scene image database.

For example, the mass data is manually preprocessed, and similar and same scene data are sequentially arranged to obtain an image to be sampled.

S503: and (5) evaluating the definition.

The image to be sampled is subjected to primary screening by sequentially determining the definition of the image to be sampled, and when the definition difference is larger than a set threshold (preset definition difference), the image to be sampled is used as a sample image in a training set of a target detection model.

S504: and (5) training a foreground target detection model.

For example, a neural network detection model and a classification model based on a deep learning algorithm are constructed, image features are extracted by adopting resnet50 as a backbone, multi-scale feature information is extracted by adopting an FPN network, and finally yolov3 is adopted as a detection head for detection, and the model structure is constructed by utilizing a deep learning frame Pythrch or TensorFlow.

Optionally, the data obtained in step S503 may be used to train about 200 rounds to obtain a better foreground object detection model, such as vehicle detection and pedestrian detection.

S505: scene adaptive data sampling.

Scene adaptive data sampling can be completed based on the sharpness evaluation in step S503 and the target detection model trained in step S504, and a sampled image is obtained.

Referring to fig. 6, the present embodiment provides an image sampling system 600, including:

an image obtaining module 601, configured to sequentially obtain images to be sampled;

a determining module 602, configured to determine a sharpness difference according to a current sharpness of an image to be sampled and a previous sharpness of a previous image to be sampled;

the sampling module 603 is configured to sample the image to be sampled if the image to be sampled meets a preset sampling condition;

the preset sampling condition comprises at least one of the following conditions, wherein the image to be sampled comprises a target object; and the definition difference value of the image to be sampled is greater than the preset definition difference value.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a specific image sampling system, in which a determination module 602 includes three modules: a scene state judgment module 701 based on a definition evaluation mechanism; a convolutional neural network based foreground object detection module 702; a scene adaptive based data sampling module 603. The specific data processing mode of each model is as follows:

1. and the scene state judging module is based on a definition evaluation mechanism.

The main principle of the no-reference image definition evaluation is gradient calculation of an image, and the image contains more details, namely, the image with more gradient changes inside a clear image. Based on the definition evaluation principle, the scene state judgment module 701 automatically judges the state of a scene through a definition evaluation function, such as scene switching and scene background change (illumination and smoke shielding), when the image definition change is not large, the scene is almost unchanged, the sampling interval is expanded during sampling, and the sampling of repeated images is reduced; when the image definition changes greatly, such as illumination change, smoke shielding and even scene switching, the sampling interval is reduced, and the changed scene is sampled. For example, the sharpness calculation may be performed using a sharpness evaluation function based on a tenebad algorithm.

The calculation mode of the Tenegrad algorithm is as follows:

assuming that the Sobel convolution kernel is Gx, Gy, the gradient at the image pixel I (x, y) point is the above equation (2), and assuming that the Sobel convolution kernel is shown in fig. 2 and fig. 3, in this case, the definition determination method can refer to the above equation (1).

2. And the foreground object detection module is based on a convolutional neural network.

The target detection module 702 mainly implements detection of a specified foreground target, thereby implementing image sampling including the specified target, and if images including a human body or a vehicle need to be acquired, the module is used to complete real-time target detection, and samples images with detection results. The invention adopts a target detection algorithm based on a convolutional neural network and YOLO to complete the detection of the foreground target.

3. Data sampling based on scene adaptation.

The sampling module 703 is configured to sample the image to be sampled, which is determined by the scene state determining module 701 and the target detecting module 702.

By combining the definition evaluation technology and the deep learning target detection technology, the data sampling of the designated target, the data sampling of different scenes and the data sampling of the same scene with transformation are realized in the mass data.

In this embodiment, the image sampling system is substantially provided with a plurality of modules for executing the method in the above embodiments, and specific functions and technical effects may refer to the above method embodiments, which are not described herein again.

Referring to fig. 8, an embodiment of the present invention further provides an electronic device 1000, which includes a processor 1001, a memory 1002, and a communication bus 1003;

the communication bus 1003 is used to connect the processor 1001 and the memory 1002;

the processor 1001 is configured to execute the computer program stored in the memory 1002 to implement the method according to one or more of the first embodiment.

Embodiments of the present invention also provide a computer-readable storage medium, having a computer program stored thereon,

the computer program is for causing a computer to perform the method as in any one of the above embodiments one.

Embodiments of the present application also provide a non-transitory readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a device, the device may execute instructions (instructions) included in an embodiment of the present application.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. A method of image sampling, the method comprising:

sequentially acquiring images to be sampled;

2. The image sampling method according to claim 1, wherein if the preset sampling condition includes that the sharpness difference of the image to be sampled is greater than a preset sharpness difference and the image to be sampled includes a target object, the image to be sampled satisfying the preset sampling condition comprises:

3. The image sampling method of claim 1, wherein the sharpness determination comprises:

4. The image sampling method of any one of claims 1-3, wherein the image to be sampled is subject to target detection by a target detection model, and before the target detection of the image to be sampled, the method further comprises at least one of:

5. The image sampling method of claim 4, wherein the method further comprises at least one of:

6. The image sampling method of any of claims 1-3, wherein prior to sequentially acquiring images to be sampled, the method further comprises:

7. The image sampling method of claim 6, wherein if the sharpness difference of the image to be sampled is less than a predetermined sharpness difference and a plurality of the initial images are determined from a plurality of consecutive video frames, the method further comprises:

8. An image sampling system, characterized in that the system comprises:

9. An electronic device comprising a processor, a memory, and a communication bus;

the communication bus is used for connecting the processor and the memory;

the processor is configured to execute a computer program stored in the memory to implement the method of any one of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program,

the computer program is for causing a computer to perform the method of any one of claims 1-7.