CN116824445A - Scene classification method and device based on feature points and image transformation frequency - Google Patents

Scene classification method and device based on feature points and image transformation frequency Download PDF

Info

Publication number
CN116824445A
CN116824445A CN202310714287.3A CN202310714287A CN116824445A CN 116824445 A CN116824445 A CN 116824445A CN 202310714287 A CN202310714287 A CN 202310714287A CN 116824445 A CN116824445 A CN 116824445A
Authority
CN
China
Prior art keywords
value
threshold
feature points
response
distribution variance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310714287.3A
Other languages
Chinese (zh)
Inventor
段鹏瑞
马华东
张韫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202310714287.3A priority Critical patent/CN116824445A/en
Publication of CN116824445A publication Critical patent/CN116824445A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The application provides a scene classification method and device based on feature points and image transformation frequency, comprising the following steps: acquiring videos to be classified, converting the videos into a continuous frame image sequence, and extracting key frames from the continuous frame image sequence based on a histogram variation method; extracting a first preset number of FAST feature points from each key frame based on a preset gray value threshold; calculating the response value average value and the distribution variance of a first preset number of FAST feature points; comparing the response value average value and the distribution variance with a response value average value threshold value and a distribution variance threshold value in a preset natural scene and a screen scene according to a preset classification rule so as to classify the key frames; when the classification can not be performed through the comparison threshold or the number of the FAST feature points is insufficient, obtaining a second preset number of frame images after corresponding key frames, and classifying the key frames through comparing the image transformation frequency with a preset frequency threshold. The method provided by the application does not need to train a classification model, and has high classification speed and high precision.

Description

Scene classification method and device based on feature points and image transformation frequency
Technical Field
The application relates to the technical field of scene classification, in particular to a scene classification method and device based on feature points and image transformation frequency.
Background
Scene classification refers to classifying an input image or video into different scene categories, and illustratively includes natural scenes and screen scenes, wherein the natural scenes generally refer to real environments outdoors or indoors, and the screen scenes generally refer to interfaces on computer screens, mobile phone screens and other devices.
Feature points are widely used in scene classification tasks, and in the field of image processing, feature points refer to points in an image that have specific features, such as edges, corner points, textures, and the like. The location and description of these feature points can be used to represent local features of the image. In the patent of publication number CN114373145A, entitled "method for classifying monitoring video scenes based on key frame acquisition of ORB algorithm", it is described that the tags corresponding to the videos are obtained by extracting ORB characteristic points and matching with a pre-trained model so as to classify the videos; in the publication number CN104680173a, a patent entitled "a remote sensing image scene classification method" describes that images are first pre-classified according to the distribution condition of local invariant feature points in the images, and then the images are classified into two types, namely, uniform feature point distribution and nonuniform feature point distribution, and then training and classification are performed by adopting different models for the two types of images. However, in the existing scene classification scheme, a classification model or classifier needs to be trained in advance, the training process is complicated, and the actual classification speed is low.
Patent publication number CN107666610a, entitled "desktop video content analysis method", describes that image conversion frequency is used as a standard for video classification, a block hash value of a video frame in a window period is detected, and compared with a first frame, video is classified according to the number of frames having a difference. The method shows the transformation speed of the image, but has poor classification precision.
Disclosure of Invention
In view of this, the embodiment of the application provides a scene classification method and device based on feature points and image transformation frequency, so as to eliminate or improve one or more defects existing in the prior art, and solve the problems that the prior classification technology needs to train a classification model or classifier in advance, the training process is complicated, the classification speed is slow and the classification precision is poor.
In one aspect, the present application provides a scene classification method based on feature points and image transformation frequencies, which is characterized in that the method includes the following steps:
acquiring videos to be classified, converting the videos into a continuous frame image sequence, and extracting key frames from the continuous frame image sequence based on a histogram variation method;
extracting a first preset number of FAST feature points from each key frame based on a preset gray value threshold;
calculating the response value average value and the distribution variance of the first preset number of FAST feature points;
acquiring a first response value average value threshold value and a first distribution variance threshold value in a preset natural scene, and acquiring a second response value average value threshold value and a second distribution variance threshold value in a screen scene; classifying the corresponding keyframes as natural scenes if the response value average value is smaller than the first response value average value threshold and the distribution variance is smaller than the first distribution variance threshold, and classifying the corresponding keyframes as screen scenes if the response value average value is larger than the second response value average value threshold and the distribution variance is larger than the second distribution variance threshold;
when the response value average value and the distribution variance of the FAST feature points do not meet the classification rule, or the number of the FAST feature points is smaller than the first preset number, acquiring a second preset number of frame images after corresponding key frames, and calculating the image transformation frequency by judging whether two adjacent frame images in the second preset number of frame images are transformed, if the image transformation frequency is larger than a preset frequency threshold, classifying the corresponding key frames as natural scenes, and if the image transformation frequency is smaller than the frequency threshold, classifying the corresponding key frames as screen scenes.
In some embodiments of the present application, extracting key frames from the sequence of consecutive frame images based on a histogram variance method further comprises:
converting the continuous frame image sequence into gray level images and calculating a gray level histogram;
and calculating the gray histogram difference between adjacent frames, and taking the current frame as the key frame if the difference is larger than a preset difference threshold.
In some embodiments of the present application, extracting a first preset number of FAST feature points from each key frame based on a preset gray value threshold value further includes:
in one cycle, based on a preset gray value threshold, extracting FAST feature points from each key frame; sorting the extracted FAST feature points according to the response values; modifying the preset gray value threshold value to be the gray value of the FAST feature point with the lowest response value;
and circulating the steps until the FAST feature points with the first preset number of high response values are obtained.
In some embodiments of the present application, the extracted FAST feature points are ordered according to response values, where the method for calculating the response values of the FAST feature points includes the following steps:
acquiring a preset gray value threshold value for extracting the FAST feature points, and calculating an intermediate value between the preset gray value threshold value and a maximum gray value to serve as an initial intermediate value;
in one cycle, if the initial intermediate value is smaller than the gray value of the FAST feature point, calculating the intermediate value of the initial intermediate value and the maximum gray value, and taking the intermediate value as a new initial intermediate value; if the initial intermediate value is larger than the gray value of the FAST feature point, calculating an intermediate value of the preset gray value threshold and the initial intermediate value, and taking the intermediate value as a new initial intermediate value;
and circulating the steps until a boundary gray value is obtained, and taking the boundary gray value as a response value of the FAST feature point.
In some embodiments of the present application, the method for calculating the distribution variance of the first preset number of FAST feature points includes the following steps:
dividing the corresponding key frame into a plurality of areas, and counting the number of FAST feature points in each area;
and calculating the variance of the quantity of the FAST feature points in all the areas to obtain the distribution variance.
In some embodiments of the present application, after the scene classification is implemented based on the image transformation frequency, the first response value average threshold and the first distribution variance threshold in the natural scene or the second response value average threshold and the second distribution variance threshold in the screen scene are adjusted according to the scene type.
In some embodiments of the application, further comprising:
when the corresponding key frame is judged to be a natural scene, if the response value average value is larger than the first response value average value threshold value, the first response value average value threshold value is increased, and the calculation formula is as follows:
wherein Tne represents the first response value average threshold; e represents the response value average value;
if the response value average value is smaller than the first response value average value threshold, and the distribution variance is larger than the first distribution variance threshold, the first distribution variance threshold is improved, and the calculation formula is as follows:
wherein Tnd represents the first distribution variance threshold; d represents the distribution variance;
when the corresponding key frame is judged to be a screen scene, if the response value average value is smaller than the first response value average value threshold value, the first response value average value threshold value is reduced, and the calculation formula is as follows:
wherein Tse represents the second response value average threshold; e represents the response value average value;
if the response value average value is greater than the first response value average value threshold, and the distribution variance is smaller than the first distribution variance threshold, the first distribution variance threshold is reduced, and the calculation formula is as follows:
wherein Tsd represents the second distribution variance threshold; d represents the distribution variance.
In some embodiments of the present application, by determining whether two adjacent frame images in the second preset number of frame images are transformed, the method further includes:
acquiring two adjacent frame images in the second preset number of frame images;
dividing two adjacent frames of images into a plurality of areas uniformly, and counting a gray level histogram of each area;
comparing the regional gray histograms of two adjacent frames, and recording the number of the transformed regions; if more than half of the areas are transformed, marking the current frame image to be transformed;
wherein the image conversion frequency represents the number of image frames in which conversion occurs in the second preset number of frame images.
In another aspect, the present application provides a scene classification device based on feature points and image transformation frequencies, comprising a processor and a memory, wherein the memory has stored therein computer instructions, the processor being configured to execute the computer instructions stored in the memory, the device implementing the steps of the scene classification method based on feature points and image transformation frequencies as described in any one of the above, when the computer instructions are executed by the processor.
In another aspect, the present application also provides a computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the steps of the scene classification method based on feature points and image transformation frequencies as described in any one of the above.
The application has the advantages that:
the application provides a scene classification method and device based on feature points and image transformation frequency, comprising the following steps: acquiring videos to be classified, converting the videos into a continuous frame image sequence, and extracting key frames from the continuous frame image sequence based on a histogram variation method; extracting a first preset number of FAST feature points from each key frame based on a preset gray value threshold; calculating the response value average value and the distribution variance of a first preset number of FAST feature points; comparing the response value average value and the distribution variance with a response value average value threshold value and a distribution variance threshold value in a preset natural scene and a screen scene according to a preset classification rule so as to classify the key frames; when the classification can not be performed through the comparison threshold or the quantity of the FAST feature points is insufficient, the transformation frequency of the images is adopted to complement, and the transformation frequency of the images is compared with a preset frequency threshold to classify the key frames by acquiring a second preset quantity of frame images after corresponding key frames. The scene classification method provided by the application does not need to train a classification model, avoids a complicated training process, and has high classification speed and high precision.
Furthermore, when the scene classification can not be performed by comparing the thresholds and the scene classification is needed to be obtained by adopting the image transformation frequency, the response value average value threshold and the distribution variance threshold under the corresponding scene can be dynamically adjusted according to the finally classified scene type so as to improve the classification precision.
Additional advantages, objects, and features of the application will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.
It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present application are not limited to the above-described specific ones, and that the above and other objects that can be achieved with the present application will be more clearly understood from the following detailed description.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and together with the description serve to explain the application. In the drawings:
fig. 1 is a schematic diagram of a scene classification method based on feature points and image transformation frequency according to an embodiment of the application.
Fig. 2 is a flowchart of a scene classification method based on feature points and image transformation frequency according to an embodiment of the application.
FIG. 3 is a flowchart of a method for calculating feature point response values according to an embodiment of the present application.
FIG. 4 is a flowchart illustrating the dynamic adjustment of scene threshold based on the image transformation frequency determination result according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following embodiments and the accompanying drawings, in order to make the objects, technical solutions and advantages of the present application more apparent. The exemplary embodiments of the present application and the descriptions thereof are used herein to explain the present application, but are not intended to limit the application.
It should be noted here that, in order to avoid obscuring the present application due to unnecessary details, only structures and/or processing steps closely related to the solution according to the present application are shown in the drawings, while other details not greatly related to the present application are omitted.
It should be emphasized that the term "comprises/comprising" when used herein is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.
It is also noted herein that the term "coupled" may refer to not only a direct connection, but also an indirect connection in which an intermediate is present, unless otherwise specified.
Hereinafter, embodiments of the present application will be described with reference to the accompanying drawings. In the drawings, the same reference numerals represent the same or similar components, or the same or similar steps.
It should be emphasized that the references to steps below are not intended to limit the order of the steps, but rather should be understood to mean that the steps may be performed in a different order than in the embodiments, or that several steps may be performed simultaneously.
In order to solve the problems that the prior classification technology needs to train a classification model or classifier in advance, the training process is complicated, the classification speed is slow and the classification precision is poor, the application provides a scene classification method based on characteristic points and image transformation frequency, as shown in fig. 1 and 2, the method comprises the following steps S101 to S105:
step S101: and acquiring videos to be classified, converting the videos into a continuous frame image sequence, and extracting key frames from the continuous frame image sequence based on a histogram variation method.
Step S102: based on a preset gray value threshold, a first preset number of FAST feature points are extracted from each key frame.
Step S103: and calculating the response value average value and the distribution variance of the first preset number of FAST feature points.
Step S104: acquiring a first response value average value threshold value and a first distribution variance threshold value in a preset natural scene, and acquiring a second response value average value threshold value and a second distribution variance threshold value in a screen scene; and classifying the corresponding key frames as natural scenes if the response value average value is smaller than the first response value average value threshold and the distribution variance is smaller than the first distribution variance threshold, and classifying the corresponding key frames as screen scenes if the response value average value is larger than the second response value average value threshold and the distribution variance is larger than the second distribution variance threshold.
Step S105: when the average value and the distribution variance of the response values of the FAST feature points do not meet the classification rule, or the number of the FAST feature points is smaller than the first preset number, obtaining a second preset number of frame images after corresponding key frames, and judging whether two adjacent frame images in the second preset number of frame images are transformed or not to calculate image transformation frequency, if the image transformation frequency is larger than a preset frequency threshold, classifying the corresponding key frames as natural scenes, and if the image transformation frequency is smaller than the frequency threshold, classifying the corresponding key frames as screen scenes.
In step S101, a video requiring scene classification is acquired, and the video is converted into a sequence of successive frame images using video editing software.
In some embodiments, video may be read and converted into a sequence of sequential frame images using the VideoCapture function in the open source computer vision library OpenCV.
After the continuous frame image sequence is acquired, key frames are extracted from the continuous frame image sequence based on a histogram variation method. The histogram is also called gray level histogram, and is a statistical chart used for representing the number distribution of each gray level pixel in the image. A gray level histogram is typically composed of 256 bars, representing all gray levels from 0 to 255. The horizontal axis represents the gray level, and the vertical axis represents the number of pixels in the image where the gray level appears. The gray level distribution condition of the image can be intuitively known through the gray level histogram.
In some embodiments, the method of extracting the key frame includes the following steps S1011 to S1012:
step S1011: the sequence of successive frame images is converted into gray images, and a gray histogram of each gray image is calculated.
Step S1012: and calculating the difference of the gray level histograms between adjacent frames, and taking the current frame as a key frame if the difference is larger than a preset difference threshold value.
In some embodiments, the gray level histogram gap between adjacent frames may be calculated using a histogram similarity metric method, such as cross entropy, correlation coefficients, and the like.
In some embodiments, the gray histogram for each gray image may be calculated using the calcHist function in the open source computer vision library OpenCV.
In some embodiments, only the video is converted into a sequence of consecutive frame images in step S101 without extracting key frames, i.e., the following steps S102 to S105 require feature point extraction and scene classification for each frame image in the sequence of consecutive frame images.
In step S102, a gray value threshold is preset, and N FAST feature points with high response values are extracted from each key frame by using the gray value threshold, where N represents a first preset number. The FAST feature point extraction algorithm is an algorithm for extracting image corner points, and can rapidly extract feature points capable of representing image information, and in the algorithm, the FAST corner points (feature points) are defined as: if a pixel differs significantly from a sufficient number of pixels in its surrounding neighborhood, the pixel may be a corner, and the degree of difference between the corner and the surrounding neighborhood pixels is the intensity of the corner.
Step S102 is specifically divided into the following steps S1021 to S1023:
step S1021: the gray scale image of the key frame in step S101 is acquired, and in some embodiments, the gray scale image may be denoised using a smoothing filter to reduce the false detection rate.
Step S1022: and performing feature point detection by using a FAST feature point extraction algorithm, extracting feature points in the key frame according to a preset gray value threshold, and calculating response values of the feature points.
Step S1023: and sequencing the extracted characteristic points according to the response values from high to low, and selecting the characteristic points with the first N high response values as FAST characteristic points required by the application.
In some embodiments, N FAST feature points may not be extracted at one time according to a preset gray value threshold, and the gray value threshold needs to be continuously adjusted to meet the number of required FAST feature points, specifically:
in one cycle, using a FAST feature point extraction algorithm, extracting FAST feature points from each key frame based on a preset gray value threshold (the number of extracted FAST feature points is less than N at this time); sorting the extracted FAST feature points according to the response values; and modifying the preset gray value threshold value to be the gray value of the FAST feature point with the lowest response value. And (5) circulating the steps until N FAST feature points with high response values are obtained.
In some embodiments, as shown in fig. 3, the method of calculating the response value of each FAST feature point in step S1022 may further include the following steps S10221 to S10223:
step S10221: and acquiring a preset gray value threshold value for extracting the corresponding FAST feature points, and calculating an intermediate value between the preset gray value threshold value and the maximum gray value to serve as an initial intermediate value.
Step S10222: in one cycle, if the initial intermediate value is smaller than the gray value of the FAST feature point, calculating the intermediate value of the initial intermediate value and the maximum gray value, and taking the intermediate value as a new initial intermediate value; if the initial intermediate value is larger than the gray value of the FAST feature point, calculating the intermediate value of the preset gray value threshold and the initial intermediate value, and taking the intermediate value as a new initial intermediate value.
Step S10223: and circulating the steps until a boundary gray value is obtained, and taking the boundary gray value as a response value of the FAST feature point.
In step S10221, the obtained preset gray-value threshold is further described first: for example, if the corresponding FAST feature point is not extracted in the first feature point extraction, after the above-mentioned method for extracting feature points is adopted, the gray value threshold is adjusted to 95, and the FAST feature point is extracted, then the preset gray value threshold obtained in step S10221 is 95. The maximum gray value is the gray value having the largest value among the pixel values of the image, and the maximum gray value is 255 in the 8-bit gray image. The starting intermediate value at this point is 175.
In step S10222, the initial intermediate value is compared with the gray value of the FAST feature point, that is, it is tested whether the FAST feature point is still extracted at the gray value of the initial intermediate value, and is determined as a feature point. Specifically, if the initial intermediate value is smaller than the gray value of the FAST feature point, that is, the FAST feature point is still considered as a feature point, the intermediate value of the initial intermediate value and the maximum gray value is taken as a new initial intermediate value, and the initial intermediate value is 175, the maximum gray value is 255, and the new initial intermediate value is 215; if the initial intermediate value is greater than the gray value of the FAST feature point, that is, the FAST feature point is no longer a feature point, the intermediate value between the preset gray value threshold and the initial intermediate value is taken as a new initial intermediate value, and the initial intermediate value is 175, and the new initial intermediate value is 135.
In step S10223, step S10222 is looped, a new initial intermediate value is continuously obtained, whether the FAST feature point is still extracted is tested by using the initial intermediate value, the FAST feature point is considered as a feature point, the interval range of the gray value is continuously narrowed, and finally, a boundary gray value is obtained, and the boundary gray value is used as a response value of the feature point.
In fig. 3, the initializing section T represents the preset gray-value threshold acquired in step S10221, bmax being the maximum gray-value 255; and then, step S10222 is circulated, values of bmax and bmin are updated continuously, the interval range between bmax and bmin is narrowed, and finally bmax=bmin is obtained, so that the boundary gray value is obtained.
In step S103, the mean response value and the variance distribution of the N FAST feature points on the corresponding key frame are calculated.
The response values of the FAST feature points obtained in step S102 are obtained, and the sum of the response values of the FAST feature points is divided by a first preset number N to obtain a response value average value.
The calculation method of the distribution variance includes the following steps S1031 to S1032:
step S1031: dividing the corresponding key frame into a plurality of areas, and counting the number of FAST feature points in each area. In some embodiments, the respective key frames may be divided into a plurality of regions using a meshing approach.
Step S1032: and calculating the variance of the quantity of the FAST feature points in all the areas to obtain distribution variance. Specifically, firstly, calculating the sum of the quantity of FAST feature points in all areas, dividing the sum by the quantity of the areas, and obtaining an average value; and calculating the square sum of the quantity of FAST feature points in all the areas, dividing the square sum by the quantity of the areas, and subtracting the square of the average value to finally obtain the variance.
In step S104, a preset first response value average value threshold and a first distribution variance threshold in a natural scene are obtained, a second response value average value threshold and a second distribution variance threshold in a screen scene are obtained, and the response value average value and the distribution variance of N FAST feature points on the corresponding key frame are compared with the thresholds in the two scenes to judge the scene type of the key frame. Specifically, the classification rule is:
if the response value average value is smaller than the first response value average value threshold value, and meanwhile, the distribution variance is smaller than the first distribution variance threshold value, classifying the corresponding key frame into a natural scene; and if the response value average value is larger than the second response value average value threshold value, and meanwhile, the distribution variance is larger than the second distribution variance threshold value, classifying the corresponding key frame as a screen scene.
In step S105, when the average value of the response values and the distribution variance of the FAST feature points on the corresponding key frame do not meet the classification rule in step S104, that is, the scene type cannot be determined by comparing the threshold values, or when the first preset number N of FAST feature points cannot be extracted in step S102, the transformation frequency of the image may be used to complement the scene type. Wherein, the failure to judge the scene type by comparing the threshold values means that: the response value average value is smaller than the first response value average value threshold value, and meanwhile, the distribution variance is larger than the first distribution variance threshold value; and the response value average value is larger than the first response value average value threshold value, and meanwhile, the distribution variance is smaller than the first distribution variance threshold value.
Specifically, a second preset number of frame images after the scene type cannot be judged or the corresponding key frames of enough FAST feature points cannot be extracted are extracted, and the second preset number is denoted as M by way of example.
In some embodiments, the method of determining whether an image is transformed includes the following steps S1051-S1053:
step S1051: and acquiring two adjacent frame images in the M frame images.
Step S1052: dividing two adjacent frames of images into a plurality of areas, and counting the gray level histogram of each area.
Step S1053: comparing the gray level histograms of the areas of two adjacent needles, recording the number of the areas subjected to transformation, and if more than half of the areas are subjected to transformation, marking that the current frame image is subjected to transformation.
And judging whether the M frames of images are transformed or not, and obtaining the image transformation frequency. If the image transformation frequency is larger than a preset frequency threshold, classifying the corresponding key frame as a natural scene, and if the image transformation frequency is smaller than the frequency threshold, classifying the corresponding key frame as a screen scene. For example, M takes 10, that is, 10 frames of images after obtaining the corresponding key frames, judges whether the 10 frames of images are transformed, and if 7 frames of images are marked for transformation, the image transformation frequency is 7; the frequency threshold is typically set to half the second preset number, i.e. 5; the image transformation frequency 7 is greater than the frequency threshold 5 and at this time the corresponding key frame is classified as a natural scene.
In some embodiments, as shown in fig. 4, after the scene classification is implemented based on the image transformation frequency in step S105, the first response value average threshold and the first distribution variance threshold in the natural scene or the second response value average threshold and the second distribution variance threshold in the screen scene may be adjusted according to the scene type.
Specifically, when the corresponding key frame is judged to be a natural scene, if the response value average value is greater than the first response value average value threshold value, the first response value average value threshold value is increased, and the calculation formula is shown in formula (1):
wherein Tne represents a first response value average threshold; e represents the response value average.
If the response value average value is smaller than the first response value average value threshold value, and the distribution variance is larger than the first distribution variance threshold value, the first distribution variance threshold value is improved, and the calculation formula is shown in the formula (2):
wherein Tnd represents a first distribution variance threshold; d represents the distribution variance.
When the corresponding key frame is judged to be the screen scene, if the response value average value is smaller than the first response value average value threshold value, the first response value average value threshold value is reduced, and the calculation formula is shown in the formula (3):
wherein Tse represents a second response value average threshold; e represents the response value average value;
if the response value average value is greater than the first response value average value threshold, and the distribution variance is smaller than the first distribution variance threshold, the first distribution variance threshold is reduced, and the calculation formula is shown in the formula (4):
where Tsd represents a second distribution variance threshold; d represents the distribution variance.
And dynamically adjusting the threshold values in the two scenes based on the steps and the formulas, and improving the classification precision.
As shown in Table 1, according to a large amount of experimental data, the scene classification method based on the feature points and the image transformation frequency provided by the application can be used for classifying the scene, and the classification accuracy is as high as more than 90%.
TABLE 1
The application also provides a scene classification device based on the characteristic points and the image transformation frequency, which comprises a processor and a memory, and is characterized in that the memory is stored with computer instructions, the processor is used for executing the computer instructions stored in the memory, and the device realizes the steps of the scene classification method based on the characteristic points and the image transformation frequency when the computer instructions are executed by the processor.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a scene classification method based on feature points and image transformation frequencies.
Accordingly, the present application also provides an apparatus comprising a computer apparatus including a processor and a memory, the memory having stored therein computer instructions for executing the computer instructions stored in the memory, the apparatus implementing the steps of the method as described above when the computer instructions are executed by the processor.
The embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the edge computing server deployment method described above. The computer readable storage medium may be a tangible storage medium such as Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, floppy disks, hard disk, a removable memory disk, a CD-ROM, or any other form of storage medium known in the art.
In summary, the present application provides a scene classification method and apparatus based on feature points and image transformation frequency, including: acquiring videos to be classified, converting the videos into a continuous frame image sequence, and extracting key frames from the continuous frame image sequence based on a histogram variation method; extracting a first preset number of FAST feature points from each key frame based on a preset gray value threshold; calculating the response value average value and the distribution variance of a first preset number of FAST feature points; comparing the response value average value and the distribution variance with a response value average value threshold value and a distribution variance threshold value in a preset natural scene and a screen scene according to a preset classification rule so as to classify the key frames; when the classification can not be performed through the comparison threshold or the quantity of the FAST feature points is insufficient, the transformation frequency of the images is adopted to complement, and the transformation frequency of the images is compared with a preset frequency threshold to classify the key frames by acquiring a second preset quantity of frame images after corresponding key frames. The scene classification method provided by the application does not need to train a classification model, avoids a complicated training process, and has high classification speed and high precision.
Furthermore, when the scene classification can not be performed by comparing the thresholds and the scene classification is needed to be obtained by adopting the image transformation frequency, the response value average value threshold and the distribution variance threshold under the corresponding scene can be dynamically adjusted according to the finally classified scene type so as to improve the classification precision.
Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein can be implemented as hardware, software, or a combination of both. The particular implementation is hardware or software dependent on the specific application of the solution and the design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave.
It should be understood that the application is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and shown, and those skilled in the art can make various changes, modifications and additions, or change the order between steps, after appreciating the spirit of the present application.
In this disclosure, features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, and various modifications and variations can be made to the embodiments of the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A scene classification method based on feature points and image transformation frequency, the method comprising the steps of:
acquiring videos to be classified, converting the videos into a continuous frame image sequence, and extracting key frames from the continuous frame image sequence based on a histogram variation method;
extracting a first preset number of FAST feature points from each key frame based on a preset gray value threshold;
calculating the response value average value and the distribution variance of the first preset number of FAST feature points;
acquiring a first response value average value threshold value and a first distribution variance threshold value in a preset natural scene, and acquiring a second response value average value threshold value and a second distribution variance threshold value in a screen scene; classifying the corresponding keyframes as natural scenes if the response value average value is smaller than the first response value average value threshold and the distribution variance is smaller than the first distribution variance threshold, and classifying the corresponding keyframes as screen scenes if the response value average value is larger than the second response value average value threshold and the distribution variance is larger than the second distribution variance threshold;
when the response value average value and the distribution variance of the FAST feature points do not meet the classification rule, or the number of the FAST feature points is smaller than the first preset number, acquiring a second preset number of frame images after corresponding key frames, and calculating the image transformation frequency by judging whether two adjacent frame images in the second preset number of frame images are transformed, if the image transformation frequency is larger than a preset frequency threshold, classifying the corresponding key frames as natural scenes, and if the image transformation frequency is smaller than the frequency threshold, classifying the corresponding key frames as screen scenes.
2. The scene classification method based on feature points and image transformation frequency according to claim 1, wherein key frames are extracted from the sequence of consecutive frame images based on a histogram variation method, further comprising:
converting the continuous frame image sequence into gray level images and calculating a gray level histogram;
and calculating the gray histogram difference between adjacent frames, and taking the current frame as the key frame if the difference is larger than a preset difference threshold.
3. The scene classification method based on feature points and image transformation frequency according to claim 1, wherein extracting a first preset number of FAST feature points from each key frame based on a preset gray value threshold, further comprises:
in one cycle, based on a preset gray value threshold, extracting FAST feature points from each key frame; sorting the extracted FAST feature points according to the response values; modifying the preset gray value threshold value to be the gray value of the FAST feature point with the lowest response value;
and circulating the steps until the FAST feature points with the first preset number of high response values are obtained.
4. A scene classification method based on feature points and image transformation frequency according to claim 3, characterized in that extracted FAST feature points are ordered according to response values, wherein the response value calculation method of FAST feature points comprises the steps of:
acquiring a preset gray value threshold value for extracting the FAST feature points, and calculating an intermediate value between the preset gray value threshold value and a maximum gray value to serve as an initial intermediate value;
in one cycle, if the initial intermediate value is smaller than the gray value of the FAST feature point, calculating the intermediate value of the initial intermediate value and the maximum gray value, and taking the intermediate value as a new initial intermediate value; if the initial intermediate value is larger than the gray value of the FAST feature point, calculating an intermediate value of the preset gray value threshold and the initial intermediate value, and taking the intermediate value as a new initial intermediate value;
and circulating the steps until a boundary gray value is obtained, and taking the boundary gray value as a response value of the FAST feature point.
5. The scene classification method based on feature points and image transformation frequency according to claim 1, wherein the calculation method of the distribution variance of the first preset number of FAST feature points comprises the steps of:
dividing the corresponding key frame into a plurality of areas, and counting the number of FAST feature points in each area;
and calculating the variance of the quantity of the FAST feature points in all the areas to obtain the distribution variance.
6. The scene classification method based on feature points and image transformation frequency according to claim 1, wherein after scene classification is performed based on the image transformation frequency, a first response value average threshold and a first distribution variance threshold in a natural scene or a second response value average threshold and a second distribution variance threshold in a screen scene are adjusted according to scene types.
7. The scene classification method based on feature points and image transformation frequency according to claim 6, further comprising:
when the corresponding key frame is judged to be a natural scene, if the response value average value is larger than the first response value average value threshold value, the first response value average value threshold value is increased, and the calculation formula is as follows:
wherein Tne represents the first response value average threshold; e represents the response value average value;
if the response value average value is smaller than the first response value average value threshold, and the distribution variance is larger than the first distribution variance threshold, the first distribution variance threshold is improved, and the calculation formula is as follows:
wherein Tnd represents the first distribution variance threshold; d represents the distribution variance;
when the corresponding key frame is judged to be a screen scene, if the response value average value is smaller than the first response value average value threshold value, the first response value average value threshold value is reduced, and the calculation formula is as follows:
wherein Tse represents the second response value average threshold; e represents the response value average value;
if the response value average value is greater than the first response value average value threshold, and the distribution variance is smaller than the first distribution variance threshold, the first distribution variance threshold is reduced, and the calculation formula is as follows:
wherein Tsd represents the second distribution variance threshold; d represents the distribution variance.
8. The scene classification method based on feature points and image transformation frequency according to claim 1, wherein the image transformation frequency is calculated by judging whether or not two adjacent frame images among the second preset number of frame images are transformed, further comprising:
acquiring two adjacent frame images in the second preset number of frame images;
dividing two adjacent frames of images into a plurality of areas uniformly, and counting a gray level histogram of each area;
comparing the regional gray histograms of two adjacent frames, and recording the number of the transformed regions; if more than half of the areas are transformed, marking the current frame image to be transformed;
wherein the image conversion frequency represents the number of image frames in which conversion occurs in the second preset number of frame images.
9. A scene classification device based on feature points and image transformation frequency, comprising a processor and a memory, characterized in that the memory has stored therein computer instructions for executing the computer instructions stored in the memory, which device, when executed by the processor, implements the steps of the method according to any of claims 1 to 8.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 8.
CN202310714287.3A 2023-06-15 2023-06-15 Scene classification method and device based on feature points and image transformation frequency Pending CN116824445A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310714287.3A CN116824445A (en) 2023-06-15 2023-06-15 Scene classification method and device based on feature points and image transformation frequency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310714287.3A CN116824445A (en) 2023-06-15 2023-06-15 Scene classification method and device based on feature points and image transformation frequency

Publications (1)

Publication Number Publication Date
CN116824445A true CN116824445A (en) 2023-09-29

Family

ID=88125118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310714287.3A Pending CN116824445A (en) 2023-06-15 2023-06-15 Scene classification method and device based on feature points and image transformation frequency

Country Status (1)

Country Link
CN (1) CN116824445A (en)

Similar Documents

Publication Publication Date Title
CN108229526B (en) Network training method, network training device, image processing method, image processing device, storage medium and electronic equipment
US8358837B2 (en) Apparatus and methods for detecting adult videos
JP5500024B2 (en) Image recognition method, apparatus, and program
KR101747216B1 (en) Apparatus and method for extracting target, and the recording media storing the program for performing the said method
CN108876756B (en) Image similarity measurement method and device
CN111723644A (en) Method and system for detecting occlusion of surveillance video
CN103198311B (en) Image based on shooting recognizes the method and device of character
EP1512297A1 (en) A method and system for estimating sharpness metrics based on local edge statistical distribution
CN111383244B (en) Target detection tracking method
CN110428450B (en) Scale-adaptive target tracking method applied to mine tunnel mobile inspection image
EP1932117A2 (en) Method and apparatus for determining automatically the shot type of an image (close-up shot versus long shot)
CN110060278B (en) Method and device for detecting moving target based on background subtraction
US20180047271A1 (en) Fire detection method, fire detection apparatus and electronic equipment
CN107886518B (en) Picture detection method and device, electronic equipment and readable storage medium
CN102301697B (en) Video identifier creation device
CN113743378B (en) Fire monitoring method and device based on video
Maity et al. Background modeling and foreground extraction in video data using spatio-temporal region persistence features
KR102230559B1 (en) Method and Apparatus for Creating Labeling Model with Data Programming
JP2011170890A (en) Face detecting method, face detection device, and program
CN110378271B (en) Gait recognition equipment screening method based on quality dimension evaluation parameters
KR20060007901A (en) Apparatus and method for automatic extraction of salient object from an image
CN116824445A (en) Scene classification method and device based on feature points and image transformation frequency
CN115830513A (en) Method, device and system for determining image scene change and storage medium
CN110580706A (en) Method and device for extracting video background model
CN112532938B (en) Video monitoring system based on big data technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination