WO2021087773A1 - Recognition method and apparatus, electronic device, and storage medium - Google Patents

Recognition method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2021087773A1
WO2021087773A1 PCT/CN2019/115800 CN2019115800W WO2021087773A1 WO 2021087773 A1 WO2021087773 A1 WO 2021087773A1 CN 2019115800 W CN2019115800 W CN 2019115800W WO 2021087773 A1 WO2021087773 A1 WO 2021087773A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
recognized
intermediate feature
image
identifier
Prior art date
Application number
PCT/CN2019/115800
Other languages
French (fr)
Chinese (zh)
Inventor
郭子亮
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to CN201980099689.9A priority Critical patent/CN114341946A/en
Priority to PCT/CN2019/115800 priority patent/WO2021087773A1/en
Publication of WO2021087773A1 publication Critical patent/WO2021087773A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Definitions

  • the embodiments of the present application relate to computer technology, and in particular, to an identification method, device, electronic device, and storage medium.
  • an electronic device when recognizing a video, usually extracts multiple frames of images from the video, and uses multiple frames of images to characterize the video as a recognition subject, thereby recognizing still objects in the video.
  • This application provides an identification method, device, electronic equipment, and storage medium, which can improve the accuracy of identification of static objects in a video.
  • an identification method including:
  • An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
  • an identification device including:
  • the first acquisition module is used to extract multiple frames of original images from the video to be recognized, acquire the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
  • the first determining module is configured to obtain the pixel value of each pixel in the edge gradient image of each frame, and determine the median of the pixel value of the pixel at the same position in the multiple frames of edge image;
  • the generating module is used to generate an intermediate feature image according to each median and the pixel position corresponding to each median;
  • the second determining module is configured to determine an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
  • an embodiment of the present application also provides an electronic device, including: a processor, a memory, and a computer program stored in the memory and running on the processor, and the processor realizes recognition when the computer program is executed.
  • An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
  • an embodiment of the present application also provides a storage medium containing executable instructions of an electronic device.
  • the executable instructions of the electronic device are used to perform the identification method described in the embodiments of the present application when the electronic device executable instructions are executed by the processor of the electronic device. .
  • FIG. 1 is a schematic diagram of the first flow of an identification method provided by an embodiment of the present application.
  • FIG. 2 is an original image a and an edge gradient map A corresponding to the original image a provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a scene of an identification method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the second flow of the identification method provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of the third process of the identification method provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of region division of an intermediate feature image provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an identification device provided by an embodiment of the present application.
  • FIG. 8 is a first schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a second structure of an electronic device provided by an embodiment of the present application.
  • the embodiment of the present application provides an identification method, and the identification method is applied to an electronic device.
  • the execution subject of the identification method may be the identification device provided in the embodiment of the present application, or an electronic device integrated with the identification device.
  • the identification device may be implemented in hardware or software, and the electronic device may be a smart phone or a tablet computer. , Handheld computers, notebook computers, or desktop computers that are equipped with processors and have processing capabilities.
  • FIG. 1 is a schematic diagram of a first flow of an identification method provided by an embodiment of this application.
  • the identification method is applied to the electronic device provided in the embodiment of the present application.
  • the flow of the identification method provided in the embodiment of the present application may be as follows:
  • Extract multiple frames of original images from the video to be recognized obtain an edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images.
  • the electronic device obtains a video to be recognized, extracts multiple frames of original images from the video to be recognized, obtains an edge gradient image corresponding to each frame of the original image, and obtains multiple frames of edge gradient images.
  • the edge gradient image corresponding to each frame of the original image is an image obtained after edge extraction is performed on the frame of the original image.
  • the edge is the location where the attribute of the area changes suddenly, and it is the intersection of the image area and another attribute area.
  • Edges include step-shaped edges and roof-shaped edges. The pixel values of the pixels on both sides of the step-shaped edge are obviously different.
  • the roof-like edge is at the turning point where the pixel value changes from small to large to small.
  • FIG. 2 is an original image a and an edge gradient map A corresponding to the original image a provided by an embodiment of the application.
  • the edge gradient image A corresponding to the original image a is obtained after edge extraction of the original image a.
  • the pixel value of the other pixels is 0 except that the pixels constituting the edge have a pixel value that is not 0.
  • the method of obtaining the edge gradient image corresponding to each frame of the original image is not specifically limited in the embodiment of the present application.
  • the edge gradient image corresponding to each frame of the original image is obtained through the Laplacian edge detection operator.
  • the edge gradient image corresponding to each frame of the original image is obtained through the Roberts edge detection operator.
  • the edge gradient image corresponding to each frame of the original image is obtained through the Sobel edge detection operator.
  • the edge gradient image corresponding to each frame of the original image is obtained through the Kirsch edge detection operator.
  • edge gradient image corresponding to the original image is obtained through the edge detection operator, and the edge gradient image obtained is different if the edge detection operator is different.
  • the electronic device After acquiring the edge gradient image corresponding to each frame of the original image, and after obtaining multiple frames of edge gradient images, the electronic device acquires the pixel value of each pixel in the edge gradient image of each frame, and determines that it is located in the same edge image in the multiple frames. The median of the pixel value of the pixel at the position.
  • FIG. 3 is a schematic diagram of a scene of an identification method provided by an embodiment of this application.
  • the multi-frame edge images are 3 frames of edge gradient images, they are denoted as edge gradient image B1, edge gradient image B2, and edge gradient image B3.
  • edge gradient image B1 edge gradient image B2
  • edge gradient image B3 edge gradient image B3.
  • the median acquisition is described below with the center position of 3 frames of edge gradient images.
  • the electronic device obtains the pixel value of each pixel in the edge gradient image B1
  • the pixel value of the pixel at the center of the edge gradient image B1 is P1
  • the pixel value of the pixel at the center of the edge gradient image B2 can be known. It is P2, and the pixel value of the pixel at the center of the edge gradient image B3 is P3.
  • the pixel value of other pixels in the edge gradient image has a pixel value other than 0 except for the pixel that constitutes the edge
  • the median of the pixel values P1, P2, P3 is obtained, and the median of the pixel value of the pixel at the center position in the 3 frame edge images is P1 or P3, that is, the median is 0.
  • the sizes of the 3 frames of edge gradient images are the same. That is, in this embodiment of the present application, multiple frames of edge gradient images obtained from one video have the same size.
  • the electronic device may generate a pixel value according to each median and the pixel position corresponding to each median.
  • Feature image in the middle of the frame the size of the intermediate feature image is the same as the size of the edge gradient image of each frame.
  • the electronic device obtains the median of the pixel value of the pixel at the same position in the three frames of edge images in the above manner.
  • An intermediate feature image generated based on the median and the pixel position corresponding to the median.
  • the pixel value of the pixel at the center position of the intermediate feature image is the median P1 or P3.
  • the intermediate feature image generated according to each median and the position of the pixel corresponding to each median can eliminate objects whose position in the multi-frame edge image has changed, and keep the position in the multi-frame edge image without occurrence Changing objects.
  • the electronic device may determine the object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
  • the stationary object when recognizing a stationary object in the video to be recognized, is not determined directly based on the object change of the multi-frame images intercepted from the video to be recognized, but based on the multi-frame edge gradient image. Object changes are used to determine stationary objects. Because the edge gradient image only retains the edge of each object, there are fewer interference factors, which is beneficial to improve the recognition accuracy of stationary objects in the video to be recognized.
  • FIG. 4 is a schematic diagram of a second process of the identification method provided by an embodiment of this application.
  • Extract multiple frames of original images from a video to be recognized obtain an edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images.
  • the electronic device obtains a video to be recognized, extracts multiple frames of original images from the video to be recognized, obtains an edge gradient image corresponding to each frame of the original image, and obtains multiple frames of edge gradient images.
  • the edge gradient image corresponding to each frame of the original image is an image obtained after edge extraction is performed on the frame of the original image.
  • the edge is the location where the attribute of the area changes suddenly, and it is the intersection of the image area and another attribute area.
  • Edges include step-shaped edges and roof-shaped edges. The pixel values of the pixels on both sides of the step-shaped edge are obviously different.
  • the roof-like edge is at the turning point where the pixel value changes from small to large to small.
  • each edge gradient image includes multiple frames of edge gradient images.
  • the edge gradient image obtained by the first method is different from the edge gradient image obtained by the second method.
  • the first type of edge gradient image includes S frame edges.
  • Gradient image Obtain a frame of edge gradient image corresponding to each frame of the original image in the second way, and obtain the second type of edge gradient image.
  • the second type of edge gradient image includes S frames of edge gradient images. Among them, the first way is different from the second way. For the same original image, the edge gradient image obtained by the first method is different from the edge gradient image obtained by the second method.
  • each edge gradient image uses each edge gradient image as the processing object, obtain the pixel value of each pixel in each frame of the edge gradient image in each edge gradient image, and determine the pixel value at the same position in the multi-frame edge image
  • the median of the pixel value; according to each median and the pixel position corresponding to each median, a frame of intermediate feature image is generated.
  • multiple frames of intermediate feature images are obtained. An object composed of pixels whose pixel value is not equal to zero in each frame of the intermediate feature image is determined as a candidate static object in the video to be recognized.
  • the number of acquisition methods of the edge gradient images corresponding to each frame of the original image is equal to the number of intermediate feature images.
  • the recognition rate of the object reaches the preset ratio, it is determined that the object is a stationary object in the video to be recognized.
  • the electronic device when extracting multiple frames of original images from the video to be recognized, may continuously extract multiple frames of original images from the video to be recognized.
  • the electronic device intercepts a short video with a playing time of 20 minutes to 30 minutes from the video to be identified, and uses all the original images in the short video as multiple original images extracted from the video to be identified.
  • the electronic device when extracting multiple frames of original images from the to-be-recognized video, may extract multiple frames of original images from the to-be-recognized video at intervals according to the time axis of the to-be-recognized video.
  • the electronic device obtains the original image played when the playback time is 1 minute, the original image played when the playback time is 21 minutes, the original image played when the playback time is 41 minutes, and the playback time is 61 according to the time axis of the video to be recognized.
  • the original image played in minutes is used as the multi-frame original image extracted from the video to be recognized.
  • the electronic device After acquiring the edge gradient image corresponding to each frame of the original image, and after obtaining multiple frames of edge gradient images, the electronic device acquires the pixel value of each pixel in the edge gradient image of each frame, and determines that it is located in the same edge image in the multiple frames. The median of the pixel value of the pixel at the position.
  • the electronic device may generate a pixel value according to each median and the pixel position corresponding to each median.
  • Feature image in the middle of the frame the size of the intermediate feature image is the same as the size of the edge gradient image of each frame.
  • the intermediate feature image generated according to each median and the position of the pixel corresponding to each median can eliminate objects whose position in the multi-frame edge image has changed, and keep the position in the multi-frame edge image without occurrence Changing objects.
  • the electronic device may determine the object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
  • the stationary object is not yet clear what type it is.
  • the electronic device can match each stationary object with multiple preset identifiers, and determine to match each stationary object The number of successful preset logos. It should be noted that if there is a logo in the video, the logo is generally a type of stationary objects.
  • the electronic device matches the stationary object R1 with multiple preset identifiers to obtain the similarity between the stationary object R1 and each preset identifier, and uses the preset identifier corresponding to the similarity to meet the preset conditions as the candidate identifier of the stationary object R1.
  • the identifier with the highest similarity among the candidate identifiers is used as the preset identifier for the stationary object R1 to be successfully matched.
  • determine the preset identifiers of the stationary object R2 successfully matched determine the number of preset identifiers successfully matched with the two stationary objects. It should be noted that the number of preset identifiers that are successfully matched with each stationary object can only be 0 or 1.
  • a plurality of preset identifications are pre-stored in the electronic device, and the plurality of preset identifications can be added or removed by the user. For example, when the electronic device plays a video on the display interface, if a user's preset logo storage instruction is received, the new preset logo is stored in the memory according to the preset logo storage instruction.
  • the user can trigger the shooting instruction in a preset manner.
  • the user slides three fingers on the display screen while watching a video, triggering a preset mark storage instruction.
  • the electronic device receives the preset mark storage instruction, it acquires the display image when the preset mark storage instruction is triggered, recognizes the mark in the display image, and saves the mark as a new preset mark in the memory.
  • the user performs a circle operation on the display screen while watching a video, triggering a preset mark storage instruction.
  • the electronic device receives the preset mark storage instruction, it acquires the delineated area of the delineation operation, and saves the objects in the delineated area as a new preset mark to the memory, etc.
  • the electronic device may determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority. That is, the more the number of preset identifiers, the lower the recommendation priority, the smaller the possibility that the electronic device recommends the video to be recognized; the fewer the number of preset identifiers, the higher the recommendation priority, and the possibility that the electronic device recommends the video to be recognized Bigger.
  • the electronic device when it is detected that video recommendation is needed, obtains the user's historical browsing record, determines the target type of the video to be recommended from the historical browsing record, and searches the first preset video library for the target type. For the corresponding video, the video corresponding to the target type is used as a candidate video, and is displayed on the display interface according to the recommendation priority of the candidate video from high to low according to the recommendation priority.
  • the electronic device when it is detected that video recommendation needs to be performed, obtains the user's historical browsing records, and determines from the historical browsing records a target type that needs the most recommended video. Find the video corresponding to the target type in the sub-video library (storing the video with the highest recommended priority), use the video corresponding to the target type as a candidate video, and display the candidate video on the display interface; When the ratio (such as 50%), search for the video corresponding to the target type in the second sub-video library (storing the video with the second highest recommended priority) of the second preset video library, and use the video corresponding to the target type as the backup Select a video, display the candidate video on the display interface, and so on.
  • the ratio such as 50%
  • FIG. 5 is a schematic diagram of the third process of the identification method provided by an embodiment of the application.
  • Extract multiple frames of original images from a video to be recognized obtain an edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images.
  • the electronic device obtains a video to be recognized, extracts multiple frames of original images from the video to be recognized, obtains an edge gradient image corresponding to each frame of the original image, and obtains multiple frames of edge gradient images.
  • the edge gradient image corresponding to each frame of the original image is an image obtained after edge extraction is performed on the frame of the original image.
  • the edge is the location where the attribute of the area changes suddenly, and it is the intersection of the image area and another attribute area.
  • Edges include step-shaped edges and roof-shaped edges. The pixel values of the pixels on both sides of the step-shaped edge are obviously different.
  • the roof-like edge is at the turning point where the pixel value changes from small to large to small.
  • the electronic device After acquiring the edge gradient image corresponding to each frame of the original image, and after obtaining multiple frames of edge gradient images, the electronic device acquires the pixel value of each pixel in the edge gradient image of each frame, and determines that it is located in the same edge image in the multiple frames. The median of the pixel value of the pixel at the position.
  • the electronic device may generate a pixel value according to each median and the pixel position corresponding to each median.
  • Feature image in the middle of the frame the size of the intermediate feature image is the same as the size of the edge gradient image of each frame.
  • the intermediate feature image generated according to each median and the position of the pixel corresponding to each median can eliminate objects whose position in the multi-frame edge image has changed, and keep the position in the multi-frame edge image without occurrence Changing objects.
  • the electronic device may determine the object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
  • the intermediate feature image includes the identifier of the video to be recognized.
  • the intermediate feature image includes the identifier of the video to be recognized according to the pre-trained recognition model.
  • the identification of the video to be identified in this solution may be one or more.
  • the identification of the video to be identified includes two identifications of "XXTV" and "XX Theater".
  • the electronic device inputs the intermediate feature image into the pre-trained recognition model, and obtains an output result of 0 or 1.
  • the output result is 0, it is determined that the intermediate feature image does not include the identifier of the video to be recognized, and when the output result is 1, it is determined that the intermediate feature image includes the identifier of the video to be recognized.
  • 0 indicates that the intermediate feature image does not include the identification of the video to be recognized, that is, the stationary object Y1, the stationary object Y2, and the stationary object Y3 are objects other than the identification; 1 indicates that the intermediate feature image includes the identification of the video to be recognized.
  • the electronic device inputs the intermediate feature image into the pre-trained recognition model, and outputs the object type of each stationary object.
  • the object type of stationary object Y4 is "building”
  • the object type of stationary object Y5 is "identification”
  • the object type of stationary object Y6 is The object type is "person”.
  • the object type of each stationary object it is determined whether the intermediate feature image includes the identifier of the video to be recognized.
  • the input of the recognition model in this solution is the intermediate feature image obtained from the original image. Compared with directly inputting the original image in the recognition model, it is beneficial to improve the accuracy of the identification judgment of the recognition model.
  • the electronic device may obtain multiple frames of intermediate feature images obtained from multiple training videos to form training Set, use the training set to train a preset convolutional neural network model, and use the trained convolutional neural network model as a recognition model.
  • the use of intermediate feature images to train the recognition model can improve the recognition accuracy of the model.
  • the intermediate feature image does not include the identifier of the video to be recognized, determine the highest level among multiple preset levels as the recommendation level of the video to be recognized.
  • the electronic device may determine the highest level among the multiple preset levels as the recommendation level of the video to be recognized. It should be noted that the higher the recommendation level of the video to be recognized, the greater the possibility that the electronic device recommends the video to be recognized. Compared with the video that does not include the logo, the video that includes the logo may obscure the playback content of the video and cause users to watch poorly. Therefore, the video that does not include the logo has the highest recommendation level in this solution.
  • the intermediate feature image includes the identifier of the video to be recognized, determine the area proportion of the identifier in the video to be recognized.
  • the electronic device may determine the area ratio of the multiple identifiers in the video to be recognized. Among them, the electronic device may determine the largest area ratio among the multiple candidate area ratios (the area ratio of each mark in the video to be recognized is the candidate area ratio) as the area ratio of the multiple marks in the video to be recognized . The electronic device can also calculate the area ratio of multiple marks in the to-be-recognized video.
  • the electronic device may determine the proportion of the first area of the first identifier in the video to be recognized, and determine the proportion of the second area of the second identifier in the video to be recognized. The largest of the first area ratio and the second area ratio is used as the area ratio of the video to be recognized. If the proportion of the first area is greater than the proportion of the second area, the proportions of the areas of the multiple markers in the video to be recognized are determined to be the proportions of the first area.
  • the video to be recognized includes only one logo.
  • the electronic device can determine the area percentage of the logo in the video to be recognized, from other than the highest level.
  • the recommended level of the video to be recognized is determined in the preset level of. Among them, the larger the area proportion, the lower the recommendation level, the smaller the probability that the electronic device recommends the video to be recognized, the smaller the area proportion, the higher the recommendation level, and the greater the probability that the electronic device recommends the video to be recognized.
  • the electronic device may also determine the position of the identifier in the video to be recognized. According to the location, the recommendation level of the video to be recognized is determined from preset levels other than the highest level.
  • the electronic device may determine the intermediate feature image according to the difference V between the number of preset levels and 1 and the difference V.
  • the number of divisions is V.
  • the intermediate feature image is divided into V regions with a rectangle from the center of the intermediate feature image, and each region corresponds to a preset level. The closer the region is to the edge, the higher the preset level.
  • FIG. 6 is a schematic diagram of region division of an intermediate feature image provided by an embodiment of the application.
  • the preset levels are denoted as D1, D2, D3, D4, D5, and D6, and the level is high and low D1>D2>D3>D4>D5>D6.
  • the electronic device can determine that the number of divisions of the intermediate feature image is 5.
  • the center of the intermediate feature image is divided into 5 areas by a rectangle, which are denoted as area Q1, area Q2, area Q3, area Q4, and area Q5.
  • Each area corresponds to a preset level. The closer the area is to the edge, the higher the preset level.
  • the area Q1 corresponds to the preset level D6
  • the area Q2 corresponds to the preset level D5
  • the area Q3 corresponds to the preset level D4
  • the area Q4 corresponds to the preset level.
  • Set the level D3, and the area Q5 corresponds to the preset level D1.
  • the recommendation level of the video to be recognized is determined from preset levels (from D2, D3, D4, D5, and D6) other than the highest level. If it is assumed that the location of the identifier of the video to be recognized is the area Q3, it is determined that the recommendation level of the video to be recognized is the preset level D4.
  • the intermediate feature image includes multiple identifiers of the video to be recognized
  • multiple levels can be obtained according to the positions of the multiple identifiers, and the lowest level among the multiple levels is used as the recommended level of the video to be recognized.
  • Fig. 7 is a schematic structural diagram of an identification device provided by an embodiment of the present application.
  • the device is used to execute the identification method provided in the above-mentioned embodiment and has functional modules and beneficial effects corresponding to the execution method.
  • the identification device 400 specifically includes: a first acquiring module 401, a first determining module 402, a generating module 403, and a second determining module 404, wherein:
  • the first acquisition module 401 is configured to extract multiple frames of original images from the video to be recognized, acquire the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
  • the first determining module 402 is configured to obtain the pixel value of each pixel in each frame of edge gradient image, and determine the median of the pixel value of the pixel at the same position in the multiple frames of edge image;
  • the generating module 403 is used to generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
  • the second determining module 404 is configured to determine an object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
  • the first acquisition module 401 when extracting multiple frames of original images from the to-be-recognized video, is configured to: extract multiple frames of original images from the to-be-recognized video at intervals according to the time axis of the to-be-recognized video .
  • the recognition device 400 further includes a matching module and a third determining module;
  • the matching module is configured to match the stationary object with a plurality of preset identifiers, and determine the number of preset identifiers that are successfully matched with the stationary object;
  • the third determining module is configured to determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority.
  • the recognition device 400 after determining an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized, the recognition device 400 further includes a determination module and a fourth determination module;
  • the judgment module is configured to judge whether the intermediate feature image includes the identifier of the video to be recognized according to a pre-trained recognition model
  • the fourth determining module is configured to determine the highest level among a plurality of preset levels as the recommendation level of the video to be recognized if the intermediate feature image does not include the identifier of the video to be recognized.
  • the recognition apparatus 400 further includes a fifth determining module and a sixth determining module;
  • the fifth determining module is configured to determine the position of the identifier in the video to be recognized if the intermediate feature image includes the identifier of the video to be recognized;
  • the sixth determining module is configured to determine the recommendation level of the to-be-recognized video from preset levels other than the highest level according to the location.
  • the fifth determining module is configured to, if the intermediate feature image includes the identifier of the video to be recognized, Determining the area proportion of the mark in the video to be recognized;
  • the sixth determining module is configured to determine the recommendation level of the to-be-recognized video from preset levels other than the highest level according to the area ratio.
  • the recognition device 400 before determining whether the intermediate feature image includes the identifier of the video to be recognized, the recognition device 400 further includes a second acquisition module and a training module;
  • the second acquisition module is used to acquire multiple frames of intermediate feature images obtained from multiple training videos to form a training set
  • the training module is used to train a preset convolutional neural network model using the training set, and use the trained convolutional neural network model as a recognition model.
  • the identification device provided in this embodiment of the application belongs to the same concept as the identification method in the above embodiment, and any method provided in the identification method embodiment can be run on the identification device. For the specific implementation process, see Identification The method embodiment will not be repeated here.
  • the embodiment of the present application provides a computer-readable storage medium on which a computer program is stored.
  • the storage medium may be a magnetic disk, an optical disk, a read only memory (Read Only Memory, ROM,), or a random access device (Random Access Memory, RAM), etc.
  • the electronic device 500 includes a processor 501 and a memory 502. Wherein, the processor 501 and the memory 502 are electrically connected.
  • the processor 501 is the control center of the electronic device 500. It uses various interfaces and lines to connect various parts of the entire electronic device. It executes the electronic device by running or loading the computer program stored in the memory 502, and calling the data stored in the memory 502. Various functions of the device 500 and processing data.
  • the memory 502 may be used to store software programs and modules.
  • the processor 501 executes various functional applications and data processing by running the computer programs and modules stored in the memory 502.
  • the memory 502 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system, a computer program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of electronic equipment, etc.
  • the memory 502 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the memory 502 may further include a memory controller to provide the processor 501 with access to the memory 502.
  • the processor 501 in the electronic device 500 will load the instructions corresponding to the process of one or more computer programs into the memory 502 according to the following steps, and run the instructions by the processor 501 and store them in the memory 502.
  • the processor 501 in the electronic device 500 will load the instructions corresponding to the process of one or more computer programs into the memory 502 according to the following steps, and run the instructions by the processor 501 and store them in the memory 502.
  • An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
  • FIG. 9 is a schematic diagram of a second structure of an electronic device provided by an embodiment of the application.
  • the electronic device further includes: a camera component 603, a display component 604, an audio circuit 605, Radio frequency circuit 606 and power supply 607.
  • the camera component 603, the display component 604, the audio circuit 605, the radio frequency circuit 606, and the power supply 607 are electrically connected to the processor 601, respectively.
  • the camera component 603 may include an image processing circuit, which may be implemented by hardware and/or software components, and may include various processing units that define an image signal processing (Image Signal Processing) pipeline.
  • the image processing circuit may at least include: multiple cameras, an image signal processor (Image Signal Processor, ISP processor), a control logic, an image memory, a display, and the like.
  • Each camera may include at least one or more lenses and image sensors.
  • the image sensor may include a color filter array (such as a Bayer filter). The image sensor can obtain the light intensity and wavelength information captured with each imaging pixel of the image sensor, and provide a set of raw image data that can be processed by the image signal processor.
  • the display component 604 can be used to display information input by the user or information provided to the user, and various graphical user interfaces. These graphical user interfaces can be composed of graphics, text, icons, videos, and any combination thereof.
  • the audio circuit 605 can be used to provide an audio interface between the user and the electronic device through a speaker or a microphone.
  • the radio frequency circuit 606 may be used to transmit and receive radio frequency signals to establish wireless communication with network equipment or other electronic equipment through wireless communication, and to transmit and receive signals with the network equipment or other electronic equipment.
  • the power supply 607 can be used to supply power to various components of the electronic device 600.
  • the power supply 607 may be logically connected to the processor 601 through a power management system, so that functions such as charging, discharging, and power consumption management can be managed through the power management system.
  • the processor 601 in the electronic device 600 will load the instructions corresponding to the process of one or more computer programs into the memory 602 according to the following steps, and the processor 601 will run the instructions and store them in the memory 602.
  • the processor 601 will run the instructions and store them in the memory 602.
  • An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
  • the processor 601 may execute:
  • the processor 601 may execute:
  • the processor 601 may execute:
  • the highest level among a plurality of preset levels is determined as the recommendation level of the video to be recognized.
  • the processor 601 may execute:
  • the intermediate feature image includes the identifier of the video to be recognized, determine the position of the identifier in the video to be recognized;
  • the recommendation level of the video to be recognized is determined from a preset level other than the highest level.
  • the processor 601 may execute:
  • the intermediate feature image includes the identifier of the video to be recognized, determining the area proportion of the identifier in the video to be recognized;
  • the recommendation level of the to-be-recognized video is determined from preset levels other than the highest level.
  • the processor 601 may execute:
  • the electronic device after extracting multiple frames of original images from the video to be recognized, obtains the edge gradient image corresponding to each frame of the original image, obtains multiple frames of edge gradient images, and then determines that the multiple frames of edge image
  • the object formed by the pixels is determined to be a stationary object in the video to be recognized, which can improve the recognition accuracy of the stationary object in the video to be recognized.
  • the embodiments of the present application also provide a storage medium that stores a computer program, and when the computer program is run on a computer, the computer is caused to execute the recognition method in any of the above-mentioned embodiments, for example, from a video to be recognized Extract multiple frames of original images, obtain the edge gradient images corresponding to each frame of original images, and obtain multiple frames of edge gradient images; obtain the pixel value of each pixel in the edge gradient images of each frame, and determine the position in the multiple frames of edge image The median of the pixel values of the pixels at the same position; generate an intermediate feature image according to each median and the position of the pixel corresponding to each median; divide the pixels whose pixel value is not equal to zero in the intermediate feature image The constituted object is determined to be a stationary object in the video to be recognized.
  • the storage medium may be a magnetic disk, an optical disk, a read only memory (Read Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
  • the computer program may be stored in a computer readable storage medium, such as stored in the memory of an electronic device, and executed by at least one processor in the electronic device.
  • the execution process may include a process such as an embodiment of the identification method.
  • the storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, and the like.
  • the identification device of the embodiment of the present application its functional modules may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software function modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium, such as a read-only memory, a magnetic disk, or an optical disk.

Abstract

A recognition method and apparatus, an electronic device, and a storage medium. The method comprises: extracting multiple original images from a video to be recognized, and acquiring an edge gradient image of each original image (101); determining median values of pixel values of pixel points located at the same positions in the multiple edge images (102); generating an intermediate feature image according to each median value and the pixel point position of each median value (103); and determining an object, which is composed of pixel points in the intermediate feature image the pixel values of which are not zero, as a stationary object (104).

Description

识别方法、装置、电子设备及存储介质Identification method, device, electronic equipment and storage medium 技术领域Technical field
本申请实施例涉及计算机技术,尤其涉及一种识别方法、装置、电子设备及存储介质。The embodiments of the present application relate to computer technology, and in particular, to an identification method, device, electronic device, and storage medium.
背景技术Background technique
随着科学技术的发展,各种视频资源越来越丰富。每段视频中包含着许多物体,电子设备如何识别视频中具有同一类特征的物体已成为一个具有重要意义的研究课题。With the development of science and technology, various video resources are becoming more and more abundant. Each video contains many objects. How to identify objects with the same type of characteristics in the video has become an important research topic.
目前,在对视频进行识别时,电子设备通常从视频中提取多帧图像,以多帧图像表征视频来作为识别主体,从而对视频中的静止物体进行识别。At present, when recognizing a video, an electronic device usually extracts multiple frames of images from the video, and uses multiple frames of images to characterize the video as a recognition subject, thereby recognizing still objects in the video.
发明内容Summary of the invention
本申请提供了一种识别方法、装置、电子设备及存储介质,可以提高视频中静止物体的识别准确度。This application provides an identification method, device, electronic equipment, and storage medium, which can improve the accuracy of identification of static objects in a video.
第一方面,本申请实施例提供了一种识别方法,包括:In the first aspect, an embodiment of the present application provides an identification method, including:
从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;Extract multiple frames of original images from the video to be recognized, obtain the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;Acquiring the pixel value of each pixel in the edge gradient image of each frame, and determining the median of the pixel value of the pixel located at the same position in the multi-frame edge image;
根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
第二方面,本申请实施例还提供了一种识别装置,包括:In the second aspect, an embodiment of the present application also provides an identification device, including:
第一获取模块,用于从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;The first acquisition module is used to extract multiple frames of original images from the video to be recognized, acquire the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
第一确定模块,用于获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;The first determining module is configured to obtain the pixel value of each pixel in the edge gradient image of each frame, and determine the median of the pixel value of the pixel at the same position in the multiple frames of edge image;
生成模块,用于根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;The generating module is used to generate an intermediate feature image according to each median and the pixel position corresponding to each median;
第二确定模块,用于将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。The second determining module is configured to determine an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
第三方面,本申请实施例还提供了一种电子设备,包括:处理器、存储器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现识别方法:In a third aspect, an embodiment of the present application also provides an electronic device, including: a processor, a memory, and a computer program stored in the memory and running on the processor, and the processor realizes recognition when the computer program is executed. method:
从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;Extract multiple frames of original images from the video to be recognized, obtain the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;Acquiring the pixel value of each pixel in the edge gradient image of each frame, and determining the median of the pixel value of the pixel located at the same position in the multi-frame edge image;
根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
第四方面,本申请实施例还提供了一种包含电子设备可执行指令的存储介质,所述电子设备可执行指令在由电子设备处理器执行时用于执行本申请实施例所述的识别方法。In a fourth aspect, an embodiment of the present application also provides a storage medium containing executable instructions of an electronic device. The executable instructions of the electronic device are used to perform the identification method described in the embodiments of the present application when the electronic device executable instructions are executed by the processor of the electronic device. .
附图说明Description of the drawings
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、 目的和优点将会变得更明显。By reading the detailed description of the non-limiting embodiments with reference to the following drawings, other features, purposes, and advantages of the present application will become more apparent.
图1是本申请实施例提供的识别方法的第一流程示意图。FIG. 1 is a schematic diagram of the first flow of an identification method provided by an embodiment of the present application.
图2是本申请实施例提供的原始图像a及原始图像a对应的边缘梯度图A。FIG. 2 is an original image a and an edge gradient map A corresponding to the original image a provided by an embodiment of the present application.
图3是本申请实施例提供的识别方法的场景示意图。FIG. 3 is a schematic diagram of a scene of an identification method provided by an embodiment of the present application.
图4是本申请实施例提供的识别方法的第二流程示意图。FIG. 4 is a schematic diagram of the second flow of the identification method provided by an embodiment of the present application.
图5是本申请实施例提供的识别方法的第三流程示意图。FIG. 5 is a schematic diagram of the third process of the identification method provided by an embodiment of the present application.
图6是本申请实施例提供的中间特征图像的区域划分示意图。FIG. 6 is a schematic diagram of region division of an intermediate feature image provided by an embodiment of the present application.
图7是本申请实施例提供的识别装置的结构示意图。FIG. 7 is a schematic structural diagram of an identification device provided by an embodiment of the present application.
图8是本申请实施例提供的电子设备的第一结构示意图。FIG. 8 is a first schematic structural diagram of an electronic device provided by an embodiment of the present application.
图9是本申请实施例提供的电子设备的第二结构示意图。FIG. 9 is a schematic diagram of a second structure of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。The application will be further described in detail below with reference to the drawings and embodiments. It can be understood that the specific embodiments described here are used to explain the application, but not to limit the application. In addition, it should be noted that, for ease of description, the drawings only show a part of the structure related to the present application instead of all of the structure.
本申请实施例提供一种识别方法,该识别方法应用于电子设备。其中,该识别方法的执行主体可以是本申请实施例提供的识别装置,或者集成了该识别装置的电子设备,该识别装置可以采用硬件或者软件的方式实现,电子设备可以是智能手机、平板电脑、掌上电脑、笔记本电脑、或者台式电脑等配置有处理器而具有处理能力的设备。The embodiment of the present application provides an identification method, and the identification method is applied to an electronic device. Wherein, the execution subject of the identification method may be the identification device provided in the embodiment of the present application, or an electronic device integrated with the identification device. The identification device may be implemented in hardware or software, and the electronic device may be a smart phone or a tablet computer. , Handheld computers, notebook computers, or desktop computers that are equipped with processors and have processing capabilities.
请参阅图1,图1为本申请实施例提供的识别方法的第一流程示意图。该识别方法应用于本申请实施例提供的电子设备,如图1所示,本申请实施例提供的识别方法的流程可以如下:Please refer to FIG. 1. FIG. 1 is a schematic diagram of a first flow of an identification method provided by an embodiment of this application. The identification method is applied to the electronic device provided in the embodiment of the present application. As shown in FIG. 1, the flow of the identification method provided in the embodiment of the present application may be as follows:
101、从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像。101. Extract multiple frames of original images from the video to be recognized, obtain an edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images.
比如,电子设备获取待识别视频,从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像。For example, the electronic device obtains a video to be recognized, extracts multiple frames of original images from the video to be recognized, obtains an edge gradient image corresponding to each frame of the original image, and obtains multiple frames of edge gradient images.
其中,每帧原始图像对应的边缘梯度图像是对该帧原始图像进行边缘提取后得到的图像。边缘是区域属性发生突变的位置,是图像性区域和另一个属性区域的交接处。边缘包括阶跃状边缘和屋顶状边缘。阶跃状边缘两侧像素的像素值明显不同。屋顶状边缘处于像素值由小到大再到小的变化转折点处。Wherein, the edge gradient image corresponding to each frame of the original image is an image obtained after edge extraction is performed on the frame of the original image. The edge is the location where the attribute of the area changes suddenly, and it is the intersection of the image area and another attribute area. Edges include step-shaped edges and roof-shaped edges. The pixel values of the pixels on both sides of the step-shaped edge are obviously different. The roof-like edge is at the turning point where the pixel value changes from small to large to small.
例如,请参阅图2,图2为本申请实施例提供的原始图像a及原始图像a对应的边缘梯度图A。对原始图像a进行边缘提取后得到的原始图像a对应的边缘梯度图像A。相比于原始图像a,边缘梯度图像A中除构成边缘的像素点具有不为0的像素值外,其他像素点的像素值为0。For example, please refer to FIG. 2, which is an original image a and an edge gradient map A corresponding to the original image a provided by an embodiment of the application. The edge gradient image A corresponding to the original image a is obtained after edge extraction of the original image a. Compared with the original image a, in the edge gradient image A, the pixel value of the other pixels is 0 except that the pixels constituting the edge have a pixel value that is not 0.
对于每帧原始图像对应的边缘梯度图像的获取方式,本申请实施例不作具体限制。例如,通过Laplacian边缘检测算子获取每帧原始图像对应的边缘梯度图像。例如,通过Roberts边缘检测算子获取每帧原始图像对应的边缘梯度图像。例如,通过Sobel边缘检测算子获取每帧原始图像对应的边缘梯度图像。例如,通过Kirsch边缘检测算子获取每帧原始图像对应的边缘梯度图像等。The method of obtaining the edge gradient image corresponding to each frame of the original image is not specifically limited in the embodiment of the present application. For example, the edge gradient image corresponding to each frame of the original image is obtained through the Laplacian edge detection operator. For example, the edge gradient image corresponding to each frame of the original image is obtained through the Roberts edge detection operator. For example, the edge gradient image corresponding to each frame of the original image is obtained through the Sobel edge detection operator. For example, the edge gradient image corresponding to each frame of the original image is obtained through the Kirsch edge detection operator.
需要说明的是,通过边缘检测算子获取原始图像对应的边缘梯度图像,采用的边缘检测算子不同,得到的边缘梯度图像也不同。It should be noted that the edge gradient image corresponding to the original image is obtained through the edge detection operator, and the edge gradient image obtained is different if the edge detection operator is different.
102、获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数。102. Obtain the pixel value of each pixel in the edge gradient image of each frame, and determine the median of the pixel value of the pixel at the same position in the multiple frames of edge image.
比如,在获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像之后,电子设备获取各帧边缘梯度图像中的每一像素点的像素值,并确定在多帧边缘图像中位于相同 位置的像素点的像素值的中位数。For example, after acquiring the edge gradient image corresponding to each frame of the original image, and after obtaining multiple frames of edge gradient images, the electronic device acquires the pixel value of each pixel in the edge gradient image of each frame, and determines that it is located in the same edge image in the multiple frames. The median of the pixel value of the pixel at the position.
例如,请参阅图3,图3为本申请实施例提供的识别方法的场景示意图。假设多帧边缘图像为3帧边缘梯度图像,记为边缘梯度图像B1、边缘梯度图像B2、边缘梯度图像B3。获取边缘梯度图像B1、边缘梯度图像B2、边缘梯度图像B3中的每一像素点的像素值。For example, please refer to FIG. 3, which is a schematic diagram of a scene of an identification method provided by an embodiment of this application. Assuming that the multi-frame edge images are 3 frames of edge gradient images, they are denoted as edge gradient image B1, edge gradient image B2, and edge gradient image B3. Obtain the pixel value of each pixel in the edge gradient image B1, the edge gradient image B2, and the edge gradient image B3.
下面中位数的获取以3帧边缘梯度图像的中心位置进行说明。在电子设备获取边缘梯度图像B1中的每一像素点的像素值之后,可知边缘梯度图像B1中心位置的像素点的像素值为P1,同理可知边缘梯度图像B2中心位置的像素点的像素值为P2,边缘梯度图像B3中心位置的像素点的像素值为P3。其中,按照边缘梯度图像中除构成边缘的像素点具有不为0的像素值外,其他像素点的像素值为0的原理,P1=0(因为边缘梯度图像B1中心位置的像素点不是构成边缘的像素点),P2≠0(因为边缘梯度图像B2中心位置的像素点是构成边缘的像素点),P3=0(因为边缘梯度图像B3中心位置的像素点不是构成边缘的像素点)。获取像素值P1、P2、P3的中位数,得到3帧边缘图像中位于中心位置的像素点的像素值的中位数是P1或P3,即中位数是0。The median acquisition is described below with the center position of 3 frames of edge gradient images. After the electronic device obtains the pixel value of each pixel in the edge gradient image B1, it can be known that the pixel value of the pixel at the center of the edge gradient image B1 is P1, and the same way the pixel value of the pixel at the center of the edge gradient image B2 can be known. It is P2, and the pixel value of the pixel at the center of the edge gradient image B3 is P3. Among them, in accordance with the principle that the pixel value of other pixels in the edge gradient image has a pixel value other than 0 except for the pixel that constitutes the edge, P1=0 (because the pixel at the center position of the edge gradient image B1 does not constitute the edge P2≠0 (because the pixel in the center of the edge gradient image B2 is the pixel that constitutes the edge), P3=0 (because the pixel in the center of the edge gradient image B3 is not the pixel that constitutes the edge). The median of the pixel values P1, P2, P3 is obtained, and the median of the pixel value of the pixel at the center position in the 3 frame edge images is P1 or P3, that is, the median is 0.
可以理解的是,因为该3帧边缘梯度图像由同一个视频得到,所以3帧边缘梯度图像的尺寸相同。即本申请实施例中由一个视频中得到的多帧边缘梯度图像尺寸相同。It is understandable that because the 3 frames of edge gradient images are obtained from the same video, the sizes of the 3 frames of edge gradient images are the same. That is, in this embodiment of the present application, multiple frames of edge gradient images obtained from one video have the same size.
103、根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像。103. Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median.
比如,在确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数之后,电子设备可以根据每个中位数以及每个中位数对应的像素点位置,生成一帧中间特征图像。其中,中间特征图像的尺寸与各帧边缘梯度图像的尺寸相同。For example, after determining the median of the pixel values of pixels at the same position in the multiple frames of edge images, the electronic device may generate a pixel value according to each median and the pixel position corresponding to each median. Feature image in the middle of the frame. Among them, the size of the intermediate feature image is the same as the size of the edge gradient image of each frame.
例如,继上述“位于中心位置的像素点的像素值的中位数为P1或P3”例子,电子设备按照上述方式得到3帧边缘图像中位于相同位置的像素点的像素值的中位数后,根据中位数以及中位数对应的像素点位置生成的中间特征图像。例如,中间特征图像的中心位置的像素点的像素值为中位数P1或P3。For example, following the above example of "the median of the pixel value of the pixel at the center position is P1 or P3", the electronic device obtains the median of the pixel value of the pixel at the same position in the three frames of edge images in the above manner. , An intermediate feature image generated based on the median and the pixel position corresponding to the median. For example, the pixel value of the pixel at the center position of the intermediate feature image is the median P1 or P3.
需要说明的是,根据每个中位数以及每个中位数对应的像素点位置生成的中间特征图像,可以消除多帧边缘图像中位置发生变化的物体,保留多帧边缘图像中位置没有发生变化的物体。It should be noted that the intermediate feature image generated according to each median and the position of the pixel corresponding to each median can eliminate objects whose position in the multi-frame edge image has changed, and keep the position in the multi-frame edge image without occurrence Changing objects.
104、将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。104. Determine an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
比如,在生成中间特征图像之后,电子设备可以将中间特征图像中像素值不等于零的像素点构成的物体确定为待识别视频中的静止物体。其中,静止物体可以是一个或多个。For example, after the intermediate feature image is generated, the electronic device may determine the object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized. Among them, there can be one or more stationary objects.
由上可知,本申请实施例中,在识别待识别视频中静止物体时,不直接根据从待识别视频中截取的多帧图像的物体变化来确定静止物体,而是根据多帧边缘梯度图像的物体变化来确定静止物体,因为边缘梯度图像只保留每个物体的边缘,所以干扰因素较少,有利于提高待识别视频中静止物体的识别准确度。It can be seen from the above that, in the embodiment of the present application, when recognizing a stationary object in the video to be recognized, the stationary object is not determined directly based on the object change of the multi-frame images intercepted from the video to be recognized, but based on the multi-frame edge gradient image. Object changes are used to determine stationary objects. Because the edge gradient image only retains the edge of each object, there are fewer interference factors, which is beneficial to improve the recognition accuracy of stationary objects in the video to be recognized.
请参阅图4,图4为本申请实施例提供的识别方法的第二流程示意图。Please refer to FIG. 4, which is a schematic diagram of a second process of the identification method provided by an embodiment of this application.
201、从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像。201. Extract multiple frames of original images from a video to be recognized, obtain an edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images.
比如,电子设备获取待识别视频,从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像。其中,每帧原始图像对应的边缘梯度图像是对该帧原始图像进行边缘提取后得到的图像。边缘是区域属性发生突变的位置,是图像性区域和另一个属性区域的交接处。边缘包括阶跃状边缘和屋顶状边缘。阶跃状边缘两侧像素的像素值明显不同。屋顶状边缘处于像素值由小到大再到小的变化转折点处。For example, the electronic device obtains a video to be recognized, extracts multiple frames of original images from the video to be recognized, obtains an edge gradient image corresponding to each frame of the original image, and obtains multiple frames of edge gradient images. Wherein, the edge gradient image corresponding to each frame of the original image is an image obtained after edge extraction is performed on the frame of the original image. The edge is the location where the attribute of the area changes suddenly, and it is the intersection of the image area and another attribute area. Edges include step-shaped edges and roof-shaped edges. The pixel values of the pixels on both sides of the step-shaped edge are obviously different. The roof-like edge is at the turning point where the pixel value changes from small to large to small.
又如,从待识别视频中提取多帧原始图像,通过不同的多种方式获取每帧原始图像对应的边缘梯度图像,得到多种边缘梯度图像。其中,每种边缘梯度图像中包括多帧边缘梯 度图像。对于同一帧原始图像,通过第一种方式获取的边缘梯度图像不同于通过第二种方式获取的边缘梯度图像。In another example, multiple frames of original images are extracted from the video to be recognized, and the edge gradient image corresponding to each frame of the original image is obtained in different ways to obtain multiple edge gradient images. Among them, each edge gradient image includes multiple frames of edge gradient images. For the same original image, the edge gradient image obtained by the first method is different from the edge gradient image obtained by the second method.
例如,从待识别视频中提取S帧原始图像,通过第一种方式获取每帧原始图像对应的一帧边缘梯度图像,得到第一种边缘梯度图像,第一种边缘梯度图像中包括S帧边缘梯度图像。通过第二种方式获取每帧原始图像对应的一帧边缘梯度图像,得到第二种边缘梯度图像,第二种边缘梯度图像中包括S帧边缘梯度图像。其中,第一种方式不同于第二种方式。对于同一帧原始图像,通过第一种方式获取的边缘梯度图像不同于通过第二种方式获取的边缘梯度图像。For example, extract S frames of original images from the video to be recognized, and obtain a frame of edge gradient image corresponding to each frame of the original image through the first method to obtain the first type of edge gradient image. The first type of edge gradient image includes S frame edges. Gradient image. Obtain a frame of edge gradient image corresponding to each frame of the original image in the second way, and obtain the second type of edge gradient image. The second type of edge gradient image includes S frames of edge gradient images. Among them, the first way is different from the second way. For the same original image, the edge gradient image obtained by the first method is different from the edge gradient image obtained by the second method.
之后,以每种边缘梯度图像为处理对象,获取每种边缘梯度图像中各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;根据每个中位数以及每个中位数对应的像素点位置,生成一帧中间特征图像。按照上述方式对多种边缘梯度图像进行处理后,得到多帧中间特征图像。将每帧中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的候选静止物体。After that, using each edge gradient image as the processing object, obtain the pixel value of each pixel in each frame of the edge gradient image in each edge gradient image, and determine the pixel value at the same position in the multi-frame edge image The median of the pixel value; according to each median and the pixel position corresponding to each median, a frame of intermediate feature image is generated. After processing multiple edge gradient images in the above manner, multiple frames of intermediate feature images are obtained. An object composed of pixels whose pixel value is not equal to zero in each frame of the intermediate feature image is determined as a candidate static object in the video to be recognized.
需要说明的是,每帧原始图像对应的边缘梯度图像的获取方式数量等于中间特征图像的数量。It should be noted that the number of acquisition methods of the edge gradient images corresponding to each frame of the original image is equal to the number of intermediate feature images.
接着,计算候选静止物体中同一物体的识别率,其中,物体的识别率=识别出该物体的次数/中间特征图像的数量。当物体的识别率达到预设比率时,确定该物体为待识别视频中的静止物体。Next, the recognition rate of the same object among the candidate static objects is calculated, where the recognition rate of the object=the number of times the object is recognized/the number of intermediate feature images. When the recognition rate of the object reaches the preset ratio, it is determined that the object is a stationary object in the video to be recognized.
在一些实施例中,从待识别视频中提取多帧原始图像时,电子设备可以连续从待识别视频中提取多帧原始图像。In some embodiments, when extracting multiple frames of original images from the video to be recognized, the electronic device may continuously extract multiple frames of original images from the video to be recognized.
例如,电子设备从待识别视频中截取播放时间20分钟至播放时间30分钟的小段视频,将该小段视频中的全部原始图像作为从待识别视频中提取的多帧原始图像。For example, the electronic device intercepts a short video with a playing time of 20 minutes to 30 minutes from the video to be identified, and uses all the original images in the short video as multiple original images extracted from the video to be identified.
在一些实施例中,从待识别视频中提取多帧原始图像时,电子设备可以根据待识别视频的时间轴,间隔地从所述待识别视频中提取多帧原始图像。In some embodiments, when extracting multiple frames of original images from the to-be-recognized video, the electronic device may extract multiple frames of original images from the to-be-recognized video at intervals according to the time axis of the to-be-recognized video.
例如,电子设备根据待识别视频的时间轴,获取播放时间为1分钟时播放的原始图像、播放时间为21分钟时播放的原始图像、播放时间为41分钟时播放的原始图像、播放时间为61分钟时播放的原始图像,作为待识别视频中提取的多帧原始图像。For example, the electronic device obtains the original image played when the playback time is 1 minute, the original image played when the playback time is 21 minutes, the original image played when the playback time is 41 minutes, and the playback time is 61 according to the time axis of the video to be recognized. The original image played in minutes is used as the multi-frame original image extracted from the video to be recognized.
202、获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数。202. Obtain the pixel value of each pixel in the edge gradient image of each frame, and determine the median of the pixel value of the pixel at the same position in the multiple frames of edge image.
比如,在获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像之后,电子设备获取各帧边缘梯度图像中的每一像素点的像素值,并确定在多帧边缘图像中位于相同位置的像素点的像素值的中位数。For example, after acquiring the edge gradient image corresponding to each frame of the original image, and after obtaining multiple frames of edge gradient images, the electronic device acquires the pixel value of each pixel in the edge gradient image of each frame, and determines that it is located in the same edge image in the multiple frames. The median of the pixel value of the pixel at the position.
可以理解的是,本申请实施例中由一个视频中得到的多帧边缘梯度图像尺寸相同。It is understandable that, in this embodiment of the present application, multiple frames of edge gradient images obtained from one video have the same size.
203、根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像。203. Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median.
比如,在确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数之后,电子设备可以根据每个中位数以及每个中位数对应的像素点位置,生成一帧中间特征图像。其中,中间特征图像的尺寸与各帧边缘梯度图像的尺寸相同。For example, after determining the median of the pixel values of pixels at the same position in the multiple frames of edge images, the electronic device may generate a pixel value according to each median and the pixel position corresponding to each median. Feature image in the middle of the frame. Among them, the size of the intermediate feature image is the same as the size of the edge gradient image of each frame.
需要说明的是,根据每个中位数以及每个中位数对应的像素点位置生成的中间特征图像,可以消除多帧边缘图像中位置发生变化的物体,保留多帧边缘图像中位置没有发生变化的物体。It should be noted that the intermediate feature image generated according to each median and the position of the pixel corresponding to each median can eliminate objects whose position in the multi-frame edge image has changed, and keep the position in the multi-frame edge image without occurrence Changing objects.
204、将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。204. Determine an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
比如,在生成中间特征图像之后,电子设备可以将中间特征图像中像素值不等于零的像素点构成的物体确定为待识别视频中的静止物体。其中,静止物体可以是一个或多个。For example, after the intermediate feature image is generated, the electronic device may determine the object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized. Among them, there can be one or more stationary objects.
205、将所述静止物体与多个预设标识进行匹配,确定与所述静止物体匹配成功的预设标识数目。205. Match the stationary object with a plurality of preset identifiers, and determine the number of preset identifiers that are successfully matched with the stationary object.
比如,在确定待识别视频中的静止物体之后,此时,静止物体还不清楚是何种类型,电子设备可以将每个静止物体与多个预设标识进行匹配,确定与每个静止物体匹配成功的预设标识数目。需要说明的是,若视频中存在标识,一般情况下标识属于静止物体中的一种类型。For example, after determining the stationary object in the video to be recognized, at this time, the stationary object is not yet clear what type it is. The electronic device can match each stationary object with multiple preset identifiers, and determine to match each stationary object The number of successful preset logos. It should be noted that if there is a logo in the video, the logo is generally a type of stationary objects.
例如,假设待识别视频中存在两个静止物体,分别记为静止物体R1、静止物体R2。电子设备将静止物体R1与多个预设标识进行匹配,得到静止物体R1与各个预设标识之间的相似度,将相似度满足预设条件对应的预设标识作为静止物体R1的候选标识,将候选标识中相似度最高的标识作为静止物体R1的匹配成功的预设标识。同上,确定静止物体R2匹配成功的预设标识,最后确定与两个静止物体匹配成功的预设标识数目。需要说明的是,与每个静止物体匹配成功的预设标识数目只可能是0或1。For example, suppose that there are two stationary objects in the video to be recognized, which are respectively denoted as stationary object R1 and stationary object R2. The electronic device matches the stationary object R1 with multiple preset identifiers to obtain the similarity between the stationary object R1 and each preset identifier, and uses the preset identifier corresponding to the similarity to meet the preset conditions as the candidate identifier of the stationary object R1. The identifier with the highest similarity among the candidate identifiers is used as the preset identifier for the stationary object R1 to be successfully matched. Same as above, determine the preset identifiers of the stationary object R2 successfully matched, and finally determine the number of preset identifiers successfully matched with the two stationary objects. It should be noted that the number of preset identifiers that are successfully matched with each stationary object can only be 0 or 1.
其中,电子设备中预先存储有多个预设标识,且多个预设标识可以由用户增加或去除。例如,当电子设备在显示界面上播放视频时,若接收到用户的预设标识存储指令,根据预设标识存储指令在存储器中保存新预设标识。Wherein, a plurality of preset identifications are pre-stored in the electronic device, and the plurality of preset identifications can be added or removed by the user. For example, when the electronic device plays a video on the display interface, if a user's preset logo storage instruction is received, the new preset logo is stored in the memory according to the preset logo storage instruction.
需要说明的是,用户可以通过预设的方式来触发拍摄指令。例如,用户在观看视频时通过三根手指在显示屏上进行下滑操作,触发预设标识存储指令。电子设备接收到预设标识存储指令时,获取触发预设标识存储指令时的显示图像,识别显示图像中的标识,并将该标识作为新预设标识保存至存储器中。又如,用户在观看视频时在显示屏上进行圈定操作,触发预设标识存储指令。电子设备接收到预设标识存储指令时,获取圈定操作的圈定区域,将圈定区域中的物体作为新预设标识保存至存储器中等。It should be noted that the user can trigger the shooting instruction in a preset manner. For example, the user slides three fingers on the display screen while watching a video, triggering a preset mark storage instruction. When the electronic device receives the preset mark storage instruction, it acquires the display image when the preset mark storage instruction is triggered, recognizes the mark in the display image, and saves the mark as a new preset mark in the memory. For another example, the user performs a circle operation on the display screen while watching a video, triggering a preset mark storage instruction. When the electronic device receives the preset mark storage instruction, it acquires the delineated area of the delineation operation, and saves the objects in the delineated area as a new preset mark to the memory, etc.
206、根据所述预设标识数目,确定所述待识别视频的推荐优先级,其中,所述预设标识数目与所述推荐优先级成反比。206. Determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority.
比如,在确定与静止物体匹配成功的预设标识数目之后,电子设备可以根据预设标识数目确定待识别视频的推荐优先级,其中,预设标识数目与推荐优先级成反比。即预设标识数目越多,推荐优先级越低,电子设备推荐该待识别视频的可能性越小;预设标识数目越少,推荐优先级越高,电子设备推荐该待识别视频的可能性越大。For example, after determining the preset number of identifiers successfully matched with the stationary object, the electronic device may determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority. That is, the more the number of preset identifiers, the lower the recommendation priority, the smaller the possibility that the electronic device recommends the video to be recognized; the fewer the number of preset identifiers, the higher the recommendation priority, and the possibility that the electronic device recommends the video to be recognized Bigger.
在一些实施例中,当检测到需要进行视频推荐时,电子设备获取用户的历史浏览记录,从历史浏览记录中确定需要推荐视频的目标类型,在第一预设视频库中查找该目标类型所对应的视频,将该目标类型所对应的视频作为候选视频,根据候选视频的推荐优先级,按照推荐优先级从高到低在显示界面上显示。In some embodiments, when it is detected that video recommendation is needed, the electronic device obtains the user's historical browsing record, determines the target type of the video to be recommended from the historical browsing record, and searches the first preset video library for the target type. For the corresponding video, the video corresponding to the target type is used as a candidate video, and is displayed on the display interface according to the recommendation priority of the candidate video from high to low according to the recommendation priority.
在一些实施例中,当检测到需要进行视频推荐时,电子设备获取用户的历史浏览记录,从历史浏览记录中确定最需要推荐视频的一种目标类型,在第二预设视频库的第一子视频库(存储推荐优先级最高的视频)中查找该目标类型对应的视频,将该目标类型对应的视频作为备选视频,在显示界面上显示该备选视频;当检测到用户查阅到一定比例(如50%)时,在第二预设视频库的第二子视频库(存储推荐优先级第二高的视频)中查找该目标类型对应的视频,将该目标类型对应的视频作为备选视频,在显示界面上显示该备选视频,依次类推。In some embodiments, when it is detected that video recommendation needs to be performed, the electronic device obtains the user's historical browsing records, and determines from the historical browsing records a target type that needs the most recommended video. Find the video corresponding to the target type in the sub-video library (storing the video with the highest recommended priority), use the video corresponding to the target type as a candidate video, and display the candidate video on the display interface; When the ratio (such as 50%), search for the video corresponding to the target type in the second sub-video library (storing the video with the second highest recommended priority) of the second preset video library, and use the video corresponding to the target type as the backup Select a video, display the candidate video on the display interface, and so on.
请参阅图5,图5为本申请实施例提供的识别方法的第三流程示意图。Please refer to FIG. 5, which is a schematic diagram of the third process of the identification method provided by an embodiment of the application.
301、从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像。301. Extract multiple frames of original images from a video to be recognized, obtain an edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images.
比如,电子设备获取待识别视频,从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像。其中,每帧原始图像对应的边缘梯度图像是对该帧原始图像进行边缘提取后得到的图像。边缘是区域属性发生突变的位置,是 图像性区域和另一个属性区域的交接处。边缘包括阶跃状边缘和屋顶状边缘。阶跃状边缘两侧像素的像素值明显不同。屋顶状边缘处于像素值由小到大再到小的变化转折点处。For example, the electronic device obtains a video to be recognized, extracts multiple frames of original images from the video to be recognized, obtains an edge gradient image corresponding to each frame of the original image, and obtains multiple frames of edge gradient images. Wherein, the edge gradient image corresponding to each frame of the original image is an image obtained after edge extraction is performed on the frame of the original image. The edge is the location where the attribute of the area changes suddenly, and it is the intersection of the image area and another attribute area. Edges include step-shaped edges and roof-shaped edges. The pixel values of the pixels on both sides of the step-shaped edge are obviously different. The roof-like edge is at the turning point where the pixel value changes from small to large to small.
302、获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数。302. Obtain the pixel value of each pixel in the edge gradient image of each frame, and determine the median of the pixel value of the pixel at the same position in the multi-frame edge image.
比如,在获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像之后,电子设备获取各帧边缘梯度图像中的每一像素点的像素值,并确定在多帧边缘图像中位于相同位置的像素点的像素值的中位数。For example, after acquiring the edge gradient image corresponding to each frame of the original image, and after obtaining multiple frames of edge gradient images, the electronic device acquires the pixel value of each pixel in the edge gradient image of each frame, and determines that it is located in the same edge image in the multiple frames. The median of the pixel value of the pixel at the position.
可以理解的是,本申请实施例中由一个视频中得到的多帧边缘梯度图像尺寸相同。It is understandable that, in this embodiment of the present application, multiple frames of edge gradient images obtained from one video have the same size.
303、根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像。303. Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median.
比如,在确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数之后,电子设备可以根据每个中位数以及每个中位数对应的像素点位置,生成一帧中间特征图像。其中,中间特征图像的尺寸与各帧边缘梯度图像的尺寸相同。For example, after determining the median of the pixel values of pixels at the same position in the multiple frames of edge images, the electronic device may generate a pixel value according to each median and the pixel position corresponding to each median. Feature image in the middle of the frame. Among them, the size of the intermediate feature image is the same as the size of the edge gradient image of each frame.
需要说明的是,根据每个中位数以及每个中位数对应的像素点位置生成的中间特征图像,可以消除多帧边缘图像中位置发生变化的物体,保留多帧边缘图像中位置没有发生变化的物体。It should be noted that the intermediate feature image generated according to each median and the position of the pixel corresponding to each median can eliminate objects whose position in the multi-frame edge image has changed, and keep the position in the multi-frame edge image without occurrence Changing objects.
304、将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。304. Determine an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
比如,在生成中间特征图像之后,电子设备可以将中间特征图像中像素值不等于零的像素点构成的物体确定为待识别视频中的静止物体。其中,静止物体可以是一个或多个。For example, after the intermediate feature image is generated, the electronic device may determine the object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized. Among them, there can be one or more stationary objects.
305、根据预先训练的识别模型,判断所述中间特征图像中是否包括所述待识别视频的标识。305. According to a pre-trained recognition model, determine whether the intermediate feature image includes the identifier of the video to be recognized.
比如,在确定待识别视频中存在静止物体之后,根据预先训练的识别模型,判断中间特征图像中是否包括待识别视频的标识。需要说明的是,该方案中待识别视频的标识可能是一个或多个,例如,待识别视频的标识包括“XXTV”和“XX剧场”两个标识。For example, after determining that there is a static object in the video to be recognized, it is determined whether the intermediate feature image includes the identifier of the video to be recognized according to the pre-trained recognition model. It should be noted that the identification of the video to be identified in this solution may be one or more. For example, the identification of the video to be identified includes two identifications of "XXTV" and "XX Theater".
例如,假设基于中间特征图像确定待识别视频中存在3个静止物体,分别记为静止物体Y1、静止物体Y2、静止物体Y3。电子设备将中间特征图像输入至预先训练的识别模型中,得到0或1的输出结果。当输出结果为0时,判定中间特征图像中不包括待识别视频的标识,当输出结果为1时,判定中间特征图像中包括待识别视频的标识。其中,0表示中间特征图像中不包括待识别视频的标识,即静止物体Y1、静止物体Y2和静止物体Y3是除标识以外的其他物体;1表示中间特征图像中包括待识别视频的标识。For example, suppose that it is determined based on the intermediate feature image that there are 3 stationary objects in the video to be recognized, which are recorded as stationary object Y1, stationary object Y2, and stationary object Y3. The electronic device inputs the intermediate feature image into the pre-trained recognition model, and obtains an output result of 0 or 1. When the output result is 0, it is determined that the intermediate feature image does not include the identifier of the video to be recognized, and when the output result is 1, it is determined that the intermediate feature image includes the identifier of the video to be recognized. Among them, 0 indicates that the intermediate feature image does not include the identification of the video to be recognized, that is, the stationary object Y1, the stationary object Y2, and the stationary object Y3 are objects other than the identification; 1 indicates that the intermediate feature image includes the identification of the video to be recognized.
例如,假设基于中间特征图像确定待识别视频中存在3个静止物体,分别记为静止物体Y4、静止物体Y5、静止物体Y6。电子设备将中间特征图像输入至预先训练的识别模型中,输出各静止物体的物体类型,如静止物体Y4的物体类型是“建筑物”,静止物体Y5的物体类型是“标识”,静止物体Y6的物体类型是“人”。根据各静止物体的物体类型,判断中间特征图像中是否包括所述待识别视频的标识。For example, suppose that it is determined based on the intermediate feature image that there are 3 stationary objects in the video to be recognized, which are recorded as stationary object Y4, stationary object Y5, and stationary object Y6, respectively. The electronic device inputs the intermediate feature image into the pre-trained recognition model, and outputs the object type of each stationary object. For example, the object type of stationary object Y4 is "building", the object type of stationary object Y5 is "identification", and the object type of stationary object Y6 is The object type is "person". According to the object type of each stationary object, it is determined whether the intermediate feature image includes the identifier of the video to be recognized.
需要说明的是,该方案中识别模型输入的是由原始图像得到的中间特征图像,相比于直接在识别模型中输入原始图像,有利于提高识别模型对标识判断的准确度。It should be noted that the input of the recognition model in this solution is the intermediate feature image obtained from the original image. Compared with directly inputting the original image in the recognition model, it is beneficial to improve the accuracy of the identification judgment of the recognition model.
在一些实施例中,在根据预先训练的识别模型,判断所述中间特征图像中是否包括所述待识别视频的标识之前,电子设备可以获取由多段训练视频得到的多帧中间特征图像,构成训练集,使用所述训练集对预设的卷积神经网络模型进行训练,并将训练后的卷积神经网络模型作为识别模型。该方案中,利用中间特征图像对识别模型进行训练,可以提高模型的识别准确率。In some embodiments, before judging whether the intermediate feature image includes the identifier of the video to be recognized according to the pre-trained recognition model, the electronic device may obtain multiple frames of intermediate feature images obtained from multiple training videos to form training Set, use the training set to train a preset convolutional neural network model, and use the trained convolutional neural network model as a recognition model. In this solution, the use of intermediate feature images to train the recognition model can improve the recognition accuracy of the model.
306、若所述中间特征图像中不包括所述待识别视频的标识,则将多个预设等级中的最高等级确定为所述待识别视频的推荐等级。306. If the intermediate feature image does not include the identifier of the video to be recognized, determine the highest level among multiple preset levels as the recommendation level of the video to be recognized.
比如,在判定中间特征图像中不包括待识别视频的标识之后,电子设备可以将多个预设等级中的最高等级确定为待识别视频的推荐等级。需要说明的是,待识别视频的推荐等级越高,电子设备推荐该待识别视频的可能性越大。相比于不包括标识的视频,包括标识的视频可能会遮挡视频的播放内容,导致用户观看不佳,因此该方案中不包括标识的视频推荐等级最高。For example, after determining that the intermediate feature image does not include the identifier of the video to be recognized, the electronic device may determine the highest level among the multiple preset levels as the recommendation level of the video to be recognized. It should be noted that the higher the recommendation level of the video to be recognized, the greater the possibility that the electronic device recommends the video to be recognized. Compared with the video that does not include the logo, the video that includes the logo may obscure the playback content of the video and cause users to watch poorly. Therefore, the video that does not include the logo has the highest recommendation level in this solution.
307、若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的面积占比。307. If the intermediate feature image includes the identifier of the video to be recognized, determine the area proportion of the identifier in the video to be recognized.
比如,在判定中间特征图像中包括待识别视频的一个标识之后,电子设备可以确定该个标识所占面积,根据公式:面积占比=该个标识所占面积/中间特征图像所占面积,计算标识在所述待识别视频中的面积占比。For example, after determining that the intermediate feature image includes a logo of the video to be recognized, the electronic device can determine the area occupied by the logo, according to the formula: area ratio = area occupied by the logo / area occupied by the intermediate feature image, and calculate The area percentage of the mark in the to-be-recognized video.
又如,在判定中间特征图像中包括待识别视频的多个标识之后,电子设备可以确定多个标识在待识别视频中的面积占比。其中,电子设备可以将多个候选面积占比(各个标识在待识别视频中的面积占比为候选面积占比)中最大的面积占比确定为多个标识在待识别视频中的面积占比。电子设备也可以计算多个标识在待识别视频中的面积占比等。For another example, after determining that the intermediate feature image includes multiple identifiers of the video to be recognized, the electronic device may determine the area ratio of the multiple identifiers in the video to be recognized. Among them, the electronic device may determine the largest area ratio among the multiple candidate area ratios (the area ratio of each mark in the video to be recognized is the candidate area ratio) as the area ratio of the multiple marks in the video to be recognized . The electronic device can also calculate the area ratio of multiple marks in the to-be-recognized video.
例如,假设待识别视频包括第一标识和第二标识,电子设备可以确定第一标识在待识别视频中的第一面积占比,确定第二标识在待识别视频中的第二面积占比,将第一面积占比和第二面积占比中最大的作为待识别视频的面积占比。如若第一面积占比大于第二面积占比,则确定多个标识在待识别视频中的面积占比为第一面积占比。For example, assuming that the video to be recognized includes a first identifier and a second identifier, the electronic device may determine the proportion of the first area of the first identifier in the video to be recognized, and determine the proportion of the second area of the second identifier in the video to be recognized. The largest of the first area ratio and the second area ratio is used as the area ratio of the video to be recognized. If the proportion of the first area is greater than the proportion of the second area, the proportions of the areas of the multiple markers in the video to be recognized are determined to be the proportions of the first area.
例如,假设待识别视频包括第一标识和第二标识,电子设备可以确定第一标识所占面积和第二标识所占面积,按照公式:面积占比=(第一标识所占面积+第二标识所占面积)/中间特征图像所占面积,计算面积占比,将其作为多个标识在待识别视频中的面积占比。For example, assuming that the video to be recognized includes a first logo and a second logo, the electronic device can determine the area occupied by the first logo and the area occupied by the second logo, according to the formula: area ratio = (area occupied by the first logo + second The area occupied by the mark)/the area occupied by the intermediate feature image, the area proportion is calculated and used as the proportion of the area of the multiple marks in the video to be recognized.
308、根据所述面积占比,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。308. Determine the recommendation level of the to-be-recognized video from preset levels other than the highest level according to the proportion of the area.
比如,待识别视频仅包括一个标识,在确定该个标识在待识别视频中的面积占比之后,电子设备可以根据该个标识在待识别视频中的面积占比,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。其中,面积占比越大,推荐等级越低,电子设备推荐该待识别视频的概率越小,面积占比越小,推荐等级越高,电子设备推荐该待识别视频的概率越大。For example, the video to be recognized includes only one logo. After determining the area percentage of the logo in the video to be recognized, the electronic device can determine the area percentage of the logo in the video to be recognized, from other than the highest level. The recommended level of the video to be recognized is determined in the preset level of. Among them, the larger the area proportion, the lower the recommendation level, the smaller the probability that the electronic device recommends the video to be recognized, the smaller the area proportion, the higher the recommendation level, and the greater the probability that the electronic device recommends the video to be recognized.
在一些实施例中,在判断中间特征图像中包括待识别视频的标识之后,电子设备还可以确定标识在待识别视频中的位置。根据位置,从除最高等级以外的预设等级中确定待识别视频的推荐等级。In some embodiments, after determining that the intermediate feature image includes the identifier of the video to be recognized, the electronic device may also determine the position of the identifier in the video to be recognized. According to the location, the recommendation level of the video to be recognized is determined from preset levels other than the highest level.
其中,在根据位置,从除最高等级以外的预设等级中确定待识别视频的推荐等级时,电子设备可以根据预设等级的数目与1的差值V,根据差值V确定中间特征图像的划分数目V,从中间特征图像的中心用矩形将中间特征图像划分成V个区域,每个区域对应一个预设等级,越靠近边缘的区域,预设等级越高。Wherein, when determining the recommended level of the video to be recognized from preset levels other than the highest level according to the location, the electronic device may determine the intermediate feature image according to the difference V between the number of preset levels and 1 and the difference V. The number of divisions is V. The intermediate feature image is divided into V regions with a rectangle from the center of the intermediate feature image, and each region corresponds to a preset level. The closer the region is to the edge, the higher the preset level.
如图6所示,图6为本申请实施例提供的中间特征图像的区域划分示意图。假设预设等级的数目为6,预设等级记为D1、D2、D3、D4、D5及D6,且等级高低D1>D2>D3>D4>D5>D6。在根据位置,从除最高等级以外的预设等级中(从D2、D3、D4、D5及D6中)确定待识别视频的推荐等级时,电子设备可以确定中间特征图像的划分数目为5,从中间特征图像的中心用矩形将中间特征图像划分成5个区域,记为区域Q1、区域Q2、区域Q3、区域Q4以及区域Q5。每个区域对应一个预设等级,越靠近边缘的区域,预设等级越高,如区域Q1对应预设等级D6,区域Q2对应预设等级D5,区域Q3对应预设等级D4,区域Q4对应预设等级D3,区域Q5对应预设等级D1。As shown in FIG. 6, FIG. 6 is a schematic diagram of region division of an intermediate feature image provided by an embodiment of the application. Assuming that the number of preset levels is 6, the preset levels are denoted as D1, D2, D3, D4, D5, and D6, and the level is high and low D1>D2>D3>D4>D5>D6. When determining the recommendation level of the video to be recognized from the preset levels (from D2, D3, D4, D5, and D6) other than the highest level according to the location, the electronic device can determine that the number of divisions of the intermediate feature image is 5. The center of the intermediate feature image is divided into 5 areas by a rectangle, which are denoted as area Q1, area Q2, area Q3, area Q4, and area Q5. Each area corresponds to a preset level. The closer the area is to the edge, the higher the preset level. For example, the area Q1 corresponds to the preset level D6, the area Q2 corresponds to the preset level D5, the area Q3 corresponds to the preset level D4, and the area Q4 corresponds to the preset level. Set the level D3, and the area Q5 corresponds to the preset level D1.
之后,根据待识别视频的标识所在的位置,从除最高等级以外的预设等级中(从D2、 D3、D4、D5及D6中)确定待识别视频的推荐等级。如假设待识别视频的标识所在的位置为区域Q3,则确定待识别视频的推荐等级为预设等级D4。Then, according to the location of the identifier of the video to be recognized, the recommendation level of the video to be recognized is determined from preset levels (from D2, D3, D4, D5, and D6) other than the highest level. If it is assumed that the location of the identifier of the video to be recognized is the area Q3, it is determined that the recommendation level of the video to be recognized is the preset level D4.
需要说明的是,当中间特征图像中包括待识别视频的标识为多个时,可以根据多个标识的位置得到多个等级,以多个等级中最低等级作为待识别视频的推荐等级。It should be noted that when the intermediate feature image includes multiple identifiers of the video to be recognized, multiple levels can be obtained according to the positions of the multiple identifiers, and the lowest level among the multiple levels is used as the recommended level of the video to be recognized.
图7是本申请实施例提供的识别装置的结构示意图,该装置用于执行上述实施例提供的识别方法,具备执行方法相应的功能模块和有益效果。如图7所示,该识别装置400具体包括:第一获取模块401、第一确定模块402、生成模块403以及第二确定模块404,其中:Fig. 7 is a schematic structural diagram of an identification device provided by an embodiment of the present application. The device is used to execute the identification method provided in the above-mentioned embodiment and has functional modules and beneficial effects corresponding to the execution method. As shown in FIG. 7, the identification device 400 specifically includes: a first acquiring module 401, a first determining module 402, a generating module 403, and a second determining module 404, wherein:
第一获取模块401,用于从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;The first acquisition module 401 is configured to extract multiple frames of original images from the video to be recognized, acquire the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
第一确定模块402,用于获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;The first determining module 402 is configured to obtain the pixel value of each pixel in each frame of edge gradient image, and determine the median of the pixel value of the pixel at the same position in the multiple frames of edge image;
生成模块403,用于根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;The generating module 403 is used to generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
第二确定模块404,用于将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。The second determining module 404 is configured to determine an object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
在一些实施例中,在从待识别视频中提取多帧原始图像时,第一获取模块401,用于:根据待识别视频的时间轴,间隔地从所述待识别视频中提取多帧原始图像。In some embodiments, when extracting multiple frames of original images from the to-be-recognized video, the first acquisition module 401 is configured to: extract multiple frames of original images from the to-be-recognized video at intervals according to the time axis of the to-be-recognized video .
在一些实施例中,在将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,识别装置400还包括匹配模块和第三确定模块;In some embodiments, after determining an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized, the recognition device 400 further includes a matching module and a third determining module;
所述匹配模块,用于将所述静止物体与多个预设标识进行匹配,确定与所述静止物体匹配成功的预设标识数目;The matching module is configured to match the stationary object with a plurality of preset identifiers, and determine the number of preset identifiers that are successfully matched with the stationary object;
所述第三确定模块,用于根据所述预设标识数目,确定所述待识别视频的推荐优先级,其中,所述预设标识数目与所述推荐优先级成反比。The third determining module is configured to determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority.
在一些实施例中,在将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,识别装置400还包括判断模块和第四确定模块;In some embodiments, after determining an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized, the recognition device 400 further includes a determination module and a fourth determination module;
所述判断模块,用于根据预先训练的识别模型,判断所述中间特征图像中是否包括所述待识别视频的标识;The judgment module is configured to judge whether the intermediate feature image includes the identifier of the video to be recognized according to a pre-trained recognition model;
所述第四确定模块,用于若所述中间特征图像中不包括所述待识别视频的标识,则将多个预设等级中的最高等级确定为所述待识别视频的推荐等级。The fourth determining module is configured to determine the highest level among a plurality of preset levels as the recommendation level of the video to be recognized if the intermediate feature image does not include the identifier of the video to be recognized.
在一些实施例中,在判断所述中间特征图像中是否包括所述待识别视频的标识之后,识别装置400还包括第五确定模块和第六确定模块;In some embodiments, after determining whether the intermediate feature image includes the identifier of the video to be recognized, the recognition apparatus 400 further includes a fifth determining module and a sixth determining module;
所述第五确定模块,用于若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的位置;The fifth determining module is configured to determine the position of the identifier in the video to be recognized if the intermediate feature image includes the identifier of the video to be recognized;
所述第六确定模块,用于根据所述位置,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。The sixth determining module is configured to determine the recommendation level of the to-be-recognized video from preset levels other than the highest level according to the location.
在一些实施例中,在判断所述中间特征图像中是否包括所述待识别视频的标识之后,所述第五确定模块,用于若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的面积占比;In some embodiments, after determining whether the intermediate feature image includes the identifier of the video to be recognized, the fifth determining module is configured to, if the intermediate feature image includes the identifier of the video to be recognized, Determining the area proportion of the mark in the video to be recognized;
所述第六确定模块,用于根据所述面积占比,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。The sixth determining module is configured to determine the recommendation level of the to-be-recognized video from preset levels other than the highest level according to the area ratio.
在一些实施例中,在判断所述中间特征图像中是否包括所述待识别视频的标识之前,识别装置400还包括第二获取模块和训练模块;In some embodiments, before determining whether the intermediate feature image includes the identifier of the video to be recognized, the recognition device 400 further includes a second acquisition module and a training module;
第二获取模块,用于获取由多段训练视频得到的多帧中间特征图像,构成训练集;The second acquisition module is used to acquire multiple frames of intermediate feature images obtained from multiple training videos to form a training set;
训练模块,用于使用所述训练集对预设的卷积神经网络模型进行训练,并将训练后的卷积神经网络模型作为识别模型。The training module is used to train a preset convolutional neural network model using the training set, and use the trained convolutional neural network model as a recognition model.
应当说明的是,本申请实施例提供的识别装置与上文实施例中的识别方法属于同一构思,在识别装置上可以运行识别方法实施例中提供的任一方法,其具体实现过程详见识别方法实施例,此处不再赘述。It should be noted that the identification device provided in this embodiment of the application belongs to the same concept as the identification method in the above embodiment, and any method provided in the identification method embodiment can be run on the identification device. For the specific implementation process, see Identification The method embodiment will not be repeated here.
本申请实施例提供一种计算机可读的存储介质,其上存储有计算机程序,当其存储的计算机程序在计算机上执行时,使得计算机执行如本申请实施例提供的识别方法中的步骤。其中,存储介质可以是磁碟、光盘、只读存储器(Read Only Memory,ROM,)或者随机存取器(Random Access Memory,RAM)等。The embodiment of the present application provides a computer-readable storage medium on which a computer program is stored. When the stored computer program is executed on a computer, the computer is caused to execute the steps in the identification method provided in the embodiment of the present application. Among them, the storage medium may be a magnetic disk, an optical disk, a read only memory (Read Only Memory, ROM,), or a random access device (Random Access Memory, RAM), etc.
本申请实施例还提供一种电子设备,请参照图8,电子设备500包括处理器501和存储器502。其中,处理器501与存储器502电性连接。An embodiment of the present application also provides an electronic device. Referring to FIG. 8, the electronic device 500 includes a processor 501 and a memory 502. Wherein, the processor 501 and the memory 502 are electrically connected.
处理器501是电子设备500的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或加载存储在存储器502内的计算机程序,以及调用存储在存储器502内的数据,执行电子设备500的各种功能并处理数据。The processor 501 is the control center of the electronic device 500. It uses various interfaces and lines to connect various parts of the entire electronic device. It executes the electronic device by running or loading the computer program stored in the memory 502, and calling the data stored in the memory 502. Various functions of the device 500 and processing data.
存储器502可用于存储软件程序以及模块,处理器501通过运行存储在存储器502的计算机程序以及模块,从而执行各种功能应用以及数据处理。存储器502可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的计算机程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。The memory 502 may be used to store software programs and modules. The processor 501 executes various functional applications and data processing by running the computer programs and modules stored in the memory 502. The memory 502 may mainly include a storage program area and a storage data area. The storage program area may store an operating system, a computer program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of electronic equipment, etc.
此外,存储器502可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器502还可以包括存储器控制器,以提供处理器501对存储器502的访问。In addition, the memory 502 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices. Correspondingly, the memory 502 may further include a memory controller to provide the processor 501 with access to the memory 502.
在本申请实施例中,电子设备500中的处理器501会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器502中,并由处理器501运行存储在存储器502中的计算机程序,从而实现各种功能,如下:In the embodiment of the present application, the processor 501 in the electronic device 500 will load the instructions corresponding to the process of one or more computer programs into the memory 502 according to the following steps, and run the instructions by the processor 501 and store them in the memory 502. In order to realize various functions in the computer program, as follows:
从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;Extract multiple frames of original images from the video to be recognized, obtain the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;Acquiring the pixel value of each pixel in the edge gradient image of each frame, and determining the median of the pixel value of the pixel located at the same position in the multi-frame edge image;
根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
请参照图9,图9为本申请实施例提供的电子设备的第二结构示意图,与图8所示电子设备的区别在于,电子设备还包括:摄像组件603、显示组件604、音频电路605、射频电路606以及电源607。其中,摄像组件603、显示组件604、音频电路605、射频电路606以及电源607分别与处理器601电性连接。Please refer to FIG. 9. FIG. 9 is a schematic diagram of a second structure of an electronic device provided by an embodiment of the application. The difference from the electronic device shown in FIG. 8 is that the electronic device further includes: a camera component 603, a display component 604, an audio circuit 605, Radio frequency circuit 606 and power supply 607. Among them, the camera component 603, the display component 604, the audio circuit 605, the radio frequency circuit 606, and the power supply 607 are electrically connected to the processor 601, respectively.
摄像组件603可以包括图像处理电路,图像处理电路可以利用硬件和/或软件组件实现,可包括定义图像信号处理(Image Signal Processing)管线的各种处理单元。图像处理电路至少可以包括:多个摄像头、图像信号处理器(Image Signal Processor,ISP处理器)、控制逻辑器、图像存储器以及显示器等。其中每个摄像头至少可以包括一个或多个透镜和图像传感器。图像传感器可包括色彩滤镜阵列(如Bayer滤镜)。图像传感器可获取用图像传感器的每个成像像素捕捉的光强度和波长信息,并提供可由图像信号处理器处理的一组原始图像数据。The camera component 603 may include an image processing circuit, which may be implemented by hardware and/or software components, and may include various processing units that define an image signal processing (Image Signal Processing) pipeline. The image processing circuit may at least include: multiple cameras, an image signal processor (Image Signal Processor, ISP processor), a control logic, an image memory, a display, and the like. Each camera may include at least one or more lenses and image sensors. The image sensor may include a color filter array (such as a Bayer filter). The image sensor can obtain the light intensity and wavelength information captured with each imaging pixel of the image sensor, and provide a set of raw image data that can be processed by the image signal processor.
显示组件604可以用于显示由用户输入的信息或提供给用户的信息以及各种图形用户 接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。The display component 604 can be used to display information input by the user or information provided to the user, and various graphical user interfaces. These graphical user interfaces can be composed of graphics, text, icons, videos, and any combination thereof.
音频电路605可以用于通过扬声器、传声器提供用户与电子设备之间的音频接口。The audio circuit 605 can be used to provide an audio interface between the user and the electronic device through a speaker or a microphone.
射频电路606可以用于收发射频信号,以通过无线通信与网络设备或其他电子设备建立无线通讯,与网络设备或其他电子设备之间收发信号。The radio frequency circuit 606 may be used to transmit and receive radio frequency signals to establish wireless communication with network equipment or other electronic equipment through wireless communication, and to transmit and receive signals with the network equipment or other electronic equipment.
电源607可以用于给电子设备600的各个部件供电。在一些实施例中,电源607可以通过电源管理系统与处理器601逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The power supply 607 can be used to supply power to various components of the electronic device 600. In some embodiments, the power supply 607 may be logically connected to the processor 601 through a power management system, so that functions such as charging, discharging, and power consumption management can be managed through the power management system.
在本申请实施例中,电子设备600中的处理器601会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器602中,并由处理器601运行存储在存储器602中的计算机程序,从而实现各种功能,如下:In the embodiment of the present application, the processor 601 in the electronic device 600 will load the instructions corresponding to the process of one or more computer programs into the memory 602 according to the following steps, and the processor 601 will run the instructions and store them in the memory 602. In order to realize various functions in the computer program, as follows:
从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;Extract multiple frames of original images from the video to be recognized, obtain the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;Acquiring the pixel value of each pixel in the edge gradient image of each frame, and determining the median of the pixel value of the pixel located at the same position in the multi-frame edge image;
根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
在一些实施例中,在所述从待识别视频中提取多帧原始图像时,处理器601可以执行:In some embodiments, when extracting multiple frames of original images from the video to be recognized, the processor 601 may execute:
根据待识别视频的时间轴,间隔地从所述待识别视频中提取多帧原始图像。According to the time axis of the video to be recognized, multiple frames of original images are extracted from the video to be recognized at intervals.
在一些实施例中,在将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,处理器601可以执行:In some embodiments, after determining an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized, the processor 601 may execute:
将所述静止物体与多个预设标识进行匹配,确定与所述静止物体匹配成功的预设标识数目;Matching the stationary object with a plurality of preset identifiers, and determining the number of preset identifiers that are successfully matched with the stationary object;
根据所述预设标识数目,确定所述待识别视频的推荐优先级,其中,所述预设标识数目与所述推荐优先级成反比。Determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority.
在一些实施例中,在将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,处理器601可以执行:In some embodiments, after determining an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized, the processor 601 may execute:
根据预先训练的识别模型,判断所述中间特征图像中是否包括所述待识别视频的标识;Judging whether the intermediate feature image includes the identifier of the video to be recognized according to the pre-trained recognition model;
若所述中间特征图像中不包括所述待识别视频的标识,则将多个预设等级中的最高等级确定为所述待识别视频的推荐等级。If the intermediate feature image does not include the identifier of the video to be recognized, the highest level among a plurality of preset levels is determined as the recommendation level of the video to be recognized.
在一些实施例中,在判断所述中间特征图像中是否包括所述待识别视频的标识之后,处理器601可以执行:In some embodiments, after determining whether the intermediate feature image includes the identifier of the video to be recognized, the processor 601 may execute:
若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的位置;If the intermediate feature image includes the identifier of the video to be recognized, determine the position of the identifier in the video to be recognized;
根据所述位置,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。According to the location, the recommendation level of the video to be recognized is determined from a preset level other than the highest level.
在一些实施例中,在判断所述中间特征图像中是否包括所述待识别视频的标识之后,处理器601可以执行:In some embodiments, after determining whether the intermediate feature image includes the identifier of the video to be recognized, the processor 601 may execute:
若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的面积占比;If the intermediate feature image includes the identifier of the video to be recognized, determining the area proportion of the identifier in the video to be recognized;
根据所述面积占比,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。According to the area ratio, the recommendation level of the to-be-recognized video is determined from preset levels other than the highest level.
在一些实施例中,在判断所述中间特征图像中是否包括所述待识别视频的标识之前,处理器601可以执行:In some embodiments, before determining whether the intermediate feature image includes the identifier of the video to be recognized, the processor 601 may execute:
获取由多段训练视频得到的多帧中间特征图像,构成训练集;Obtain multiple frames of intermediate feature images obtained from multiple training videos to form a training set;
使用所述训练集对预设的卷积神经网络模型进行训练,并将训练后的卷积神经网络模型作为识别模型。Use the training set to train a preset convolutional neural network model, and use the trained convolutional neural network model as a recognition model.
由上述可知,本实施例提供的电子设备,在从待识别视频中提取多帧原始图像之后,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像,然后确定在多帧边缘图像中位于相同位置的像素点的像素值的中位数,接着根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像,最后将中间特征图像中像素值不等于零的像素点构成的物体确定为待识别视频中的静止物体,可以提高待识别视频中静止物体的识别准确度。It can be seen from the above that the electronic device provided in this embodiment, after extracting multiple frames of original images from the video to be recognized, obtains the edge gradient image corresponding to each frame of the original image, obtains multiple frames of edge gradient images, and then determines that the multiple frames of edge image The median of the pixel value of the pixel at the same position in the middle, and then according to each median and the pixel position corresponding to each median, an intermediate feature image is generated, and finally the pixel value in the intermediate feature image is not equal to zero The object formed by the pixels is determined to be a stationary object in the video to be recognized, which can improve the recognition accuracy of the stationary object in the video to be recognized.
本申请实施例还提供一种存储介质,该存储介质存储有计算机程序,当该计算机程序在计算机上运行时,使得该计算机执行上述任一实施例中的识别方法,比如:从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。The embodiments of the present application also provide a storage medium that stores a computer program, and when the computer program is run on a computer, the computer is caused to execute the recognition method in any of the above-mentioned embodiments, for example, from a video to be recognized Extract multiple frames of original images, obtain the edge gradient images corresponding to each frame of original images, and obtain multiple frames of edge gradient images; obtain the pixel value of each pixel in the edge gradient images of each frame, and determine the position in the multiple frames of edge image The median of the pixel values of the pixels at the same position; generate an intermediate feature image according to each median and the position of the pixel corresponding to each median; divide the pixels whose pixel value is not equal to zero in the intermediate feature image The constituted object is determined to be a stationary object in the video to be recognized.
在本申请实施例中,存储介质可以是磁碟、光盘、只读存储器(Read Only Memory,ROM)、或者随机存取记忆体(Random Access Memory,RAM)等。In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a read only memory (Read Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.
需要说明的是,对本申请实施例的识别方法而言,本领域普通测试人员可以理解实现本申请实施例的识别方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,该计算机程序可存储于一计算机可读取存储介质中,如存储在电子设备的存储器中,并被该电子设备内的至少一个处理器执行,在执行过程中可包括如识别方法的实施例的流程。其中,存储介质可为磁碟、光盘、只读存储器、随机存取记忆体等。It should be noted that for the identification method of the embodiment of the present application, ordinary testers in the field can understand that all or part of the process of implementing the identification method of the embodiment of the present application can be completed by controlling the relevant hardware through a computer program. The computer program may be stored in a computer readable storage medium, such as stored in the memory of an electronic device, and executed by at least one processor in the electronic device. The execution process may include a process such as an embodiment of the identification method. . Among them, the storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, and the like.
对本申请实施例的识别装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。该集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,该存储介质譬如为只读存储器,磁盘或光盘等。For the identification device of the embodiment of the present application, its functional modules may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software function modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium, such as a read-only memory, a magnetic disk, or an optical disk.
以上对本申请实施例所提供的一种识别方法、装置、存储介质以及电子设备进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The identification method, device, storage medium, and electronic equipment provided by the embodiments of the application are described in detail above. Specific examples are used in this article to illustrate the principles and implementations of the application. The description of the above embodiments is only used To help understand the methods and core ideas of this application; at the same time, for those skilled in the art, according to the ideas of this application, there will be changes in the specific implementation and scope of application. In summary, the content of this specification It should not be construed as a limitation on this application.

Claims (20)

  1. 一种识别方法,其中,包括:An identification method, which includes:
    从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;Extract multiple frames of original images from the video to be recognized, obtain the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
    获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;Acquiring the pixel value of each pixel in the edge gradient image of each frame, and determining the median of the pixel value of the pixel located at the same position in the multi-frame edge image;
    根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
    将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
  2. 根据权利要求1所述的识别方法,其中,所述从待识别视频中提取多帧原始图像,包括:The recognition method according to claim 1, wherein said extracting multiple frames of original images from the video to be recognized comprises:
    根据待识别视频的时间轴,间隔地从所述待识别视频中提取多帧原始图像。According to the time axis of the video to be recognized, multiple frames of original images are extracted from the video to be recognized at intervals.
  3. 根据权利要求1所述的识别方法,其中,所述将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,还包括:The recognition method according to claim 1, wherein after determining an object formed by pixels whose pixel value is not equal to zero in the intermediate characteristic image as a stationary object in the video to be recognized, the method further comprises:
    将所述静止物体与多个预设标识进行匹配,确定与所述静止物体匹配成功的预设标识数目;Matching the stationary object with a plurality of preset identifiers, and determining the number of preset identifiers that are successfully matched with the stationary object;
    根据所述预设标识数目,确定所述待识别视频的推荐优先级,其中,所述预设标识数目与所述推荐优先级成反比。Determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority.
  4. 根据权利要求1所述的识别方法,其中,所述将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,还包括:The recognition method according to claim 1, wherein after determining an object formed by pixels whose pixel value is not equal to zero in the intermediate characteristic image as a stationary object in the video to be recognized, the method further comprises:
    根据预先训练的识别模型,判断所述中间特征图像中是否包括所述待识别视频的标识;Judging whether the intermediate feature image includes the identifier of the video to be recognized according to the pre-trained recognition model;
    若所述中间特征图像中不包括所述待识别视频的标识,则将多个预设等级中的最高等级确定为所述待识别视频的推荐等级。If the intermediate feature image does not include the identifier of the video to be recognized, the highest level among a plurality of preset levels is determined as the recommendation level of the video to be recognized.
  5. 根据权利要求4所述的识别方法,其中,所述判断所述中间特征图像中是否包括所述待识别视频的标识之后,还包括:The recognition method according to claim 4, wherein, after determining whether the intermediate feature image includes the identifier of the video to be recognized, the method further comprises:
    若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的位置;If the intermediate feature image includes the identifier of the video to be recognized, determine the position of the identifier in the video to be recognized;
    根据所述位置,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。According to the location, the recommendation level of the video to be recognized is determined from a preset level other than the highest level.
  6. 根据权利要求4所述的识别方法,其中,所述判断所述中间特征图像中是否包括所述待识别视频的标识之后,还包括:The recognition method according to claim 4, wherein, after determining whether the intermediate feature image includes the identifier of the video to be recognized, the method further comprises:
    若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的面积占比;If the intermediate feature image includes the identifier of the video to be recognized, determining the area proportion of the identifier in the video to be recognized;
    根据所述面积占比,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。According to the area ratio, the recommendation level of the to-be-recognized video is determined from preset levels other than the highest level.
  7. 根据权利要求4所述的识别方法,其中,所述判断所述中间特征图像中是否包括所述待识别视频的标识之前,还包括:The recognition method according to claim 4, wherein before the determining whether the intermediate feature image includes the identifier of the video to be recognized, the method further comprises:
    获取由多段训练视频得到的多帧中间特征图像,构成训练集;Obtain multiple frames of intermediate feature images obtained from multiple training videos to form a training set;
    使用所述训练集对预设的卷积神经网络模型进行训练,并将训练后的卷积神经网络模型作为识别模型。Use the training set to train a preset convolutional neural network model, and use the trained convolutional neural network model as a recognition model.
  8. 一种识别装置,其中,包括:An identification device, which includes:
    第一获取模块,用于从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;The first acquisition module is used to extract multiple frames of original images from the video to be recognized, acquire the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
    第一确定模块,用于获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;The first determining module is configured to obtain the pixel value of each pixel in the edge gradient image of each frame, and determine the median of the pixel value of the pixel at the same position in the multiple frames of edge image;
    生成模块,用于根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;The generating module is used to generate an intermediate feature image according to each median and the pixel position corresponding to each median;
    第二确定模块,用于将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。The second determining module is configured to determine an object formed by pixels whose pixel values are not equal to zero in the intermediate feature image as a stationary object in the video to be recognized.
  9. 根据权利要求8所述的识别装置,其中,所述第一获取模块,用于:根据待识别视频的时间轴,间隔地从所述待识别视频中提取多帧原始图像。8. The recognition device according to claim 8, wherein the first acquisition module is configured to extract multiple frames of original images from the video to be recognized at intervals according to the time axis of the video to be recognized.
  10. 根据权利要求8所述的识别装置,其中,所述识别装置还包括:The identification device according to claim 8, wherein the identification device further comprises:
    匹配模块,用于将所述静止物体与多个预设标识进行匹配,确定与所述静止物体匹配成功的预设标识数目;A matching module, configured to match the stationary object with a plurality of preset identifiers, and determine the number of preset identifiers that are successfully matched with the stationary object;
    第三确定模块,用于根据所述预设标识数目,确定所述待识别视频的推荐优先级,其中,所述预设标识数目与所述推荐优先级成反比。The third determining module is configured to determine the recommended priority of the video to be recognized according to the preset number of identifiers, wherein the preset number of identifiers is inversely proportional to the recommended priority.
  11. 根据权利要求8所述的识别装置,其中,所述识别装置还包括:The identification device according to claim 8, wherein the identification device further comprises:
    判断模块,用于根据预先训练的识别模型,判断所述中间特征图像中是否包括所述待识别视频的标识;A judging module, configured to judge whether the intermediate feature image includes the identifier of the video to be recognized according to a pre-trained recognition model;
    第四确定模块,用于若所述中间特征图像中不包括所述待识别视频的标识,则将多个预设等级中的最高等级确定为所述待识别视频的推荐等级。The fourth determining module is configured to determine the highest level among a plurality of preset levels as the recommendation level of the video to be recognized if the intermediate feature image does not include the identifier of the video to be recognized.
  12. 根据权利要求11所述的识别装置,其中,所述识别装置还包括:The identification device according to claim 11, wherein the identification device further comprises:
    第二获取模块,用于获取由多段训练视频得到的多帧中间特征图像,构成训练集;The second acquisition module is used to acquire multiple frames of intermediate feature images obtained from multiple training videos to form a training set;
    训练模块,用于使用所述训练集对预设的卷积神经网络模型进行训练,并将训练后的卷积神经网络模型作为识别模型。The training module is used to train a preset convolutional neural network model using the training set, and use the trained convolutional neural network model as a recognition model.
  13. 一种电子设备,包括:处理器、存储器以及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现识别方法:An electronic device comprising: a processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein the processor implements an identification method when the computer program is executed:
    从待识别视频中提取多帧原始图像,获取每帧原始图像对应的边缘梯度图像,得到多帧边缘梯度图像;Extract multiple frames of original images from the video to be recognized, obtain the edge gradient image corresponding to each frame of the original image, and obtain multiple frames of edge gradient images;
    获取各帧边缘梯度图像中每一像素点的像素值,并确定在所述多帧边缘图像中位于相同位置的像素点的像素值的中位数;Acquiring the pixel value of each pixel in the edge gradient image of each frame, and determining the median of the pixel value of the pixel located at the same position in the multi-frame edge image;
    根据每个中位数以及每个中位数对应的像素点位置,生成中间特征图像;Generate an intermediate feature image according to each median and the position of the pixel corresponding to each median;
    将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体。An object formed by pixels whose pixel value is not equal to zero in the intermediate feature image is determined as a static object in the video to be recognized.
  14. 根据权利要求13所述的电子设备,其中,在所述从待识别视频中提取多帧原始图像时,所述处理器用于执行:The electronic device according to claim 13, wherein, when the multiple frames of original images are extracted from the video to be recognized, the processor is configured to execute:
    根据待识别视频的时间轴,间隔地从所述待识别视频中提取多帧原始图像。According to the time axis of the video to be recognized, multiple frames of original images are extracted from the video to be recognized at intervals.
  15. 根据权利要求13所述的电子设备,其中,在所述将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,所述处理器用于执行:The electronic device according to claim 13, wherein, after determining an object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized, the processor is configured to carried out:
    将所述静止物体与多个预设标识进行匹配,确定与所述静止物体匹配成功的预设标识数目;Matching the stationary object with a plurality of preset identifiers, and determining the number of preset identifiers that are successfully matched with the stationary object;
    根据所述预设标识数目,确定所述待识别视频的推荐优先级,其中,所述预设标识数目与所述推荐优先级成反比。Determine the recommended priority of the video to be recognized according to the preset number of identifiers, where the preset number of identifiers is inversely proportional to the recommended priority.
  16. 根据权利要求13所述的电子设备,其中,在所述将所述中间特征图像中像素值不等于零的像素点构成的物体确定为所述待识别视频中的静止物体之后,所述处理器用于执行:The electronic device according to claim 13, wherein, after determining an object formed by pixels whose pixel value is not equal to zero in the intermediate feature image as a stationary object in the video to be recognized, the processor is configured to carried out:
    根据预先训练的识别模型,判断所述中间特征图像中是否包括所述待识别视频的标识;Judging whether the intermediate feature image includes the identifier of the video to be recognized according to the pre-trained recognition model;
    若所述中间特征图像中不包括所述待识别视频的标识,则将多个预设等级中的最高等 级确定为所述待识别视频的推荐等级。If the intermediate feature image does not include the identifier of the video to be recognized, the highest level among a plurality of preset levels is determined as the recommendation level of the video to be recognized.
  17. 根据权利要求16所述的电子设备,其中,在所述判断所述中间特征图像中是否包括所述待识别视频的标识之后,所述处理器用于执行:The electronic device according to claim 16, wherein, after said determining whether the intermediate feature image includes the identifier of the video to be recognized, the processor is configured to execute:
    若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的位置;If the intermediate feature image includes the identifier of the video to be recognized, determine the position of the identifier in the video to be recognized;
    根据所述位置,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。According to the location, the recommendation level of the video to be recognized is determined from a preset level other than the highest level.
  18. 根据权利要求16所述的电子设备,其中,在所述判断所述中间特征图像中是否包括所述待识别视频的标识之后,所述处理器用于执行:The electronic device according to claim 16, wherein, after said determining whether the intermediate feature image includes the identifier of the video to be recognized, the processor is configured to execute:
    若所述中间特征图像中包括所述待识别视频的标识,则确定所述标识在所述待识别视频中的面积占比;If the intermediate feature image includes the identifier of the video to be recognized, determining the area proportion of the identifier in the video to be recognized;
    根据所述面积占比,从除所述最高等级以外的预设等级中确定所述待识别视频的推荐等级。According to the area ratio, the recommendation level of the to-be-recognized video is determined from preset levels other than the highest level.
  19. 根据权利要求16所述的电子设备,其中,在所述判断所述中间特征图像中是否包括所述待识别视频的标识之前,所述处理器用于执行:The electronic device according to claim 16, wherein, before said determining whether the intermediate feature image includes the identifier of the video to be recognized, the processor is configured to execute:
    获取由多段训练视频得到的多帧中间特征图像,构成训练集;Obtain multiple frames of intermediate feature images obtained from multiple training videos to form a training set;
    使用所述训练集对预设的卷积神经网络模型进行训练,并将训练后的卷积神经网络模型作为识别模型。Use the training set to train a preset convolutional neural network model, and use the trained convolutional neural network model as a recognition model.
  20. 一种包含电子设备可执行指令的存储介质,其中,所述电子设备可执行指令在由电子设备处理器执行时用于执行如权利要求1至7任一项所述的识别方法。A storage medium containing executable instructions of an electronic device, wherein the executable instructions of the electronic device are used to execute the identification method according to any one of claims 1 to 7 when executed by an electronic device processor.
PCT/CN2019/115800 2019-11-05 2019-11-05 Recognition method and apparatus, electronic device, and storage medium WO2021087773A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980099689.9A CN114341946A (en) 2019-11-05 2019-11-05 Identification method, identification device, electronic equipment and storage medium
PCT/CN2019/115800 WO2021087773A1 (en) 2019-11-05 2019-11-05 Recognition method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/115800 WO2021087773A1 (en) 2019-11-05 2019-11-05 Recognition method and apparatus, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021087773A1 true WO2021087773A1 (en) 2021-05-14

Family

ID=75849407

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/115800 WO2021087773A1 (en) 2019-11-05 2019-11-05 Recognition method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN114341946A (en)
WO (1) WO2021087773A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361458A (en) * 2021-06-29 2021-09-07 北京百度网讯科技有限公司 Target object identification method and device based on video, vehicle and road side equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1801930A (en) * 2005-12-06 2006-07-12 南望信息产业集团有限公司 Dubious static object detecting method based on video content analysis
US8411974B2 (en) * 2008-12-16 2013-04-02 Sony Corporation Image processing apparatus, method, and program for detecting still-zone area
CN105046653A (en) * 2015-06-12 2015-11-11 中国科学院深圳先进技术研究院 Method and system for removing raindrops in videos
CN105205459A (en) * 2015-09-16 2015-12-30 东软集团股份有限公司 Method and device for identifying type of image feature point
CN110111347A (en) * 2019-04-19 2019-08-09 深圳市华星光电技术有限公司 Logos extracting method, device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1801930A (en) * 2005-12-06 2006-07-12 南望信息产业集团有限公司 Dubious static object detecting method based on video content analysis
US8411974B2 (en) * 2008-12-16 2013-04-02 Sony Corporation Image processing apparatus, method, and program for detecting still-zone area
CN105046653A (en) * 2015-06-12 2015-11-11 中国科学院深圳先进技术研究院 Method and system for removing raindrops in videos
CN105205459A (en) * 2015-09-16 2015-12-30 东软集团股份有限公司 Method and device for identifying type of image feature point
CN110111347A (en) * 2019-04-19 2019-08-09 深圳市华星光电技术有限公司 Logos extracting method, device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361458A (en) * 2021-06-29 2021-09-07 北京百度网讯科技有限公司 Target object identification method and device based on video, vehicle and road side equipment

Also Published As

Publication number Publication date
CN114341946A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
US10134165B2 (en) Image distractor detection and processing
TWI564791B (en) Broadcast control system, method, computer program product and computer readable medium
KR20230013243A (en) Maintain a fixed size for the target object in the frame
CN110189378A (en) A kind of method for processing video frequency, device and electronic equipment
WO2021143624A1 (en) Video tag determination method, device, terminal, and storage medium
WO2016187888A1 (en) Keyword notification method and device based on character recognition, and computer program product
CN113395542B (en) Video generation method and device based on artificial intelligence, computer equipment and medium
CN104508680B (en) Improved video signal is tracked
CN108961267B (en) Picture processing method, picture processing device and terminal equipment
US20210335391A1 (en) Resource display method, device, apparatus, and storage medium
CN108513069B (en) Image processing method, image processing device, storage medium and electronic equipment
CN105430269B (en) A kind of photographic method and device applied to mobile terminal
US20230368461A1 (en) Method and apparatus for processing action of virtual object, and storage medium
WO2018184260A1 (en) Correcting method and device for document image
TWI698117B (en) Generating method and playing method of multimedia file, multimedia file generation apparatus and multimedia file playback apparatus
US11295416B2 (en) Method for picture processing, computer-readable storage medium, and electronic device
WO2021087773A1 (en) Recognition method and apparatus, electronic device, and storage medium
WO2018129955A1 (en) Electronic device control method and electronic device
US10936878B2 (en) Method and device for determining inter-cut time range in media item
WO2023066373A1 (en) Sample image determination method and apparatus, device, and storage medium
US10832369B2 (en) Method and apparatus for determining the capture mode following capture of the content
CN111475677A (en) Image processing method, image processing device, storage medium and electronic equipment
CN108495038B (en) Image processing method, image processing device, storage medium and electronic equipment
CN108898081B (en) Picture processing method and device, mobile terminal and computer readable storage medium
JP2014085845A (en) Moving picture processing device, moving picture processing method, program and integrated circuit

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19951447

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19951447

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19951447

Country of ref document: EP

Kind code of ref document: A1