WO2020048324A1

WO2020048324A1 - Video abstract generation method and apparatus, and electronic device and readable storage medium

Info

Publication number: WO2020048324A1
Application number: PCT/CN2019/102073
Authority: WO
Inventors: 韩巧玲
Original assignee: 杭州海康威视数字技术股份有限公司
Priority date: 2018-09-04
Filing date: 2019-08-22
Publication date: 2020-03-12

Abstract

Provided are a video abstract generation method and apparatus, and an electronic device and a readable storage medium. The method comprises: receiving a target search request, wherein the target search request carries feature information of a target to be searched; searching for a first target picture matching the feature information of the target to be searched; and generating a video abstract according to the first target picture and a collection time corresponding to the first target picture.

Description

Video abstract generating method, device, electronic device and readable storage medium

Technical field

The present application relates to video surveillance technology, and in particular, to a method, a device, an electronic device, and a readable storage medium for generating a video summary.

Background technique

Video surveillance system, as an important technical means of social security management, is increasingly used and deployed in the field of social security maintenance. As the number of deployed surveillance devices increases and the scope of deployment expands, the amount of data stored in video recording data also increases. If you want to find out the specific target (person or vehicle, etc.) in which time period and place from the video recording data, it is often necessary to manually search and search a large amount of video recording data, which takes a long time and may be neglected. There are efficiency bottlenecks and incomplete risks in video positioning and integrated display.

Summary of the Invention

In view of this, the present application provides a method and a device for generating a video summary.

Specifically, the present application is implemented through the following technical solutions.

According to a first aspect of the embodiments of the present application, a method for generating a video digest is provided, including: receiving a target search request, where the target search request carries feature information of a target to be searched; and the search matches the feature information of the target to be searched A first target picture; generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture.

Optionally, before searching for the first target picture that matches the feature information of the target to be searched, the method further includes: obtaining target picture information in video data of the video source device, where the target picture information includes the target picture, Collection time of the target picture and attribute information of the target picture; saving the target picture information to a picture information database.

Optionally, acquiring the target picture information in the video data of the video source device includes receiving the target picture information sent by the video source device.

Optionally, acquiring the target picture information in the video data of the video source device includes: receiving the target picture and the acquisition time of the target picture sent by the video source device; The picture is modeled, and attribute information of the target picture is extracted.

Optionally, acquiring the target picture information in the video data of the video source device includes: receiving the target picture sent by the video source device, a collection time of the target picture, and a first picture of the target picture. An attribute information; modeling the target picture and extracting the second attribute information of the target picture; determining the target picture according to the first attribute information of the target picture and the second attribute information of the target picture Attribute information.

Optionally, acquiring the target picture information in the video data of the video source device includes: performing target detection on the video data provided by the video source device to obtain the video data in the video data. The target picture and the acquisition time of the target picture; modeling the target picture and extracting attribute information of the target picture.

Optionally, the characteristic information of the target to be searched includes attribute information of the target to be searched; and searching for the first target picture that matches the characteristic information of the target to be searched includes: according to the attribute information of the target to be searched Searching the picture information database for the matching first target picture.

Optionally, the feature information of the target to be searched includes a target picture to be searched; searching for the first target picture that matches the characteristic information of the target to be searched includes: modeling the target picture to be searched, And extracting attribute information of the target picture to be searched for; and searching the picture information database for the matching first target picture according to the attribute information of the target picture to be searched for.

Optionally, feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; and searching for the first target picture that matches the characteristic information of the target to be searched includes: : Model the target picture to be searched, and extract fourth attribute information of the target picture to be searched; determine the attribute of the target picture to be searched according to the third attribute information and the fourth attribute information Information; searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.

Optionally, the target search request further carries a search time range; searching for the first target picture that matches the characteristic information of the target to be searched includes: comparing the picture according to the search time range The target pictures in the information database are filtered to obtain a second target picture whose acquisition time is within the range of the search time period; according to the feature information of the target to be searched, a matching target is searched from the second target picture. The first target picture is described.

Optionally, the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture; searching for the first target matching the feature information of the target to be searched The picture includes: filtering the target picture in the picture information database according to the search channel number to obtain a third target picture whose channel number is consistent with the search channel number; Feature information, searching the first target picture for matching from the third target picture.

Optionally, the target picture is a face picture, and the target search request is a face search request.

Optionally, the target picture is a vehicle picture, and the target search request is a vehicle search request.

Optionally, generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: sorting the first target picture in the order of the acquisition time from morning to night; Generating the video summary by using the first target picture.

Optionally, generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: for each first target picture, determining a target video clip corresponding to the first target picture, The target video clip is video data between the n-th second before the acquisition time corresponding to the first target picture and the m-th second after the acquisition time of the first target picture; according to each target The video clip generates a video summary.

Optionally, for each of the first target pictures, determining the target video clip corresponding to the first target picture includes: when there are multiple first target pictures with the same acquisition time, for the multiple first target pictures, For any first target picture in the target picture, determine a start time point and an end time point of the video clip corresponding to the first target picture, where the start time point is the acquisition time corresponding to the first target picture The nth second before, the end time point is the mth second after the acquisition time corresponding to the first target picture; searching whether the starting time point exists in the recording data of the video data channel to which the first target picture belongs If there are I frames at the end time point; if there are I frames at the start time point and I frames at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded And determine the video clip corresponding to the first target picture as the target video clip.

Optionally, the method further includes: if there is no I frame at the start time point, increasing the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new start point Point in time, and repeat the above search step until the new starting time point I frame is searched in the recording data of the video data channel to which the first target picture belongs, or the first target picture corresponds to The new start time point of the video clip is the same as the acquisition time; if there is no I frame at the end time point, the end time of the video clip corresponding to the first target picture The point is reduced by x seconds to obtain a new end time point, and the above search step is repeated until the I frame of the new end time point is searched in the recording data of the video data channel to which the first target picture belongs, or , The new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; among the corresponding video clips in the plurality of first target pictures, the longest video clip is selected as the The target footage; discard the remaining length of the plurality of video clips when the first target image corresponding to the first target picture.

Optionally, generating the video summary according to each of the target video clips includes: filtering the target video clips according to a start time point and an end time point of each target video clip to remove time-repeated video data. Generating the video summary according to the filtered target video clip.

Optionally, filtering the target video clips according to the start time point and the end time point of each target video clip includes: matching the target video clips according to the start time point of each target video clip. Sorting each of the target video clips; for an adjacent first target video clip and a second target video clip, when the end time point of the first target video clip is greater than or equal to the start time of the second target video clip At the point of time, if the first target video clip and the second target video clip belong to the same video data channel, the first target video clip and the second target video clip are merged. The start time point is the start time point of the first target video clip, and the end time point is the end time point of the second target video clip; if the first target video clip and the second target video clip are Belonging to different video data channels, using the end time point of the first target video clip as the start time point of the second target video clip, or Use the start time point of the second target video clip as the end time point of the first target video clip; wherein the start time point of the first target video clip is smaller than the second The starting time point of the target video clip.

According to a second aspect of the embodiments of the present application, a video digest generating device is provided, including: a receiving unit configured to receive a target search request, where the target search request carries characteristic information of a target to be searched; a search unit configured to search A first target picture matching the characteristic information of the target to be searched; a processing unit, configured to generate a video digest according to the first target picture and the acquisition time corresponding to the first target picture.

Optionally, the apparatus further includes: an obtaining unit, configured to obtain target picture information in the video data of the video source device, where the target picture information includes the target picture, the acquisition time of the target picture, and attribute information of the target picture; A unit, configured to save the target picture information to a picture information database.

Optionally, the obtaining unit is specifically configured to receive the target picture information sent by the video source device.

Optionally, the obtaining unit is specifically configured to receive the target picture and the collection time of the target picture sent by the video source device; model the target picture and extract attributes of the target picture information.

Optionally, the obtaining unit is specifically configured to receive the target picture, the acquisition time of the target picture, and first attribute information of the target picture sent by the video source device; and model the target picture, And extracting the second attribute information of the target picture; determining the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.

Optionally, the obtaining unit is specifically configured to perform target detection on the video data provided by the video source device to obtain the target picture and the acquisition time of the target picture in the video data; The target picture is modeled, and attribute information of the target picture is extracted.

Optionally, the feature information of the target to be searched includes attribute information of the target to be searched; and the search unit is specifically configured to search the picture information database for the matching first part according to the attribute information of the target to be searched. A target picture.

Optionally, the feature information of the target to be searched includes a target picture to be searched; the search unit is specifically configured to model the target picture to be searched and extract attribute information of the target picture to be searched; The attribute information of the target picture to be searched is searched in the picture information database for the matching first target picture.

Optionally, the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; the search unit is specifically configured to model the target picture to be searched, and Extracting fourth attribute information of the target picture to be searched; determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information; and according to the attribute information of the target picture to be searched in The picture information database searches for a matching first target picture.

Optionally, the target search request also carries a search time range; the search unit is specifically configured to filter the target pictures in the picture information database according to the search time range to obtain a collection time A second target picture within the range of the search period; and searching for a matching first target picture from the second target picture according to the feature information of the target to be searched.

Optionally, the target search request also carries a search channel number, and the target picture information further includes a channel number of the target picture; and the search unit is specifically configured to match the search channel number with the search channel number. The target pictures in the picture information database are filtered to obtain a third target picture with the same channel number as the search channel number; according to the feature information of the target to be searched, a matching first picture is searched from the third target picture. A target picture.

Optionally, the processing unit is specifically configured to sort the first target picture in the order of the collection time from morning to night; and generate the video summary according to the sorted first target picture.

Optionally, the processing unit is specifically configured to determine, for each first target picture, a target video clip corresponding to the first target picture, where the target video clip is the acquisition time corresponding to the first target picture Recording data between the nth second before and the mth second after the acquisition time of the first target picture; generating a video summary according to each of the target video clips.

Optionally, the processing unit is specifically configured to, when there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, determine that the first target picture corresponds to The start time point and end time point of the video clip, where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is the corresponding time point of the first target picture The m-th second after the acquisition time; searching whether the I frame at the start time point and the I frame at the end time point exist in the recording data of the video data channel to which the first target picture belongs; if the The I frame at the start time point and the I frame at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded, and the video clip corresponding to the first target picture is determined as the target video Fragment.

Optionally, the processing unit is further configured to, if there is no I frame at the start time point, increase the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new Start time point, and repeat the above search steps until the new start time point I frame is searched in the video data of the video data channel to which the first target picture belongs, or the first target The new start time point of the video clip corresponding to the picture is the same as the acquisition time; if there is no I frame at the end time point, the video clip corresponding to the first target picture is Decrease the end time point by x seconds to obtain a new end time point, and repeat the above search steps until the I frame of the new end time point is searched in the recording data of the video data channel to which the first target picture belongs. Or, the new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; among the corresponding video clips in the plurality of first target pictures, a video with the longest duration is selected Examples of the target video clip; discard the remaining length of the plurality of video clips when the first target image corresponding to the first target picture.

Optionally, the processing unit is specifically configured to filter the target video clip according to a start time point and an end time point of each of the target video clips to remove time-repeated video data; and according to the filtered target video The snippet generates the video summary.

Optionally, the processing unit is specifically configured to sort each of the target video clips according to the start time point of each of the target video clips; for an adjacent first target video clip and a second target video clip Clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second target video clip belong to the same video data Channel, the first target video clip and the second target video clip are merged, the start time point of the merged video clip is the start time point of the first target video clip, and the end time point is The end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as The start time point of the second target video clip, or using the start time point of the second target video clip as the first target video clip Beam time point; wherein the first target segment of video start time point of the second target is less than the start time point of the video clip.

According to a third aspect of the embodiments of the present application, there is provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory pass through the processor. The communication bus completes communication with each other; a memory is configured to store a computer program; and a processor is configured to implement the steps of the above-mentioned video abstract generation method when the computer program stored on the memory is executed.

According to a fourth aspect of the embodiments of the present application, a computer-readable storage medium is provided, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the video digest generation is implemented. Method steps.

The video abstract generating method in the embodiment of the present application receives a target search request, searches for a first target picture that matches the characteristic information of the target to be searched carried in the target search request, and according to each first target picture and its corresponding acquisition time Generate a video summary of the search target. This improves the efficiency and accuracy of locating targets in video recordings. On the basis of removing video recordings that do not match the target to be searched, the consistency of target video tracking is guaranteed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic structural diagram of a video digest generating system according to an exemplary embodiment of the present application; FIG.

FIG. 2 is a schematic flowchart of a video abstract generating method according to an exemplary embodiment of the present application; FIG.

FIG. 3 is a schematic flowchart of generating a picture information database according to an exemplary embodiment of the present application; FIG.

4 is a schematic flowchart of a repeated picture filtering process according to an exemplary embodiment of the present application;

FIG. 5 is a schematic flowchart of a video digest generating system according to another exemplary embodiment of the present application; FIG.

FIG. 6 is a schematic flowchart of extracting attributes of a face picture according to an exemplary embodiment of the present application; FIG.

FIG. 7 is a schematic flowchart of extracting attributes of a face picture according to another exemplary embodiment of the present application; FIG.

FIG. 8 is a schematic flowchart of a video digest generating system according to still another exemplary embodiment of the present application; FIG.

FIG. 9 is a schematic flowchart of extracting a picture attribute of a vehicle according to an exemplary embodiment of the present application; FIG.

FIG. 10 is a schematic flowchart of extracting a picture attribute of a vehicle according to another exemplary embodiment of the present application; FIG.

FIG. 11 is a schematic flowchart illustrating a video digest generating system according to another exemplary embodiment of the present application; FIG.

FIG. 12 is a schematic structural diagram of a video digest generating apparatus according to an exemplary embodiment of the present application; FIG.

FIG. 13 is a schematic structural diagram of a video digest generating apparatus according to another exemplary embodiment of the present application; FIG.

Fig. 14 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application.

detailed description

Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of devices and methods consistent with certain aspects of the application as detailed in the appended claims.

The terminology used in this application is for the purpose of describing particular embodiments only and is not intended to limit the application. As used in this application and the appended claims, the singular forms "a", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

In order to enable those skilled in the art to better understand the technical solutions provided in the embodiments of the present application, the following briefly describes the system architecture applicable to the embodiments of the present application.

Please refer to FIG. 1, which is a schematic structural diagram of a video digest generating system according to an embodiment of the present application. As shown in FIG. 1, the video digest generating system may include a video source device 110 and a search device 120. The video source device 110 may provide video data, and the video data may include real-time video data or video recording data (referred to as recording data herein). The search device 120 may receive a target search request, and search for a target picture (referred to herein as a text image) in the video data of the video source device 110 that matches the feature information of the target to be searched according to the characteristic information of the target to be searched carried in the target search request Is the first target picture), and a video summary of the target to be searched is generated according to the searched first target picture.

It should be noted that, in the embodiment of the present application, the video source device 110 may be a front-end video capture device (such as IPC (Internet Protocol Camera)) or a video recording storage device (such as NVR (Network Video Recorder, Network Hard Disk Video Recorder)). )); The search device 120 may be an NVR (with a target search function) or a device deployed in a video surveillance system dedicated to target search. When the video source device 110 is an NVR, the video source device 110 and the search device 120 may be the same device.

In addition, one video source device 110 may provide video data for multiple search devices 120, and one search device 120 may also obtain video data from multiple video source devices 110 (one-to-one is taken as an example in the figure). When the video source device 110 is IPC, one video source device 110 can correspond to one video data channel; when the video source device 110 is a video recording storage device such as an NVR, one video source device 110 can provide multiple video data channels. Video data (recording data).

In order to make the foregoing objects, features, and advantages of the embodiments of the present application more comprehensible, the technical solutions in the embodiments of the present application will be further described in detail below with reference to the accompanying drawings.

Please refer to FIG. 2, which is a schematic flowchart of a video abstract generating method according to an embodiment of the present application. The video abstract generating method may be applied to a search device (taking an NVR as an example). As shown in FIG. 2, the video summary generating method may include the following steps.

Step S200: Receive a target search request, where the target search request carries characteristic information of a target to be searched.

In the embodiment of the present application, the target may include, but is not limited to, a human face, a vehicle, or a license plate. When the target to be searched is a human face, the feature information of the target to be searched may include, but is not limited to, a face picture, structured information of the face (such as whether to smile, whether to wear glasses, gender, age range, etc.), etc. Or more. When the target to be searched is a vehicle, the characteristic information of the target to be searched may be, but is not limited to, one or more of a vehicle picture, characteristic information of the vehicle (such as color, type, logo, brand, etc.). When the target to be searched is a license plate, the feature information of the target to be searched may include, but is not limited to, one or more of a license plate picture, characteristic information (such as color, position, license plate number, etc.) of the license plate.

In the embodiment of the present application, the target may further include a human body, an animal, and the like. When the target to be searched is a human body, the feature information of the target to be searched may include, but is not limited to, one or more of a human body picture, human body characteristic information (such as height, weight, gender, skin color, clothing, etc.). When the target to be searched is an animal, the characteristic information of the target to be searched may include, but is not limited to, one or more of animal pictures, animal characteristic information (such as type, hair color, size, etc.) and the like.

Step S210: Search for a first target picture that matches the feature information of the target to be searched.

In the embodiment of the present application, when receiving a target search request, the search device may search for a first target picture in the video data of the video source device that matches the characteristic information of the target to be searched for according to the characteristic information of the target to be searched. At the same time, the first target picture and the acquisition time of the first target picture may be used as the first target picture information that matches the feature information of the target to be searched.

In the embodiment of the present application, when the search device searches for the first target picture that matches the feature information of the target to be searched, the search result may be one or more first target pictures, or the search result may be empty. When the search result is empty, that is, when no first target picture matching the feature information of the target to be searched is found, the search device may determine that the target search fails, and return a search failure response message.

In the embodiment of the present application, the collection time of the target picture may be the time when the front-end video capture device collects (eg, captures) the target picture, or the time when the target picture appears in the video image collected by the front-end video capture device.

It should be noted that, in the embodiment of the present application, the acquisition time of the target picture may be carried in the target picture (for example, displaying the acquisition time of the target picture at a specific position in the target picture (such as the lower left corner or the lower right corner, etc.)) Or, the acquisition time of the target picture may be independent of the target picture, and its specific implementation is not described herein.

Step S220: Generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.

In the embodiment of the present application, after the search device determines the first target picture that matches the feature information of the target to be searched, it can generate a video summary of the target to be searched according to the first target picture and its corresponding acquisition time.

In one embodiment of the present application, generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: sorting multiple first target pictures in the order of the acquisition time from morning to night; Video summary from multiple first target images. In this embodiment, the search device may directly sort the first target picture in the order of the collection time from morning to night to generate a video summary.

For example, when the number of the first target pictures searched by the search device exceeds a preset number threshold (can be set according to actual needs, such as 200, 500, etc.), and are sorted in the order of the collection time from morning to night, they are next to each other When the difference between the acquisition times of the two first target pictures does not exceed a preset time threshold (can be set according to actual needs, such as 1 second, 2 seconds, etc.), the search device may directly according to the sorted first target pictures, Generate a video summary.

In another embodiment of the present application, generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: for each first target picture, determining a target video corresponding to the first target picture A segment, where the target video segment is video data from the n-th second before the acquisition time corresponding to the first target picture to the m-th second after the acquisition time of the first target picture; generated according to each target video segment Video summary.

In the embodiment of the present application, when the search device searches for multiple first target pictures that match the feature information of the target to be searched, for each first target picture, the search device may Recording data between the nth second and the mth second after the acquisition time of the first target picture is determined as a target video clip corresponding to the first target picture, so as to obtain a video recording that matches the characteristic information of the target to be searched. The collection time of the first target picture is the time when the first target picture appears in the video data extracted from the first target picture (that is, the time when the first target picture is collected by the video acquisition device).

In the embodiment of the present application, the start time point of the target video clip corresponding to the first target picture (n-th second before the acquisition time of the first target picture) and the end time point (m-th after the acquisition time of the first target picture) Seconds) can be pre-configured in the search device or can be carried in the target search request (can be set by the user according to actual needs or default values are used). Among them, n and m are non-negative numbers.

It should be noted that when the video data provided by the video source device for the search device includes video data of multiple video data channels, when the search device determines the target video clip corresponding to the first target picture, it may first determine the first target picture belongs to Video data channel, and determine a target video clip corresponding to the first target picture in the video data of the video data channel. That is, among the recording data of the video data channel, the recording data between the nth second before and the mth second after the acquisition time of the first target picture is determined as a target video clip corresponding to the first target picture.

Accordingly, in this case, the first target picture information may further include a channel number of the first target picture, that is, a channel number of a video data channel to which the first target picture belongs.

In this embodiment, after the search device determines the target video clips corresponding to the first target pictures that match the characteristic information of the target to be searched, the search device may fuse each target video clip, that is, according to the starting of each target video clip Sort and splice each target video clip at the start time and / or end time to generate a video summary of the target to be searched.

It can be seen that, in this embodiment, a first target picture that matches the feature information of the target to be searched carried in the target search request is searched, and the n-th second to the m-th second after the acquisition time corresponding to each first target picture The recorded video data is determined as the target video clip corresponding to the first target picture, and further, the video summary of the target to be searched is obtained by fusing each target video clip. Therefore, the efficiency and accuracy of locating targets in video recordings are improved, and the consistency of target video tracking is ensured on the basis of removing video recordings that do not match the target to be searched.

After the retrieval device generates a video summary of the search target, the video summary may be further played back. Video summary can be generated based on each target video clip and decoded and displayed.

Further, in the embodiment of the present application, considering that each time a target search is performed, a matching first target picture is extracted from the video data provided by the video source device, and the time required for the target search is longer, resulting in search efficiency. Too low, in order to improve the efficiency of target search, you can extract the target picture information from the video data provided by the video source device in advance, and save the target picture information to the picture information database. When the target search is needed, directly from the picture The information base searches for a matching first target picture.

Correspondingly, in one embodiment of the present application, before searching for the first target picture that matches the characteristic information of the target to be searched, the method may further include: obtaining target picture information in the video data of the video source device, where the target picture information includes The target picture, the acquisition time of the target picture, and the attribute information of the target picture; the obtained target picture information is saved to the picture information database.

In this embodiment, the search device may obtain the target picture information in the video data of the video source device in advance, and save the obtained target picture information to the picture information database. Further, when a target search is required, it may directly The feature information of the search target searches for a matching first target picture from the picture information database to improve target search efficiency. The target picture information may include, but is not limited to, the target picture, feature information of the target picture, and acquisition time of the target picture. The picture information library can be a specified storage space in the search device, or it can be a third-party database.

In one embodiment of the present application, acquiring the target picture information in the video data of the video source device may include: receiving the target picture information sent by the video source device.

In this embodiment, when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function) and a target picture analysis function, the video source device can directly obtain the target picture, the acquisition time of the target picture, and the target picture The target picture information such as the attribute information of the camera, and send the target picture information to the search device. The search device can receive the target picture information sent by the video source device.

For example, assuming that the video source device is an IPC with a target picture capture function and a target picture analysis function, the video source device can capture the target picture (and record the target picture capture time (that is, the acquisition time)), and perform the target picture capture Target image analysis to extract the attribute information of the target image. Furthermore, the video source device can send the target image information such as the target image, the acquisition time of the target image, and the attribute information of the target image to a search device, such as an NVR, which is stored by the search device. The target picture information such as the received target picture, the acquisition time of the target picture, and the attribute information of the target picture.

In another embodiment of the present application, obtaining the target picture information in the video data of the video source device may include: receiving the target picture and the acquisition time of the target picture sent by the video source device; modeling the target picture and extracting The attribute information of the target picture.

In this embodiment, when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function), the video source device can obtain the target picture and the acquisition time of the target picture, and convert the target picture and the target picture's The acquisition time is sent to the search device. When the search device receives the target picture sent by the video source device and the acquisition time of the target picture, it can model the target picture and extract the attribute information of the target picture. Therefore, the search device can obtain the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device.

For example, assuming that the video source device is an IPC with a target picture capture function, the video source device can capture the target picture (and record the target picture capture time (that is, the acquisition time)), and send the captured target picture and the target picture collection time Give search devices such as NVR. When the search device receives the target picture, it can model the target picture and extract the attribute information of the target picture. The search device can then store the target picture, the acquisition time of the target picture, and the attribute information of the target picture.

In still another embodiment of the present application, obtaining target picture information in the video data of the video source device may include: receiving the target picture information sent by the video source device, the acquisition time of the target picture, and the first attribute information of the target picture; The target picture is modeled and the second attribute information of the target picture is extracted; the attribute information of the target picture is determined according to the first attribute information of the target picture and the second attribute information of the target picture.

In this embodiment, when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function) and a target picture analysis function, the video source device can directly obtain the target picture, the acquisition time of the target picture, and the target picture Attribute information (referred to herein as the first attribute information of the target picture), and sends the target picture, the acquisition time of the target picture, and the first attribute information of the target picture to the search device. When the search device receives the target picture sent by the video source device, it can model the received target picture and extract the attribute information of the target picture (referred to herein as the second attribute information of the target picture).

For any target picture, the search device may compare the first attribute information of the target picture with the second attribute information of the target picture. For the attribute information that exists in the first attribute information but does not exist in the second attribute information, or the first attribute The information does not exist, but the attribute information existing in the second attribute information is added to the attribute information of the target picture; for the attribute information existing in both the first attribute information and the second attribute information, the attributes in the second attribute information are added The information is added to the attribute information of the target picture, and then the attribute information of the target picture is obtained.

It should be noted that, in this embodiment, the search device may also directly use the second attribute information of the target picture as the attribute information of the target picture.

In still another embodiment of the present application, obtaining the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device may include: performing target detection on the video data provided by the video source device to Obtain the target picture and the acquisition time of the target picture in the video data; model the target picture and extract the attribute information of the target picture.

In this embodiment, when the video source device does not have a target picture acquisition function, or the video source device and the search device are the same device (such as NVR), the search device may directly perform target detection on the video data provided by the video source device to The target picture in the video data and the acquisition time of the target picture are obtained; after the search device obtains the target picture in the video data, the target picture can be further modeled and attribute information of the target picture can be extracted.

For example, assuming that the video source device is an IPC that does not have a target picture capture function, the video source device may send the acquired video data to a search device, such as an NVR. When the search device receives the video data sent by the video source device, it can perform target detection on the received video data to obtain the target picture in the video data and the acquisition time of the target picture (the time when the target picture appears in the video data) , And model the target picture to extract the attribute information of the target picture. The search device can then store the target picture, the acquisition time of the target picture, and the attribute information of the target picture.

In the embodiment of the present application, after the search device obtains the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device, the obtained target picture, the acquisition time of the target picture, and the target may be stored. Picture attribute information.

Further, in the embodiment of the present application, in order to reduce the storage of redundant picture information, for multiple pieces of target picture information including the same target detected in the video data of the same scene, if the target pictures in the multiple pieces of target picture information are There is a small difference in acquisition time, so only one of the target picture information can be saved to the picture information database.

Correspondingly, in one of the embodiments of the present application, the above-mentioned saving of the target picture information to the picture information database may include: for any target picture information of any video data channel, judging whether a picture information database is stored with the target The picture information includes other target pictures of the same target, where the other target picture information and the target picture information are from the same video data channel, and the difference between the acquisition time included in the other target picture information and the acquisition time included in the target picture information The value is less than the preset time threshold; when there is no other target picture information that contains the same target as the target picture information in the picture information database, the target picture information is saved to the picture information database.

In this embodiment, it is considered that video data in a video data channel (such as video data obtained by an IPC) is usually video data in a fixed scene. Therefore, for any target picture information of any video data channel (including the search device obtaining target detection of the video data provided by the video source device, or the search device receiving the video source device), the search device is Before the picture information is saved in the picture information database, you can determine whether other target pictures that meet the following conditions are stored in the picture information database: the same target as the target picture; from the same video data channel; the acquisition time of other target pictures and the target picture The difference between the acquisition times included in the information is less than a preset threshold.

When the search device determines that there is no other target picture information in the picture information database that meets the above conditions, the search device may save the target picture information to the picture information database. When the search device determines that there is other target picture information in the picture information database that meets the above conditions, it refuses to save the target picture information to the picture information database, such as discarding the target picture information directly to reduce redundant picture storage. Therefore, the workload of searching the first target picture in the picture information database can be reduced, and the search efficiency can be improved.

In the embodiment of the present application, the feature information of the target to be searched for may include target picture to be searched and / or attribute information of the target picture to be searched.

In one embodiment of the present application, the feature information of the target to be searched includes the target picture to be searched; correspondingly, searching for the first target picture that matches the characteristic information of the target to be searched includes: modeling the target picture to be searched, And extracting the attribute information of the target picture to be searched; according to the attribute information of the target picture to be searched, a matching first target picture is searched in the picture information database.

In this embodiment, the search device may provide a target search function, according to the target picture to be searched carried in the received target search request, and search for a matching target picture in a map search mode.

For example, the search device may provide a target search request interface, and the target search request interface may include an input or / and selection area of a target picture to be searched, and a user enters or / and selects a target picture to be searched in the target search request interface, and Submit a target search request.

When the search device receives the target search request, it models the target picture to be searched and extracts the attribute information of the target picture to be searched. Furthermore, the search device can query the stored target picture information according to the attribute information of the target picture to be searched, and The target picture information corresponding to the attribute information of the target picture that matches the attribute information of the target picture to be searched is determined as the first target picture information.

For example, suppose the target is a human face and the feature information of the target to be searched is a face picture. When the search device receives the target search request, it can search for a matching first target face picture in the manner of map search, that is, the target The face picture carried in the search request is modeled to obtain a feature model of the face picture, and further, the similarity between the video data of the video source device and the feature model of the face picture is greater than or equal to a preset similarity threshold The face image of is determined as the first target face picture, and the acquisition time of the first target face picture and the first target face picture is used as the first target face picture information.

The similarity threshold may be configured in a search device in advance, or may be carried in a target search request (can be set by a user according to actual needs or a default value is used).

In one embodiment of the present application, the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; correspondingly, searching for a first target picture that matches the feature information of the target to be searched, Including: modeling the target picture to be searched and extracting the fourth attribute information of the target picture to be searched; determining the attribute of the target picture to be searched according to the third attribute information of the target picture to be searched and the fourth attribute information of the target picture to be searched Information; searching for a matching first target picture in the picture information database according to the attribute information of the target picture to be searched.

In this embodiment, when the feature information of the target search includes the target picture to be searched and the third attribute information of the target picture to be searched, the search device may further model the search target picture and extract the attributes of the target picture to be searched Information (herein referred to as the fourth attribute information of the target picture to be searched).

After the search device obtains the fourth attribute information of the target picture to be searched, it may determine the attribute information of the target picture to be searched according to the third attribute information of the target picture to be searched and the fourth attribute information of the target picture to be searched.

For example, the search device may compare the third attribute information of the target picture to be searched with the fourth attribute information of the target picture to be searched. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or the third attribute The information does not exist, but the attribute information existing in the fourth attribute information is added to the attribute information of the target picture to be searched; for the attribute information existing in both the third attribute information and the fourth attribute information, the fourth attribute information is added The attribute information of is added to the attribute information of the target picture to be searched, and then the attribute information of the target picture to be searched is obtained.

When the search device obtains the attribute information of the target picture to be searched, it can query the target picture information in the picture information database according to the attribute information of the target picture to be searched, and will correspond to the attribute information of the target picture that matches the attribute information of the target picture to be searched. The target picture information is determined as the first target picture information.

In one embodiment of the present application, the feature information of the target to be searched includes attribute information of the target to be searched. Correspondingly, searching for a first target picture that matches the feature information of the target to be searched includes: searching for the matching first target picture in the picture information database according to the attribute information of the target to be searched.

In the embodiment of the present application, the feature information of the target to be searched may also be attribute information of the target picture to be searched. When the search device receives the target search request, it may directly according to the attribute information of the target picture to be searched carried in the picture information database. Search for a matching first target picture.

Further, in the embodiment of the present application, in order to further improve the search efficiency of the first target picture, when performing the target search, it may also carry a specific filtering attribute to instruct the search device to firstly After filtering the target image information, the first target image search is further performed. The specific filtering attribute may include, but is not limited to, a search time range or / and a search channel number.

Correspondingly, in one of the embodiments of the present application, the target search request also carries a search time range; searching for the first target picture matching the characteristic information of the target to be searched for may include: comparing the picture information according to the search time range. The target pictures in the library are filtered to obtain a second target picture whose acquisition time is within the search time range; and a matching first target picture is searched from the second target picture according to the feature information of the target to be searched.

In this embodiment, when the search device receives the target search request, the search device may first filter the target pictures in the picture information database according to the search time range carried in the target search request to obtain the acquisition time in the search time period. The second target picture within range. For example, assuming that the search time range is [t1, t2] (t2> t1), the second target picture refers to a target picture whose acquisition time t satisfies t1 ≦ t ≦ t2.

In this embodiment, when the search device obtains the second target picture, it may search for a matching first target picture from the second target picture according to the feature information of the target to be searched.

In another embodiment of the present application, the target search request also carries a search channel number, and the target picture information further includes a channel number of the target picture (that is, a channel number of a video data channel to which the target picture belongs); The first target picture with matching feature information includes: filtering the target picture information in the picture information database according to the search channel number to obtain a third target picture with the same channel number as the search channel number; and according to the feature information of the target to be searched, The third target picture is searched for a matching first target picture.

In this embodiment, before searching the first target picture, the search device may first filter the target pictures in the picture information database according to the search channel number and the channel number information of each target picture in the picture information database to obtain the channel number and The third target picture with the same channel number is searched, and then the first target picture is searched in the third target picture.

Further, in the embodiment of the present application, it is considered that when there is video data of multiple video data channels and the scenes covered by the multiple video data channels overlap (for example, there is an area overlap in the monitoring perspective range of multiple IPCs), It may happen that the same target appears in the video data of multiple video data channels at the same point in time. At this time, in order to avoid excessively similar video data in the video summary, the first target picture may be deduplicated.

Correspondingly, in one embodiment of the present application, for any first target picture, determining a target video clip corresponding to the first target picture may include: when there are multiple first target pictures with the same acquisition time, for Any one of the plurality of first target pictures determines a start time point and an end time point of a video clip corresponding to the first target picture, where the start time point corresponds to the first target picture The n-th second before the acquisition time, the end time point is the m-th second after the acquisition time corresponding to the first target picture; searching whether the start time exists in the recording data of the video data channel to which the first target picture belongs Point I frame, and whether there is an I frame at the end time; if there are I frames, discard the remaining first target pictures in the multiple first target pictures, and determine the video clip corresponding to the first target picture For the target video clip.

In this embodiment, when there are multiple first target pictures with the same acquisition time, the search device may determine the start of the video segment corresponding to each first target picture in the multiple first target pictures according to a preset policy. Time point and end time point. For any first target picture among the plurality of first target pictures, the start time point of the video clip corresponding to the first target picture is the n-th second and end time point before the acquisition time corresponding to the first target picture M seconds after the acquisition time corresponding to the first target picture.

After the search device determines the start time point and the end time point, it can search whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs (that is, whether the video data channel exists The key frame at the start time point), and whether there is an I frame at the end time point (that is, whether there is a key frame at the end time point in the video data channel). If there are I frames, the search device may directly determine the video clip corresponding to the first target picture as the target video clip, and discard the remaining first target pictures among the plurality of first target pictures.

Further, in this embodiment, when there is no start time point (the n-th second before the acquisition time of the first target picture) of the plurality of first target pictures, the I frame or / and the end time point (the first M frame seconds after the acquisition time of a target picture), for any first target picture of the plurality of first target pictures, if the recording data of the video data channel to which the first target picture belongs is not If there is an I frame at the start time point, the search device may increase the start time point of the video clip corresponding to the first target picture by x seconds, and search for the existence of an I frame at the new start time point. If not, Then increase the starting time point by x seconds again, and search for a new starting time point if there is an I frame, and repeat this operation until the starting point is found in the video data of the video data channel to which the first target picture belongs. The I frame at the time point (the updated start time point), or the start time point of the video clip corresponding to the first target picture is the same as the acquisition time of the first target picture. Among them, x is a positive number.

Similarly, if the I frame at the end time point does not exist in the recording data of the video data channel to which the first target picture belongs, the search device may reduce the end time point of the video clip corresponding to the first target picture by x seconds, and Search whether there is an I frame at the end time point. If it does not exist, reduce the end time point of the video segment corresponding to the first target picture by x seconds again, and search for whether there is an I frame at the end time point, and repeat the operation. Until the I frame at the end time point (the updated end time point) is searched in the recording data of the video data channel to which the first target picture belongs, or the start time of the video clip corresponding to the first target picture The points are collected at the same time as the first target picture.

For example, when the collection time of the first target picture is 1 minute and 0 seconds, the start time point may be 0 minutes and 58 seconds, and the end time point may be 1 minute and 3 seconds. It is searched whether the video data of the video data channel to which the first target picture belongs has an I frame at 0:58. If there is no I frame, you can increase the start time point by 1 second to obtain a new start time, 0 minutes and 59 seconds, and continue to search for whether there is an I frame in the video data of 0 minutes and 59 seconds. If the I frame still does not exist, increase the start time point by 1 second to obtain a new start time point, 1 minute and 0 seconds. Since the start time point is already the same as the acquisition time, no further judgment will be performed and the acquisition will be continued. Time is used as the start time. Similarly, it is searched whether the I frame exists in the video data of the video data channel to which the first target picture belongs in 1 minute and 3 seconds. If there is no I-frame, the end time point can be reduced by 1 second to obtain a new end time, 1 minute and 2 seconds, and it is continued to search whether there is an I-frame in the recording data of 1 minute and 2 seconds. If there is an I frame, the recording data between 1 minute and 0 seconds and 1 minute and 2 seconds is used as the recording segment corresponding to the first target picture. If the I frame still does not exist, continue to reduce the end time point by 1 second to obtain a new end time, and continue to judge. Until the new end time point is 1 minute and 0 seconds. In this case, the video data of 1 minute and 0 seconds is used as the video clip corresponding to the first target picture.

After the search device determines the start time and end time of the video clips corresponding to the multiple first target pictures in the foregoing manner, it can determine that the duration of the corresponding video clip in the multiple first target pictures is the longest (end First target picture with the largest difference between the time point and the starting time point), determine the video clip corresponding to the first target picture as the target video clip, and discard the remaining first targets in the plurality of first target pictures image.

It should be noted that, in this embodiment, when the number of the first target pictures with the longest duration of the corresponding video clips in the plurality of first target pictures is greater than 1, a first one may be selected according to a preset strategy. A target picture, and a video clip corresponding to the selected first target picture is determined as the target video clip. For example, the first target picture with the earliest start time, the first target picture with the latest end time, or a random selection may be selected.

In addition, in the embodiment of the present application, when there are multiple first target pictures with the same acquisition time, deduplication processing of the multiple first target pictures may also be implemented manually. For example, the search device may display the plurality of first target pictures in a specified interface, the user selects the first target picture to be retained, and discards the remaining first target pictures with the same acquisition time.

Further, in the embodiment of the present application, when there are multiple first target pictures, it may be considered that the time segments of the target video clips corresponding to the first target pictures may overlap, thereby affecting the normal playback of the video summary. Therefore, In order to optimize the playback effect of the video summary, deduplication time processing can be performed on each target video clip.

Correspondingly, in one of the embodiments of the present application, generating a video summary according to each target recording segment may include: filtering the target recording segment according to a start time point and an end time point of each target recording segment to remove duplicate time Video data; video summary is generated based on the filtered target video clips.

In this embodiment, after the search device determines the target video clip corresponding to each first target picture, the target video clip can be filtered according to the start time point and the end time point of each target video clip to remove the video with a repeated time. data.

In an example, filtering the target video clips according to the start time point and the end time point of each target video clip includes: sorting the target video clips according to the start time point of each target video clip; The first target video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second target video clip belong to the same For the video data channel, the first target video clip and the second target video clip are combined, and the start time point of the merged video clip is the start time point of the first target video clip and the end time point is the second target video clip. The end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as the start time point of the second target video clip, or The start time point of the two target clips is used as the end time point of the first target video clip.

Specifically, in this example, the search device may sort each target video clip according to a start time point of each target video clip. For example, the search device may sort each target video clip by using a bubble sorting method according to the start time point of each target video clip.

For two adjacent target video clips (referred to herein as the first target video clip and the second target video clip, the start time point of the first target video clip is less than the start time point of the second target video clip), when When the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, corresponding processing may be performed according to the video data channel to which the first target video clip and the second target video clip belong.

If the first target video clip and the second target video clip belong to the same video data channel, the search device may merge the first target video clip and the second target video clip, and the start time point of the merged video clip is the first target. The start time point of the video clip, and the end time is the end time point of the second target video clip.

If the first target video clip and the second target video clip belong to different video data channels, the search device may use the end time point of the first target video clip as the start time point of the second target video clip, or the search device The start time point of the second target clip can be used as the end time point of the first target video clip, and its specific implementation will be described below in conjunction with a specific example.

In order to enable those skilled in the art to better understand the technical solutions provided by the embodiments of the present application, the technical solutions provided by the embodiments of the present application are described below with reference to specific examples and drawings.

In this embodiment, the search device is an NVR, the video source device is an IPC, the target is a face, and the target search is a face picture search. The specific implementation process of the video summary generation scheme is as follows.

1.Face recognition and picture storage

If the NVR is connected to the ordinary IPC, the NVR can use the structured video analysis technology on the GPU (Graphics Processing Unit) to perform face recognition on the real-time video stream transmitted by the IPC to obtain the face picture information in the real-time video stream . If the NVR is connected to a professional face capture IPC, the NVR can receive face picture information from the IPC end. The face picture information may include, but is not limited to, a face picture, structured information of the face picture, acquisition time of the face picture, channel number of the face picture, and channel name of the face picture (for recording address information) Wait.

The NVR maintains a buffer for each video data channel (referred to as the channel) between the IPC and the NVR, which is used to store the face picture information received within 3 seconds. When the face picture information is stored in the buffer for 3 seconds, the NVR deletes the face picture from the buffer.

For any channel, when the NVR obtains new face picture information (including face picture information received directly from IPC or face picture information obtained through face detection) from the channel, the face picture The information is compared with the face picture information in the buffer corresponding to the channel. If the face picture information of the same face exists in the buffer, the newly obtained face picture information is discarded, or the The face picture information is added to the buffer and saved to the picture information database.

It should be noted that if the NVR obtains new face picture information, and it is determined that no face picture information of the same face exists in the buffer, but there is no free space in the buffer, the NVR can obtain the newly obtained The face image information of the overwrites the earliest stored face image information in the buffer, and saves the newly obtained face image information to the picture information database.

The flow diagram of the face recognition and picture storage performed by the NVR can be shown in FIG. 3.

2. Video fusion parameter configuration

The configurable recording fusion parameters may include, but are not limited to, the following parameters.

Channel selection: if single channel is selected, it means single-channel face image information search and video fusion; if the specified multiple channels are checked, face image information search and multi-channel video fusion are represented on multiple channels.

Search time range: Represents the time range information of the face image to be searched for.

Similarity threshold: If the threshold is 90%, the similarity result is 90% and above.

Video fusion time period: the range of n seconds before and m seconds after the face image collection time. For example, configure the first 2 (ie n = 2) seconds and the last 3 (ie m = 3) seconds to indicate that the person needs to be intercepted. Video clips within a time range of 2 seconds before the collection time of the face picture and 3 seconds after the collection time of the face picture.

Duplicate picture filtering mode: There are two modes: automatic and manual. The processing methods of different modes are explained in point 3.

3.Repeat image filtering

The NVR receives a face search request, and the face search request carries a target face picture and a video fusion parameter.

Image search: NVR models the target face picture, searches and compares the face picture information in the picture information database that matches the channel number and search time range carried in the face search request, and calculates the similarity. List face pictures whose similarity with the target face picture is greater than or equal to the similarity threshold. The acquisition time corresponding to the face picture and the face picture constitutes face picture information (hereinafter referred to as the first face picture information). The first face picture information may be recorded in the first face picture information list.

Sort the first face pictures in the first face picture information list according to the collection time in the first face picture information from morning to night.

If there are multiple first face pictures with the same acquisition time (when the IPC's perspective ranges overlap, there may be multiple first face pictures with the same acquisition time). Face pictures are filtered repeatedly. For any one of the first face pictures, subtracting n seconds from the acquisition time of the first face picture to obtain the start of the video segment corresponding to the first face picture The time point x, and the acquisition time of the first face picture is added to m seconds to obtain the end time point y of the video clip corresponding to the first face picture. The NVR searches the video recordings of this channel respectively for the I frames at time points x and y. If there are no I frames at time point x, and / or, and no I frames at time point y, then let x = x + 1 , Or / and, let y = y-1, and continue to search for I frames at time points x or / and y, until I frames at time points x and y are found, or x = y. When x = y, the duration of the video clip corresponding to the first face picture is 1 second.

Duplicate picture filtering mode includes manual filtering mode or automatic filtering mode. The manual filtering mode is: the multiple first face pictures are output on the specified interface, the user checks the first face pictures to be retained, and the other first face pictures at the same acquisition time are discarded. The automatic filtering mode is: among the plurality of first face pictures, the first face picture corresponding to the longest video clip is retained, and other first face pictures with the same acquisition time are discarded. When there are multiple first face pictures corresponding to the longest video clip in the face picture, one of them is retained, and the rest are discarded.

The first face image information after filtering the repeated pictures is formed into a final first face image information list.

The schematic flowchart of the NVR generating the final first face picture information list can be shown in FIG. 4.

4.Repeat video clip filtering

Based on the final first face picture information list, a time segment element set of video clips is created, and each element includes the following information: the channel number of the channel to which the first face picture belongs, and the start of the video clip corresponding to the first face picture The time point (the starting time point of the I frame exists) and the end time point of the video clip corresponding to the first face picture (the end time point of the I frame exists); among them, each element is from early to late according to the starting time point Sort in order.

Duplicate video clip filtering: For the two adjacent elements of the video clip time period element set (hereinafter referred to as the first element and the second element respectively), it is assumed that the start time and the end time of the first element and the second element [A, B] and [C, D], where A <C≤B. If the first element and the second element include the same channel number, the first element and the second element are combined into one element, and the start time point of the element is A and the end time point is D. If the channel numbers included in the first element and the second element are different, it is determined whether there is an I frame at time point B in the video recording of the channel to which the second element belongs. If there is an I frame at time point B, the start time point of the second element is updated to B, that is, the start time point and end time point of the first element and the second element are [A, B] and [B , D]. If there is no I frame at time point B, the start time point of the first element is updated to C, that is, the start time point and end time point of the first element and the second element are [A, C] and [ C, D].

Traverse the entire set of video clip time period elements to form a video clip element set used to generate a video summary.

5. Video summary generation

Based on the set of video clip elements used to generate the video summary, corresponding video data is obtained from the video video of the corresponding channel, and a video summary is generated based on the obtained video data. The video summary can be downloaded by the user, and the user can export a set of video clip elements used to generate the video summary.

In the embodiment of the present application, by receiving a target search request, a first target picture that matches the feature information of the target to be searched carried in the target search request is searched, and according to each first target picture and the corresponding collection of each first target picture The video summary of the target to be searched is generated in time, which improves the efficiency and accuracy of locating the target in the video recording. On the basis of removing the video recording that does not match the target to be searched, the consistency of the target video tracking is guaranteed.

Please refer to FIG. 5, which is a schematic flowchart of a video abstract generating method according to another embodiment of the present application. The video abstract generating method may be applied to a retrieval device. In this example, the video digest generation method is directed to a face search request. As shown in FIG. 5, the video summary generating method may include the following steps.

Step S500: Acquire and store a face picture in the video data of the video source device, a collection time of the face picture, and attribute information of the face picture.

In this embodiment, in order to improve the efficiency of face retrieval, the retrieval device may acquire and store the face picture information in the video data of the video source device. The face picture information may include, but is not limited to, a face picture, a collection time of the face picture, and attribute information of the face picture.

In the embodiment of the present application, the attribute information of the face picture may include, but is not limited to, one or more of the following: facial expression (such as whether to smile), whether to wear glasses, gender, age range, and ethnicity.

The specific implementation method for obtaining the face picture, the collection time of the face picture, and the attribute information of the face picture in the video data of the video source device is similar to the foregoing method of obtaining the target picture information. The target picture information is replaced with the adult face picture information. Yes, I wo n’t repeat them here.

In this example, after the retrieval device obtains the face picture in the video data of the video source device, the acquisition time of the face picture, and the attribute information of the face picture, the retrieved face picture and the collection of the face picture can be stored. Time and attribute information of face pictures.

Storing the face picture in the video data of the video source device, the collection time of the face picture, and the attribute information of the face picture may include: storing the face picture; recording the storage location of the face picture in the face picture information table, Collection time of face pictures and attribute information of face pictures.

After the retrieval device obtains the face picture, the collection time of the face picture, and the attribute information of the face picture, it can store the obtained face picture, and store the storage location of the face picture, the collection time of the face picture, and the person. The face picture attribute information is recorded in the face picture information table, and its format can be shown in Table 1:

Table 1

人脸图片的位置信息Face image location information	人脸图片的采集时间Face image collection time	人脸图片的属性信息Face image attribute information
人脸图片1的位置信息Location information of face picture 1	人脸图片1的采集时间Acquisition time of face picture 1	人脸图片1的属性信息Attribute information of face picture 1
人脸图片2的位置信息Location information of face picture 2	人脸图片2的采集时间Acquisition time of face picture 2	人脸图片2的属性信息Attribute information of face picture 2
…...	…...	…...

The position information of the face picture may be a position offset and a length of the face picture in a storage space (such as a hard disk).

The above implementation manner of storing the face picture, the collection time of the face picture, and the attribute information of the face picture in the video data of the video source device is only to store the face picture, the collection time of the face picture, and the face picture in this application. A specific example of the attribute information is not a limitation on the protection scope of the present application, that is, in the embodiments of the present application, the face picture, the time of collecting the face picture, and the person in the video data of the video source device may also be stored in other ways. Attribute information of face pictures.

For example, in one example, the face picture, the acquisition time of the face picture, and the attribute information of the face picture can be stored in the same database (that is, the face picture is directly stored in the database in a binary form). In this example, the face picture, the collection time of the face picture, and the attribute information of the face picture can be stored in the same data table. At this time, there is no need to additionally record the storage location of the face picture.

In another example, the face picture can still be stored first to obtain the storage location of the face picture, but when the storage location of the face picture, the collection time of the face picture, and the attribute information of the face picture are no longer stored in the data Tables are stored in the form of other forms, such as a tree structure or a file. The specific implementation is not described here.

Step S510: When a face retrieval request is received, a first target face picture matching the face retrieval filter condition is determined according to the face retrieval filter condition carried in the face retrieval request.

The retrieval device can provide a face retrieval function, which retrieves matching face pictures according to the face retrieval filter conditions carried in the received face retrieval request, and at the same time can obtain the collection time of the matching face pictures.

For example, the retrieval device may provide a face retrieval request interface, and the face retrieval request interface may include a face retrieval filter condition input area or / and a face retrieval filter condition option, which is entered by the user in the face retrieval request interface or / And select a face search filter and submit a face search request.

In one example, the face retrieval filter condition is attribute information of a face picture to be retrieved (this may be referred to as third attribute information of the face picture to be retrieved), which may include, but is not limited to, facial expressions of the face to be retrieved , Whether or not you wear glasses, gender, and age.

In another example, the face retrieval filtering condition may include a face picture to be retrieved and third attribute information of the face picture to be retrieved.

When the retrieval device receives a face retrieval request, it can obtain the face retrieval filter conditions carried in the face retrieval request, and query the stored face pictures, the collection time of the face pictures, and the person according to the face retrieval filter conditions. Attribute information of a face picture, and determine a face picture corresponding to the attribute information of a face picture matching a face retrieval filter condition as a face picture matching a face retrieval filter condition (referred to herein as a first target Face picture).

For example, assuming that the retrieval device stores the face picture, the acquisition time of the face picture, and the attribute information of the face picture in the form of a face picture information table (see the related description in step S500), the retrieval device may The search filter conditions query the attribute information of the face picture in the face picture information table to obtain the face picture information entry that matches the face search filter condition, and obtain the face picture in the face picture information entry Storage location (that is, the storage location of the first target face picture) and the acquisition time of the first target face picture.

Further, the retrieval device may obtain the first target face picture from the specified storage space according to the storage location of the first target face picture.

When the face retrieval filter condition includes the face image to be retrieved and the third attribute information of the face image to be retrieved, the comparison of the face retrieval filter condition and the attribute information of the face picture recorded in the face picture information table may include: : Model the face picture to be retrieved and extract the fourth attribute information of the face picture to be retrieved; determine the person to be retrieved based on the third attribute information of the face picture to be retrieved and the fourth attribute information of the face picture to be retrieved Attribute information of the face; compare the attribute information of the face picture to be retrieved with the attribute information of the face picture recorded in the face picture information table.

When the face retrieval filter condition includes the face image to be retrieved and the third attribute information of the face image to be retrieved, the retrieval device can model the face image to be retrieved and extract the attribute information of the face image to be retrieved (in this article Called the fourth attribute information of the face picture to be retrieved).

After the retrieval device obtains the fourth attribute information, it can determine the attribute information of the face to be retrieved according to the third attribute information and the fourth attribute information.

For example, the retrieval device may compare the third attribute information and the fourth attribute information of the face picture to be retrieved. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or does not exist in the third attribute information , But the attribute information existing in the fourth attribute information is added to the attribute information of the face picture to be retrieved; for the attribute information existing in both the third attribute information and the fourth attribute information, the attribute information in the fourth attribute information is added The attribute information of the face picture to be retrieved is added to obtain the attribute information of the face picture to be retrieved.

When the retrieval device obtains the attribute information of the face picture to be retrieved, it can query the stored face picture, the acquisition time of the face picture, and the attribute information of the face picture according to the attribute information of the face picture to be retrieved, and compare it with the attribute information of the face picture to be retrieved. The face picture corresponding to the attribute information of the face picture whose attribute information matches the face picture is determined as the first target face picture that matches the face retrieval filter condition.

The above retrieval of face picture information is only a specific example in the case of storing face picture information in the form of a face picture information table, and is not a limitation on the protection scope of the present application, that is, in the embodiment of the present application, Retrieval of face picture information can also be achieved in other ways.

For example, when a face picture, a collection time of the face picture, and attribute information of the face picture are stored in the same data table in the database, the retrieval device may directly query the database based on the attribute information of the face picture to be retrieved. The entry where the attribute information of the matching face picture is located, and the face picture information is obtained from the queried entry.

Step S520: Generate a video summary according to the first target face picture and the collection time of the first target face picture, and play back the video summary.

In the embodiment of the present application, after the retrieval device obtains the first target face picture and the collection time of the first target face picture that match the filtering conditions of the face search, the search device may use the first target face picture and the first target face. The collection time of the picture generates a video summary with the retrieved face, and plays back the video summary of the detected face. The specific implementation method of this step is similar to the above step S220, except that the acquisition time corresponding to the first target picture and the first target picture is replaced with the acquisition time of the first target face picture and the first target face picture, and is not repeated here. To repeat.

The video source device is IPC, and the retrieval device is NVR. Among them, the NVR is loaded with a smart chip with a smart analysis function. In this example, the video digest generation scheme implementation process is as follows.

1.Face picture information acquisition

When IPC has a face picture capture function

The IPC captures a face picture, and transmits the captured face picture and the acquisition time of the face picture to the NVR. In this method, during the process of obtaining a real-time video stream, the IPC can capture a face picture, and transmit the captured face picture and the acquisition time of the face picture (that is, the capture time of the face picture) to the NVR. It should be noted that in this method, the IPC will also transmit the real-time video stream to the NVR, and the NVR saves the video recording according to a preset policy.

The NVR extracts feature values in the face picture, models the face picture according to the feature values of the face picture, and extracts attribute information of the face picture. In this method, when the NVR receives the face picture transmitted by the IPC, it can intelligently analyze the face picture through a smart chip. The smart chip can use the algorithm library to extract the feature values of the face in the face picture, and use the algorithm library to model the face picture according to the extracted feature values and extract the attribute information of the face picture. The flow chart of the NVR obtaining the face picture information can be shown in FIG. 6.

When IPC does not have a picture capture function

The NVR performs target detection on a video recording or a real-time video stream to obtain a face picture and a collection time of the face picture. In this method, the NVR can perform target detection on the video recording or real-time video stream through the smart chip to obtain the face picture and the acquisition time of the face picture in the video recording or real-time video stream (that is, the face picture is in the video data) Time of occurrence).

The NVR extracts feature values in the face picture, models the face picture according to the feature values of the face picture, and extracts attribute information of the face picture. In this method, when the NVR receives the face picture transmitted by the IPC, it can intelligently analyze the face picture through a smart chip. The smart chip can use the algorithm library to extract the feature values of the face in the face picture, and use the algorithm library to model the face picture according to the extracted feature values and extract the attribute information of the face picture. The flow chart of the NVR obtaining the face picture information can be shown in FIG. 7.

The NVR stores the face picture, the collection time of the face picture, and the attribute information of the face picture. For specific implementation, refer to the subsequent description.

2.Face picture information storage

The NVR stores the face picture to obtain the storage location of the face picture. For the face picture information, a database table FaceTable (face table) related to the face picture information is established, wherein the main fields in the FaceTable table are: the storage location of the face picture, the collection time of the face picture, and the face picture Attribute information. It should be noted that the NVR can also store model data of face pictures, and its specific implementation is not described here. After the NVR stores the face picture to the hard disk, the position offset and length of the hard disk where the face picture is located (that is, the storage location of the face picture) can be obtained.

The NVR records the storage location of the face picture, the collection time of the face picture, and the attribute information of the face picture in the FaceTable.

Face retrieval

A face retrieval request is received, and the face retrieval request carries a face retrieval filtering condition. The NVR can provide a face search interface. The face search interface includes a face search filter input area or / and options. The user can fill in or / and select a face search filter condition through the face search interface, and submit a face search request.

The NVR queries the storage location of the matching face pictures and the collection time of the face pictures from the FaceTable table according to the face retrieval filter conditions. The NVR can query the FaceTable table, compare the face retrieval filter conditions with the attribute information of the face pictures recorded in the FaceTable table, and compare the recorded face attribute information with the face records in the FaceTable entries that match the face retrieval filter conditions. The storage location of the picture and the collection time of the face picture are determined as the storage location of the matching face picture (that is, the storage location of the first target face picture) and the collection time of the face picture (that is, the first target face picture's Acquisition time).

The NVR obtains the first target face picture according to the storage location of the first target face picture. The NVR can read the first target face picture from the hard disk according to the storage position (position offset + length) of the first target face picture, so that the NVR can obtain the first target face picture and the first target face picture. Acquisition time.

4, video summary generation

For any first target face picture, obtain a target video clip 5 seconds before the acquisition time of the first target face picture and 5 seconds after the alarm of the first target face picture. The NVR can create a VideoTable1. The main fields in the VideoTable1 table are: the storage location of the video (the hard disk position offset + length) and the start and end time (start time and end time) of the video data. After a completed video is stored on the hard disk, a new record (ie, a new entry) is inserted into the VideoTable1 table, recording the video storage location and the start and end time. For any first target face picture, the NVR determines the 5 second before the acquisition time of the first target face picture as the start time of the target video clip, and the 5th time after the acquisition time of the first target face picture The second is determined as the end time of the target video clip, and the VideoTable1 table is queried according to the start time and end time of the first target face picture to obtain the target video clip.

Generate a video summary based on each target video clip, and decode and display it.

In this example, by obtaining and storing the face picture in the video data of the video source device, the acquisition time of the face picture, and the attribute information of the face picture, when a face retrieval request is received, it is carried according to the face retrieval request. Face search filter conditions, determine the first target face picture that matches the face search filter conditions, and then generate a video summary based on the first target face picture and the acquisition time of the first target face picture, and perform the video summary Playback avoids the need to extract matching face pictures from video data for each face retrieval, improves the efficiency and accuracy of face retrieval, and ensures the consistency of face video tracking.

Please refer to FIG. 8, which is a schematic flowchart of a video abstract generating method according to another embodiment of the present application. The video abstract generating method may be applied to a retrieval device. In this example, the video digest generation method is directed to a vehicle search request. As shown in FIG. 8, the video summary generating method may include the following steps.

Step S800: Acquire and store a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture in the video data of the video source device.

In this example, in order to improve the efficiency of vehicle retrieval, the retrieval device can acquire and store the vehicle picture information in the video data of the video source device. The vehicle picture information may include, but is not limited to, a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture.

In the embodiment of the present application, the attribute information of the vehicle picture may include, but is not limited to, one or more of the following: the location of the vehicle in the vehicle picture, the location of the license plate in the vehicle picture, the license plate number, the license plate color, the country type, and the body color , Vehicle brand, model (such as a large passenger car, truck or van, etc.), whether the driver is wearing a seat belt, and whether the driver is calling.

The specific implementation method for obtaining the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device is similar to the foregoing method of obtaining the target picture information, and the target picture information can be replaced with the vehicle picture information. This will not be repeated here.

In this example, after the retrieval device obtains the vehicle picture, the acquisition time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device, it may store the acquired vehicle picture, the acquisition time of the vehicle picture, and the vehicle picture. Attribute information.

Storing the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture may include: storing the vehicle picture; recording the storage location of the vehicle picture, the collection time of the vehicle picture in the vehicle picture information table, and Attribute information of the vehicle picture.

After the retrieval device obtains the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture, it can store the obtained vehicle picture, and record the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in Vehicle picture information table, its format can be shown in Table 2:

Table 2

车辆图片的位置信息Location information of vehicle pictures	车辆图片的采集时间Collection time of vehicle pictures	车辆图片的属性信息Vehicle picture attribute information
车辆图片1的位置信息Location information of vehicle picture 1	车辆图片1的采集时间Collection time of vehicle picture 1	车辆图片1的属性信息Attribute information of vehicle picture 1
车辆图片2的位置信息Location information of vehicle picture 2	车辆图片2的采集时间Collection time of vehicle picture 2	车辆图片2的属性信息Attribute information of vehicle picture 2
…...	…...	…...

The location information of the vehicle picture may be a position offset and a length of the vehicle picture in a storage space (such as a hard disk).

The foregoing implementation manner of storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device is only a specific example of storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in this application. The example is not a limitation on the protection scope of the present application, that is, in the embodiment of the present application, the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture may also be stored.

For example, in one example, the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture may be stored in the same database (that is, the vehicle picture is directly stored in the database in a binary form). In this example, the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture can be stored in the same data table. At this time, there is no need to additionally record the storage location of the vehicle picture.

In another example, the vehicle picture may still be stored first to obtain the storage location of the vehicle picture, but the storage location of the vehicle picture, the time when the vehicle picture was collected, and the attribute information of the vehicle picture are no longer stored in the form of a data table. Instead, it is stored in other forms, such as a tree structure or a file. The specific implementation is not described here.

Step S810: When a vehicle retrieval request is received, a first vehicle picture that matches the vehicle picture to be retrieved is determined according to the vehicle picture to be retrieved carried in the vehicle retrieval request.

In this example, the retrieval device may provide a vehicle retrieval function, and according to the pictures of the vehicle to be retrieved carried in the received vehicle retrieval request, retrieve the matching vehicle pictures and the collection time of the vehicle pictures in the manner of map search.

For example, the retrieval device may provide a vehicle retrieval request interface, and the vehicle retrieval request interface may include an input or / and selection area of a picture of the vehicle to be retrieved, and a user enters or / and selects a picture of the vehicle to be retrieved in the vehicle retrieval request interface, and Submit a vehicle search request.

When the retrieval device receives a vehicle retrieval request, it models the to-be-retrieved vehicle pictures and extracts attribute information of the to-be-retrieved vehicle pictures. Furthermore, the retrieval device can query the stored vehicle pictures and vehicle pictures based on the attribute information of the to-be-retrieved vehicle pictures. Collect the time and the attribute information of the vehicle picture, and determine the vehicle picture corresponding to the attribute information of the vehicle picture that matches the attribute information of the vehicle picture to be retrieved as the vehicle picture that matches the attribute information of the vehicle picture to be retrieved (referred to herein as For the first vehicle picture).

For example, assuming that the retrieval device stores the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the form of a vehicle picture information table (see the relevant description in step S800), the retrieval device may according to the attributes of the vehicle picture to be retrieved Information query the attribute information of the vehicle picture in the vehicle picture information table to obtain the vehicle picture information entry that matches the attribute information of the vehicle picture to be retrieved, and obtain the storage location of the vehicle picture in the vehicle picture information entry (i.e. Storage location of the first vehicle picture).

Further, the retrieval device may acquire the first vehicle picture from the specified storage space according to the storage location of the first vehicle picture.

The above retrieval of vehicle picture information is only a specific example in the case of storing vehicle picture information in the form of a vehicle picture information table, and is not a limitation on the protection scope of the present application. That is, in the embodiment of the present application, Other ways to achieve vehicle image information retrieval.

For example, when the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture are stored in the same data table in the database, the retrieval device may directly query the matching vehicle from the database according to the attribute information of the vehicle picture to be retrieved The entry where the attribute information of the picture is located, and the vehicle picture information is obtained from the queried entry.

Step S820: Generate a video summary according to the first vehicle picture and the collection time of the first vehicle picture, and play back the video summary.

In the embodiment of the present application, after the retrieval device obtains the first vehicle picture and the collection time of the first vehicle picture that match the picture of the vehicle to be retrieved, the retrieval device may generate a searched vehicle with the first vehicle picture and the first vehicle picture collection time. Video summary, and playback of video summary of the vehicle to be detected. The specific implementation method of this step is similar to the above step S220, except that the acquisition time corresponding to the first target picture and the first target picture is replaced with the acquisition time of the first vehicle picture and the first vehicle picture, and details are not described herein again.

1. Vehicle picture information acquisition

When the IPC has a vehicle picture capture function

The IPC captures the vehicle pictures and transmits the captured vehicle pictures and the collection time of the vehicle pictures to the NVR. In this method, during the process of acquiring the real-time video stream, the IPC can also capture a vehicle picture, and transmit the captured vehicle picture and the acquisition time of the vehicle picture (that is, the acquisition time of the vehicle picture) to the NVR. It should be noted that in this method, the IPC will also transmit the real-time video stream to the NVR, and the NVR saves the video recording according to a preset policy.

The NVR extracts feature values from the vehicle pictures, models the vehicle pictures according to the feature values of the vehicle pictures, and extracts attribute information of the vehicle pictures. In this mode, when the NVR receives the vehicle picture transmitted by the IPC, it can intelligently analyze the vehicle picture through a smart chip. The smart chip can use an algorithm library to extract the feature values of the vehicle from the vehicle picture, and use the algorithm library to model the vehicle picture according to the extracted feature values, and extract the attribute information of the vehicle picture. The schematic diagram of the NVR's process of obtaining vehicle picture information can be shown in FIG. 9.

When IPC does not have a picture capture function

The NVR performs object detection on the video recording or real-time video stream to obtain the vehicle picture and the acquisition time of the vehicle picture. In this mode, the NVR can perform target detection on the video recording or real-time video stream through the smart chip to obtain the vehicle pictures and the collection time of the vehicle pictures in the video recording or real-time video stream (that is, the time when the vehicle pictures appear in the video data ).

The NVR extracts feature values from the vehicle pictures, models the vehicle pictures according to the feature values of the vehicle pictures, and extracts attribute information of the vehicle pictures. In this mode, when the NVR receives the vehicle picture transmitted by the IPC, it can intelligently analyze the vehicle picture through a smart chip. The smart chip can use an algorithm library to extract the feature values of the vehicle from the vehicle picture, and use the algorithm library to model the vehicle picture according to the extracted feature values, and extract the attribute information of the vehicle picture. The schematic diagram of the process for the NVR to obtain vehicle picture information can be shown in FIG. 10.

The NVR stores the vehicle picture, the time when the vehicle picture was collected, and the attribute information of the vehicle picture. For specific implementation, see the subsequent description.

2. Vehicle picture information storage

The NVR stores the vehicle picture to obtain the storage location of the vehicle picture. For the vehicle picture information, a vehicle table (vehicle table) related to the vehicle picture information is established. The main fields in the VehicleTable table are: the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture. It should be noted that the NVR can also store model data of vehicle pictures, and its specific implementation is not described here. After the NVR stores the vehicle picture to the hard disk, the position offset and length of the hard disk where the vehicle picture is located (that is, the storage location of the vehicle picture) can be obtained.

The NVR records the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the VehicleTable.

3. Vehicle retrieval

A vehicle retrieval request is received, and the vehicle retrieval request carries a picture of a vehicle to be retrieved. The NVR can provide a vehicle search interface, and the vehicle search interface includes a picture input or / and selection area of a vehicle to be searched. The user can input or / and select pictures of the vehicle to be retrieved through the vehicle retrieval interface, and submit a vehicle retrieval request.

The NVR models the vehicle pictures to be retrieved and extracts the attribute information of the vehicle pictures to be retrieved.

The NVR queries the storage location of the matching vehicle picture and the collection time of the vehicle picture from the VehicleTable table according to the attribute information of the vehicle picture to be retrieved. The NVR can query the VehicleTable table, compare the attribute information of the vehicle picture to be retrieved with the attribute information of the vehicle picture recorded in the VehicleTable table, and match the attribute information of the recorded vehicle picture with the attribute information of the vehicle picture to be retrieved. The storage location of the vehicle picture and the collection time of the vehicle picture are determined as the matching storage location of the vehicle picture (that is, the storage location of the first vehicle picture) and the collection time of the vehicle picture (that is, the collection time of the first vehicle picture).

The NVR obtains the first vehicle picture according to the storage location of the first vehicle picture. The NVR can read the first vehicle picture from the hard disk according to the storage position (position offset + length) of the first vehicle picture, so that the NVR can obtain the first vehicle picture and the collection time of the first vehicle picture.

4, video summary generation

For any first vehicle picture, obtain a target video clip between 5 seconds before the collection time of the first vehicle picture and 5 seconds after the alarm of the first vehicle picture. The NVR can create a VideoTable2. The main fields in the VideoTable2 table are: the storage location of the video (the hard disk position offset + length) and the start and end time (start time and end time) of the video data. After a completed video is stored on the hard disk, a new record (ie, a new entry) is inserted into the VideoTable2 table, recording the video storage location and the start and end time. For any first vehicle picture, the NVR determines the 5 second before the acquisition time of the first vehicle picture as the start time of the target video clip, and the 5 second after the acquisition time of the first vehicle picture is determined as the target video clip. Query the VideoTable2 table according to the start time and end time of the first vehicle picture to obtain the target video clip.

In this example, by obtaining and storing the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture, when a vehicle retrieval request is received, according to the vehicle picture to be retrieved carried in the vehicle retrieval request To determine the first vehicle picture that matches the picture of the vehicle to be retrieved, and then generate a video summary based on the first vehicle picture and the acquisition time of the first vehicle picture, and play back the video summary, avoiding the need for video data from each vehicle retrieval Extracting matching vehicle pictures improves vehicle retrieval efficiency and accuracy, and ensures the consistency of vehicle video tracking.

Please refer to FIG. 11, which is a schematic flowchart of a video abstract generating method according to another embodiment of the present application. The video abstract generating method may be applied to a retrieval device. In this example, the video digest generation method is directed to a vehicle search request. As shown in FIG. 11, the video digest generating method may include the following steps.

Step S1100: Acquire and store a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture in the video data of the video source device.

For a specific implementation method of this step, reference may be made to step S800, and details are not described herein again.

Step S1110: When a vehicle search request is received, a second vehicle picture matching the vehicle search filter condition is determined according to the vehicle search filter condition carried in the vehicle search request.

In this example, the retrieval device may provide a vehicle retrieval function, and retrieve a matching vehicle picture and a collection time of the vehicle picture according to a vehicle retrieval filter condition carried in the received vehicle retrieval request.

For example, the retrieval device may provide a vehicle search request interface, and the vehicle search request interface may include a vehicle search filter condition input area or / and a vehicle search filter condition option, and a user enters or / and selects a vehicle search in the vehicle search request interface Filter conditions and submit a vehicle search request.

In one example, the vehicle retrieval filter condition is attribute information of a picture of the vehicle to be retrieved (this may be referred to as the third attribute information of the picture of the vehicle to be retrieved), which may include, but is not limited to, the license plate number, body color, One or more of information such as model and vehicle brand.

In another example, the vehicle retrieval filter condition may include the third attribute information of the image of the vehicle to be retrieved and the image of the vehicle to be retrieved.

When a retrieval device receives a vehicle retrieval request, it can obtain the vehicle retrieval filter conditions carried in the vehicle retrieval request, and query the stored vehicle pictures, the collection time of the vehicle pictures, and the attribute information of the vehicle pictures according to the vehicle retrieval filter conditions, and The vehicle picture corresponding to the attribute information of the vehicle picture matching the vehicle search filter condition is determined as the vehicle picture matching the vehicle search filter condition (referred to herein as the second vehicle picture).

For example, assuming that the retrieval device stores the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the form of a vehicle picture information table, the retrieval device can query the attributes of the vehicle picture in the vehicle picture information table according to the vehicle search filter Information to obtain a vehicle picture information entry that matches the vehicle search filter condition, and obtain the storage location of the vehicle picture in the vehicle picture information entry (that is, the storage location of the second vehicle picture) and the collection of the second vehicle picture time.

Further, the retrieval device may acquire the second vehicle picture from the designated storage space according to the storage location of the second vehicle picture.

In one embodiment of the present application, when the vehicle retrieval filter condition includes the third attribute information of the image of the vehicle to be retrieved and the image of the vehicle to be retrieved, the above comparison of the vehicle retrieval filter condition and the attribute information of the vehicle picture recorded in the vehicle picture information table May include: modeling a picture of a vehicle to be retrieved and extracting fourth attribute information of the picture of the vehicle to be retrieved; and determining, based on the third attribute information of the picture of the vehicle to be retrieved and the fourth attribute information of the picture of the vehicle to be retrieved, Attribute information; compare the attribute information of the vehicle picture to be retrieved with the attribute information of the vehicle picture recorded in the vehicle picture information table.

In this embodiment, when the vehicle retrieval filter condition includes a picture of the vehicle to be retrieved and third attribute information of the picture of the vehicle to be retrieved, the retrieval device may model the picture of the vehicle to be retrieved and extract attribute information of the picture of the vehicle to be retrieved (this article (Referred to as the fourth attribute information of the picture of the vehicle to be retrieved).

After the retrieval device obtains the fourth attribute information of the picture of the vehicle to be retrieved, it can determine the attribute information of the vehicle to be retrieved according to the third attribute information of the picture of the vehicle to be retrieved and the fourth attribute information of the picture of the vehicle to be retrieved.

For example, the retrieval device may compare the third attribute information of the picture of the vehicle to be retrieved with the fourth attribute information of the picture of the vehicle to be retrieved. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or the third attribute The information does not exist, but the attribute information existing in the fourth attribute information is added to the attribute information of the vehicle picture to be retrieved; for the attribute information existing in both the third attribute information and the fourth attribute information, the fourth attribute information is added to Is added to the attribute information of the picture of the vehicle to be retrieved, and further, the attribute information of the picture of the vehicle to be retrieved is obtained.

When the retrieval device obtains the attribute information of the vehicle picture to be retrieved, it can query the stored vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture according to the attribute information of the vehicle picture to be retrieved, and compare it with the attribute information of the vehicle picture to be retrieved The vehicle picture corresponding to the attribute information of the matched vehicle picture is determined as the second vehicle picture that matches the filter criteria of the vehicle search.

It should be recognized that the retrieval of the vehicle picture information described above is only a specific example in the case of storing the vehicle picture information in the form of a vehicle picture information table, and is not a limitation on the protection scope of the present application, that is, in the embodiments of the present application , You can also retrieve vehicle picture information in other ways.

Step S1120: Generate a video summary according to the second vehicle picture and the collection time of the second vehicle picture, and play back the video summary.

For a specific implementation method of this step, reference may be made to step S820, and details are not described herein again.

In the embodiment of the present application, by obtaining and storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device, when a vehicle retrieval request is received, the vehicle is retrieved according to the vehicle carried in the vehicle retrieval request. Filter conditions to determine the second vehicle picture that matches the vehicle search filter conditions, and then generate a video summary based on the second vehicle picture and the acquisition time of the second vehicle picture, and play back the video summary, avoiding the need for each vehicle retrieval Extracting matching vehicle pictures from video data improves the efficiency and accuracy of vehicle retrieval, and ensures the consistency of vehicle video tracking.

The method provided in the present application has been described above. The device provided in this application is described below.

Please refer to FIG. 12, which is a schematic structural diagram of a video summary generating apparatus according to an embodiment of the present application. The video summary generating apparatus may be applied to the search device in the foregoing embodiment. As shown in FIG. 12, the video summary generating apparatus The device may include the following units.

The receiving unit 1210 is configured to receive a target search request, where the target search request carries characteristic information of a target to be searched.

The search unit 1220 is configured to search for a first target picture that matches feature information of the target to be searched.

The processing unit 1230 is configured to generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.

In an optional embodiment, as shown in FIG. 13, the device further includes the following units.

The obtaining unit 1240 is configured to obtain target picture information in the video data of the video source device, where the target picture information includes the target picture, a collection time of the target picture, and attribute information of the target picture.

The saving unit 1250 is configured to save the target picture information to a picture information database.

In an optional implementation manner, the obtaining unit 1240 is specifically configured to receive target picture information sent by the video source device.

In an optional implementation manner, the obtaining unit 1240 is specifically configured to receive the target picture and the collection time of the target picture sent by the video source device; model the target picture, and Extracting attribute information of the target picture.

In an optional implementation manner, the obtaining unit 1240 is specifically configured to receive the target picture, the collection time of the target picture, and first attribute information of the target picture sent by the video source device; Model the target picture, and extract the second attribute information of the target picture; determine the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.

In an optional implementation manner, the obtaining unit 1240 is specifically configured to perform target detection on the video data provided by the video source device to obtain the target picture and the target image in the video data. Acquisition time of the target picture; modeling the target picture, and extracting attribute information of the target picture.

In an optional implementation manner, the feature information of the target to be searched includes attribute information of the target to be searched; and the searching unit 1220 is specifically configured to be stored in the picture information database according to the attribute information of the target to be searched Searching for a matching first target picture.

In an optional implementation manner, the feature information of the target to be searched includes a target picture to be searched; the search unit 1220 is specifically configured to model the target picture to be searched and extract the target to be searched Attribute information of the target picture; and searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.

In an optional implementation manner, the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; and the searching unit 1220 is specifically configured to perform a search on the target picture to be searched Performing modeling, and extracting fourth attribute information of the target picture to be searched; determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information; and according to the target to be searched The attribute information of the picture searches for a matching first target picture in the picture information database.

In an optional implementation manner, the target search request further carries a search time period range; the search unit 1220 is specifically configured to target the target picture in the picture information database according to the search time range range. Perform filtering to obtain a second target picture whose acquisition time is within the range of the search period; and search for a matching first target picture from the second target picture according to the feature information of the target to be searched.

In an optional implementation manner, the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture; and the search unit 1220 is specifically configured to perform the search according to the search. The channel number is used to filter the target pictures in the picture information database to obtain a third target picture whose channel number is consistent with the search channel number; according to the feature information of the target to be searched, from the third target picture Search for a matching first target picture.

In an optional implementation manner, the target picture is a face picture, and the target search request is a face search request.

In an optional implementation manner, the target picture is a vehicle picture, and the target search request is a vehicle search request.

In an optional implementation manner, the processing unit 1230 is specifically configured to sort the first target picture in an order from early to late in the acquisition time; and generate the first target picture according to the sorted first target picture. Video summary.

In an optional implementation manner, the processing unit 1230 is specifically configured to determine, for each first target picture, a target video clip corresponding to the first target picture, where the target video clip is the first target Recording data between the n-th second before the acquisition time corresponding to the picture and the m-th second after the acquisition time of the first target picture; generating a video summary according to each of the target video clips.

In an optional implementation manner, the processing unit 1230 is specifically configured to, when there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, Determine the start time point and end time point of the video clip corresponding to the first target picture, where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is The m-th second after the acquisition time corresponding to the first target picture; searching whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs, and whether the end time point exists If there are I frames at the start time point and I frames at the end time point, discard the remaining first target pictures in the multiple first target pictures, and record the video corresponding to the first target pictures The clip is determined as the target video clip.

In an optional implementation manner, the processing unit 1230 is further configured to: if there is no I-frame at the start time point, the start time of the video clip corresponding to the first target picture The point is increased by x seconds to obtain a new starting time point, and the above search steps are repeated until the I frame of the new starting time point is searched in the recording data of the video data channel to which the first target picture belongs. Or, the new start time point of the video clip corresponding to the first target picture is the same as the acquisition time; if there is no I frame at the end time point, the corresponding The end time point of the video clip is reduced by x seconds to obtain a new end time point, and the above search steps are repeated until the new search result is found in the video data of the video data channel to which the first target picture belongs. I frame at the end time point, or the new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; the corresponding video clips in the plurality of first target pictures respectively in, Selecting the longest video clip as the target video clip; discarding the first target picture corresponding to the remaining video clips of the plurality of first target pictures.

In an optional implementation manner, the processing unit 1230 is specifically configured to filter the target video clip according to a start time point and an end time point of each target video clip to remove time-repeated video data. Generating the video summary according to the filtered target video clip.

In an optional implementation manner, the processing unit 1230 is specifically configured to sort each of the target video clips according to the start time point of each of the target video clips; for an adjacent first target The video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second If the target video clip belongs to the same video data channel, the first target video clip and the second target video clip are merged, and the start time point of the combined video clip is the start time of the first target video clip Point, the end time point is the end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the The end time point is used as the start time point of the second target video clip, or the start time point of the second target video clip is used as the first The end time point of the target video clip; wherein the start time point of the first target video clip is smaller than the start time point of the second target video clip.

Please refer to FIG. 14, which is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application. The electronic device may include a processor 1401, a communication interface 1402, a memory 1403, and a communication bus 1404. The processor 1401, the communication interface 1402, and the memory 1403 complete communication with each other through the communication bus 1404. Among them, a computer program is stored in the memory 1403; the processor 1401 can execute the program stored in the memory 1403 to execute the video digest generating method described above.

The memory 1403 mentioned herein may be any electronic, magnetic, optical, or other physical storage device, and may contain or store information such as executable instructions, data, and so on. For example, the memory 1402 may be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as hard drive), solid state hard disk, any type of storage disk (such as optical disk , DVD, etc.), or similar storage media, or a combination thereof.

An embodiment of the present application further provides a machine-readable storage medium storing a computer program, such as the memory 1403 in FIG. 14, and the computer program may be executed by the processor 1401 in the electronic device shown in FIG. 14 to implement the foregoing description Video digest generation method.

It should be noted that in this article, relational terms such as first and second are used only to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or operations There is any such actual relationship or order among them. Moreover, the terms "including", "comprising", or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article, or device that includes a series of elements includes not only those elements but also those that are not explicitly listed Or other elements inherent to such a process, method, article, or device. Without more restrictions, the elements defined by the sentence "including a ..." do not exclude the existence of other identical elements in the process, method, article, or equipment including the elements.

The above are only preferred embodiments of this application, and are not intended to limit this application. Any modification, equivalent replacement, or improvement made within the spirit and principles of this application shall be included in this application Within the scope of protection.

Claims

A video summary generating method is characterized in that it includes:

Receiving a target search request, where the target search request carries characteristic information of a target to be searched;

Searching for a first target picture that matches the feature information of the target to be searched;

Generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
The method according to claim 1, before searching for the first target picture matching the feature information of the target to be searched, further comprising:

Acquiring target picture information in video data of a video source device, where the target picture information includes the target picture, a collection time of the target picture, and attribute information of the target picture;

Save the target picture information to a picture information database.
The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:

Receiving the target picture information sent by the video source device.
The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:

Receiving the target picture and the acquisition time of the target picture sent by the video source device;

Model the target picture, and extract attribute information of the target picture.
The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:

Receiving the target picture, the acquisition time of the target picture, and first attribute information of the target picture sent by the video source device;

Modeling the target picture, and extracting second attribute information of the target picture;

Determine the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:

Performing target detection on the video data provided by the video source device to obtain the target picture and the acquisition time of the target picture in the video data;

Model the target picture, and extract attribute information of the target picture.
The method according to claim 2, wherein the feature information of the target to be searched includes attribute information of the target to be searched;

Searching for the first target picture that matches the feature information of the target to be searched includes:

Searching for the matching first target picture in the picture information database according to the attribute information of the target to be searched.
The method according to claim 2, wherein the feature information of the target to be searched comprises a target picture to be searched;

Searching for the first target picture that matches the feature information of the target to be searched includes:

Modeling the target picture to be searched, and extracting attribute information of the target picture to be searched;

Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
The method according to claim 2, wherein the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched;

Searching for the first target picture that matches the feature information of the target to be searched includes:

Modeling the target picture to be searched, and extracting fourth attribute information of the target picture to be searched;

Determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information;

Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
The method according to claim 2, wherein the target search request further carries a search time period range;

Searching for the first target picture that matches the feature information of the target to be searched includes:

Filtering the target pictures in the picture information database according to the range of the search time period to obtain a second target picture whose acquisition time is within the range of the search time period;

Searching for a matching first target picture from the second target picture according to the feature information of the target to be searched.
The method according to claim 2, wherein the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture;

Searching for the first target picture that matches the feature information of the target to be searched includes:

Filtering the target picture in the picture information database according to the search channel number to obtain a third target picture whose channel number is consistent with the search channel number;

Searching for a matching first target picture from the third target picture according to the feature information of the target to be searched.
The method according to claim 2, wherein the target picture is a face picture, and the target search request is a face search request.
The method according to claim 2, wherein the target picture is a vehicle picture, and the target search request is a vehicle search request.
The method according to claim 1, wherein generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture comprises:

Sorting the first target picture in the order of the collection time from morning to night;

Generating the video summary according to the sorted first target picture.
The method according to claim 1 or 14, wherein generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture comprises:

For each first target picture, a target video clip corresponding to the first target picture is determined, where the target video clip is from the nth second before the acquisition time corresponding to the first target picture to the first target picture Video data between the m-th second after the acquisition time;

A video summary is generated according to each of the target video clips.
The method according to claim 15, wherein for each of the first target pictures, determining the target video clip corresponding to the first target picture comprises:

When there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, determine the start time point and end time point of the video clip corresponding to the first target picture Where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is the m-th second after the acquisition time corresponding to the first target picture;

Searching whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs, and whether the I frame at the end time point exists;

If there are I frames at the start time point and I frames at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded, and the video clip corresponding to the first target picture is determined as the Describe the target video clip.
The method according to claim 16, further comprising:

If there is no I frame at the start time point, increase the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new start time point, and repeat the above search steps until The I-frame at the new start time point is searched in the recording data of the video data channel to which the first target picture belongs, or the new one of the video clips corresponding to the first target picture The starting time point is the same as the collection time;

If there is no I frame at the end time point, the end time point of the video clip corresponding to the first target picture is reduced by x seconds to obtain a new end time point, and the above search steps are repeated until the An I-frame of the new end time point is searched in the recording data of the video data channel to which a target picture belongs, or the new end time point of the video clip corresponding to the first target picture and The acquisition times are the same;

Selecting, from the corresponding video clips in the multiple first target pictures, the longest video clip as the target video clip;

Discard the first target pictures corresponding to the remaining video clips in the multiple first target pictures.
The method according to claim 15, wherein generating the video summary according to each of the target video recording segments comprises:

Filtering the target video clip according to the start time point and the end time point of each target video clip to remove time-repeated video data;

Generating the video summary according to the filtered target video clip.
The method according to claim 18, wherein filtering the target video clip according to the start time point and the end time point of each of the target video clips comprises:

Sorting each of the target video clips according to the start time point of each of the target video clips;

For the adjacent first target video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip,

If the first target video clip and the second target video clip belong to the same video data channel, merge the first target video clip and the second target video clip, and start time of the combined video clip The point is the start time point of the first target video clip, and the end time point is the end time point of the second target video clip;

If the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as the start of the second target video clip A time point, or the start time point of the second target clip as the end time point of the first target video clip;

Wherein, the start time point of the first target video clip is smaller than the start time point of the second target video clip.
A video digest generating device, comprising:

A receiving unit, configured to receive a target search request, where the target search request carries characteristic information of a target to be searched;

A search unit, configured to search for a first target picture that matches feature information of the target to be searched;

A processing unit, configured to generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
The apparatus according to claim 20, further comprising:

An obtaining unit, configured to obtain target picture information in video data of a video source device, where the target picture information includes the target picture, a collection time of the target picture, and attribute information of the target picture;

A saving unit, configured to save the target picture information to a picture information database.
The device according to claim 21, wherein:

The obtaining unit is specifically configured to receive the target picture information sent by the video source device.
The apparatus according to claim 21, wherein the obtaining unit is specifically configured to:

Receiving the target picture and the acquisition time of the target picture sent by the video source device;

Model the target picture, and extract attribute information of the target picture.
The apparatus according to claim 21, wherein the obtaining unit is specifically configured to:

Receiving the target picture, the acquisition time of the target picture, and first attribute information of the target picture sent by the video source device;

Modeling the target picture, and extracting second attribute information of the target picture;

Determine the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
The apparatus according to claim 21, wherein the obtaining unit is specifically configured to:

Performing target detection on the video data provided by the video source device to obtain the target picture and the acquisition time of the target picture in the video data;

Model the target picture, and extract attribute information of the target picture.
The apparatus according to claim 21, wherein the feature information of the target to be searched includes attribute information of the target to be searched;

The search unit is specifically configured to search for a matching first target picture in the picture information database according to attribute information of the target to be searched.
The device according to claim 21, wherein the feature information of the target to be searched includes a target picture to be searched;

The search unit is specifically used for

Modeling the target picture to be searched, and extracting attribute information of the target picture to be searched;

Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
The apparatus according to claim 21, wherein the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched;

The search unit is specifically used for

Modeling the target picture to be searched, and extracting fourth attribute information of the target picture to be searched;

Determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information;

Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
The apparatus according to claim 21, wherein the target search request further carries a search time period range;

The search unit is specifically used for

Filtering the target pictures in the picture information database according to the search time period range to obtain a second target picture with a collection time within the search time range;

Searching for a matching first target picture from the second target picture according to the feature information of the target to be searched.
The device according to claim 21, wherein the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture;

The search unit is specifically used for

Filtering the target pictures in the picture information database according to the search channel number to obtain a third target picture with the same channel number as the search channel number;

Searching for a matching first target picture from the third target picture according to the feature information of the target to be searched.
The device according to claim 21, wherein the target picture is a face picture, and the target search request is a face search request.
The device according to claim 21, wherein the target picture is a vehicle picture, and the target search request is a vehicle search request.
The apparatus according to claim 20, wherein the processing unit is specifically configured to:

Sorting the first target picture in the order of the collection time from morning to night;

Generating the video summary according to the sorted first target picture.
The device according to claim 20 or 33, wherein the processing unit is specifically configured to

For each first target picture, a target video clip corresponding to the first target picture is determined, where the target video clip is from the nth second before the acquisition time corresponding to the first target picture to the first target picture Video data between the m-th second after the acquisition time;

A video summary is generated according to each of the target video clips.
The apparatus according to claim 34, wherein the processing unit is specifically configured to:

When there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, determine the start time point and end time point of the video clip corresponding to the first target picture Where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is the m-th second after the acquisition time corresponding to the first target picture;

Searching whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs, and whether the I frame at the end time point exists;

If there are I frames at the start time point and I frames at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded, and the video clip corresponding to the first target picture is determined as the Describe the target video clip.
The apparatus according to claim 35, wherein the processing unit is further configured to:

If there is no I frame at the start time point, increase the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new start time point, and repeat the above search steps until The I-frame at the new start time point is searched in the recording data of the video data channel to which the first target picture belongs, or the new one of the video clips corresponding to the first target picture The starting time point is the same as the collection time;

If there is no I frame at the end time point, the end time point of the video clip corresponding to the first target picture is reduced by x seconds to obtain a new end time point, and the above search steps are repeated until the An I-frame of the new end time point is searched in the recording data of the video data channel to which a target picture belongs, or the new end time point of the video clip corresponding to the first target picture and The acquisition times are the same;

Selecting, from the corresponding video clips in the multiple first target pictures, the longest video clip as the target video clip;

Discard the first target pictures corresponding to the remaining video clips in the multiple first target pictures.
The apparatus according to claim 34, wherein the processing unit is specifically configured to:

Filtering the target video clip according to the start time point and the end time point of each target video clip to remove time-repeated video data;

Generating the video summary according to the filtered target video clip.
The apparatus according to claim 37, wherein the processing unit is specifically configured to:

Sorting each of the target video clips according to the start time point of each of the target video clips;

For the adjacent first target video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip,

If the first target video clip and the second target video clip belong to the same video data channel, merge the first target video clip and the second target video clip, and start time of the combined video clip The point is the start time point of the first target video clip, and the end time point is the end time point of the second target video clip;

If the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as the start of the second target video clip A time point, or the start time point of the second target clip as the end time point of the first target video clip;

Wherein, the start time point of the first target video clip is smaller than the start time point of the second target video clip.
An electronic device, comprising a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;

The memory is used to store a computer program;

The processor is configured to implement the method steps according to any one of claims 1 to 19 when the computer program stored in the memory is executed.
A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method steps according to any one of claims 1-19 are implemented.