WO2020048324A1 - Video abstract generation method and apparatus, and electronic device and readable storage medium - Google Patents

Video abstract generation method and apparatus, and electronic device and readable storage medium Download PDF

Info

Publication number
WO2020048324A1
WO2020048324A1 PCT/CN2019/102073 CN2019102073W WO2020048324A1 WO 2020048324 A1 WO2020048324 A1 WO 2020048324A1 CN 2019102073 W CN2019102073 W CN 2019102073W WO 2020048324 A1 WO2020048324 A1 WO 2020048324A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
picture
target picture
video
time point
Prior art date
Application number
PCT/CN2019/102073
Other languages
French (fr)
Chinese (zh)
Inventor
韩巧玲
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201811026894.6A external-priority patent/CN110876090B/en
Priority claimed from CN201811027515.5A external-priority patent/CN110876029B/en
Priority claimed from CN201811025858.8A external-priority patent/CN110876092B/en
Priority claimed from CN201811027494.7A external-priority patent/CN110929095A/en
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2020048324A1 publication Critical patent/WO2020048324A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying

Definitions

  • the present application relates to video surveillance technology, and in particular, to a method, a device, an electronic device, and a readable storage medium for generating a video summary.
  • Video surveillance system as an important technical means of social security management, is increasingly used and deployed in the field of social security maintenance.
  • the number of deployed surveillance devices increases and the scope of deployment expands, the amount of data stored in video recording data also increases. If you want to find out the specific target (person or vehicle, etc.) in which time period and place from the video recording data, it is often necessary to manually search and search a large amount of video recording data, which takes a long time and may be neglected. There are efficiency bottlenecks and incomplete risks in video positioning and integrated display.
  • the present application provides a method and a device for generating a video summary.
  • a method for generating a video digest including: receiving a target search request, where the target search request carries feature information of a target to be searched; and the search matches the feature information of the target to be searched A first target picture; generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
  • the method before searching for the first target picture that matches the feature information of the target to be searched, the method further includes: obtaining target picture information in video data of the video source device, where the target picture information includes the target picture, Collection time of the target picture and attribute information of the target picture; saving the target picture information to a picture information database.
  • acquiring the target picture information in the video data of the video source device includes receiving the target picture information sent by the video source device.
  • acquiring the target picture information in the video data of the video source device includes: receiving the target picture and the acquisition time of the target picture sent by the video source device; The picture is modeled, and attribute information of the target picture is extracted.
  • acquiring the target picture information in the video data of the video source device includes: receiving the target picture sent by the video source device, a collection time of the target picture, and a first picture of the target picture.
  • An attribute information modeling the target picture and extracting the second attribute information of the target picture; determining the target picture according to the first attribute information of the target picture and the second attribute information of the target picture Attribute information.
  • acquiring the target picture information in the video data of the video source device includes: performing target detection on the video data provided by the video source device to obtain the video data in the video data.
  • the target picture and the acquisition time of the target picture modeling the target picture and extracting attribute information of the target picture.
  • the characteristic information of the target to be searched includes attribute information of the target to be searched; and searching for the first target picture that matches the characteristic information of the target to be searched includes: according to the attribute information of the target to be searched Searching the picture information database for the matching first target picture.
  • the feature information of the target to be searched includes a target picture to be searched; searching for the first target picture that matches the characteristic information of the target to be searched includes: modeling the target picture to be searched, And extracting attribute information of the target picture to be searched for; and searching the picture information database for the matching first target picture according to the attribute information of the target picture to be searched for.
  • feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; and searching for the first target picture that matches the characteristic information of the target to be searched includes: : Model the target picture to be searched, and extract fourth attribute information of the target picture to be searched; determine the attribute of the target picture to be searched according to the third attribute information and the fourth attribute information Information; searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
  • the target search request further carries a search time range; searching for the first target picture that matches the characteristic information of the target to be searched includes: comparing the picture according to the search time range
  • the target pictures in the information database are filtered to obtain a second target picture whose acquisition time is within the range of the search time period; according to the feature information of the target to be searched, a matching target is searched from the second target picture.
  • the first target picture is described.
  • the target search request further carries a search channel number
  • the target picture information further includes a channel number of the target picture; searching for the first target matching the feature information of the target to be searched
  • the picture includes: filtering the target picture in the picture information database according to the search channel number to obtain a third target picture whose channel number is consistent with the search channel number; Feature information, searching the first target picture for matching from the third target picture.
  • the target picture is a face picture
  • the target search request is a face search request
  • the target picture is a vehicle picture
  • the target search request is a vehicle search request
  • generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: sorting the first target picture in the order of the acquisition time from morning to night; Generating the video summary by using the first target picture.
  • generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: for each first target picture, determining a target video clip corresponding to the first target picture, The target video clip is video data between the n-th second before the acquisition time corresponding to the first target picture and the m-th second after the acquisition time of the first target picture; according to each target The video clip generates a video summary.
  • determining the target video clip corresponding to the first target picture includes: when there are multiple first target pictures with the same acquisition time, for the multiple first target pictures, For any first target picture in the target picture, determine a start time point and an end time point of the video clip corresponding to the first target picture, where the start time point is the acquisition time corresponding to the first target picture The nth second before, the end time point is the mth second after the acquisition time corresponding to the first target picture; searching whether the starting time point exists in the recording data of the video data channel to which the first target picture belongs If there are I frames at the end time point; if there are I frames at the start time point and I frames at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded And determine the video clip corresponding to the first target picture as the target video clip.
  • the method further includes: if there is no I frame at the start time point, increasing the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new start point Point in time, and repeat the above search step until the new starting time point I frame is searched in the recording data of the video data channel to which the first target picture belongs, or the first target picture corresponds to The new start time point of the video clip is the same as the acquisition time; if there is no I frame at the end time point, the end time of the video clip corresponding to the first target picture The point is reduced by x seconds to obtain a new end time point, and the above search step is repeated until the I frame of the new end time point is searched in the recording data of the video data channel to which the first target picture belongs, or , The new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; among the corresponding video clips in the plurality of first target pictures, the longest video clip is selected as the The target footage; discard the remaining length of the plurality of video clips
  • generating the video summary according to each of the target video clips includes: filtering the target video clips according to a start time point and an end time point of each target video clip to remove time-repeated video data. Generating the video summary according to the filtered target video clip.
  • filtering the target video clips according to the start time point and the end time point of each target video clip includes: matching the target video clips according to the start time point of each target video clip. Sorting each of the target video clips; for an adjacent first target video clip and a second target video clip, when the end time point of the first target video clip is greater than or equal to the start time of the second target video clip At the point of time, if the first target video clip and the second target video clip belong to the same video data channel, the first target video clip and the second target video clip are merged.
  • the start time point is the start time point of the first target video clip
  • the end time point is the end time point of the second target video clip
  • the first target video clip and the second target video clip are Belonging to different video data channels, using the end time point of the first target video clip as the start time point of the second target video clip, or Use the start time point of the second target video clip as the end time point of the first target video clip; wherein the start time point of the first target video clip is smaller than the second The starting time point of the target video clip.
  • a video digest generating device including: a receiving unit configured to receive a target search request, where the target search request carries characteristic information of a target to be searched; a search unit configured to search A first target picture matching the characteristic information of the target to be searched; a processing unit, configured to generate a video digest according to the first target picture and the acquisition time corresponding to the first target picture.
  • the apparatus further includes: an obtaining unit, configured to obtain target picture information in the video data of the video source device, where the target picture information includes the target picture, the acquisition time of the target picture, and attribute information of the target picture; A unit, configured to save the target picture information to a picture information database.
  • the obtaining unit is specifically configured to receive the target picture information sent by the video source device.
  • the obtaining unit is specifically configured to receive the target picture and the collection time of the target picture sent by the video source device; model the target picture and extract attributes of the target picture information.
  • the obtaining unit is specifically configured to receive the target picture, the acquisition time of the target picture, and first attribute information of the target picture sent by the video source device; and model the target picture, And extracting the second attribute information of the target picture; determining the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
  • the obtaining unit is specifically configured to perform target detection on the video data provided by the video source device to obtain the target picture and the acquisition time of the target picture in the video data;
  • the target picture is modeled, and attribute information of the target picture is extracted.
  • the feature information of the target to be searched includes attribute information of the target to be searched; and the search unit is specifically configured to search the picture information database for the matching first part according to the attribute information of the target to be searched.
  • a target picture is specifically configured to search the picture information database for the matching first part according to the attribute information of the target to be searched.
  • the feature information of the target to be searched includes a target picture to be searched; the search unit is specifically configured to model the target picture to be searched and extract attribute information of the target picture to be searched; The attribute information of the target picture to be searched is searched in the picture information database for the matching first target picture.
  • the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched;
  • the search unit is specifically configured to model the target picture to be searched, and Extracting fourth attribute information of the target picture to be searched; determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information; and according to the attribute information of the target picture to be searched in
  • the picture information database searches for a matching first target picture.
  • the target search request also carries a search time range; the search unit is specifically configured to filter the target pictures in the picture information database according to the search time range to obtain a collection time A second target picture within the range of the search period; and searching for a matching first target picture from the second target picture according to the feature information of the target to be searched.
  • the target search request also carries a search channel number
  • the target picture information further includes a channel number of the target picture
  • the search unit is specifically configured to match the search channel number with the search channel number.
  • the target pictures in the picture information database are filtered to obtain a third target picture with the same channel number as the search channel number; according to the feature information of the target to be searched, a matching first picture is searched from the third target picture.
  • the target picture is a face picture
  • the target search request is a face search request
  • the target picture is a vehicle picture
  • the target search request is a vehicle search request
  • the processing unit is specifically configured to sort the first target picture in the order of the collection time from morning to night; and generate the video summary according to the sorted first target picture.
  • the processing unit is specifically configured to determine, for each first target picture, a target video clip corresponding to the first target picture, where the target video clip is the acquisition time corresponding to the first target picture Recording data between the nth second before and the mth second after the acquisition time of the first target picture; generating a video summary according to each of the target video clips.
  • the processing unit is specifically configured to, when there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, determine that the first target picture corresponds to The start time point and end time point of the video clip, where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is the corresponding time point of the first target picture The m-th second after the acquisition time; searching whether the I frame at the start time point and the I frame at the end time point exist in the recording data of the video data channel to which the first target picture belongs; if the The I frame at the start time point and the I frame at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded, and the video clip corresponding to the first target picture is determined as the target video Fragment.
  • the processing unit is further configured to, if there is no I frame at the start time point, increase the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new Start time point, and repeat the above search steps until the new start time point I frame is searched in the video data of the video data channel to which the first target picture belongs, or the first target
  • the new start time point of the video clip corresponding to the picture is the same as the acquisition time; if there is no I frame at the end time point, the video clip corresponding to the first target picture is Decrease the end time point by x seconds to obtain a new end time point, and repeat the above search steps until the I frame of the new end time point is searched in the recording data of the video data channel to which the first target picture belongs.
  • the new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; among the corresponding video clips in the plurality of first target pictures, a video with the longest duration is selected Examples of the target video clip; discard the remaining length of the plurality of video clips when the first target image corresponding to the first target picture.
  • the processing unit is specifically configured to filter the target video clip according to a start time point and an end time point of each of the target video clips to remove time-repeated video data; and according to the filtered target video
  • the snippet generates the video summary.
  • the processing unit is specifically configured to sort each of the target video clips according to the start time point of each of the target video clips; for an adjacent first target video clip and a second target video clip Clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second target video clip belong to the same video data Channel, the first target video clip and the second target video clip are merged, the start time point of the merged video clip is the start time point of the first target video clip, and the end time point is The end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as The start time point of the second target video clip, or using the start time point of the second target video clip as the first target video clip Beam time point; wherein the first target segment of video start time point of the second target is less than the start time point of the video clip.
  • an electronic device including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory pass through the processor.
  • the communication bus completes communication with each other; a memory is configured to store a computer program; and a processor is configured to implement the steps of the above-mentioned video abstract generation method when the computer program stored on the memory is executed.
  • a computer-readable storage medium characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the video digest generation is implemented. Method steps.
  • the video abstract generating method in the embodiment of the present application receives a target search request, searches for a first target picture that matches the characteristic information of the target to be searched carried in the target search request, and according to each first target picture and its corresponding acquisition time Generate a video summary of the search target. This improves the efficiency and accuracy of locating targets in video recordings. On the basis of removing video recordings that do not match the target to be searched, the consistency of target video tracking is guaranteed.
  • FIG. 1 is a schematic structural diagram of a video digest generating system according to an exemplary embodiment of the present application
  • FIG. 2 is a schematic flowchart of a video abstract generating method according to an exemplary embodiment of the present application
  • FIG. 3 is a schematic flowchart of generating a picture information database according to an exemplary embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a repeated picture filtering process according to an exemplary embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a video digest generating system according to another exemplary embodiment of the present application.
  • FIG. 6 is a schematic flowchart of extracting attributes of a face picture according to an exemplary embodiment of the present application.
  • FIG. 7 is a schematic flowchart of extracting attributes of a face picture according to another exemplary embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a video digest generating system according to still another exemplary embodiment of the present application.
  • FIG. 9 is a schematic flowchart of extracting a picture attribute of a vehicle according to an exemplary embodiment of the present application.
  • FIG. 10 is a schematic flowchart of extracting a picture attribute of a vehicle according to another exemplary embodiment of the present application.
  • FIG. 11 is a schematic flowchart illustrating a video digest generating system according to another exemplary embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a video digest generating apparatus according to an exemplary embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a video digest generating apparatus according to another exemplary embodiment of the present application.
  • Fig. 14 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application.
  • the video digest generating system may include a video source device 110 and a search device 120.
  • the video source device 110 may provide video data, and the video data may include real-time video data or video recording data (referred to as recording data herein).
  • the search device 120 may receive a target search request, and search for a target picture (referred to herein as a text image) in the video data of the video source device 110 that matches the feature information of the target to be searched according to the characteristic information of the target to be searched carried in the target search request Is the first target picture), and a video summary of the target to be searched is generated according to the searched first target picture.
  • a target picture referred to herein as a text image
  • the video source device 110 may be a front-end video capture device (such as IPC (Internet Protocol Camera)) or a video recording storage device (such as NVR (Network Video Recorder, Network Hard Disk Video Recorder)). ));
  • the search device 120 may be an NVR (with a target search function) or a device deployed in a video surveillance system dedicated to target search.
  • the video source device 110 is an NVR, the video source device 110 and the search device 120 may be the same device.
  • one video source device 110 may provide video data for multiple search devices 120, and one search device 120 may also obtain video data from multiple video source devices 110 (one-to-one is taken as an example in the figure).
  • the video source device 110 is IPC, one video source device 110 can correspond to one video data channel; when the video source device 110 is a video recording storage device such as an NVR, one video source device 110 can provide multiple video data channels.
  • Video data (recording data).
  • FIG. 2 is a schematic flowchart of a video abstract generating method according to an embodiment of the present application.
  • the video abstract generating method may be applied to a search device (taking an NVR as an example).
  • the video summary generating method may include the following steps.
  • Step S200 Receive a target search request, where the target search request carries characteristic information of a target to be searched.
  • the target may include, but is not limited to, a human face, a vehicle, or a license plate.
  • the feature information of the target to be searched may include, but is not limited to, a face picture, structured information of the face (such as whether to smile, whether to wear glasses, gender, age range, etc.), etc. Or more.
  • the characteristic information of the target to be searched may be, but is not limited to, one or more of a vehicle picture, characteristic information of the vehicle (such as color, type, logo, brand, etc.).
  • the feature information of the target to be searched may include, but is not limited to, one or more of a license plate picture, characteristic information (such as color, position, license plate number, etc.) of the license plate.
  • the target may further include a human body, an animal, and the like.
  • the feature information of the target to be searched may include, but is not limited to, one or more of a human body picture, human body characteristic information (such as height, weight, gender, skin color, clothing, etc.).
  • the characteristic information of the target to be searched may include, but is not limited to, one or more of animal pictures, animal characteristic information (such as type, hair color, size, etc.) and the like.
  • Step S210 Search for a first target picture that matches the feature information of the target to be searched.
  • the search device when receiving a target search request, may search for a first target picture in the video data of the video source device that matches the characteristic information of the target to be searched for according to the characteristic information of the target to be searched.
  • the first target picture and the acquisition time of the first target picture may be used as the first target picture information that matches the feature information of the target to be searched.
  • the search result when the search device searches for the first target picture that matches the feature information of the target to be searched, the search result may be one or more first target pictures, or the search result may be empty.
  • the search result is empty, that is, when no first target picture matching the feature information of the target to be searched is found, the search device may determine that the target search fails, and return a search failure response message.
  • the collection time of the target picture may be the time when the front-end video capture device collects (eg, captures) the target picture, or the time when the target picture appears in the video image collected by the front-end video capture device.
  • the acquisition time of the target picture may be carried in the target picture (for example, displaying the acquisition time of the target picture at a specific position in the target picture (such as the lower left corner or the lower right corner, etc.))
  • the acquisition time of the target picture may be independent of the target picture, and its specific implementation is not described herein.
  • Step S220 Generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
  • the search device after the search device determines the first target picture that matches the feature information of the target to be searched, it can generate a video summary of the target to be searched according to the first target picture and its corresponding acquisition time.
  • generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: sorting multiple first target pictures in the order of the acquisition time from morning to night; Video summary from multiple first target images.
  • the search device may directly sort the first target picture in the order of the collection time from morning to night to generate a video summary.
  • the search device may directly according to the sorted first target pictures, Generate a video summary.
  • generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: for each first target picture, determining a target video corresponding to the first target picture A segment, where the target video segment is video data from the n-th second before the acquisition time corresponding to the first target picture to the m-th second after the acquisition time of the first target picture; generated according to each target video segment Video summary.
  • the search device when the search device searches for multiple first target pictures that match the feature information of the target to be searched, for each first target picture, the search device may Recording data between the nth second and the mth second after the acquisition time of the first target picture is determined as a target video clip corresponding to the first target picture, so as to obtain a video recording that matches the characteristic information of the target to be searched.
  • the collection time of the first target picture is the time when the first target picture appears in the video data extracted from the first target picture (that is, the time when the first target picture is collected by the video acquisition device).
  • the start time point of the target video clip corresponding to the first target picture (n-th second before the acquisition time of the first target picture) and the end time point (m-th after the acquisition time of the first target picture) Seconds) can be pre-configured in the search device or can be carried in the target search request (can be set by the user according to actual needs or default values are used).
  • n and m are non-negative numbers.
  • the search device when the video data provided by the video source device for the search device includes video data of multiple video data channels, when the search device determines the target video clip corresponding to the first target picture, it may first determine the first target picture belongs to Video data channel, and determine a target video clip corresponding to the first target picture in the video data of the video data channel. That is, among the recording data of the video data channel, the recording data between the nth second before and the mth second after the acquisition time of the first target picture is determined as a target video clip corresponding to the first target picture.
  • the first target picture information may further include a channel number of the first target picture, that is, a channel number of a video data channel to which the first target picture belongs.
  • the search device may fuse each target video clip, that is, according to the starting of each target video clip Sort and splice each target video clip at the start time and / or end time to generate a video summary of the target to be searched.
  • a first target picture that matches the feature information of the target to be searched carried in the target search request is searched, and the n-th second to the m-th second after the acquisition time corresponding to each first target picture
  • the recorded video data is determined as the target video clip corresponding to the first target picture, and further, the video summary of the target to be searched is obtained by fusing each target video clip. Therefore, the efficiency and accuracy of locating targets in video recordings are improved, and the consistency of target video tracking is ensured on the basis of removing video recordings that do not match the target to be searched.
  • Video summary can be generated based on each target video clip and decoded and displayed.
  • the method may further include: obtaining target picture information in the video data of the video source device, where the target picture information includes The target picture, the acquisition time of the target picture, and the attribute information of the target picture; the obtained target picture information is saved to the picture information database.
  • the search device may obtain the target picture information in the video data of the video source device in advance, and save the obtained target picture information to the picture information database. Further, when a target search is required, it may directly The feature information of the search target searches for a matching first target picture from the picture information database to improve target search efficiency.
  • the target picture information may include, but is not limited to, the target picture, feature information of the target picture, and acquisition time of the target picture.
  • the picture information library can be a specified storage space in the search device, or it can be a third-party database.
  • acquiring the target picture information in the video data of the video source device may include: receiving the target picture information sent by the video source device.
  • the video source device when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function) and a target picture analysis function, the video source device can directly obtain the target picture, the acquisition time of the target picture, and the target picture The target picture information such as the attribute information of the camera, and send the target picture information to the search device.
  • the search device can receive the target picture information sent by the video source device.
  • the video source device can capture the target picture (and record the target picture capture time (that is, the acquisition time)), and perform the target picture capture Target image analysis to extract the attribute information of the target image. Furthermore, the video source device can send the target image information such as the target image, the acquisition time of the target image, and the attribute information of the target image to a search device, such as an NVR, which is stored by the search device.
  • the target picture information such as the received target picture, the acquisition time of the target picture, and the attribute information of the target picture.
  • obtaining the target picture information in the video data of the video source device may include: receiving the target picture and the acquisition time of the target picture sent by the video source device; modeling the target picture and extracting The attribute information of the target picture.
  • the video source device when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function), the video source device can obtain the target picture and the acquisition time of the target picture, and convert the target picture and the target picture's The acquisition time is sent to the search device.
  • the search device receives the target picture sent by the video source device and the acquisition time of the target picture, it can model the target picture and extract the attribute information of the target picture. Therefore, the search device can obtain the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device.
  • the video source device can capture the target picture (and record the target picture capture time (that is, the acquisition time)), and send the captured target picture and the target picture collection time Give search devices such as NVR.
  • the search device receives the target picture, it can model the target picture and extract the attribute information of the target picture.
  • the search device can then store the target picture, the acquisition time of the target picture, and the attribute information of the target picture.
  • obtaining target picture information in the video data of the video source device may include: receiving the target picture information sent by the video source device, the acquisition time of the target picture, and the first attribute information of the target picture; The target picture is modeled and the second attribute information of the target picture is extracted; the attribute information of the target picture is determined according to the first attribute information of the target picture and the second attribute information of the target picture.
  • the video source device when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function) and a target picture analysis function, the video source device can directly obtain the target picture, the acquisition time of the target picture, and the target picture Attribute information (referred to herein as the first attribute information of the target picture), and sends the target picture, the acquisition time of the target picture, and the first attribute information of the target picture to the search device.
  • the search device receives the target picture sent by the video source device, it can model the received target picture and extract the attribute information of the target picture (referred to herein as the second attribute information of the target picture).
  • the search device may compare the first attribute information of the target picture with the second attribute information of the target picture. For the attribute information that exists in the first attribute information but does not exist in the second attribute information, or the first attribute The information does not exist, but the attribute information existing in the second attribute information is added to the attribute information of the target picture; for the attribute information existing in both the first attribute information and the second attribute information, the attributes in the second attribute information are added The information is added to the attribute information of the target picture, and then the attribute information of the target picture is obtained.
  • the search device may also directly use the second attribute information of the target picture as the attribute information of the target picture.
  • obtaining the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device may include: performing target detection on the video data provided by the video source device to Obtain the target picture and the acquisition time of the target picture in the video data; model the target picture and extract the attribute information of the target picture.
  • the search device may directly perform target detection on the video data provided by the video source device to The target picture in the video data and the acquisition time of the target picture are obtained; after the search device obtains the target picture in the video data, the target picture can be further modeled and attribute information of the target picture can be extracted.
  • the video source device may send the acquired video data to a search device, such as an NVR.
  • a search device receives the video data sent by the video source device, it can perform target detection on the received video data to obtain the target picture in the video data and the acquisition time of the target picture (the time when the target picture appears in the video data) , And model the target picture to extract the attribute information of the target picture.
  • the search device can then store the target picture, the acquisition time of the target picture, and the attribute information of the target picture.
  • the search device obtains the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device
  • the obtained target picture, the acquisition time of the target picture, and the target may be stored. Picture attribute information.
  • the above-mentioned saving of the target picture information to the picture information database may include: for any target picture information of any video data channel, judging whether a picture information database is stored with the target
  • the picture information includes other target pictures of the same target, where the other target picture information and the target picture information are from the same video data channel, and the difference between the acquisition time included in the other target picture information and the acquisition time included in the target picture information
  • the value is less than the preset time threshold; when there is no other target picture information that contains the same target as the target picture information in the picture information database, the target picture information is saved to the picture information database.
  • video data in a video data channel (such as video data obtained by an IPC) is usually video data in a fixed scene. Therefore, for any target picture information of any video data channel (including the search device obtaining target detection of the video data provided by the video source device, or the search device receiving the video source device), the search device is Before the picture information is saved in the picture information database, you can determine whether other target pictures that meet the following conditions are stored in the picture information database: the same target as the target picture; from the same video data channel; the acquisition time of other target pictures and the target picture The difference between the acquisition times included in the information is less than a preset threshold.
  • the search device may save the target picture information to the picture information database.
  • the search device determines that there is other target picture information in the picture information database that meets the above conditions, it refuses to save the target picture information to the picture information database, such as discarding the target picture information directly to reduce redundant picture storage. Therefore, the workload of searching the first target picture in the picture information database can be reduced, and the search efficiency can be improved.
  • the feature information of the target to be searched for may include target picture to be searched and / or attribute information of the target picture to be searched.
  • the feature information of the target to be searched includes the target picture to be searched; correspondingly, searching for the first target picture that matches the characteristic information of the target to be searched includes: modeling the target picture to be searched, And extracting the attribute information of the target picture to be searched; according to the attribute information of the target picture to be searched, a matching first target picture is searched in the picture information database.
  • the search device may provide a target search function, according to the target picture to be searched carried in the received target search request, and search for a matching target picture in a map search mode.
  • the search device may provide a target search request interface, and the target search request interface may include an input or / and selection area of a target picture to be searched, and a user enters or / and selects a target picture to be searched in the target search request interface, and Submit a target search request.
  • the search device When the search device receives the target search request, it models the target picture to be searched and extracts the attribute information of the target picture to be searched. Furthermore, the search device can query the stored target picture information according to the attribute information of the target picture to be searched, and The target picture information corresponding to the attribute information of the target picture that matches the attribute information of the target picture to be searched is determined as the first target picture information.
  • the search device receives the target search request, it can search for a matching first target face picture in the manner of map search, that is, the target
  • the face picture carried in the search request is modeled to obtain a feature model of the face picture, and further, the similarity between the video data of the video source device and the feature model of the face picture is greater than or equal to a preset similarity threshold
  • the face image of is determined as the first target face picture, and the acquisition time of the first target face picture and the first target face picture is used as the first target face picture information.
  • the similarity threshold may be configured in a search device in advance, or may be carried in a target search request (can be set by a user according to actual needs or a default value is used).
  • the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; correspondingly, searching for a first target picture that matches the feature information of the target to be searched, Including: modeling the target picture to be searched and extracting the fourth attribute information of the target picture to be searched; determining the attribute of the target picture to be searched according to the third attribute information of the target picture to be searched and the fourth attribute information of the target picture to be searched Information; searching for a matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
  • the search device may further model the search target picture and extract the attributes of the target picture to be searched Information (herein referred to as the fourth attribute information of the target picture to be searched).
  • the search device After the search device obtains the fourth attribute information of the target picture to be searched, it may determine the attribute information of the target picture to be searched according to the third attribute information of the target picture to be searched and the fourth attribute information of the target picture to be searched.
  • the search device may compare the third attribute information of the target picture to be searched with the fourth attribute information of the target picture to be searched. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or the third attribute The information does not exist, but the attribute information existing in the fourth attribute information is added to the attribute information of the target picture to be searched; for the attribute information existing in both the third attribute information and the fourth attribute information, the fourth attribute information is added The attribute information of is added to the attribute information of the target picture to be searched, and then the attribute information of the target picture to be searched is obtained.
  • the search device When the search device obtains the attribute information of the target picture to be searched, it can query the target picture information in the picture information database according to the attribute information of the target picture to be searched, and will correspond to the attribute information of the target picture that matches the attribute information of the target picture to be searched.
  • the target picture information is determined as the first target picture information.
  • the feature information of the target to be searched includes attribute information of the target to be searched.
  • searching for a first target picture that matches the feature information of the target to be searched includes: searching for the matching first target picture in the picture information database according to the attribute information of the target to be searched.
  • the feature information of the target to be searched may also be attribute information of the target picture to be searched.
  • the search device receives the target search request, it may directly according to the attribute information of the target picture to be searched carried in the picture information database. Search for a matching first target picture.
  • the target search when performing the target search, it may also carry a specific filtering attribute to instruct the search device to firstly After filtering the target image information, the first target image search is further performed.
  • the specific filtering attribute may include, but is not limited to, a search time range or / and a search channel number.
  • the target search request also carries a search time range; searching for the first target picture matching the characteristic information of the target to be searched for may include: comparing the picture information according to the search time range.
  • the target pictures in the library are filtered to obtain a second target picture whose acquisition time is within the search time range; and a matching first target picture is searched from the second target picture according to the feature information of the target to be searched.
  • the search device may first filter the target pictures in the picture information database according to the search time range carried in the target search request to obtain the acquisition time in the search time period.
  • the second target picture within range. For example, assuming that the search time range is [t1, t2] (t2> t1), the second target picture refers to a target picture whose acquisition time t satisfies t1 ⁇ t ⁇ t2.
  • the search device when it obtains the second target picture, it may search for a matching first target picture from the second target picture according to the feature information of the target to be searched.
  • the target search request also carries a search channel number
  • the target picture information further includes a channel number of the target picture (that is, a channel number of a video data channel to which the target picture belongs);
  • the first target picture with matching feature information includes: filtering the target picture information in the picture information database according to the search channel number to obtain a third target picture with the same channel number as the search channel number; and according to the feature information of the target to be searched, The third target picture is searched for a matching first target picture.
  • the search device may first filter the target pictures in the picture information database according to the search channel number and the channel number information of each target picture in the picture information database to obtain the channel number and The third target picture with the same channel number is searched, and then the first target picture is searched in the third target picture.
  • the first target picture may be deduplicated.
  • determining a target video clip corresponding to the first target picture may include: when there are multiple first target pictures with the same acquisition time, for Any one of the plurality of first target pictures determines a start time point and an end time point of a video clip corresponding to the first target picture, where the start time point corresponds to the first target picture The n-th second before the acquisition time, the end time point is the m-th second after the acquisition time corresponding to the first target picture; searching whether the start time exists in the recording data of the video data channel to which the first target picture belongs Point I frame, and whether there is an I frame at the end time; if there are I frames, discard the remaining first target pictures in the multiple first target pictures, and determine the video clip corresponding to the first target picture For the target video clip.
  • the search device may determine the start of the video segment corresponding to each first target picture in the multiple first target pictures according to a preset policy. Time point and end time point. For any first target picture among the plurality of first target pictures, the start time point of the video clip corresponding to the first target picture is the n-th second and end time point before the acquisition time corresponding to the first target picture M seconds after the acquisition time corresponding to the first target picture.
  • the search device After the search device determines the start time point and the end time point, it can search whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs (that is, whether the video data channel exists The key frame at the start time point), and whether there is an I frame at the end time point (that is, whether there is a key frame at the end time point in the video data channel). If there are I frames, the search device may directly determine the video clip corresponding to the first target picture as the target video clip, and discard the remaining first target pictures among the plurality of first target pictures.
  • the search device may increase the start time point of the video clip corresponding to the first target picture by x seconds, and search for the existence of an I frame at the new start time point.
  • the I frame at the time point (the updated start time point), or the start time point of the video clip corresponding to the first target picture is the same as the acquisition time of the first target picture.
  • x is a positive number.
  • the search device may reduce the end time point of the video clip corresponding to the first target picture by x seconds, and Search whether there is an I frame at the end time point. If it does not exist, reduce the end time point of the video segment corresponding to the first target picture by x seconds again, and search for whether there is an I frame at the end time point, and repeat the operation. Until the I frame at the end time point (the updated end time point) is searched in the recording data of the video data channel to which the first target picture belongs, or the start time of the video clip corresponding to the first target picture The points are collected at the same time as the first target picture.
  • the start time point may be 0 minutes and 58 seconds, and the end time point may be 1 minute and 3 seconds. It is searched whether the video data of the video data channel to which the first target picture belongs has an I frame at 0:58. If there is no I frame, you can increase the start time point by 1 second to obtain a new start time, 0 minutes and 59 seconds, and continue to search for whether there is an I frame in the video data of 0 minutes and 59 seconds. If the I frame still does not exist, increase the start time point by 1 second to obtain a new start time point, 1 minute and 0 seconds.
  • Time is used as the start time. Similarly, it is searched whether the I frame exists in the video data of the video data channel to which the first target picture belongs in 1 minute and 3 seconds. If there is no I-frame, the end time point can be reduced by 1 second to obtain a new end time, 1 minute and 2 seconds, and it is continued to search whether there is an I-frame in the recording data of 1 minute and 2 seconds. If there is an I frame, the recording data between 1 minute and 0 seconds and 1 minute and 2 seconds is used as the recording segment corresponding to the first target picture.
  • the I frame still does not exist, continue to reduce the end time point by 1 second to obtain a new end time, and continue to judge. Until the new end time point is 1 minute and 0 seconds. In this case, the video data of 1 minute and 0 seconds is used as the video clip corresponding to the first target picture.
  • the search device After the search device determines the start time and end time of the video clips corresponding to the multiple first target pictures in the foregoing manner, it can determine that the duration of the corresponding video clip in the multiple first target pictures is the longest (end First target picture with the largest difference between the time point and the starting time point), determine the video clip corresponding to the first target picture as the target video clip, and discard the remaining first targets in the plurality of first target pictures image.
  • a first one when the number of the first target pictures with the longest duration of the corresponding video clips in the plurality of first target pictures is greater than 1, a first one may be selected according to a preset strategy.
  • a target picture, and a video clip corresponding to the selected first target picture is determined as the target video clip. For example, the first target picture with the earliest start time, the first target picture with the latest end time, or a random selection may be selected.
  • deduplication processing of the multiple first target pictures may also be implemented manually.
  • the search device may display the plurality of first target pictures in a specified interface, the user selects the first target picture to be retained, and discards the remaining first target pictures with the same acquisition time.
  • generating a video summary according to each target recording segment may include: filtering the target recording segment according to a start time point and an end time point of each target recording segment to remove duplicate time Video data; video summary is generated based on the filtered target video clips.
  • the target video clip can be filtered according to the start time point and the end time point of each target video clip to remove the video with a repeated time. data.
  • filtering the target video clips according to the start time point and the end time point of each target video clip includes: sorting the target video clips according to the start time point of each target video clip; The first target video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second target video clip belong to the same
  • the first target video clip and the second target video clip are combined, and the start time point of the merged video clip is the start time point of the first target video clip and the end time point is the second target video clip.
  • the end time point of the second target video clip if the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as the start time point of the second target video clip, or The start time point of the two target clips is used as the end time point of the first target video clip.
  • the search device may sort each target video clip according to a start time point of each target video clip.
  • the search device may sort each target video clip by using a bubble sorting method according to the start time point of each target video clip.
  • the start time point of the first target video clip is less than the start time point of the second target video clip
  • corresponding processing may be performed according to the video data channel to which the first target video clip and the second target video clip belong.
  • the search device may merge the first target video clip and the second target video clip, and the start time point of the merged video clip is the first target.
  • the start time point of the video clip, and the end time is the end time point of the second target video clip.
  • the search device may use the end time point of the first target video clip as the start time point of the second target video clip, or the search device
  • the start time point of the second target clip can be used as the end time point of the first target video clip, and its specific implementation will be described below in conjunction with a specific example.
  • the search device is an NVR
  • the video source device is an IPC
  • the target is a face
  • the target search is a face picture search.
  • the NVR can use the structured video analysis technology on the GPU (Graphics Processing Unit) to perform face recognition on the real-time video stream transmitted by the IPC to obtain the face picture information in the real-time video stream .
  • the NVR can receive face picture information from the IPC end.
  • the face picture information may include, but is not limited to, a face picture, structured information of the face picture, acquisition time of the face picture, channel number of the face picture, and channel name of the face picture (for recording address information) Wait.
  • the NVR maintains a buffer for each video data channel (referred to as the channel) between the IPC and the NVR, which is used to store the face picture information received within 3 seconds.
  • the NVR deletes the face picture from the buffer.
  • the NVR obtains new face picture information (including face picture information received directly from IPC or face picture information obtained through face detection) from the channel
  • the face picture The information is compared with the face picture information in the buffer corresponding to the channel. If the face picture information of the same face exists in the buffer, the newly obtained face picture information is discarded, or the The face picture information is added to the buffer and saved to the picture information database.
  • the NVR can obtain the newly obtained The face image information of the overwrites the earliest stored face image information in the buffer, and saves the newly obtained face image information to the picture information database.
  • FIG. 3 The flow diagram of the face recognition and picture storage performed by the NVR can be shown in FIG. 3.
  • the configurable recording fusion parameters may include, but are not limited to, the following parameters.
  • Channel selection if single channel is selected, it means single-channel face image information search and video fusion; if the specified multiple channels are checked, face image information search and multi-channel video fusion are represented on multiple channels.
  • Search time range Represents the time range information of the face image to be searched for.
  • Similarity threshold If the threshold is 90%, the similarity result is 90% and above.
  • Duplicate picture filtering mode There are two modes: automatic and manual. The processing methods of different modes are explained in point 3.
  • the NVR receives a face search request, and the face search request carries a target face picture and a video fusion parameter.
  • Image search NVR models the target face picture, searches and compares the face picture information in the picture information database that matches the channel number and search time range carried in the face search request, and calculates the similarity.
  • List face pictures whose similarity with the target face picture is greater than or equal to the similarity threshold.
  • the acquisition time corresponding to the face picture and the face picture constitutes face picture information (hereinafter referred to as the first face picture information).
  • the first face picture information may be recorded in the first face picture information list.
  • Face pictures are filtered repeatedly. For any one of the first face pictures, subtracting n seconds from the acquisition time of the first face picture to obtain the start of the video segment corresponding to the first face picture The time point x, and the acquisition time of the first face picture is added to m seconds to obtain the end time point y of the video clip corresponding to the first face picture.
  • the NVR searches the video recordings of this channel respectively for the I frames at time points x and y.
  • Duplicate picture filtering mode includes manual filtering mode or automatic filtering mode.
  • the manual filtering mode is: the multiple first face pictures are output on the specified interface, the user checks the first face pictures to be retained, and the other first face pictures at the same acquisition time are discarded.
  • the automatic filtering mode is: among the plurality of first face pictures, the first face picture corresponding to the longest video clip is retained, and other first face pictures with the same acquisition time are discarded. When there are multiple first face pictures corresponding to the longest video clip in the face picture, one of them is retained, and the rest are discarded.
  • the first face image information after filtering the repeated pictures is formed into a final first face image information list.
  • the schematic flowchart of the NVR generating the final first face picture information list can be shown in FIG. 4.
  • each element includes the following information: the channel number of the channel to which the first face picture belongs, and the start of the video clip corresponding to the first face picture The time point (the starting time point of the I frame exists) and the end time point of the video clip corresponding to the first face picture (the end time point of the I frame exists); among them, each element is from early to late according to the starting time point Sort in order.
  • Duplicate video clip filtering For the two adjacent elements of the video clip time period element set (hereinafter referred to as the first element and the second element respectively), it is assumed that the start time and the end time of the first element and the second element [A, B] and [C, D], where A ⁇ C ⁇ B. If the first element and the second element include the same channel number, the first element and the second element are combined into one element, and the start time point of the element is A and the end time point is D. If the channel numbers included in the first element and the second element are different, it is determined whether there is an I frame at time point B in the video recording of the channel to which the second element belongs.
  • the start time point of the second element is updated to B, that is, the start time point and end time point of the first element and the second element are [A, B] and [B , D]. If there is no I frame at time point B, the start time point of the first element is updated to C, that is, the start time point and end time point of the first element and the second element are [A, C] and [ C, D].
  • corresponding video data is obtained from the video video of the corresponding channel, and a video summary is generated based on the obtained video data.
  • the video summary can be downloaded by the user, and the user can export a set of video clip elements used to generate the video summary.
  • a target search request by receiving a target search request, a first target picture that matches the feature information of the target to be searched carried in the target search request is searched, and according to each first target picture and the corresponding collection of each first target picture
  • the video summary of the target to be searched is generated in time, which improves the efficiency and accuracy of locating the target in the video recording.
  • the consistency of the target video tracking is guaranteed.
  • FIG. 5 is a schematic flowchart of a video abstract generating method according to another embodiment of the present application.
  • the video abstract generating method may be applied to a retrieval device.
  • the video digest generation method is directed to a face search request.
  • the video summary generating method may include the following steps.
  • Step S500 Acquire and store a face picture in the video data of the video source device, a collection time of the face picture, and attribute information of the face picture.
  • the retrieval device may acquire and store the face picture information in the video data of the video source device.
  • the face picture information may include, but is not limited to, a face picture, a collection time of the face picture, and attribute information of the face picture.
  • the attribute information of the face picture may include, but is not limited to, one or more of the following: facial expression (such as whether to smile), whether to wear glasses, gender, age range, and ethnicity.
  • the specific implementation method for obtaining the face picture, the collection time of the face picture, and the attribute information of the face picture in the video data of the video source device is similar to the foregoing method of obtaining the target picture information.
  • the target picture information is replaced with the adult face picture information. Yes, I wo n’t repeat them here.
  • the retrieval device obtains the face picture in the video data of the video source device, the acquisition time of the face picture, and the attribute information of the face picture, the retrieved face picture and the collection of the face picture can be stored. Time and attribute information of face pictures.
  • Storing the face picture in the video data of the video source device, the collection time of the face picture, and the attribute information of the face picture may include: storing the face picture; recording the storage location of the face picture in the face picture information table, Collection time of face pictures and attribute information of face pictures.
  • the retrieval device After the retrieval device obtains the face picture, the collection time of the face picture, and the attribute information of the face picture, it can store the obtained face picture, and store the storage location of the face picture, the collection time of the face picture, and the person.
  • the face picture attribute information is recorded in the face picture information table, and its format can be shown in Table 1:
  • Face image location information Face image collection time Face image attribute information Location information of face picture 1 Acquisition time of face picture 1 Attribute information of face picture 1 Location information of face picture 2 Acquisition time of face picture 2 Attribute information of face picture 2 ... ... ... ...
  • the position information of the face picture may be a position offset and a length of the face picture in a storage space (such as a hard disk).
  • the above implementation manner of storing the face picture, the collection time of the face picture, and the attribute information of the face picture in the video data of the video source device is only to store the face picture, the collection time of the face picture, and the face picture in this application.
  • a specific example of the attribute information is not a limitation on the protection scope of the present application, that is, in the embodiments of the present application, the face picture, the time of collecting the face picture, and the person in the video data of the video source device may also be stored in other ways. Attribute information of face pictures.
  • the face picture, the acquisition time of the face picture, and the attribute information of the face picture can be stored in the same database (that is, the face picture is directly stored in the database in a binary form).
  • the face picture, the collection time of the face picture, and the attribute information of the face picture can be stored in the same data table. At this time, there is no need to additionally record the storage location of the face picture.
  • the face picture can still be stored first to obtain the storage location of the face picture, but when the storage location of the face picture, the collection time of the face picture, and the attribute information of the face picture are no longer stored in the data Tables are stored in the form of other forms, such as a tree structure or a file. The specific implementation is not described here.
  • Step S510 When a face retrieval request is received, a first target face picture matching the face retrieval filter condition is determined according to the face retrieval filter condition carried in the face retrieval request.
  • the retrieval device can provide a face retrieval function, which retrieves matching face pictures according to the face retrieval filter conditions carried in the received face retrieval request, and at the same time can obtain the collection time of the matching face pictures.
  • the retrieval device may provide a face retrieval request interface
  • the face retrieval request interface may include a face retrieval filter condition input area or / and a face retrieval filter condition option, which is entered by the user in the face retrieval request interface or / And select a face search filter and submit a face search request.
  • the face retrieval filter condition is attribute information of a face picture to be retrieved (this may be referred to as third attribute information of the face picture to be retrieved), which may include, but is not limited to, facial expressions of the face to be retrieved , Whether or not you wear glasses, gender, and age.
  • the face retrieval filtering condition may include a face picture to be retrieved and third attribute information of the face picture to be retrieved.
  • the retrieval device When the retrieval device receives a face retrieval request, it can obtain the face retrieval filter conditions carried in the face retrieval request, and query the stored face pictures, the collection time of the face pictures, and the person according to the face retrieval filter conditions. Attribute information of a face picture, and determine a face picture corresponding to the attribute information of a face picture matching a face retrieval filter condition as a face picture matching a face retrieval filter condition (referred to herein as a first target Face picture).
  • the retrieval device may The search filter conditions query the attribute information of the face picture in the face picture information table to obtain the face picture information entry that matches the face search filter condition, and obtain the face picture in the face picture information entry Storage location (that is, the storage location of the first target face picture) and the acquisition time of the first target face picture.
  • the retrieval device may obtain the first target face picture from the specified storage space according to the storage location of the first target face picture.
  • the comparison of the face retrieval filter condition and the attribute information of the face picture recorded in the face picture information table may include: : Model the face picture to be retrieved and extract the fourth attribute information of the face picture to be retrieved; determine the person to be retrieved based on the third attribute information of the face picture to be retrieved and the fourth attribute information of the face picture to be retrieved Attribute information of the face; compare the attribute information of the face picture to be retrieved with the attribute information of the face picture recorded in the face picture information table.
  • the retrieval device can model the face image to be retrieved and extract the attribute information of the face image to be retrieved (in this article Called the fourth attribute information of the face picture to be retrieved).
  • the retrieval device After the retrieval device obtains the fourth attribute information, it can determine the attribute information of the face to be retrieved according to the third attribute information and the fourth attribute information.
  • the retrieval device may compare the third attribute information and the fourth attribute information of the face picture to be retrieved. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or does not exist in the third attribute information , But the attribute information existing in the fourth attribute information is added to the attribute information of the face picture to be retrieved; for the attribute information existing in both the third attribute information and the fourth attribute information, the attribute information in the fourth attribute information is added The attribute information of the face picture to be retrieved is added to obtain the attribute information of the face picture to be retrieved.
  • the retrieval device When the retrieval device obtains the attribute information of the face picture to be retrieved, it can query the stored face picture, the acquisition time of the face picture, and the attribute information of the face picture according to the attribute information of the face picture to be retrieved, and compare it with the attribute information of the face picture to be retrieved.
  • the face picture corresponding to the attribute information of the face picture whose attribute information matches the face picture is determined as the first target face picture that matches the face retrieval filter condition.
  • the retrieval device may directly query the database based on the attribute information of the face picture to be retrieved.
  • the entry where the attribute information of the matching face picture is located, and the face picture information is obtained from the queried entry.
  • Step S520 Generate a video summary according to the first target face picture and the collection time of the first target face picture, and play back the video summary.
  • the search device may use the first target face picture and the first target face.
  • the collection time of the picture generates a video summary with the retrieved face, and plays back the video summary of the detected face.
  • the specific implementation method of this step is similar to the above step S220, except that the acquisition time corresponding to the first target picture and the first target picture is replaced with the acquisition time of the first target face picture and the first target face picture, and is not repeated here. To repeat.
  • the video source device is IPC
  • the retrieval device is NVR.
  • the NVR is loaded with a smart chip with a smart analysis function.
  • the video digest generation scheme implementation process is as follows.
  • the IPC captures a face picture, and transmits the captured face picture and the acquisition time of the face picture to the NVR.
  • the IPC can capture a face picture, and transmit the captured face picture and the acquisition time of the face picture (that is, the capture time of the face picture) to the NVR.
  • the IPC will also transmit the real-time video stream to the NVR, and the NVR saves the video recording according to a preset policy.
  • the NVR extracts feature values in the face picture, models the face picture according to the feature values of the face picture, and extracts attribute information of the face picture.
  • the NVR when it receives the face picture transmitted by the IPC, it can intelligently analyze the face picture through a smart chip.
  • the smart chip can use the algorithm library to extract the feature values of the face in the face picture, and use the algorithm library to model the face picture according to the extracted feature values and extract the attribute information of the face picture.
  • the flow chart of the NVR obtaining the face picture information can be shown in FIG. 6.
  • the NVR performs target detection on a video recording or a real-time video stream to obtain a face picture and a collection time of the face picture.
  • the NVR can perform target detection on the video recording or real-time video stream through the smart chip to obtain the face picture and the acquisition time of the face picture in the video recording or real-time video stream (that is, the face picture is in the video data) Time of occurrence).
  • the NVR extracts feature values in the face picture, models the face picture according to the feature values of the face picture, and extracts attribute information of the face picture.
  • the NVR when it receives the face picture transmitted by the IPC, it can intelligently analyze the face picture through a smart chip.
  • the smart chip can use the algorithm library to extract the feature values of the face in the face picture, and use the algorithm library to model the face picture according to the extracted feature values and extract the attribute information of the face picture.
  • the flow chart of the NVR obtaining the face picture information can be shown in FIG. 7.
  • the NVR stores the face picture, the collection time of the face picture, and the attribute information of the face picture. For specific implementation, refer to the subsequent description.
  • the NVR stores the face picture to obtain the storage location of the face picture.
  • a database table FaceTable Face table related to the face picture information is established, wherein the main fields in the FaceTable table are: the storage location of the face picture, the collection time of the face picture, and the face picture Attribute information.
  • FaceTable database table
  • the NVR can also store model data of face pictures, and its specific implementation is not described here.
  • the NVR records the storage location of the face picture, the collection time of the face picture, and the attribute information of the face picture in the FaceTable.
  • a face retrieval request is received, and the face retrieval request carries a face retrieval filtering condition.
  • the NVR can provide a face search interface.
  • the face search interface includes a face search filter input area or / and options. The user can fill in or / and select a face search filter condition through the face search interface, and submit a face search request.
  • the NVR queries the storage location of the matching face pictures and the collection time of the face pictures from the FaceTable table according to the face retrieval filter conditions.
  • the NVR can query the FaceTable table, compare the face retrieval filter conditions with the attribute information of the face pictures recorded in the FaceTable table, and compare the recorded face attribute information with the face records in the FaceTable entries that match the face retrieval filter conditions.
  • the storage location of the picture and the collection time of the face picture are determined as the storage location of the matching face picture (that is, the storage location of the first target face picture) and the collection time of the face picture (that is, the first target face picture's Acquisition time).
  • the NVR obtains the first target face picture according to the storage location of the first target face picture.
  • the NVR can read the first target face picture from the hard disk according to the storage position (position offset + length) of the first target face picture, so that the NVR can obtain the first target face picture and the first target face picture. Acquisition time.
  • the NVR can create a VideoTable1.
  • the main fields in the VideoTable1 table are: the storage location of the video (the hard disk position offset + length) and the start and end time (start time and end time) of the video data.
  • a new record ie, a new entry
  • the VideoTable1 table is inserted into the VideoTable1 table, recording the video storage location and the start and end time.
  • the NVR determines the 5 second before the acquisition time of the first target face picture as the start time of the target video clip, and the 5th time after the acquisition time of the first target face picture The second is determined as the end time of the target video clip, and the VideoTable1 table is queried according to the start time and end time of the first target face picture to obtain the target video clip.
  • Face search filter conditions determine the first target face picture that matches the face search filter conditions, and then generate a video summary based on the first target face picture and the acquisition time of the first target face picture, and perform the video summary Playback avoids the need to extract matching face pictures from video data for each face retrieval, improves the efficiency and accuracy of face retrieval, and ensures the consistency of face video tracking.
  • FIG. 8 is a schematic flowchart of a video abstract generating method according to another embodiment of the present application.
  • the video abstract generating method may be applied to a retrieval device.
  • the video digest generation method is directed to a vehicle search request.
  • the video summary generating method may include the following steps.
  • Step S800 Acquire and store a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture in the video data of the video source device.
  • the retrieval device can acquire and store the vehicle picture information in the video data of the video source device.
  • the vehicle picture information may include, but is not limited to, a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture.
  • the attribute information of the vehicle picture may include, but is not limited to, one or more of the following: the location of the vehicle in the vehicle picture, the location of the license plate in the vehicle picture, the license plate number, the license plate color, the country type, and the body color , Vehicle brand, model (such as a large passenger car, truck or van, etc.), whether the driver is wearing a seat belt, and whether the driver is calling.
  • the specific implementation method for obtaining the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device is similar to the foregoing method of obtaining the target picture information, and the target picture information can be replaced with the vehicle picture information. This will not be repeated here.
  • the retrieval device may store the acquired vehicle picture, the acquisition time of the vehicle picture, and the vehicle picture. Attribute information.
  • Storing the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture may include: storing the vehicle picture; recording the storage location of the vehicle picture, the collection time of the vehicle picture in the vehicle picture information table, and Attribute information of the vehicle picture.
  • the retrieval device After the retrieval device obtains the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture, it can store the obtained vehicle picture, and record the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in Vehicle picture information table, its format can be shown in Table 2:
  • the location information of the vehicle picture may be a position offset and a length of the vehicle picture in a storage space (such as a hard disk).
  • the foregoing implementation manner of storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device is only a specific example of storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in this application.
  • the example is not a limitation on the protection scope of the present application, that is, in the embodiment of the present application, the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture may also be stored.
  • the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture may be stored in the same database (that is, the vehicle picture is directly stored in the database in a binary form).
  • the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture can be stored in the same data table. At this time, there is no need to additionally record the storage location of the vehicle picture.
  • the vehicle picture may still be stored first to obtain the storage location of the vehicle picture, but the storage location of the vehicle picture, the time when the vehicle picture was collected, and the attribute information of the vehicle picture are no longer stored in the form of a data table. Instead, it is stored in other forms, such as a tree structure or a file. The specific implementation is not described here.
  • Step S810 When a vehicle retrieval request is received, a first vehicle picture that matches the vehicle picture to be retrieved is determined according to the vehicle picture to be retrieved carried in the vehicle retrieval request.
  • the retrieval device may provide a vehicle retrieval function, and according to the pictures of the vehicle to be retrieved carried in the received vehicle retrieval request, retrieve the matching vehicle pictures and the collection time of the vehicle pictures in the manner of map search.
  • the retrieval device may provide a vehicle retrieval request interface
  • the vehicle retrieval request interface may include an input or / and selection area of a picture of the vehicle to be retrieved, and a user enters or / and selects a picture of the vehicle to be retrieved in the vehicle retrieval request interface, and Submit a vehicle search request.
  • the retrieval device When the retrieval device receives a vehicle retrieval request, it models the to-be-retrieved vehicle pictures and extracts attribute information of the to-be-retrieved vehicle pictures. Furthermore, the retrieval device can query the stored vehicle pictures and vehicle pictures based on the attribute information of the to-be-retrieved vehicle pictures. Collect the time and the attribute information of the vehicle picture, and determine the vehicle picture corresponding to the attribute information of the vehicle picture that matches the attribute information of the vehicle picture to be retrieved as the vehicle picture that matches the attribute information of the vehicle picture to be retrieved (referred to herein as For the first vehicle picture).
  • the retrieval device may according to the attributes of the vehicle picture to be retrieved Information query the attribute information of the vehicle picture in the vehicle picture information table to obtain the vehicle picture information entry that matches the attribute information of the vehicle picture to be retrieved, and obtain the storage location of the vehicle picture in the vehicle picture information entry (i.e. Storage location of the first vehicle picture).
  • the retrieval device may acquire the first vehicle picture from the specified storage space according to the storage location of the first vehicle picture.
  • vehicle picture information is only a specific example in the case of storing vehicle picture information in the form of a vehicle picture information table, and is not a limitation on the protection scope of the present application. That is, in the embodiment of the present application, Other ways to achieve vehicle image information retrieval.
  • the retrieval device may directly query the matching vehicle from the database according to the attribute information of the vehicle picture to be retrieved The entry where the attribute information of the picture is located, and the vehicle picture information is obtained from the queried entry.
  • Step S820 Generate a video summary according to the first vehicle picture and the collection time of the first vehicle picture, and play back the video summary.
  • the retrieval device may generate a searched vehicle with the first vehicle picture and the first vehicle picture collection time. Video summary, and playback of video summary of the vehicle to be detected.
  • the specific implementation method of this step is similar to the above step S220, except that the acquisition time corresponding to the first target picture and the first target picture is replaced with the acquisition time of the first vehicle picture and the first vehicle picture, and details are not described herein again.
  • the video source device is IPC
  • the retrieval device is NVR.
  • the NVR is loaded with a smart chip with a smart analysis function.
  • the video digest generation scheme implementation process is as follows.
  • the IPC captures the vehicle pictures and transmits the captured vehicle pictures and the collection time of the vehicle pictures to the NVR.
  • the IPC can also capture a vehicle picture, and transmit the captured vehicle picture and the acquisition time of the vehicle picture (that is, the acquisition time of the vehicle picture) to the NVR.
  • the IPC will also transmit the real-time video stream to the NVR, and the NVR saves the video recording according to a preset policy.
  • the NVR extracts feature values from the vehicle pictures, models the vehicle pictures according to the feature values of the vehicle pictures, and extracts attribute information of the vehicle pictures.
  • the NVR when it receives the vehicle picture transmitted by the IPC, it can intelligently analyze the vehicle picture through a smart chip.
  • the smart chip can use an algorithm library to extract the feature values of the vehicle from the vehicle picture, and use the algorithm library to model the vehicle picture according to the extracted feature values, and extract the attribute information of the vehicle picture.
  • the schematic diagram of the NVR's process of obtaining vehicle picture information can be shown in FIG. 9.
  • the NVR performs object detection on the video recording or real-time video stream to obtain the vehicle picture and the acquisition time of the vehicle picture.
  • the NVR can perform target detection on the video recording or real-time video stream through the smart chip to obtain the vehicle pictures and the collection time of the vehicle pictures in the video recording or real-time video stream (that is, the time when the vehicle pictures appear in the video data ).
  • the NVR extracts feature values from the vehicle pictures, models the vehicle pictures according to the feature values of the vehicle pictures, and extracts attribute information of the vehicle pictures.
  • the NVR can intelligently analyze the vehicle picture through a smart chip.
  • the smart chip can use an algorithm library to extract the feature values of the vehicle from the vehicle picture, and use the algorithm library to model the vehicle picture according to the extracted feature values, and extract the attribute information of the vehicle picture.
  • the schematic diagram of the process for the NVR to obtain vehicle picture information can be shown in FIG. 10.
  • the NVR stores the vehicle picture, the time when the vehicle picture was collected, and the attribute information of the vehicle picture. For specific implementation, see the subsequent description.
  • the NVR stores the vehicle picture to obtain the storage location of the vehicle picture.
  • a vehicle table vehicle table
  • the main fields in the VehicleTable table are: the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture.
  • the NVR can also store model data of vehicle pictures, and its specific implementation is not described here.
  • the NVR records the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the VehicleTable.
  • a vehicle retrieval request is received, and the vehicle retrieval request carries a picture of a vehicle to be retrieved.
  • the NVR can provide a vehicle search interface, and the vehicle search interface includes a picture input or / and selection area of a vehicle to be searched.
  • the user can input or / and select pictures of the vehicle to be retrieved through the vehicle retrieval interface, and submit a vehicle retrieval request.
  • the NVR models the vehicle pictures to be retrieved and extracts the attribute information of the vehicle pictures to be retrieved.
  • the NVR queries the storage location of the matching vehicle picture and the collection time of the vehicle picture from the VehicleTable table according to the attribute information of the vehicle picture to be retrieved.
  • the NVR can query the VehicleTable table, compare the attribute information of the vehicle picture to be retrieved with the attribute information of the vehicle picture recorded in the VehicleTable table, and match the attribute information of the recorded vehicle picture with the attribute information of the vehicle picture to be retrieved.
  • the storage location of the vehicle picture and the collection time of the vehicle picture are determined as the matching storage location of the vehicle picture (that is, the storage location of the first vehicle picture) and the collection time of the vehicle picture (that is, the collection time of the first vehicle picture).
  • the NVR obtains the first vehicle picture according to the storage location of the first vehicle picture.
  • the NVR can read the first vehicle picture from the hard disk according to the storage position (position offset + length) of the first vehicle picture, so that the NVR can obtain the first vehicle picture and the collection time of the first vehicle picture.
  • the NVR can create a VideoTable2.
  • the main fields in the VideoTable2 table are: the storage location of the video (the hard disk position offset + length) and the start and end time (start time and end time) of the video data. After a completed video is stored on the hard disk, a new record (ie, a new entry) is inserted into the VideoTable2 table, recording the video storage location and the start and end time.
  • the NVR determines the 5 second before the acquisition time of the first vehicle picture as the start time of the target video clip, and the 5 second after the acquisition time of the first vehicle picture is determined as the target video clip. Query the VideoTable2 table according to the start time and end time of the first vehicle picture to obtain the target video clip.
  • FIG. 11 is a schematic flowchart of a video abstract generating method according to another embodiment of the present application.
  • the video abstract generating method may be applied to a retrieval device.
  • the video digest generation method is directed to a vehicle search request.
  • the video digest generating method may include the following steps.
  • Step S1100 Acquire and store a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture in the video data of the video source device.
  • step S800 For a specific implementation method of this step, reference may be made to step S800, and details are not described herein again.
  • Step S1110 When a vehicle search request is received, a second vehicle picture matching the vehicle search filter condition is determined according to the vehicle search filter condition carried in the vehicle search request.
  • the retrieval device may provide a vehicle retrieval function, and retrieve a matching vehicle picture and a collection time of the vehicle picture according to a vehicle retrieval filter condition carried in the received vehicle retrieval request.
  • the retrieval device may provide a vehicle search request interface
  • the vehicle search request interface may include a vehicle search filter condition input area or / and a vehicle search filter condition option, and a user enters or / and selects a vehicle search in the vehicle search request interface Filter conditions and submit a vehicle search request.
  • the vehicle retrieval filter condition is attribute information of a picture of the vehicle to be retrieved (this may be referred to as the third attribute information of the picture of the vehicle to be retrieved), which may include, but is not limited to, the license plate number, body color, One or more of information such as model and vehicle brand.
  • the vehicle retrieval filter condition may include the third attribute information of the image of the vehicle to be retrieved and the image of the vehicle to be retrieved.
  • a retrieval device When a retrieval device receives a vehicle retrieval request, it can obtain the vehicle retrieval filter conditions carried in the vehicle retrieval request, and query the stored vehicle pictures, the collection time of the vehicle pictures, and the attribute information of the vehicle pictures according to the vehicle retrieval filter conditions, and The vehicle picture corresponding to the attribute information of the vehicle picture matching the vehicle search filter condition is determined as the vehicle picture matching the vehicle search filter condition (referred to herein as the second vehicle picture).
  • the retrieval device can query the attributes of the vehicle picture in the vehicle picture information table according to the vehicle search filter Information to obtain a vehicle picture information entry that matches the vehicle search filter condition, and obtain the storage location of the vehicle picture in the vehicle picture information entry (that is, the storage location of the second vehicle picture) and the collection of the second vehicle picture time.
  • the retrieval device may acquire the second vehicle picture from the designated storage space according to the storage location of the second vehicle picture.
  • the above comparison of the vehicle retrieval filter condition and the attribute information of the vehicle picture recorded in the vehicle picture information table May include: modeling a picture of a vehicle to be retrieved and extracting fourth attribute information of the picture of the vehicle to be retrieved; and determining, based on the third attribute information of the picture of the vehicle to be retrieved and the fourth attribute information of the picture of the vehicle to be retrieved, Attribute information; compare the attribute information of the vehicle picture to be retrieved with the attribute information of the vehicle picture recorded in the vehicle picture information table.
  • the retrieval device may model the picture of the vehicle to be retrieved and extract attribute information of the picture of the vehicle to be retrieved (this article (Referred to as the fourth attribute information of the picture of the vehicle to be retrieved).
  • the retrieval device After the retrieval device obtains the fourth attribute information of the picture of the vehicle to be retrieved, it can determine the attribute information of the vehicle to be retrieved according to the third attribute information of the picture of the vehicle to be retrieved and the fourth attribute information of the picture of the vehicle to be retrieved.
  • the retrieval device may compare the third attribute information of the picture of the vehicle to be retrieved with the fourth attribute information of the picture of the vehicle to be retrieved. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or the third attribute The information does not exist, but the attribute information existing in the fourth attribute information is added to the attribute information of the vehicle picture to be retrieved; for the attribute information existing in both the third attribute information and the fourth attribute information, the fourth attribute information is added to Is added to the attribute information of the picture of the vehicle to be retrieved, and further, the attribute information of the picture of the vehicle to be retrieved is obtained.
  • the retrieval device When the retrieval device obtains the attribute information of the vehicle picture to be retrieved, it can query the stored vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture according to the attribute information of the vehicle picture to be retrieved, and compare it with the attribute information of the vehicle picture to be retrieved The vehicle picture corresponding to the attribute information of the matched vehicle picture is determined as the second vehicle picture that matches the filter criteria of the vehicle search.
  • the retrieval device may directly query the matching vehicle from the database according to the attribute information of the vehicle picture to be retrieved The entry where the attribute information of the picture is located, and the vehicle picture information is obtained from the queried entry.
  • Step S1120 Generate a video summary according to the second vehicle picture and the collection time of the second vehicle picture, and play back the video summary.
  • step S820 For a specific implementation method of this step, reference may be made to step S820, and details are not described herein again.
  • the vehicle by obtaining and storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device, when a vehicle retrieval request is received, the vehicle is retrieved according to the vehicle carried in the vehicle retrieval request.
  • Extracting matching vehicle pictures from video data improves the efficiency and accuracy of vehicle retrieval, and ensures the consistency of vehicle video tracking.
  • FIG. 12 is a schematic structural diagram of a video summary generating apparatus according to an embodiment of the present application.
  • the video summary generating apparatus may be applied to the search device in the foregoing embodiment.
  • the video summary generating apparatus The device may include the following units.
  • the receiving unit 1210 is configured to receive a target search request, where the target search request carries characteristic information of a target to be searched.
  • the search unit 1220 is configured to search for a first target picture that matches feature information of the target to be searched.
  • the processing unit 1230 is configured to generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
  • the device further includes the following units.
  • the obtaining unit 1240 is configured to obtain target picture information in the video data of the video source device, where the target picture information includes the target picture, a collection time of the target picture, and attribute information of the target picture.
  • the saving unit 1250 is configured to save the target picture information to a picture information database.
  • the obtaining unit 1240 is specifically configured to receive target picture information sent by the video source device.
  • the obtaining unit 1240 is specifically configured to receive the target picture and the collection time of the target picture sent by the video source device; model the target picture, and Extracting attribute information of the target picture.
  • the obtaining unit 1240 is specifically configured to receive the target picture, the collection time of the target picture, and first attribute information of the target picture sent by the video source device; Model the target picture, and extract the second attribute information of the target picture; determine the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
  • the obtaining unit 1240 is specifically configured to perform target detection on the video data provided by the video source device to obtain the target picture and the target image in the video data. Acquisition time of the target picture; modeling the target picture, and extracting attribute information of the target picture.
  • the feature information of the target to be searched includes attribute information of the target to be searched; and the searching unit 1220 is specifically configured to be stored in the picture information database according to the attribute information of the target to be searched Searching for a matching first target picture.
  • the feature information of the target to be searched includes a target picture to be searched; the search unit 1220 is specifically configured to model the target picture to be searched and extract the target to be searched Attribute information of the target picture; and searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
  • the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; and the searching unit 1220 is specifically configured to perform a search on the target picture to be searched Performing modeling, and extracting fourth attribute information of the target picture to be searched; determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information; and according to the target to be searched
  • the attribute information of the picture searches for a matching first target picture in the picture information database.
  • the target search request further carries a search time period range;
  • the search unit 1220 is specifically configured to target the target picture in the picture information database according to the search time range range. Perform filtering to obtain a second target picture whose acquisition time is within the range of the search period; and search for a matching first target picture from the second target picture according to the feature information of the target to be searched.
  • the target search request further carries a search channel number
  • the target picture information further includes a channel number of the target picture
  • the search unit 1220 is specifically configured to perform the search according to the search.
  • the channel number is used to filter the target pictures in the picture information database to obtain a third target picture whose channel number is consistent with the search channel number; according to the feature information of the target to be searched, from the third target picture Search for a matching first target picture.
  • the target picture is a face picture
  • the target search request is a face search request
  • the target picture is a vehicle picture
  • the target search request is a vehicle search request
  • the processing unit 1230 is specifically configured to sort the first target picture in an order from early to late in the acquisition time; and generate the first target picture according to the sorted first target picture. Video summary.
  • the processing unit 1230 is specifically configured to determine, for each first target picture, a target video clip corresponding to the first target picture, where the target video clip is the first target Recording data between the n-th second before the acquisition time corresponding to the picture and the m-th second after the acquisition time of the first target picture; generating a video summary according to each of the target video clips.
  • the processing unit 1230 is specifically configured to, when there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, Determine the start time point and end time point of the video clip corresponding to the first target picture, where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is The m-th second after the acquisition time corresponding to the first target picture; searching whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs, and whether the end time point exists If there are I frames at the start time point and I frames at the end time point, discard the remaining first target pictures in the multiple first target pictures, and record the video corresponding to the first target pictures The clip is determined as the target video clip.
  • the processing unit 1230 is further configured to: if there is no I-frame at the start time point, the start time of the video clip corresponding to the first target picture The point is increased by x seconds to obtain a new starting time point, and the above search steps are repeated until the I frame of the new starting time point is searched in the recording data of the video data channel to which the first target picture belongs.
  • the new start time point of the video clip corresponding to the first target picture is the same as the acquisition time; if there is no I frame at the end time point, the corresponding The end time point of the video clip is reduced by x seconds to obtain a new end time point, and the above search steps are repeated until the new search result is found in the video data of the video data channel to which the first target picture belongs.
  • I frame at the end time point, or the new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; the corresponding video clips in the plurality of first target pictures respectively in, Selecting the longest video clip as the target video clip; discarding the first target picture corresponding to the remaining video clips of the plurality of first target pictures.
  • the processing unit 1230 is specifically configured to filter the target video clip according to a start time point and an end time point of each target video clip to remove time-repeated video data. Generating the video summary according to the filtered target video clip.
  • the processing unit 1230 is specifically configured to sort each of the target video clips according to the start time point of each of the target video clips; for an adjacent first target The video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second If the target video clip belongs to the same video data channel, the first target video clip and the second target video clip are merged, and the start time point of the combined video clip is the start time of the first target video clip Point, the end time point is the end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the The end time point is used as the start time point of the second target video clip, or the start time point of the second target video clip is used as the first The end time point of the target video clip; wherein the start time point of the first target video clip is smaller than the start time point of the second target video clip.
  • FIG. 14 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
  • the electronic device may include a processor 1401, a communication interface 1402, a memory 1403, and a communication bus 1404.
  • the processor 1401, the communication interface 1402, and the memory 1403 complete communication with each other through the communication bus 1404.
  • a computer program is stored in the memory 1403; the processor 1401 can execute the program stored in the memory 1403 to execute the video digest generating method described above.
  • the memory 1403 mentioned herein may be any electronic, magnetic, optical, or other physical storage device, and may contain or store information such as executable instructions, data, and so on.
  • the memory 1402 may be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as hard drive), solid state hard disk, any type of storage disk (such as optical disk , DVD, etc.), or similar storage media, or a combination thereof.
  • An embodiment of the present application further provides a machine-readable storage medium storing a computer program, such as the memory 1403 in FIG. 14, and the computer program may be executed by the processor 1401 in the electronic device shown in FIG. 14 to implement the foregoing description Video digest generation method.
  • a computer program such as the memory 1403 in FIG. 14
  • the computer program may be executed by the processor 1401 in the electronic device shown in FIG. 14 to implement the foregoing description Video digest generation method.

Abstract

Provided are a video abstract generation method and apparatus, and an electronic device and a readable storage medium. The method comprises: receiving a target search request, wherein the target search request carries feature information of a target to be searched; searching for a first target picture matching the feature information of the target to be searched; and generating a video abstract according to the first target picture and a collection time corresponding to the first target picture.

Description

视频摘要生成方法、装置、电子设备及可读存储介质Video abstract generating method, device, electronic device and readable storage medium 技术领域Technical field
本申请涉及视频监控技术,尤其涉及一种视频摘要生成方法、装置、电子设备及可读存储介质。The present application relates to video surveillance technology, and in particular, to a method, a device, an electronic device, and a readable storage medium for generating a video summary.
背景技术Background technique
视频监控系统作为社会安全管理的重要技术手段,在社会治安维护领域的应用和部署越来越广。随着所部署的监控设备数量的增加和部署范围的扩大,所存储视频录像数据的数据量也越来越大。若想从视频录像数据中查找出特定目标(人或车辆等)在哪些时间段、哪些地点出现过,往往需要人工对大量的视频录像数据进行回放查找,耗时较长且可能会疏忽遗漏,在录像定位、整合展示方面存在效率瓶颈和不完整风险。Video surveillance system, as an important technical means of social security management, is increasingly used and deployed in the field of social security maintenance. As the number of deployed surveillance devices increases and the scope of deployment expands, the amount of data stored in video recording data also increases. If you want to find out the specific target (person or vehicle, etc.) in which time period and place from the video recording data, it is often necessary to manually search and search a large amount of video recording data, which takes a long time and may be neglected. There are efficiency bottlenecks and incomplete risks in video positioning and integrated display.
发明内容Summary of the Invention
有鉴于此,本申请提供一种视频摘要生成方法及其装置。In view of this, the present application provides a method and a device for generating a video summary.
具体地,本申请是通过如下技术方案实现的。Specifically, the present application is implemented through the following technical solutions.
根据本申请实施例的第一方面,提供一种视频摘要生成方法,包括:接收目标搜索请求,所述目标搜索请求中携带有待搜索目标的特征信息;搜索与所述待搜索目标的特征信息匹配的第一目标图片;根据所述第一目标图片和所述第一目标图片对应的采集时间生成视频摘要。According to a first aspect of the embodiments of the present application, a method for generating a video digest is provided, including: receiving a target search request, where the target search request carries feature information of a target to be searched; and the search matches the feature information of the target to be searched A first target picture; generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
可选地,在搜索与所述待搜索目标的特征信息匹配的所述第一目标图片之前,还包括:获取视频源设备的视频数据中的目标图片信息,所述目标图片信息包括目标图片、目标图片的采集时间以及目标图片的属性信息;将所述目标图片信息保存至图片信息库。Optionally, before searching for the first target picture that matches the feature information of the target to be searched, the method further includes: obtaining target picture information in video data of the video source device, where the target picture information includes the target picture, Collection time of the target picture and attribute information of the target picture; saving the target picture information to a picture information database.
可选地,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:接收所述视频源设备发送的所述目标图片信息。Optionally, acquiring the target picture information in the video data of the video source device includes receiving the target picture information sent by the video source device.
可选地,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:接收所述视频源设备发送的所述目标图片及所述目标图片的采集时间;对所述目标图片进行建模,并提取所述目标图片的属性信息。Optionally, acquiring the target picture information in the video data of the video source device includes: receiving the target picture and the acquisition time of the target picture sent by the video source device; The picture is modeled, and attribute information of the target picture is extracted.
可选地,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:接收所述视频源设备发送的所述目标图片、所述目标图片的采集时间以及目标图片的第一属性信息;对所述目标图片进行建模,并提取所述目标图片的第二属性信息;根据所述目标图片的第一属性信息和所述目标图片的第二属性信息确定所述目标图片的属性信息。Optionally, acquiring the target picture information in the video data of the video source device includes: receiving the target picture sent by the video source device, a collection time of the target picture, and a first picture of the target picture. An attribute information; modeling the target picture and extracting the second attribute information of the target picture; determining the target picture according to the first attribute information of the target picture and the second attribute information of the target picture Attribute information.
可选地,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:对所述视频源设备提供的所述视频数据进行目标检测,得到所述视频数据中的所述目标图片及所述目标图片的采集时间;对所述目标图片进行建模,并提取所述目标图片的属性信息。Optionally, acquiring the target picture information in the video data of the video source device includes: performing target detection on the video data provided by the video source device to obtain the video data in the video data. The target picture and the acquisition time of the target picture; modeling the target picture and extracting attribute information of the target picture.
可选地,所述待搜索目标的特征信息包括待搜索目标的属性信息;搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:根据所述待搜索目标的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Optionally, the characteristic information of the target to be searched includes attribute information of the target to be searched; and searching for the first target picture that matches the characteristic information of the target to be searched includes: according to the attribute information of the target to be searched Searching the picture information database for the matching first target picture.
可选地,所述待搜索目标的特征信息包括待搜索目标图片;搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的属性信息;根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Optionally, the feature information of the target to be searched includes a target picture to be searched; searching for the first target picture that matches the characteristic information of the target to be searched includes: modeling the target picture to be searched, And extracting attribute information of the target picture to be searched for; and searching the picture information database for the matching first target picture according to the attribute information of the target picture to be searched for.
可选地,所述待搜索目标的特征信息包括待搜索目标图片和所述待搜索目标图片的第三属性信息;搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的第四属性信息;根据所述第三属性信息和所述第四属性信息,确定所述待搜索目标图片的属性信息;根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Optionally, feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; and searching for the first target picture that matches the characteristic information of the target to be searched includes: : Model the target picture to be searched, and extract fourth attribute information of the target picture to be searched; determine the attribute of the target picture to be searched according to the third attribute information and the fourth attribute information Information; searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
可选地,所述目标搜索请求中还携带有搜索时间段范围;搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:根据所述搜索时间段范围对所述图片信息库中的所述目标图片进行筛选,得到采集时间在所述搜索时间段范围内的第二目标图片;根据所述待搜索目标的特征信息,从所述第二目标图片中搜索匹配的所述第一目标图片。Optionally, the target search request further carries a search time range; searching for the first target picture that matches the characteristic information of the target to be searched includes: comparing the picture according to the search time range The target pictures in the information database are filtered to obtain a second target picture whose acquisition time is within the range of the search time period; according to the feature information of the target to be searched, a matching target is searched from the second target picture. The first target picture is described.
可选地,所述目标搜索请求中还携带有搜索通道号,所述目标图片信息还包括有所述目标图片的通道号;搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:根据所述搜索通道号对所述图片信息库中的所述目标图片进行筛选,得到所述通道号与所述搜索通道号一致的第三目标图片;根据所述待搜索目标的特征信息,从所述第三目标图片中搜索匹配的所述第一目标图片。Optionally, the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture; searching for the first target matching the feature information of the target to be searched The picture includes: filtering the target picture in the picture information database according to the search channel number to obtain a third target picture whose channel number is consistent with the search channel number; Feature information, searching the first target picture for matching from the third target picture.
可选地,所述目标图片为人脸图片,所述目标搜索请求为人脸搜索请求。Optionally, the target picture is a face picture, and the target search request is a face search request.
可选地,所述目标图片为车辆图片,所述目标搜索请求为车辆搜索请求。Optionally, the target picture is a vehicle picture, and the target search request is a vehicle search request.
可选地,根据所述第一目标图片和所述第一目标图片对应的采集时间生成所述视频摘要,包括:按采集时间从早到晚的顺序排序所述第一目标图片;根据排序后的所述第一目标图片,生成所述视频摘要。Optionally, generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: sorting the first target picture in the order of the acquisition time from morning to night; Generating the video summary by using the first target picture.
可选地,根据所述第一目标图片和所述第一目标图片对应的采集时间生成所述视频摘要,包括:对于每个第一目标图片,确定该第一目标图片对应的目标录像片段,其中,该目标录像片段为该第一目标图片对应的所述采集时间之前的第n秒到该第一目标图片的所述采集时间之后的第m秒之间的录像数据;根据各所述目标录像片段生成视频摘要。Optionally, generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: for each first target picture, determining a target video clip corresponding to the first target picture, The target video clip is video data between the n-th second before the acquisition time corresponding to the first target picture and the m-th second after the acquisition time of the first target picture; according to each target The video clip generates a video summary.
可选地,对于所述每个第一目标图片,确定该第一目标图片对应的所述目标录像片段,包括:当存在采集时间相同的多张第一目标图片时,对于该多张第一目标图片中的任一第一目标图片,确定该第一目标图片对应的录像片段的起始时间点和结束时间点,其中,该起始时间点为该第一目标图片对应的所述采集时间之前的第n秒,该结束时间点为该第一目标图片对应的所述采集时间之后的第m秒;搜索该第一目标图片所属的视频数据通道的录像数据中是否存在该起始时间点的I帧,以及是否存在该结束时间点的I帧;若存在该起始时间点的I帧和该结束时间点的I帧,则丢弃该多张第一目标图片中 的其余第一目标图片,并将该第一目标图片对应的录像片段确定为所述目标录像片段。Optionally, for each of the first target pictures, determining the target video clip corresponding to the first target picture includes: when there are multiple first target pictures with the same acquisition time, for the multiple first target pictures, For any first target picture in the target picture, determine a start time point and an end time point of the video clip corresponding to the first target picture, where the start time point is the acquisition time corresponding to the first target picture The nth second before, the end time point is the mth second after the acquisition time corresponding to the first target picture; searching whether the starting time point exists in the recording data of the video data channel to which the first target picture belongs If there are I frames at the end time point; if there are I frames at the start time point and I frames at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded And determine the video clip corresponding to the first target picture as the target video clip.
可选地,所述方法还包括:若不存在该起始时间点的I帧,则将该第一目标图片对应的所述录像片段的所述起始时间点增加x秒得到新的起始时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的起始时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的起始时间点与所述采集时间相同;若不存在该结束时间点的I帧,则将该第一目标图片对应的所述录像片段的所述结束时间点减少x秒得到新的结束时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的结束时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的结束时间点与所述采集时间相同;在该多张第一目标图片中分别对应的录像片段中,选择时长最长录像片段作为所述目标录像片段;丢弃该多张第一目标图片中的其余时长的录像片段对应的第一目标图片。Optionally, the method further includes: if there is no I frame at the start time point, increasing the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new start point Point in time, and repeat the above search step until the new starting time point I frame is searched in the recording data of the video data channel to which the first target picture belongs, or the first target picture corresponds to The new start time point of the video clip is the same as the acquisition time; if there is no I frame at the end time point, the end time of the video clip corresponding to the first target picture The point is reduced by x seconds to obtain a new end time point, and the above search step is repeated until the I frame of the new end time point is searched in the recording data of the video data channel to which the first target picture belongs, or , The new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; among the corresponding video clips in the plurality of first target pictures, the longest video clip is selected as the The target footage; discard the remaining length of the plurality of video clips when the first target image corresponding to the first target picture.
可选地,根据各所述目标录像片段生成所述视频摘要,包括:根据各所述目标录像片段的起始时间点和结束时间点对所述目标录像片段进行过滤,去除时间重复的录像数据;根据过滤后的目标录像片段生成所述视频摘要。Optionally, generating the video summary according to each of the target video clips includes: filtering the target video clips according to a start time point and an end time point of each target video clip to remove time-repeated video data. Generating the video summary according to the filtered target video clip.
可选地,根据各所述目标录像片段的所述起始时间点和所述结束时间点对所述目标录像片段进行过滤,包括:按照各所述目标录像片段的所述起始时间点对各所述目标录像片段进行排序;对于相邻的第一目标录像片段和第二目标录像片段,当所述第一目标录像片段的结束时间点大于等于所述第二目标录像片段的起始时间点时,若所述第一目标录像片段和所述第二目标录像片段属于同一视频数据通道,则将所述第一目标录像片段和所述第二目标录像片段合并,合并后的录像片段的起始时间点为所述第一目标录像片段的起始时间点,结束时间点为所述第二目标录像片段的结束时间点;若所述第一目标录像片段和所述第二目标录像片段属于不同视频数据通道,则以所述第一目标录像片段的所述结束时间点作为所述第二目标录像片段的所述起始时间点,或,以所述第二目标片段的所述起始时间点作为所述第一目标录像片段的所述结束时间点;其中,所述第一目标录像片段的所述起始时间点小于所述第二目标录像片段的所述起始时间点。Optionally, filtering the target video clips according to the start time point and the end time point of each target video clip includes: matching the target video clips according to the start time point of each target video clip. Sorting each of the target video clips; for an adjacent first target video clip and a second target video clip, when the end time point of the first target video clip is greater than or equal to the start time of the second target video clip At the point of time, if the first target video clip and the second target video clip belong to the same video data channel, the first target video clip and the second target video clip are merged. The start time point is the start time point of the first target video clip, and the end time point is the end time point of the second target video clip; if the first target video clip and the second target video clip are Belonging to different video data channels, using the end time point of the first target video clip as the start time point of the second target video clip, or Use the start time point of the second target video clip as the end time point of the first target video clip; wherein the start time point of the first target video clip is smaller than the second The starting time point of the target video clip.
根据本申请实施例的第二方面,提供一种视频摘要生成装置,包括:接收单元,用于接收目标搜索请求,所述目标搜索请求中携带有待搜索目标的特征信息;搜索单元,用于搜索与所述待搜索目标的特征信息匹配的第一目标图片;处理单元,用于根据所述第一目标图片和所述第一目标图片对应的采集时间生成视频摘要。According to a second aspect of the embodiments of the present application, a video digest generating device is provided, including: a receiving unit configured to receive a target search request, where the target search request carries characteristic information of a target to be searched; a search unit configured to search A first target picture matching the characteristic information of the target to be searched; a processing unit, configured to generate a video digest according to the first target picture and the acquisition time corresponding to the first target picture.
可选地,所述装置还包括:获取单元,用于获取视频源设备的视频数据中的目标图片信息,所述目标图片信息包括目标图片、目标图片的采集时间以及目标图片的属性信息;保存单元,用于将所述目标图片信息保存至图片信息库。Optionally, the apparatus further includes: an obtaining unit, configured to obtain target picture information in the video data of the video source device, where the target picture information includes the target picture, the acquisition time of the target picture, and attribute information of the target picture; A unit, configured to save the target picture information to a picture information database.
可选地,所述获取单元,具体用于接收所述视频源设备发送的所述目标图片信息。Optionally, the obtaining unit is specifically configured to receive the target picture information sent by the video source device.
可选地,所述获取单元,具体用于接收所述视频源设备发送的所述目标图片及所述目标图片的采集时间;对所述目标图片进行建模,并提取所述目标图片的属性信息。Optionally, the obtaining unit is specifically configured to receive the target picture and the collection time of the target picture sent by the video source device; model the target picture and extract attributes of the target picture information.
可选地,所述获取单元,具体用于接收所述视频源设备发送的所述目标图片、所述目标图片的采集时间以及目标图片的第一属性信息;对所述目标图片进行建模,并提取 所述目标图片的第二属性信息;根据所述目标图片的第一属性信息和所述目标图片的第二属性信息确定所述目标图片的属性信息。Optionally, the obtaining unit is specifically configured to receive the target picture, the acquisition time of the target picture, and first attribute information of the target picture sent by the video source device; and model the target picture, And extracting the second attribute information of the target picture; determining the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
可选地,所述获取单元,具体用于对所述视频源设备提供的所述视频数据进行目标检测,得到所述视频数据中的所述目标图片及所述目标图片的采集时间;对所述目标图片进行建模,并提取所述目标图片的属性信息。Optionally, the obtaining unit is specifically configured to perform target detection on the video data provided by the video source device to obtain the target picture and the acquisition time of the target picture in the video data; The target picture is modeled, and attribute information of the target picture is extracted.
可选地,所述待搜索目标的特征信息包括待搜索目标的属性信息;所述搜索单元,具体用于根据所述待搜索目标的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Optionally, the feature information of the target to be searched includes attribute information of the target to be searched; and the search unit is specifically configured to search the picture information database for the matching first part according to the attribute information of the target to be searched. A target picture.
可选地,所述待搜索目标的特征信息包括待搜索目标图片;所述搜索单元,具体用于对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的属性信息;根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Optionally, the feature information of the target to be searched includes a target picture to be searched; the search unit is specifically configured to model the target picture to be searched and extract attribute information of the target picture to be searched; The attribute information of the target picture to be searched is searched in the picture information database for the matching first target picture.
可选地,所述待搜索目标的特征信息包括待搜索目标图片和所述待搜索目标图片的第三属性信息;所述搜索单元,具体用于对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的第四属性信息;根据所述第三属性信息和所述第四属性信息,确定所述待搜索目标图片的属性信息;根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Optionally, the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; the search unit is specifically configured to model the target picture to be searched, and Extracting fourth attribute information of the target picture to be searched; determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information; and according to the attribute information of the target picture to be searched in The picture information database searches for a matching first target picture.
可选地,所述目标搜索请求中还携带有搜索时间段范围;所述搜索单元,具体用于根据所述搜索时间段范围对所述图片信息库中的目标图片进行筛选,以得到采集时间在所述搜索时间段范围内的第二目标图片;根据所述待搜索目标的特征信息从所述第二目标图片中搜索匹配的第一目标图片。Optionally, the target search request also carries a search time range; the search unit is specifically configured to filter the target pictures in the picture information database according to the search time range to obtain a collection time A second target picture within the range of the search period; and searching for a matching first target picture from the second target picture according to the feature information of the target to be searched.
可选地,所述目标搜索请求中还携带有搜索通道号,所述目标图片信息还包括有所述目标图片的通道号;所述搜索单元,具体用于根据所述搜索通道号对所述图片信息库中的目标图片进行筛选,得到所述通道号与所述搜索通道号一致的第三目标图片;根据所述待搜索目标的特征信息,从所述第三目标图片中搜索匹配的第一目标图片。Optionally, the target search request also carries a search channel number, and the target picture information further includes a channel number of the target picture; and the search unit is specifically configured to match the search channel number with the search channel number. The target pictures in the picture information database are filtered to obtain a third target picture with the same channel number as the search channel number; according to the feature information of the target to be searched, a matching first picture is searched from the third target picture. A target picture.
可选地,所述目标图片为人脸图片,所述目标搜索请求为人脸搜索请求。Optionally, the target picture is a face picture, and the target search request is a face search request.
可选地,所述目标图片为车辆图片,所述目标搜索请求为车辆搜索请求。Optionally, the target picture is a vehicle picture, and the target search request is a vehicle search request.
可选地,所述处理单元,具体用于按采集时间从早到晚的顺序排序所述第一目标图片;根据排序后的所述第一目标图片,生成所述视频摘要。Optionally, the processing unit is specifically configured to sort the first target picture in the order of the collection time from morning to night; and generate the video summary according to the sorted first target picture.
可选地,所述处理单元,具体用于对于每个第一目标图片,确定该第一目标图片对应的目标录像片段,其中,该目标录像片段为该第一目标图片对应的所述采集时间之前的第n秒到该第一目标图片的所述采集时间之后的第m秒之间的录像数据;根据各所述目标录像片段生成视频摘要。Optionally, the processing unit is specifically configured to determine, for each first target picture, a target video clip corresponding to the first target picture, where the target video clip is the acquisition time corresponding to the first target picture Recording data between the nth second before and the mth second after the acquisition time of the first target picture; generating a video summary according to each of the target video clips.
可选地,所述处理单元,具体用于当存在采集时间相同的多张第一目标图片时,对于该多张第一目标图片中的任一第一目标图片,确定该第一目标图片对应的录像片段的起始时间点和结束时间点,其中,该起始时间点为该第一目标图片对应的所述采集时间之前的第n秒,该结束时间点为该第一目标图片对应的所述采集时间之后的第m秒;搜 索该第一目标图片所属的视频数据通道的录像数据中是否存在该起始时间点的I帧,以及是否存在该结束时间点的I帧;若存在该起始时间点的I帧和该结束时间点的I帧,则丢弃该多张第一目标图片中的其余第一目标图片,并将该第一目标图片对应的录像片段确定为所述目标录像片段。Optionally, the processing unit is specifically configured to, when there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, determine that the first target picture corresponds to The start time point and end time point of the video clip, where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is the corresponding time point of the first target picture The m-th second after the acquisition time; searching whether the I frame at the start time point and the I frame at the end time point exist in the recording data of the video data channel to which the first target picture belongs; if the The I frame at the start time point and the I frame at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded, and the video clip corresponding to the first target picture is determined as the target video Fragment.
可选地,所述处理单元,还用于若不存在该起始时间点的I帧,则将该第一目标图片对应的所述录像片段的所述起始时间点增加x秒得到新的起始时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的起始时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的起始时间点与所述采集时间相同;若不存在该结束时间点的I帧,则将该第一目标图片对应的所述录像片段的所述结束时间点减少x秒得到新的结束时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的结束时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的结束时间点与所述采集时间相同;在该多张第一目标图片中分别对应的录像片段中,选择时长最长录像片段作为所述目标录像片段;丢弃该多张第一目标图片中的其余时长的录像片段对应的第一目标图片。Optionally, the processing unit is further configured to, if there is no I frame at the start time point, increase the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new Start time point, and repeat the above search steps until the new start time point I frame is searched in the video data of the video data channel to which the first target picture belongs, or the first target The new start time point of the video clip corresponding to the picture is the same as the acquisition time; if there is no I frame at the end time point, the video clip corresponding to the first target picture is Decrease the end time point by x seconds to obtain a new end time point, and repeat the above search steps until the I frame of the new end time point is searched in the recording data of the video data channel to which the first target picture belongs. Or, the new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; among the corresponding video clips in the plurality of first target pictures, a video with the longest duration is selected Examples of the target video clip; discard the remaining length of the plurality of video clips when the first target image corresponding to the first target picture.
可选地,所述处理单元,具体用于根据各所述目标录像片段的起始时间点和结束时间点对所述目标录像片段进行过滤,去除时间重复的录像数据;根据过滤后的目标录像片段生成所述视频摘要。Optionally, the processing unit is specifically configured to filter the target video clip according to a start time point and an end time point of each of the target video clips to remove time-repeated video data; and according to the filtered target video The snippet generates the video summary.
可选地,所述处理单元,具体用于按照各所述目标录像片段的所述起始时间点对各所述目标录像片段进行排序;对于相邻的第一目标录像片段和第二目标录像片段,当所述第一目标录像片段的结束时间点大于等于所述第二目标录像片段的起始时间点时,若所述第一目标录像片段和所述第二目标录像片段属于同一视频数据通道,则将所述第一目标录像片段和所述第二目标录像片段合并,合并后的录像片段的起始时间点为所述第一目标录像片段的起始时间点,结束时间点为所述第二目标录像片段的结束时间点;若所述第一目标录像片段和所述第二目标录像片段属于不同视频数据通道,则以所述第一目标录像片段的所述结束时间点作为所述第二目标录像片段的所述起始时间点,或,以所述第二目标片段的所述起始时间点作为所述第一目标录像片段的所述结束时间点;其中,所述第一目标录像片段的所述起始时间点小于所述第二目标录像片段的所述起始时间点。Optionally, the processing unit is specifically configured to sort each of the target video clips according to the start time point of each of the target video clips; for an adjacent first target video clip and a second target video clip Clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second target video clip belong to the same video data Channel, the first target video clip and the second target video clip are merged, the start time point of the merged video clip is the start time point of the first target video clip, and the end time point is The end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as The start time point of the second target video clip, or using the start time point of the second target video clip as the first target video clip Beam time point; wherein the first target segment of video start time point of the second target is less than the start time point of the video clip.
根据本申请实施例的第三方面,提供一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,所述处理器,所述通信接口,所述存储器通过所述通信总线完成相互间的通信;存储器,用于存放计算机程序;处理器,用于执行所述存储器上所存放的所述计算机程序时,实现上述视频摘要生成方法步骤。According to a third aspect of the embodiments of the present application, there is provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory pass through the processor. The communication bus completes communication with each other; a memory is configured to store a computer program; and a processor is configured to implement the steps of the above-mentioned video abstract generation method when the computer program stored on the memory is executed.
根据本申请实施例的第四方面,提供一种计算机可读存储介质,其特征在于,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述视频摘要生成方法步骤。According to a fourth aspect of the embodiments of the present application, a computer-readable storage medium is provided, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the video digest generation is implemented. Method steps.
本申请实施例的视频摘要生成方法,接收目标搜索请求,搜索与该目标搜索请求中携带的待搜索目标的特征信息匹配的第一目标图片,并根据各第一目标图片和其对应的 采集时间生成待搜索目标的视频摘要。从而提高了在视频录像中定位目标的效率和准确性,在去除与待搜索目标不匹配的视频录像的基础上,保证了目标视频跟踪的连贯性。The video abstract generating method in the embodiment of the present application receives a target search request, searches for a first target picture that matches the characteristic information of the target to be searched carried in the target search request, and according to each first target picture and its corresponding acquisition time Generate a video summary of the search target. This improves the efficiency and accuracy of locating targets in video recordings. On the basis of removing video recordings that do not match the target to be searched, the consistency of target video tracking is guaranteed.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本申请一示例性实施例示出的一种视频摘要生成系统的架构示意图;FIG. 1 is a schematic structural diagram of a video digest generating system according to an exemplary embodiment of the present application; FIG.
图2是本申请一示例性实施例示出的一种视频摘要生成方法的流程示意图;FIG. 2 is a schematic flowchart of a video abstract generating method according to an exemplary embodiment of the present application; FIG.
图3是本申请一示例性实施例示出的图片信息库生成的流程示意图;FIG. 3 is a schematic flowchart of generating a picture information database according to an exemplary embodiment of the present application; FIG.
图4是本申请一示例性实施例示出的一种重复图片过滤的流程示意图;4 is a schematic flowchart of a repeated picture filtering process according to an exemplary embodiment of the present application;
图5是本申请另一示例性实施例示出的一种视频摘要生成系统的流程示意图;FIG. 5 is a schematic flowchart of a video digest generating system according to another exemplary embodiment of the present application; FIG.
图6是本申请一示例性实施例示出的一种提取人脸图片属性的流程示意图;FIG. 6 is a schematic flowchart of extracting attributes of a face picture according to an exemplary embodiment of the present application; FIG.
图7是本申请另一示例性实施例示出的一种提取人脸图片属性的流程示意图;FIG. 7 is a schematic flowchart of extracting attributes of a face picture according to another exemplary embodiment of the present application; FIG.
图8是本申请再一示例性实施例示出的一种视频摘要生成系统的流程示意图;FIG. 8 is a schematic flowchart of a video digest generating system according to still another exemplary embodiment of the present application; FIG.
图9是本申请一示例性实施例示出的一种提取车辆图片属性的流程示意图;FIG. 9 is a schematic flowchart of extracting a picture attribute of a vehicle according to an exemplary embodiment of the present application; FIG.
图10是本申请另一示例性实施例示出的一种提取车辆图片属性的流程示意图;FIG. 10 is a schematic flowchart of extracting a picture attribute of a vehicle according to another exemplary embodiment of the present application; FIG.
图11是本申请又一示例性实施例示出一种视频摘要生成系统的流程示意图;FIG. 11 is a schematic flowchart illustrating a video digest generating system according to another exemplary embodiment of the present application; FIG.
图12是本申请一示例性实施例示出的一种视频摘要生成装置的结构示意图;FIG. 12 is a schematic structural diagram of a video digest generating apparatus according to an exemplary embodiment of the present application; FIG.
图13是本申请另一示例性实施例示出的一种视频摘要生成装置的结构示意图;FIG. 13 is a schematic structural diagram of a video digest generating apparatus according to another exemplary embodiment of the present application; FIG.
图14是本申请一示例性实施例示出的一种电子设备的硬件结构示意图。Fig. 14 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application.
具体实施方式detailed description
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of devices and methods consistent with certain aspects of the application as detailed in the appended claims.
在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。The terminology used in this application is for the purpose of describing particular embodiments only and is not intended to limit the application. As used in this application and the appended claims, the singular forms "a", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
为了使本领域技术人员更好地理解本申请实施例提供的技术方案,下面先对本申请实施例适用的系统架构进行简单说明。In order to enable those skilled in the art to better understand the technical solutions provided in the embodiments of the present application, the following briefly describes the system architecture applicable to the embodiments of the present application.
请参见图1,为本申请实施例提供的一种视频摘要生成系统的架构示意图,如图1所示,该视频摘要生成系统可以包括视频源设备110以及搜索设备120。其中,视频源设备110可以提供视频数据,该视频数据可以包括实时视频数据或视频录像数据(本文中简称为录像数据)。搜索设备120可以接收目标搜索请求,并根据目标搜索请求中携带的待搜索目标的特征信息,搜索视频源设备110的视频数据中存在的与待搜索目标的 特征信息匹配的目标图片(本文中称为第一目标图片),并根据搜索到的第一目标图片,生成待搜索目标的视频摘要。Please refer to FIG. 1, which is a schematic structural diagram of a video digest generating system according to an embodiment of the present application. As shown in FIG. 1, the video digest generating system may include a video source device 110 and a search device 120. The video source device 110 may provide video data, and the video data may include real-time video data or video recording data (referred to as recording data herein). The search device 120 may receive a target search request, and search for a target picture (referred to herein as a text image) in the video data of the video source device 110 that matches the feature information of the target to be searched according to the characteristic information of the target to be searched carried in the target search request Is the first target picture), and a video summary of the target to be searched is generated according to the searched first target picture.
需要说明的是,在本申请实施例中,视频源设备110可以为前端视频采集设备(如IPC(Internet Protocol Camera,网络摄像机))或视频录像存储设备(如NVR(Network Video Recorder,网络硬盘录像机));搜索设备120可以为NVR(具有目标搜索功能)或视频监控系统中部署的专用于进行目标搜索的设备。其中,当视频源设备110为NVR时,视频源设备110和搜索设备120可以为同一设备。It should be noted that, in the embodiment of the present application, the video source device 110 may be a front-end video capture device (such as IPC (Internet Protocol Camera)) or a video recording storage device (such as NVR (Network Video Recorder, Network Hard Disk Video Recorder)). )); The search device 120 may be an NVR (with a target search function) or a device deployed in a video surveillance system dedicated to target search. When the video source device 110 is an NVR, the video source device 110 and the search device 120 may be the same device.
此外,一个视频源设备110可以为多个搜索设备120提供视频数据,一个搜索设备120也可以从多个视频源设备110中获取视频数据(图中以一对一为例)。其中,当视频源设备110为IPC时,一个视频源设备110可以对应一个视频数据通道;当视频源设备110为NVR等视频录像存储设备时,一个视频源设备110可以提供多个视频数据通道的视频数据(录像数据)。In addition, one video source device 110 may provide video data for multiple search devices 120, and one search device 120 may also obtain video data from multiple video source devices 110 (one-to-one is taken as an example in the figure). When the video source device 110 is IPC, one video source device 110 can correspond to one video data channel; when the video source device 110 is a video recording storage device such as an NVR, one video source device 110 can provide multiple video data channels. Video data (recording data).
为使本申请实施例的上述目的、特征和优点能够更加明显易懂,下面结合附图对本申请实施例中技术方案作进一步详细的说明。In order to make the foregoing objects, features, and advantages of the embodiments of the present application more comprehensible, the technical solutions in the embodiments of the present application will be further described in detail below with reference to the accompanying drawings.
请参见图2,为本申请实施例提供的一种视频摘要生成方法的流程示意图,其中,该视频摘要生成方法可以应用于搜索设备(以NVR为例)。如图2所示,该视频摘要生成方法可以包括以下步骤。Please refer to FIG. 2, which is a schematic flowchart of a video abstract generating method according to an embodiment of the present application. The video abstract generating method may be applied to a search device (taking an NVR as an example). As shown in FIG. 2, the video summary generating method may include the following steps.
步骤S200、接收目标搜索请求,该目标搜索请求中携带有待搜索目标的特征信息。Step S200: Receive a target search request, where the target search request carries characteristic information of a target to be searched.
本申请实施例中,目标可以包括但不限于人脸、车辆或车牌等。其中,当待搜索目标为人脸时,待搜索目标的特征信息可以包括但不限于人脸图片、人脸的结构化信息(如是否微笑、是否戴眼镜、性别、年龄段等)等中的一个或多个。当待搜索目标为车辆时,待搜索目标的特征信息可以但不限于车辆图片、车辆的特征信息(如颜色、类型、车标、品牌等)等中的一个或多个。当待搜索目标为车牌时,待搜索目标的特征信息可以包括但不限于车牌图片、车牌的特征信息(如颜色、位置、车牌号等)等中的一个或多个。In the embodiment of the present application, the target may include, but is not limited to, a human face, a vehicle, or a license plate. When the target to be searched is a human face, the feature information of the target to be searched may include, but is not limited to, a face picture, structured information of the face (such as whether to smile, whether to wear glasses, gender, age range, etc.), etc. Or more. When the target to be searched is a vehicle, the characteristic information of the target to be searched may be, but is not limited to, one or more of a vehicle picture, characteristic information of the vehicle (such as color, type, logo, brand, etc.). When the target to be searched is a license plate, the feature information of the target to be searched may include, but is not limited to, one or more of a license plate picture, characteristic information (such as color, position, license plate number, etc.) of the license plate.
本申请实施例中,目标还可以包括人体、动物等。其中,当待搜索目标为人体时,待搜索目标的特征信息可以包括但不限于人体图片、人体的特征信息(如高矮、胖瘦、性别、肤色、衣着等)等中的一个或多个。当待搜索目标为动物时,待搜索目标的特征信息可以包括但不限于动物图片、动物的特征信息(如种类、毛发颜色、大小等)等中的一个或多个。In the embodiment of the present application, the target may further include a human body, an animal, and the like. When the target to be searched is a human body, the feature information of the target to be searched may include, but is not limited to, one or more of a human body picture, human body characteristic information (such as height, weight, gender, skin color, clothing, etc.). When the target to be searched is an animal, the characteristic information of the target to be searched may include, but is not limited to, one or more of animal pictures, animal characteristic information (such as type, hair color, size, etc.) and the like.
步骤S210、搜索与待搜索目标的特征信息匹配的第一目标图片。Step S210: Search for a first target picture that matches the feature information of the target to be searched.
本申请实施例中,搜索设备接收到目标搜索请求时,可以根据待搜索目标的特征信息,搜索视频源设备的视频数据中存在的与该待搜索目标的特征信息匹配的第一目标图片。同时,可以将该第一目标图片以及该第一目标图片的采集时间作为与该待搜索目标的特征信息匹配的第一目标图片信息。In the embodiment of the present application, when receiving a target search request, the search device may search for a first target picture in the video data of the video source device that matches the characteristic information of the target to be searched for according to the characteristic information of the target to be searched. At the same time, the first target picture and the acquisition time of the first target picture may be used as the first target picture information that matches the feature information of the target to be searched.
本申请实施例中,搜索设备搜索与待搜索目标的特征信息匹配的第一目标图片时,搜索结果可以为一张或多张第一目标图片,或者,搜索结果可以为空。当搜索结果为空时,即未搜索到与待搜索目标的特征信息匹配的第一目标图片时,搜索设备可以确定目 标搜索失败,返回搜索失败响应消息。In the embodiment of the present application, when the search device searches for the first target picture that matches the feature information of the target to be searched, the search result may be one or more first target pictures, or the search result may be empty. When the search result is empty, that is, when no first target picture matching the feature information of the target to be searched is found, the search device may determine that the target search fails, and return a search failure response message.
本申请实施例中,目标图片的采集时间可以为前端视频采集设备采集(如抓拍)到该目标图片的时间,或该目标图片在前端视频采集设备采集到的视频图像中出现的时间。In the embodiment of the present application, the collection time of the target picture may be the time when the front-end video capture device collects (eg, captures) the target picture, or the time when the target picture appears in the video image collected by the front-end video capture device.
需要说明的是,在本申请实施例中,目标图片的采集时间可以携带在目标图片中(例如,将目标图片的采集时间显示在目标图片中的特定位置(如左下角或右下角等)),或者,目标图片的采集时间可以独立于目标图片之外,其具体实现在此不做赘述。It should be noted that, in the embodiment of the present application, the acquisition time of the target picture may be carried in the target picture (for example, displaying the acquisition time of the target picture at a specific position in the target picture (such as the lower left corner or the lower right corner, etc.)) Or, the acquisition time of the target picture may be independent of the target picture, and its specific implementation is not described herein.
步骤S220、根据第一目标图片和第一目标图片对应的采集时间生成视频摘要。Step S220: Generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
本申请实施例中,搜索设备确定了与待搜索目标的特征信息匹配的第一目标图片之后,可以根据该第一目标图片和其对应的采集时间生成待搜索目标的视频摘要。In the embodiment of the present application, after the search device determines the first target picture that matches the feature information of the target to be searched, it can generate a video summary of the target to be searched according to the first target picture and its corresponding acquisition time.
在本申请其中一个实施例中,根据第一目标图片和该第一目标图片对应的采集时间生成视频摘要,包括:按采集时间从早到晚的顺序排序多张第一目标图片,根据排序后的多张第一目标图片生成视频摘要。在该实施例中,搜索设备可以直接按采集时间从早到晚的顺序排序第一目标图片,生成视频摘要。In one embodiment of the present application, generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: sorting multiple first target pictures in the order of the acquisition time from morning to night; Video summary from multiple first target images. In this embodiment, the search device may directly sort the first target picture in the order of the collection time from morning to night to generate a video summary.
例如,当搜索设备搜索到的第一目标图片的数量超过预设数量阈值(可以根据实际需求设定,如200张、500张等),且按采集时间从早到晚的顺序排序后相邻的两张第一目标图片的采集时间的差值不超过预设时间阈值(可以根据实际需求设定,如1秒、2秒等)时,搜索设备可以直接根据排序后的第一目标图片,生成视频摘要。For example, when the number of the first target pictures searched by the search device exceeds a preset number threshold (can be set according to actual needs, such as 200, 500, etc.), and are sorted in the order of the collection time from morning to night, they are next to each other When the difference between the acquisition times of the two first target pictures does not exceed a preset time threshold (can be set according to actual needs, such as 1 second, 2 seconds, etc.), the search device may directly according to the sorted first target pictures, Generate a video summary.
在本申请另一个实施例中,根据所述第一目标图片和该第一目标图片对应的采集时间生成视频摘要,包括:对于每一第一目标图片,确定该第一目标图片对应的目标录像片段,其中,该目标录像片段为从该第一目标图片对应的采集时间之前的第n秒到该第一目标图片的采集时间之后的第m秒之间的录像数据;根据各目标录像片段生成视频摘要。In another embodiment of the present application, generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture includes: for each first target picture, determining a target video corresponding to the first target picture A segment, where the target video segment is video data from the n-th second before the acquisition time corresponding to the first target picture to the m-th second after the acquisition time of the first target picture; generated according to each target video segment Video summary.
本申请实施例中,当搜索设备搜索到与待搜索目标的特征信息匹配的多张第一目标图片时,对于每一第一目标图片,搜索设备可以将该第一目标图片的采集时间之前的第n秒到该第一目标图片的采集时间之后的第m秒之间的录像数据确定为第一目标图片对应的目标录像片段,以得到与待搜索目标的特征信息相匹配的视频录像。其中,第一目标图片的采集时间为该第一目标图片在提取到该第一目标图片的视频数据中出现的时间(即视频采集设备采集到该第一目标图片的时间)。In the embodiment of the present application, when the search device searches for multiple first target pictures that match the feature information of the target to be searched, for each first target picture, the search device may Recording data between the nth second and the mth second after the acquisition time of the first target picture is determined as a target video clip corresponding to the first target picture, so as to obtain a video recording that matches the characteristic information of the target to be searched. The collection time of the first target picture is the time when the first target picture appears in the video data extracted from the first target picture (that is, the time when the first target picture is collected by the video acquisition device).
本申请实施例中,第一目标图片对应的目标录像片段的起始时间点(第一目标图片的采集时间之前的第n秒)以及结束时间点(第一目标图片的采集时间之后的第m秒)可以预先配置在搜索设备中,也可以携带在目标搜索请求中(可以由用户根据实际需求进行设置或采用默认值)。其中,n和m为非负数。In the embodiment of the present application, the start time point of the target video clip corresponding to the first target picture (n-th second before the acquisition time of the first target picture) and the end time point (m-th after the acquisition time of the first target picture) Seconds) can be pre-configured in the search device or can be carried in the target search request (can be set by the user according to actual needs or default values are used). Among them, n and m are non-negative numbers.
需要说明的是,当视频源设备为搜索设备提供的视频数据包括多个视频数据通道的视频数据时,搜索设备在确定第一目标图片对应的目标录像片段时,可以先确定第一目标图片所属的视频数据通道,并在该视频数据通道的录像数据中确定该第一目标图片对应的目标录像片段。即将该视频数据通道的录像数据中,该第一目标图片的采集时间之前的第n秒到之后的第m秒之间的录像数据确定为该第一目标图片对应的目标录像片段。It should be noted that when the video data provided by the video source device for the search device includes video data of multiple video data channels, when the search device determines the target video clip corresponding to the first target picture, it may first determine the first target picture belongs to Video data channel, and determine a target video clip corresponding to the first target picture in the video data of the video data channel. That is, among the recording data of the video data channel, the recording data between the nth second before and the mth second after the acquisition time of the first target picture is determined as a target video clip corresponding to the first target picture.
相应地,在该情况下,第一目标图片信息还可以包括第一目标图片的通道号,即第一目标图片所属视频数据通道的通道号。Accordingly, in this case, the first target picture information may further include a channel number of the first target picture, that is, a channel number of a video data channel to which the first target picture belongs.
在该实施例中,搜索设备确定了与待搜索目标的特征信息匹配的各第一目标图片对应的目标录像片段之后,搜索设备可以对各目标录像片段进行融合,即根据各目标录像片段的起始时间或/和结束时间对各目标录像片段进行排序和拼接,以生成待搜索目标的视频摘要。In this embodiment, after the search device determines the target video clips corresponding to the first target pictures that match the characteristic information of the target to be searched, the search device may fuse each target video clip, that is, according to the starting of each target video clip Sort and splice each target video clip at the start time and / or end time to generate a video summary of the target to be searched.
可见,在该实施例中,搜索与目标搜索请求中携带的待搜索目标的特征信息匹配的第一目标图片,并将各第一目标图片对应的采集时间之前的第n秒到之后第m秒之间的录像数据确定为第一目标图片对应的目标录像片段,进而,通过对各目标录像片段进行融合,得到待搜索目标的视频摘要。因此,提高了在视频录像中定位目标的效率和准确性,在去除与待搜索目标不匹配的视频录像的基础上,保证了目标视频跟踪的连贯性。It can be seen that, in this embodiment, a first target picture that matches the feature information of the target to be searched carried in the target search request is searched, and the n-th second to the m-th second after the acquisition time corresponding to each first target picture The recorded video data is determined as the target video clip corresponding to the first target picture, and further, the video summary of the target to be searched is obtained by fusing each target video clip. Therefore, the efficiency and accuracy of locating targets in video recordings are improved, and the consistency of target video tracking is ensured on the basis of removing video recordings that do not match the target to be searched.
在检索设备生成待搜索目标的视频摘要之后,还可以进一步对该视频摘要进行回放。可以根据各目标录像片段生成视频摘要,并解码显示。After the retrieval device generates a video summary of the search target, the video summary may be further played back. Video summary can be generated based on each target video clip and decoded and displayed.
进一步地,在本申请实施例中,考虑到若每次进行目标搜索,均从视频源设备提供的视频数据中提取匹配的第一目标图片,目标搜索所需的时间较长,从而导致搜索效率过低,因此,为了提高目标搜索的效率,可以预先从视频源设备提供的视频数据中提取出目标图片信息,并将目标图片信息保存至图片信息库,当需要进行目标搜索时,直接从图片信息库中搜索匹配的第一目标图片。Further, in the embodiment of the present application, considering that each time a target search is performed, a matching first target picture is extracted from the video data provided by the video source device, and the time required for the target search is longer, resulting in search efficiency. Too low, in order to improve the efficiency of target search, you can extract the target picture information from the video data provided by the video source device in advance, and save the target picture information to the picture information database. When the target search is needed, directly from the picture The information base searches for a matching first target picture.
相应地,在本申请其中一个实施例中,搜索与待搜索目标的特征信息匹配的第一目标图片之前,还可以包括:获取视频源设备的视频数据中的目标图片信息,该目标图片信息包括目标图片、目标图片的采集时间以及目标图片的属性信息;将获取到的目标图片信息保存至图片信息库。Correspondingly, in one embodiment of the present application, before searching for the first target picture that matches the characteristic information of the target to be searched, the method may further include: obtaining target picture information in the video data of the video source device, where the target picture information includes The target picture, the acquisition time of the target picture, and the attribute information of the target picture; the obtained target picture information is saved to the picture information database.
在该实施例中,搜索设备可以预先获取视频源设备的视频数据中的目标图片信息,并将获取到的目标图片信息保存至图片信息库,进而,当需要进行目标搜索时,可以直接根据待搜索目标的特征信息从图片信息库中搜索匹配的第一目标图片,以提高目标搜索效率。其中,目标图片信息可以包括但不限于目标图片、目标图片的特征信息以及目标图片的采集时间等。图片信息库可以为搜索设备中的指定存储空间,也可以为第三方数据库。In this embodiment, the search device may obtain the target picture information in the video data of the video source device in advance, and save the obtained target picture information to the picture information database. Further, when a target search is required, it may directly The feature information of the search target searches for a matching first target picture from the picture information database to improve target search efficiency. The target picture information may include, but is not limited to, the target picture, feature information of the target picture, and acquisition time of the target picture. The picture information library can be a specified storage space in the search device, or it can be a third-party database.
在本申请其中一个实施例中,获取视频源设备的视频数据中的目标图片信息,可以包括:接收视频源设备发送的目标图片信息。In one embodiment of the present application, acquiring the target picture information in the video data of the video source device may include: receiving the target picture information sent by the video source device.
在该实施例中,当视频源设备具有目标图片获取功能(如目标图片抓拍功能或目标检测功能)以及目标图片分析功能时,视频源设备可以直接获取目标图片、目标图片的采集时间以及目标图片的属性信息等目标图片信息,并将目标图片信息发送给搜索设备。搜索设备可以接收视频源设备发送的目标图片信息。In this embodiment, when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function) and a target picture analysis function, the video source device can directly obtain the target picture, the acquisition time of the target picture, and the target picture The target picture information such as the attribute information of the camera, and send the target picture information to the search device. The search device can receive the target picture information sent by the video source device.
例如,假设视频源设备为具有目标图片抓拍功能及目标图片分析功能的IPC,则视频源设备可以抓拍目标图片(并记录目标图片抓拍时间(即采集时间)),并对所抓拍的目标图片进行目标图片分析,以提取目标图片的属性信息,进而,视频源设备可以将 目标图片、目标图片的采集时间以及目标图片的属性信息等目标图片信息发送给搜索设备,如NVR,由搜索设备存储所接收到的目标图片、目标图片的采集时间以及目标图片的属性信息等目标图片信息。For example, assuming that the video source device is an IPC with a target picture capture function and a target picture analysis function, the video source device can capture the target picture (and record the target picture capture time (that is, the acquisition time)), and perform the target picture capture Target image analysis to extract the attribute information of the target image. Furthermore, the video source device can send the target image information such as the target image, the acquisition time of the target image, and the attribute information of the target image to a search device, such as an NVR, which is stored by the search device. The target picture information such as the received target picture, the acquisition time of the target picture, and the attribute information of the target picture.
在本申请另一个实施例中,获取视频源设备的视频数据中的目标图片信息,可以包括:接收到视频源设备发送的目标图片及目标图片的采集时间;对目标图片进行建模,并提取目标图片的属性信息。In another embodiment of the present application, obtaining the target picture information in the video data of the video source device may include: receiving the target picture and the acquisition time of the target picture sent by the video source device; modeling the target picture and extracting The attribute information of the target picture.
在该实施例中,当视频源设备具有目标图片获取功能(如目标图片抓拍功能或目标检测功能)时,视频源设备可以获取目标图片以及目标图片的采集时间,并将目标图片以及目标图片的采集时间发送给搜索设备。搜索设备接收到视频源设备发送的目标图片以及目标图片的采集时间时,可以对目标图片进行建模,并提取目标图片的属性信息。因此,搜索设备可以获取到视频源设备的视频数据中的目标图片、目标图片的采集时间以及目标图片的属性信息。In this embodiment, when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function), the video source device can obtain the target picture and the acquisition time of the target picture, and convert the target picture and the target picture's The acquisition time is sent to the search device. When the search device receives the target picture sent by the video source device and the acquisition time of the target picture, it can model the target picture and extract the attribute information of the target picture. Therefore, the search device can obtain the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device.
例如,假设视频源设备为具有目标图片抓拍功能的IPC,则视频源设备可以抓拍目标图片(并记录目标图片抓拍时间(即采集时间)),并将抓拍的目标图片以及目标图片的采集时间发送给搜索设备,如NVR。搜索设备接收到目标图片时,可以对目标图片进行建模,并提取目标图片的属性信息。然后,搜索设备可以存储目标图片、目标图片的采集时间以及目标图片的属性信息。For example, assuming that the video source device is an IPC with a target picture capture function, the video source device can capture the target picture (and record the target picture capture time (that is, the acquisition time)), and send the captured target picture and the target picture collection time Give search devices such as NVR. When the search device receives the target picture, it can model the target picture and extract the attribute information of the target picture. The search device can then store the target picture, the acquisition time of the target picture, and the attribute information of the target picture.
在本申请又一个实施例中,获取视频源设备的视频数据中的目标图片信息,可以包括:接收视频源设备发送的目标图片信息、目标图片的采集时间以及目标图片的第一属性信息;对目标图片进行建模,并提取目标图片的第二属性信息;根据目标图片的第一属性信息和目标图片的第二属性信息确定目标图片的属性信息。In still another embodiment of the present application, obtaining target picture information in the video data of the video source device may include: receiving the target picture information sent by the video source device, the acquisition time of the target picture, and the first attribute information of the target picture; The target picture is modeled and the second attribute information of the target picture is extracted; the attribute information of the target picture is determined according to the first attribute information of the target picture and the second attribute information of the target picture.
在该实施例中,当视频源设备具有目标图片获取功能(如目标图片抓拍功能或目标检测功能)以及目标图片分析功能时,视频源设备可以直接获取目标图片、目标图片的采集时间以及目标图片的属性信息(本文中称为目标图片的第一属性信息),并将目标图片、目标图片的采集时间以及目标图片的第一属性信息发送给搜索设备。搜索设备接收到视频源设备发送的目标图片时,可以对所接收到的目标图片进行建模,并提取目标图片的属性信息(本文中称为目标图片的第二属性信息)。In this embodiment, when the video source device has a target picture acquisition function (such as a target picture capture function or a target detection function) and a target picture analysis function, the video source device can directly obtain the target picture, the acquisition time of the target picture, and the target picture Attribute information (referred to herein as the first attribute information of the target picture), and sends the target picture, the acquisition time of the target picture, and the first attribute information of the target picture to the search device. When the search device receives the target picture sent by the video source device, it can model the received target picture and extract the attribute information of the target picture (referred to herein as the second attribute information of the target picture).
对于任一目标图片,搜索设备可以比较目标图片的第一属性信息和目标图片的第二属性信息,对于第一属性信息中存在,但第二属性信息中不存在的属性信息,或第一属性信息中不存在,但第二属性信息中存在的属性信息,加入到该目标图片的属性信息;对于第一属性信息和第二属性信息中均存在的属性信息,将第二属性信息中的属性信息加入到该目标图片的属性信息,进而,得到该目标图片的属性信息。For any target picture, the search device may compare the first attribute information of the target picture with the second attribute information of the target picture. For the attribute information that exists in the first attribute information but does not exist in the second attribute information, or the first attribute The information does not exist, but the attribute information existing in the second attribute information is added to the attribute information of the target picture; for the attribute information existing in both the first attribute information and the second attribute information, the attributes in the second attribute information are added The information is added to the attribute information of the target picture, and then the attribute information of the target picture is obtained.
需要说明的是,在该实施例中,搜索设备也可以直接将目标图片的第二属性信息作为目标图片的属性信息。It should be noted that, in this embodiment, the search device may also directly use the second attribute information of the target picture as the attribute information of the target picture.
在本申请的又一实施例中,获取视频源设备的视频数据中的目标图片、目标图片的采集时间以及目标图片的属性信息,可以包括:对视频源设备提供的视频数据进行目标检测,以得到视频数据中的目标图片及目标图片的采集时间;对目标图片进行建模, 并提取目标图片的属性信息。In still another embodiment of the present application, obtaining the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device may include: performing target detection on the video data provided by the video source device to Obtain the target picture and the acquisition time of the target picture in the video data; model the target picture and extract the attribute information of the target picture.
在该实施例中,当视频源设备不具备目标图片获取功能,或视频源设备与搜索设备为同一设备(如NVR)时,搜索设备可以直接对视频源设备提供的视频数据进行目标检测,以得到视频数据中的目标图片及目标图片的采集时间;搜索设备得到视频数据中的目标图片之后,可以进一步对目标图片进行建模,并提取目标图片的属性信息。In this embodiment, when the video source device does not have a target picture acquisition function, or the video source device and the search device are the same device (such as NVR), the search device may directly perform target detection on the video data provided by the video source device to The target picture in the video data and the acquisition time of the target picture are obtained; after the search device obtains the target picture in the video data, the target picture can be further modeled and attribute information of the target picture can be extracted.
例如,假设视频源设备为不具有目标图片抓拍功能的IPC,视频源设备可以将获取到的视频数据发送给搜索设备,如NVR。搜索设备接收到视频源设备发送的视频数据时,可以对所接收到的视频数据进行目标检测,以得到视频数据中的目标图片及目标图片的采集时间(目标图片在视频数据中出现的时间),并对目标图片进行建模,提取目标图片的属性信息。然后,搜索设备可以存储目标图片、目标图片的采集时间以及目标图片的属性信息。For example, assuming that the video source device is an IPC that does not have a target picture capture function, the video source device may send the acquired video data to a search device, such as an NVR. When the search device receives the video data sent by the video source device, it can perform target detection on the received video data to obtain the target picture in the video data and the acquisition time of the target picture (the time when the target picture appears in the video data) , And model the target picture to extract the attribute information of the target picture. The search device can then store the target picture, the acquisition time of the target picture, and the attribute information of the target picture.
本申请实施例中,搜索设备获取到视频源设备的视频数据中的目标图片、目标图片的采集时间以及目标图片的属性信息之后,可以存储所获取到的目标图片、目标图片的采集时间以及目标图片的属性信息。In the embodiment of the present application, after the search device obtains the target picture, the acquisition time of the target picture, and the attribute information of the target picture in the video data of the video source device, the obtained target picture, the acquisition time of the target picture, and the target may be stored. Picture attribute information.
进一步地,在本申请实施例中,为了减少冗余图片信息存储,对于同一场景的视频数据中检测出的包括同一目标的多条目标图片信息,若该多条目标图片信息中的目标图片的采集时间相差较小,则可以仅将其中一条目标图片信息保存至图片信息库。Further, in the embodiment of the present application, in order to reduce the storage of redundant picture information, for multiple pieces of target picture information including the same target detected in the video data of the same scene, if the target pictures in the multiple pieces of target picture information are There is a small difference in acquisition time, so only one of the target picture information can be saved to the picture information database.
相应地,在本申请其中一个实施例中,上述将目标图片信息保存至图片信息库,可以包括:对于任一视频数据通道的任一目标图片信息,判断图片信息库中是否保存有与该目标图片信息包含同一目标的其它目标图片,其中,该其它目标图片信息与该目标图片信息来自同一视频数据通道,且该其它目标图片信息包括的采集时间与该目标图片信息中包括的采集时间的差值小于预设时间阈值;当图片信息库中未保存有与该目标图片信息包含同一目标的其它目标图片信息时,将该目标图片信息保存至图片信息库。Correspondingly, in one of the embodiments of the present application, the above-mentioned saving of the target picture information to the picture information database may include: for any target picture information of any video data channel, judging whether a picture information database is stored with the target The picture information includes other target pictures of the same target, where the other target picture information and the target picture information are from the same video data channel, and the difference between the acquisition time included in the other target picture information and the acquisition time included in the target picture information The value is less than the preset time threshold; when there is no other target picture information that contains the same target as the target picture information in the picture information database, the target picture information is saved to the picture information database.
在该实施例中,考虑到一个视频数据通道中的视频数据(如一个IPC获取到的视频数据)通常为一个固定场景中的视频数据。因此,对于任一视频数据通道的任一目标图片信息(包括搜索设备对视频源设备提供的视频数据进行目标检测得到,或,搜索设备接收到视频源设备发送的),搜索设备在将该目标图片信息保存至图片信息库之前,可以判断图片信息库中是否保存有满足以下条件的其它目标图片:与目标图片包括同一目标;来自同一视频数据通道;其它目标图片包括的采集时间与该目标图片信息中包括的采集时间的差值小于预设阈值。In this embodiment, it is considered that video data in a video data channel (such as video data obtained by an IPC) is usually video data in a fixed scene. Therefore, for any target picture information of any video data channel (including the search device obtaining target detection of the video data provided by the video source device, or the search device receiving the video source device), the search device is Before the picture information is saved in the picture information database, you can determine whether other target pictures that meet the following conditions are stored in the picture information database: the same target as the target picture; from the same video data channel; the acquisition time of other target pictures and the target picture The difference between the acquisition times included in the information is less than a preset threshold.
当搜索设备确定图片信息库中不存在满足上述条件的其它目标图片信息时,搜索设备可以将该目标图片信息保存至图片信息库。当搜索设备确定图片信息库中存在满足上述条件的其它目标图片信息时,拒绝将该目标图片信息保存至图片信息库,如直接丢弃该目标图片信息,以减少冗余图片存储。因此,可以减少在图片信息库中进行第一目标图片搜索的工作量,并提高搜索效率。When the search device determines that there is no other target picture information in the picture information database that meets the above conditions, the search device may save the target picture information to the picture information database. When the search device determines that there is other target picture information in the picture information database that meets the above conditions, it refuses to save the target picture information to the picture information database, such as discarding the target picture information directly to reduce redundant picture storage. Therefore, the workload of searching the first target picture in the picture information database can be reduced, and the search efficiency can be improved.
本申请实施例中,上述待搜索目标的特征信息可以包括待搜索目标图片和/或待搜索目标图片的属性信息。In the embodiment of the present application, the feature information of the target to be searched for may include target picture to be searched and / or attribute information of the target picture to be searched.
在本申请其中一个实施例中,上述待搜索目标的特征信息包括待搜索目标图片;相应地,搜索与待搜索目标的特征信息匹配的第一目标图片,包括:对待搜索目标图片进行建模,并提取待搜索目标图片的属性信息;根据待搜索目标图片的属性信息在图片信息库中搜索匹配的第一目标图片。In one embodiment of the present application, the feature information of the target to be searched includes the target picture to be searched; correspondingly, searching for the first target picture that matches the characteristic information of the target to be searched includes: modeling the target picture to be searched, And extracting the attribute information of the target picture to be searched; according to the attribute information of the target picture to be searched, a matching first target picture is searched in the picture information database.
在该实施例中,搜索设备可以提供目标搜索功能,根据接收到的目标搜索请求中携带的待搜索目标图片,按照以图搜图的方式,搜索匹配的目标图片。In this embodiment, the search device may provide a target search function, according to the target picture to be searched carried in the received target search request, and search for a matching target picture in a map search mode.
例如,搜索设备可以提供目标搜索请求界面,该目标搜索请求界面中可以包括待搜索目标图片输入或/和选择区域,由用户在该目标搜索请求界面中输入或/和选择待搜索目标图片,并提交目标搜索请求。For example, the search device may provide a target search request interface, and the target search request interface may include an input or / and selection area of a target picture to be searched, and a user enters or / and selects a target picture to be searched in the target search request interface, and Submit a target search request.
搜索设备接收到目标搜索请求时,对待搜索目标图片进行建模,并提取待搜索目标图片的属性信息,进而,搜索设备可以根据待搜索目标图片的属性信息查询所存储的目标图片信息,并将与待搜索目标图片的属性信息匹配的目标图片的属性信息对应的目标图片信息,确定为第一目标图片信息。When the search device receives the target search request, it models the target picture to be searched and extracts the attribute information of the target picture to be searched. Furthermore, the search device can query the stored target picture information according to the attribute information of the target picture to be searched, and The target picture information corresponding to the attribute information of the target picture that matches the attribute information of the target picture to be searched is determined as the first target picture information.
举例来说,假设目标为人脸,待搜索目标的特征信息为人脸图片,则搜索设备接收到目标搜索请求时,可以按照以图搜图的方式搜索匹配的第一目标人脸图片,即对目标搜索请求中携带的人脸图片进行建模,以得到该人脸图片的特征模型,进而,将视频源设备的视频数据中与该人脸图片的特征模型的相似度大于等于预设相似度阈值的人脸图片确定为第一目标人脸图片,并将该第一目标人脸图片和第一目标人脸图片的采集时间作为第一目标人脸图片信息。For example, suppose the target is a human face and the feature information of the target to be searched is a face picture. When the search device receives the target search request, it can search for a matching first target face picture in the manner of map search, that is, the target The face picture carried in the search request is modeled to obtain a feature model of the face picture, and further, the similarity between the video data of the video source device and the feature model of the face picture is greater than or equal to a preset similarity threshold The face image of is determined as the first target face picture, and the acquisition time of the first target face picture and the first target face picture is used as the first target face picture information.
其中,该相似度阈值可以预先配置在搜索设备中,也可以携带在目标搜索请求中(可以由用户根据实际需求进行设置或采用默认值)。The similarity threshold may be configured in a search device in advance, or may be carried in a target search request (can be set by a user according to actual needs or a default value is used).
在本申请其中一个实施例中,上述待搜索目标的特征信息包括待搜索目标图片和待搜索目标图片的第三属性信息;相应地,搜索与待搜索目标的特征信息匹配的第一目标图片,包括:对待搜索目标图片进行建模,并提取待搜索目标图片的第四属性信息;根据待搜索目标图片的第三属性信息和待搜索目标图片的第四属性信息,确定待搜索目标图片的属性信息;根据待搜索目标图片的属性信息在图片信息库中搜索匹配的第一目标图片。In one embodiment of the present application, the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; correspondingly, searching for a first target picture that matches the feature information of the target to be searched, Including: modeling the target picture to be searched and extracting the fourth attribute information of the target picture to be searched; determining the attribute of the target picture to be searched according to the third attribute information of the target picture to be searched and the fourth attribute information of the target picture to be searched Information; searching for a matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
在该实施例中,当目标搜索的特征信息包括待搜索目标图片和待搜索目标图片的第三属性信息时,搜索设备还可以进一步对待搜索目标图片进行建模,并提取待搜索目标图片的属性信息(本文中称为待搜索目标图片的第四属性信息)。In this embodiment, when the feature information of the target search includes the target picture to be searched and the third attribute information of the target picture to be searched, the search device may further model the search target picture and extract the attributes of the target picture to be searched Information (herein referred to as the fourth attribute information of the target picture to be searched).
搜索设备得到待搜索目标图片的第四属性信息之后,可以根据待搜索目标图片的第三属性信息和待搜索目标图片的第四属性信息,确定待搜索目标图片的属性信息。After the search device obtains the fourth attribute information of the target picture to be searched, it may determine the attribute information of the target picture to be searched according to the third attribute information of the target picture to be searched and the fourth attribute information of the target picture to be searched.
例如,搜索设备可以比较待搜索目标图片的第三属性信息和待搜索目标图片的第四属性信息,对于第三属性信息中存在,但第四属性信息中不存在的属性信息,或第三属性信息中不存在,但第四属性信息中存在的属性信息,加入到该待搜索目标图片的属性信息;对于第三属性信息和第四属性信息中均存在的属性信息,将第四属性信息中的属性信息加入到该待搜索目标图片的属性信息,进而,得到该待搜索目标图片的属性 信息。For example, the search device may compare the third attribute information of the target picture to be searched with the fourth attribute information of the target picture to be searched. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or the third attribute The information does not exist, but the attribute information existing in the fourth attribute information is added to the attribute information of the target picture to be searched; for the attribute information existing in both the third attribute information and the fourth attribute information, the fourth attribute information is added The attribute information of is added to the attribute information of the target picture to be searched, and then the attribute information of the target picture to be searched is obtained.
搜索设备得到待搜索目标图片的属性信息时,可以根据待搜索目标图片的属性信息查询图片信息库中的目标图片信息,并将与待搜索目标图片的属性信息匹配的目标图片的属性信息对应的目标图片信息,确定为第一目标图片信息。When the search device obtains the attribute information of the target picture to be searched, it can query the target picture information in the picture information database according to the attribute information of the target picture to be searched, and will correspond to the attribute information of the target picture that matches the attribute information of the target picture to be searched. The target picture information is determined as the first target picture information.
在本申请其中一个实施例中,上述待搜索目标的特征信息包括待搜索目标的属性信息。相应地,搜索与待搜索目标的特征信息匹配的第一目标图片,包括:根据所述待搜索目标的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。In one embodiment of the present application, the feature information of the target to be searched includes attribute information of the target to be searched. Correspondingly, searching for a first target picture that matches the feature information of the target to be searched includes: searching for the matching first target picture in the picture information database according to the attribute information of the target to be searched.
在本申请实施例中,待搜索目标的特征信息也可以为待搜索目标图片的属性信息,搜索设备接收到目标搜索请求时,可以直接根据其中携带的待搜索目标图片的属性信息在图片信息库中搜索匹配的第一目标图片。In the embodiment of the present application, the feature information of the target to be searched may also be attribute information of the target picture to be searched. When the search device receives the target search request, it may directly according to the attribute information of the target picture to be searched carried in the picture information database. Search for a matching first target picture.
进一步地,在本申请实施例中,为了进一步提高第一目标图片的搜索效率,进行目标搜索时,还可以携带特定的过滤属性,以指示搜索设备先根据该特定的过滤属性对图片信息库中的目标图片信息进行过滤后,再进一步进行第一目标图片搜索。其中,该特定的过滤属性可以包括但不限于搜索时间段范围或/和搜索通道号等。Further, in the embodiment of the present application, in order to further improve the search efficiency of the first target picture, when performing the target search, it may also carry a specific filtering attribute to instruct the search device to firstly After filtering the target image information, the first target image search is further performed. The specific filtering attribute may include, but is not limited to, a search time range or / and a search channel number.
相应地,在本申请其中一个实施例中,目标搜索请求中还携带有搜索时间段范围;搜索与待搜索目标的特征信息匹配的第一目标图片,可以包括:根据搜索时间段范围对图片信息库中的目标图片进行筛选,以得到采集时间在搜索时间段范围内的第二目标图片;根据待搜索目标的特征信息从第二目标图片中搜索匹配的第一目标图片。Correspondingly, in one of the embodiments of the present application, the target search request also carries a search time range; searching for the first target picture matching the characteristic information of the target to be searched for may include: comparing the picture information according to the search time range. The target pictures in the library are filtered to obtain a second target picture whose acquisition time is within the search time range; and a matching first target picture is searched from the second target picture according to the feature information of the target to be searched.
在该实施例中,当搜索设备接收到目标搜索请求时,搜索设备可以先根据目标搜索请求中携带的搜索时间段范围对图片信息库中的目标图片进行筛选,以得到采集时间在搜索时间段范围内的第二目标图片。例如,假设搜索时间段范围为[t1,t2](t2>t1),则第二目标图片是指采集时间t满足t1≤t≤t2的目标图片。In this embodiment, when the search device receives the target search request, the search device may first filter the target pictures in the picture information database according to the search time range carried in the target search request to obtain the acquisition time in the search time period. The second target picture within range. For example, assuming that the search time range is [t1, t2] (t2> t1), the second target picture refers to a target picture whose acquisition time t satisfies t1 ≦ t ≦ t2.
在该实施例中,搜索设备得到了第二目标图片时,可以根据待搜索目标特征信息从第二目标图片中搜索匹配的第一目标图片。In this embodiment, when the search device obtains the second target picture, it may search for a matching first target picture from the second target picture according to the feature information of the target to be searched.
在本申请另一个实施例中,目标搜索请求中还携带有搜索通道号,目标图片信息还包括有目标图片的通道号(即目标图片所属视频数据通道的通道号);搜索与待搜索目标的特征信息匹配的第一目标图片,包括:根据搜索通道号对图片信息库中的目标图片信息进行筛选,以得到通道号与搜索通道号一致的第三目标图片;根据待搜索目标的特征信息从第三目标图片中搜索匹配的第一目标图片。In another embodiment of the present application, the target search request also carries a search channel number, and the target picture information further includes a channel number of the target picture (that is, a channel number of a video data channel to which the target picture belongs); The first target picture with matching feature information includes: filtering the target picture information in the picture information database according to the search channel number to obtain a third target picture with the same channel number as the search channel number; and according to the feature information of the target to be searched, The third target picture is searched for a matching first target picture.
在该实施例中,搜索设备在搜索第一目标图片之前,可以先根据搜索通道号以及图片信息库中各目标图片的通道号信息对图片信息库中的目标图片进行过滤,以得到通道号与搜索通道号一致的第三目标图片,进而,在该第三目标图片中搜索第一目标图片。In this embodiment, before searching the first target picture, the search device may first filter the target pictures in the picture information database according to the search channel number and the channel number information of each target picture in the picture information database to obtain the channel number and The third target picture with the same channel number is searched, and then the first target picture is searched in the third target picture.
进一步地,在本申请实施例中,考虑到当存在多个视频数据通道的视频数据,且多个视频数据通道覆盖的场景存在重叠(如多个IPC的监控视角范围内存在区域重叠)时,可能会出现同一时间点在多个视频数据通道的视频数据中出现同一目标的情况,此时,为了避免视频摘要中出现相似度过高的视频数据,可以对第一目标图片进行去重复 处理。Further, in the embodiment of the present application, it is considered that when there is video data of multiple video data channels and the scenes covered by the multiple video data channels overlap (for example, there is an area overlap in the monitoring perspective range of multiple IPCs), It may happen that the same target appears in the video data of multiple video data channels at the same point in time. At this time, in order to avoid excessively similar video data in the video summary, the first target picture may be deduplicated.
相应地,在本申请其中一个实施例中,对于任一第一目标图片,确定该第一目标图片对应的目标录像片段,可以包括:当存在采集时间相同的多张第一目标图片时,对于该多张第一目标图片中的任一第一目标图片,确定该第一目标图片对应的录像片段的起始时间点和结束时间点,其中,该起始时间点为该第一目标图片对应的采集时间之前的第n秒,该结束时间点为该第一目标图片对应的采集时间之后的第m秒;搜索该第一目标图片所属的视频数据通道的录像数据中是否存在该起始时间点的I帧,以及是否存在该结束时间的I帧;若均存在I帧,则丢弃该多张第一目标图片中的其余第一目标图片,并将该第一目标图片对应的录像片段确定为目标录像片段。Correspondingly, in one embodiment of the present application, for any first target picture, determining a target video clip corresponding to the first target picture may include: when there are multiple first target pictures with the same acquisition time, for Any one of the plurality of first target pictures determines a start time point and an end time point of a video clip corresponding to the first target picture, where the start time point corresponds to the first target picture The n-th second before the acquisition time, the end time point is the m-th second after the acquisition time corresponding to the first target picture; searching whether the start time exists in the recording data of the video data channel to which the first target picture belongs Point I frame, and whether there is an I frame at the end time; if there are I frames, discard the remaining first target pictures in the multiple first target pictures, and determine the video clip corresponding to the first target picture For the target video clip.
在该实施例中,当存在采集时间相同的多张第一目标图片时,搜索设备可以按照预设策略确定该多张第一目标图片中的各第一目标图片分别对应的录像片段的起始时间点和结束时间点。对于该多张第一目标图片中的任一第一目标图片,该第一目标图片对应的录像片段的起始时间点为该第一目标图片对应的采集时间之前的第n秒,结束时间点为该第一目标图片对应的采集时间之后的第m秒。In this embodiment, when there are multiple first target pictures with the same acquisition time, the search device may determine the start of the video segment corresponding to each first target picture in the multiple first target pictures according to a preset policy. Time point and end time point. For any first target picture among the plurality of first target pictures, the start time point of the video clip corresponding to the first target picture is the n-th second and end time point before the acquisition time corresponding to the first target picture M seconds after the acquisition time corresponding to the first target picture.
搜索设备确定了起始时间点和结束时间点之后,可以搜索该第一目标图片所属的视频数据通道的录像数据中是否存在该起始时间点的I帧(即该视频数据通道中是否存在该起始时间点的关键帧),以及是否存在该结束时间点的I帧(即该视频数据通道中是否存在该结束时间点的关键帧)。若均存在I帧,则搜索设备可以直接将该第一目标图片对应的录像片段确定为目标录像片段,并丢弃该多条第一目标图片中的其余第一目标图片。After the search device determines the start time point and the end time point, it can search whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs (that is, whether the video data channel exists The key frame at the start time point), and whether there is an I frame at the end time point (that is, whether there is a key frame at the end time point in the video data channel). If there are I frames, the search device may directly determine the video clip corresponding to the first target picture as the target video clip, and discard the remaining first target pictures among the plurality of first target pictures.
进一步地,在该实施例中,当该多张第一目标图片不存在起始时间点(该第一目标图片的采集时间之前的第n秒)的I帧或/和结束时间点(该第一目标图片的采集时间之后的第m秒)的I帧时,对于该多张第一目标图片中的任一第一目标图片,若该第一目标图片所属的视频数据通道的录像数据中不存在该起始时间点的I帧,则搜索设备可以将第一目标图片对应的录像片段的起始时间点增加x秒,并搜索新的起始时间点的是否存在I帧,若不存在,则再次将该起始时间点增加x秒,并搜索新的起始时间点是否存在I帧,重复该操作,直至在该第一目标图片所属的视频数据通道的录像数据中搜索到该起始时间点(更新后的起始时间点)的I帧,或,该第一目标图片对应的录像片段的起始时间点与第一目标图片的采集时间相同。其中,x为正数。Further, in this embodiment, when there is no start time point (the n-th second before the acquisition time of the first target picture) of the plurality of first target pictures, the I frame or / and the end time point (the first M frame seconds after the acquisition time of a target picture), for any first target picture of the plurality of first target pictures, if the recording data of the video data channel to which the first target picture belongs is not If there is an I frame at the start time point, the search device may increase the start time point of the video clip corresponding to the first target picture by x seconds, and search for the existence of an I frame at the new start time point. If not, Then increase the starting time point by x seconds again, and search for a new starting time point if there is an I frame, and repeat this operation until the starting point is found in the video data of the video data channel to which the first target picture belongs. The I frame at the time point (the updated start time point), or the start time point of the video clip corresponding to the first target picture is the same as the acquisition time of the first target picture. Among them, x is a positive number.
同理,若该第一目标图片所属的视频数据通道的录像数据中不存在该结束时间点的I帧,则搜索设备可以将第一目标图片对应的录像片段的结束时间点减少x秒,并搜索是否存在该结束时间点的I帧,若不存在,则再次将该第一目标图片对应的录像片段的结束时间点减少x秒,并搜索是否存在该结束时间点的I帧,重复该操作,直至在该第一目标图片所属的视频数据通道的录像数据中搜索到该结束时间点(更新后的结束时间点)的I帧,或,该第一目标图片对应的录像片段的起始时间点与第一目标图片的采集时间相同。Similarly, if the I frame at the end time point does not exist in the recording data of the video data channel to which the first target picture belongs, the search device may reduce the end time point of the video clip corresponding to the first target picture by x seconds, and Search whether there is an I frame at the end time point. If it does not exist, reduce the end time point of the video segment corresponding to the first target picture by x seconds again, and search for whether there is an I frame at the end time point, and repeat the operation. Until the I frame at the end time point (the updated end time point) is searched in the recording data of the video data channel to which the first target picture belongs, or the start time of the video clip corresponding to the first target picture The points are collected at the same time as the first target picture.
例如,当第一目标图片的采集时间为1分0秒时,则起始时间点可以为0分58秒,结束时间点可以为1分3秒。搜索该第一目标图片所属的视频数据通道的录像数据 中是否在0分58秒存在I帧。若不存在I帧,则可以将该起始时间点增加1秒得到新的起始时间,0分59秒,继续搜索0分59秒的录像数据中是否存在I帧。若仍不存在I帧,则将该起始时间点增加1秒得到新的起始时间点,1分0秒,由于该起始时间点已经和采集时间相同,不再继续进行判断,将采集时间作为起始时间。同样的,搜索该第一目标图片所属的视频数据通道的录像数据中是否在1分3秒存在I帧。若不存在I帧,则可以将该结束时间点减少1秒得到新的结束时间,1分2秒,继续搜索1分2秒的录像数据中是否存在I帧。若存在I帧,则将1分0秒到1分2秒之间的录像数据作为该第一目标图片对应的录像片段。若仍不存在I帧,则继续将该结束时间点减少1秒得到新的结束时间,并继续进行判断。直到新的结束时间点为1分0秒。在这种情况下,将1分0秒这一秒的录像数据作为该第一目标图片对应的录像片段。For example, when the collection time of the first target picture is 1 minute and 0 seconds, the start time point may be 0 minutes and 58 seconds, and the end time point may be 1 minute and 3 seconds. It is searched whether the video data of the video data channel to which the first target picture belongs has an I frame at 0:58. If there is no I frame, you can increase the start time point by 1 second to obtain a new start time, 0 minutes and 59 seconds, and continue to search for whether there is an I frame in the video data of 0 minutes and 59 seconds. If the I frame still does not exist, increase the start time point by 1 second to obtain a new start time point, 1 minute and 0 seconds. Since the start time point is already the same as the acquisition time, no further judgment will be performed and the acquisition will be continued. Time is used as the start time. Similarly, it is searched whether the I frame exists in the video data of the video data channel to which the first target picture belongs in 1 minute and 3 seconds. If there is no I-frame, the end time point can be reduced by 1 second to obtain a new end time, 1 minute and 2 seconds, and it is continued to search whether there is an I-frame in the recording data of 1 minute and 2 seconds. If there is an I frame, the recording data between 1 minute and 0 seconds and 1 minute and 2 seconds is used as the recording segment corresponding to the first target picture. If the I frame still does not exist, continue to reduce the end time point by 1 second to obtain a new end time, and continue to judge. Until the new end time point is 1 minute and 0 seconds. In this case, the video data of 1 minute and 0 seconds is used as the video clip corresponding to the first target picture.
搜索设备按照上述方式确定了该多个第一目标图片分别对应的录像片段的起始时间点与结束时间点之后,可以确定该多张第一目标图片中对应的录像片段的时长最长(结束时间点与起始时间点的差值最大)的第一目标图片,将该第一目标图片对应的录像片段确定为目标录像片段,并丢弃该多张第一目标图片中的其余的第一目标图片。After the search device determines the start time and end time of the video clips corresponding to the multiple first target pictures in the foregoing manner, it can determine that the duration of the corresponding video clip in the multiple first target pictures is the longest (end First target picture with the largest difference between the time point and the starting time point), determine the video clip corresponding to the first target picture as the target video clip, and discard the remaining first targets in the plurality of first target pictures image.
需要说明的是,在该实施例中,当该多张第一目标图片中对应的录像片段的时长最长的第一目标图片的数量大于1时,可以按照预设策略从其中选择一个第一目标图片,并将所选择的第一目标图片对应的录像片段确定为目标录像片段。例如,可以选择起始时间点最早的第一目标图片、选择结束时间最晚的第一目标图片或随机选择。It should be noted that, in this embodiment, when the number of the first target pictures with the longest duration of the corresponding video clips in the plurality of first target pictures is greater than 1, a first one may be selected according to a preset strategy. A target picture, and a video clip corresponding to the selected first target picture is determined as the target video clip. For example, the first target picture with the earliest start time, the first target picture with the latest end time, or a random selection may be selected.
此外,在本申请实施例中,当存在采集时间相同的多张第一目标图片,也可以通过手动的方式实现该多张第一目标图片的去重复处理。例如,搜索设备可以将该多张第一目标图片展示在指定界面中,由用户选择需要保留的第一目标图片,并将其余采集时间相同的第一目标图片丢弃。In addition, in the embodiment of the present application, when there are multiple first target pictures with the same acquisition time, deduplication processing of the multiple first target pictures may also be implemented manually. For example, the search device may display the plurality of first target pictures in a specified interface, the user selects the first target picture to be retained, and discards the remaining first target pictures with the same acquisition time.
进一步地,在本申请实施例中,考虑到当存在多个第一目标图片时,各第一目标图片对应的目标录像片段的时间段可能会存在重叠,进而影响视频摘要的正常播放,因此,为了优化视频摘要的播放效果,可以对各目标录像片段进行去重复时间处理。Further, in the embodiment of the present application, when there are multiple first target pictures, it may be considered that the time segments of the target video clips corresponding to the first target pictures may overlap, thereby affecting the normal playback of the video summary. Therefore, In order to optimize the playback effect of the video summary, deduplication time processing can be performed on each target video clip.
相应地,在本申请其中一个实施例中,根据各目标录像片段生成视频摘要,可以包括:根据各目标录像片段的起始时间点和结束时间点对目标录像片段进行过滤,以去除重复时间的录像数据;根据过滤后的目标录像片段生成视频摘要。Correspondingly, in one of the embodiments of the present application, generating a video summary according to each target recording segment may include: filtering the target recording segment according to a start time point and an end time point of each target recording segment to remove duplicate time Video data; video summary is generated based on the filtered target video clips.
在该实施例中,搜索设备确定了各第一目标图片对应的目标录像片段之后,可以根据各目标录像片段的起始时间点和结束时间点对目标录像片段进行过滤,以去除重复时间的录像数据。In this embodiment, after the search device determines the target video clip corresponding to each first target picture, the target video clip can be filtered according to the start time point and the end time point of each target video clip to remove the video with a repeated time. data.
在一个示例中,根据各目标录像片段的起始时间点和结束时间点对目标录像片段进行过滤,包括:按照各目标录像片段的起始时间点对各目标录像片段进行排序;对于相邻的第一目标录像片段和第二目标录像片段,当第一目标录像片段的结束时间点大于等于第二目标录像片段的起始时间点时,若第一目标录像片段和第二目标录像片段属于同一视频数据通道,则将第一目标录像片段和第二目标录像片段合并,合并后的录像片段的起始时间点为第一目标录像片段的起始时间点,结束时间点为第二目标录像片段的结束时间点;若第一目标录像片段和第二目标录像片段属于不同视频数据通道,则以 第一目标录像片段的结束时间点作为第二目标录像片段的起始时间点,或,以第二目标片段的起始时间点作为第一目标录像片段的结束时间点。In an example, filtering the target video clips according to the start time point and the end time point of each target video clip includes: sorting the target video clips according to the start time point of each target video clip; The first target video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second target video clip belong to the same For the video data channel, the first target video clip and the second target video clip are combined, and the start time point of the merged video clip is the start time point of the first target video clip and the end time point is the second target video clip. The end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as the start time point of the second target video clip, or The start time point of the two target clips is used as the end time point of the first target video clip.
具体地,在该示例中,搜索设备可以按照各目标录像片段的起始时间点对各目标录像片段进行排序。例如,搜索设备可以根据各目标录像片段的起始时间点,利用冒泡排序方式,将各目标录像片段进行排序。Specifically, in this example, the search device may sort each target video clip according to a start time point of each target video clip. For example, the search device may sort each target video clip by using a bubble sorting method according to the start time point of each target video clip.
对于相邻的两个目标录像片段(本文中称为第一目标录像片段和第二目标录像片段,第一目标录像片段的起始时间点小于第二目标录像片段的起始时间点),当第一目标录像片段的结束时间点大于等于第二目标录像片段的起始时间点时,可以根据第一目标录像片段和第二目标录像片段所属的视频数据通道进行相应的处理。For two adjacent target video clips (referred to herein as the first target video clip and the second target video clip, the start time point of the first target video clip is less than the start time point of the second target video clip), when When the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, corresponding processing may be performed according to the video data channel to which the first target video clip and the second target video clip belong.
若第一目标录像片段和第二目标录像片段属于同一视频数据通道,则搜索设备可以将第一目标录像片段和第二目标录像片段合并,合并后的录像片段的起始时间点为第一目标录像片段的起始时间点,结束时间为第二目标录像片段的结束时间点。If the first target video clip and the second target video clip belong to the same video data channel, the search device may merge the first target video clip and the second target video clip, and the start time point of the merged video clip is the first target. The start time point of the video clip, and the end time is the end time point of the second target video clip.
若第一目标录像片段和所述第二目标录像片段属于不同视频数据通道,则搜索设备可以以第一目标录像片段的结束时间点作为第二目标录像片段的起始时间点,或,搜索设备可以以第二目标片段的起始时间点作为第一目标录像片段的结束时间点,其具体实现将在下文中结合具体实例进行说明。If the first target video clip and the second target video clip belong to different video data channels, the search device may use the end time point of the first target video clip as the start time point of the second target video clip, or the search device The start time point of the second target clip can be used as the end time point of the first target video clip, and its specific implementation will be described below in conjunction with a specific example.
为了使本领域技术人员更好地理解本申请实施例提供的技术方案,下面结合具体实例和附图对本申请实施例提供的技术方案进行说明。In order to enable those skilled in the art to better understand the technical solutions provided by the embodiments of the present application, the technical solutions provided by the embodiments of the present application are described below with reference to specific examples and drawings.
在该实施例中,以搜索设备为NVR,视频源设备为IPC,目标为人脸,目标搜索为人脸图片搜索为例,视频摘要生成方案的具体实现流程如下。In this embodiment, the search device is an NVR, the video source device is an IPC, the target is a face, and the target search is a face picture search. The specific implementation process of the video summary generation scheme is as follows.
1、人脸识别和图片存储1.Face recognition and picture storage
若NVR接入普通IPC,NVR可以在GPU(Graphics Processing Unit,图形处理单元)上使用结构化视频分析技术对IPC传输的实时视频流进行人脸识别,以得到实时视频流中的人脸图片信息。若NVR接入专业人脸抓拍IPC,则NVR可以从IPC端接收人脸图片信息。其中,人脸图片信息可以包括但不限于人脸图片、人脸图片的结构化信息、人脸图片的采集时间、人脸图片的通道号以及人脸图片的通道名称(用于记录地址信息)等。If the NVR is connected to the ordinary IPC, the NVR can use the structured video analysis technology on the GPU (Graphics Processing Unit) to perform face recognition on the real-time video stream transmitted by the IPC to obtain the face picture information in the real-time video stream . If the NVR is connected to a professional face capture IPC, the NVR can receive face picture information from the IPC end. The face picture information may include, but is not limited to, a face picture, structured information of the face picture, acquisition time of the face picture, channel number of the face picture, and channel name of the face picture (for recording address information) Wait.
NVR为IPC和NVR之间的每个视频数据通道(简称通道)维护一个缓冲区,用于存放3秒内收到的人脸图片信息。其中,当人脸图片信息在该缓冲区中存放的时间达到3秒时,NVR将该人脸图片从该缓冲区中删除。The NVR maintains a buffer for each video data channel (referred to as the channel) between the IPC and the NVR, which is used to store the face picture information received within 3 seconds. When the face picture information is stored in the buffer for 3 seconds, the NVR deletes the face picture from the buffer.
对于任一通道,当NVR从该通道中获取到新的人脸图片信息(包括直接从IPC中接收到的人脸图片信息或通过人脸检测得到的人脸图片信息),将该人脸图片信息与该通道对应的缓冲区中的人脸图片信息进行比对,若该缓冲区中存在同一人脸的人脸图片信息,则将该新获取到的人脸图片信息丢弃,或者,可以将该人脸图片信息加入到缓冲区,并保存至图片信息库。For any channel, when the NVR obtains new face picture information (including face picture information received directly from IPC or face picture information obtained through face detection) from the channel, the face picture The information is compared with the face picture information in the buffer corresponding to the channel. If the face picture information of the same face exists in the buffer, the newly obtained face picture information is discarded, or the The face picture information is added to the buffer and saved to the picture information database.
需要说明的是,若NVR获取到新的人脸图片信息,且确定缓冲区中不存在同一 人脸的人脸图片信息,但该缓冲区中不存在空闲空间,则NVR可以将该新获取到的人脸图片信息覆盖最早存入缓冲区中的人脸图片信息,并将该新获取到的人脸图片信息保存至图片信息库。It should be noted that if the NVR obtains new face picture information, and it is determined that no face picture information of the same face exists in the buffer, but there is no free space in the buffer, the NVR can obtain the newly obtained The face image information of the overwrites the earliest stored face image information in the buffer, and saves the newly obtained face image information to the picture information database.
其中,NVR进行人脸识别和图片存储的流程示意图可以如图3所示。The flow diagram of the face recognition and picture storage performed by the NVR can be shown in FIG. 3.
2、录像融合参数配置2. Video fusion parameter configuration
可配置的录像融合参数可以包括但不限于以下参数。The configurable recording fusion parameters may include, but are not limited to, the following parameters.
通道选择:若选择单通道,代表单通道人脸图片信息搜索以及录像融合;若勾选指定的多个通道,代表多个通道的人脸图片信息搜索以及多通道录像融合。Channel selection: if single channel is selected, it means single-channel face image information search and video fusion; if the specified multiple channels are checked, face image information search and multi-channel video fusion are represented on multiple channels.
搜索时间段范围:代表要搜索匹配哪个时间段范围内的人脸图片信息。Search time range: Represents the time range information of the face image to be searched for.
相似度阈值:如阈值为90%,代表相似度结果为90%及以上才认为人脸图片匹配成功。Similarity threshold: If the threshold is 90%, the similarity result is 90% and above.
录像融合时间段:即人脸图片的采集时间前n秒以及后m秒的范围参数,如配置前2(即n=2)秒、后3(即m=3)秒,代表需要截取该人脸图片的采集时间之前2秒以及该人脸图片的采集时间之后3s的时间范围内的录像片段。Video fusion time period: the range of n seconds before and m seconds after the face image collection time. For example, configure the first 2 (ie n = 2) seconds and the last 3 (ie m = 3) seconds to indicate that the person needs to be intercepted. Video clips within a time range of 2 seconds before the collection time of the face picture and 3 seconds after the collection time of the face picture.
重复图片过滤模式:分为自动和手动两种模式,不同模式的处理方式在第3点中阐述。Duplicate picture filtering mode: There are two modes: automatic and manual. The processing methods of different modes are explained in point 3.
3、重复图片过滤3.Repeat image filtering
NVR接收人脸搜索请求,该人脸搜索请求中携带有目标人脸图片以及录像融合参数。The NVR receives a face search request, and the face search request carries a target face picture and a video fusion parameter.
以图搜图:NVR对目标人脸图片进行建模,对图片信息库中与人脸搜索请求中携带的通道号以及搜索时间段范围匹配的人脸图片信息进行搜索比对,计算相似度,列出与目标人脸图片相似度大于等于相似度阈值的人脸图片,该人脸图片和人脸图片对应的采集时间构成人脸图片信息(下文中称为第一人脸图片信息),多个第一人脸图片信息可以记载在第一人脸图片信息列表中。Image search: NVR models the target face picture, searches and compares the face picture information in the picture information database that matches the channel number and search time range carried in the face search request, and calculates the similarity. List face pictures whose similarity with the target face picture is greater than or equal to the similarity threshold. The acquisition time corresponding to the face picture and the face picture constitutes face picture information (hereinafter referred to as the first face picture information). The first face picture information may be recorded in the first face picture information list.
按各第一人脸图片信息中的采集时间从早到晚的顺序,对第一人脸图片信息列表中各第一人脸图片进行排序。Sort the first face pictures in the first face picture information list according to the collection time in the first face picture information from morning to night.
若存在多个第一人脸图片的采集时间一致(当多个IPC的视角范围有区域重叠时,可能会出现多个第一人脸图片的采集时间一致的情况),对该多个第一人脸图片进行重复图片过滤。其中,对于该多个第一人脸图片中的任一第一人脸图片,将该第一人脸图片的采集时间减去n秒,得到该第一人脸图片对应的录像片段的起始时间点x,并将该第一人脸图片的采集时间加上m秒,得到该第一人脸图片对应的录像片段的结束时间点y。NVR分别在该通道的视频录像中搜索x和y这两个时间点的I帧,若时间点x不存在I帧,或/和,时间点y不存在I帧,则令x=x+1,或/和,令y=y-1,并继续搜索时间点x或/和y这两个时间点的I帧,直至找到时间点x和时间点y的I帧,或,x=y。其中,当x=y时,代表该第一人脸图片对应的录像片段的时长为1秒。If there are multiple first face pictures with the same acquisition time (when the IPC's perspective ranges overlap, there may be multiple first face pictures with the same acquisition time). Face pictures are filtered repeatedly. For any one of the first face pictures, subtracting n seconds from the acquisition time of the first face picture to obtain the start of the video segment corresponding to the first face picture The time point x, and the acquisition time of the first face picture is added to m seconds to obtain the end time point y of the video clip corresponding to the first face picture. The NVR searches the video recordings of this channel respectively for the I frames at time points x and y. If there are no I frames at time point x, and / or, and no I frames at time point y, then let x = x + 1 , Or / and, let y = y-1, and continue to search for I frames at time points x or / and y, until I frames at time points x and y are found, or x = y. When x = y, the duration of the video clip corresponding to the first face picture is 1 second.
重复图片过滤模式包括手动过滤模式或自动过滤模式。手动过滤模式为:在指定界面输出该多个第一人脸图片,由用户勾选确认要保留的第一人脸图片,其他相同采集时间的第一人脸图片丢弃。自动过滤模式为:将该多个第一人脸图片中,对应的录像片段最长的第一人脸图片保留,其它相同采集时间的第一人脸图片丢弃,其中,当该多个第一人脸图片中,对应的录像片段最长的第一人脸图片存在多个时,保留其中一个,其余丢弃。Duplicate picture filtering mode includes manual filtering mode or automatic filtering mode. The manual filtering mode is: the multiple first face pictures are output on the specified interface, the user checks the first face pictures to be retained, and the other first face pictures at the same acquisition time are discarded. The automatic filtering mode is: among the plurality of first face pictures, the first face picture corresponding to the longest video clip is retained, and other first face pictures with the same acquisition time are discarded. When there are multiple first face pictures corresponding to the longest video clip in the face picture, one of them is retained, and the rest are discarded.
将重复图片过滤后的第一人脸图片信息形成最终的第一人脸图片信息列表。The first face image information after filtering the repeated pictures is formed into a final first face image information list.
其中,NVR生成最终的第一人脸图片信息列表的流程示意图可以如图4所示。The schematic flowchart of the NVR generating the final first face picture information list can be shown in FIG. 4.
4、重复录像片段过滤4.Repeat video clip filtering
基于最终的第一人脸图片信息列表,创建一个录像片段时间段元素集合,每一个元素包括如下信息:第一人脸图片所属通道的通道号、第一人脸图片对应的录像片段的起始时间点(存在I帧的起始时间点)以及第一人脸图片对应的录像片段的结束时间点(存在I帧的结束时间点);其中,各元素按照起始时间点从早到晚的顺序排序。Based on the final first face picture information list, a time segment element set of video clips is created, and each element includes the following information: the channel number of the channel to which the first face picture belongs, and the start of the video clip corresponding to the first face picture The time point (the starting time point of the I frame exists) and the end time point of the video clip corresponding to the first face picture (the end time point of the I frame exists); among them, each element is from early to late according to the starting time point Sort in order.
重复录像片段过滤:对于录像片段时间段元素集合相邻的两个元素(下文中分别称为第一元素和第二元素),假设第一元素和第二元素的起始时间点和结束时间点分别为[A,B]和[C,D],其中,A<C≤B。若第一元素和第二元素包括的通道号相同,则将第一元素和第二元素合并为1个元素,该元素的起始时间点为A,结束时间点为D。若第一元素和第二元素包括的通道号不同,则判断第二元素所属通道的视频录像中是否存在时间点B的I帧。若存在时间点B的I帧,则将第二元素的起始时间点更新为B,即第一元素和第二元素的起始时间点和结束时间点分别为[A,B]和[B,D]。若不存在时间点B的I帧,则将第一元素的起始时间点更新为C,即第一元素和第二元素的起始时间点和结束时间点分别为[A,C]和[C,D]。Duplicate video clip filtering: For the two adjacent elements of the video clip time period element set (hereinafter referred to as the first element and the second element respectively), it is assumed that the start time and the end time of the first element and the second element [A, B] and [C, D], where A <C≤B. If the first element and the second element include the same channel number, the first element and the second element are combined into one element, and the start time point of the element is A and the end time point is D. If the channel numbers included in the first element and the second element are different, it is determined whether there is an I frame at time point B in the video recording of the channel to which the second element belongs. If there is an I frame at time point B, the start time point of the second element is updated to B, that is, the start time point and end time point of the first element and the second element are [A, B] and [B , D]. If there is no I frame at time point B, the start time point of the first element is updated to C, that is, the start time point and end time point of the first element and the second element are [A, C] and [ C, D].
遍历整个录像片段时间段元素集合,最终形成用于生成视频摘要的录像片段元素集合。Traverse the entire set of video clip time period elements to form a video clip element set used to generate a video summary.
5、视频摘要生成5. Video summary generation
基于用于生成视频摘要的录像片段元素集合,从对应通道的视频录像中获取对应的录像数据,并根据所获取到的录像数据生成视频摘要。其中,该视频摘要可以支持用户下载,且用户可以导出用于生成视频摘要的录像片段元素集合。Based on the set of video clip elements used to generate the video summary, corresponding video data is obtained from the video video of the corresponding channel, and a video summary is generated based on the obtained video data. The video summary can be downloaded by the user, and the user can export a set of video clip elements used to generate the video summary.
本申请实施例中,通过接收目标搜索请求,搜索与该目标搜索请求中携带的待搜索目标的特征信息匹配的第一目标图片,并根据各第一目标图片和各第一目标图片对应的采集时间生成待搜索目标的视频摘要,提高了在视频录像中定位目标的效率和准确性,在去除与待搜索目标不匹配的视频录像的基础上,保证了目标视频跟踪的连贯性。In the embodiment of the present application, by receiving a target search request, a first target picture that matches the feature information of the target to be searched carried in the target search request is searched, and according to each first target picture and the corresponding collection of each first target picture The video summary of the target to be searched is generated in time, which improves the efficiency and accuracy of locating the target in the video recording. On the basis of removing the video recording that does not match the target to be searched, the consistency of the target video tracking is guaranteed.
请参见图5,为本申请另一实施例提供的一种视频摘要生成方法的流程示意图,其中,该视频摘要生成方法可以应用于检索设备。在该例中,视频摘要生成方法针对人脸搜索请求。如图5所示,该视频摘要生成方法可以包括以下步骤。Please refer to FIG. 5, which is a schematic flowchart of a video abstract generating method according to another embodiment of the present application. The video abstract generating method may be applied to a retrieval device. In this example, the video digest generation method is directed to a face search request. As shown in FIG. 5, the video summary generating method may include the following steps.
步骤S500、获取并存储视频源设备的视频数据中的人脸图片、人脸图片的采集 时间以及人脸图片的属性信息。Step S500: Acquire and store a face picture in the video data of the video source device, a collection time of the face picture, and attribute information of the face picture.
本实施例中,为了提高人脸检索的效率,检索设备可以获取并存储视频源设备的视频数据中的人脸图片信息。其中,该人脸图片信息可以包括但不限于人脸图片、人脸图片的采集时间以及人脸图片的属性信息等。In this embodiment, in order to improve the efficiency of face retrieval, the retrieval device may acquire and store the face picture information in the video data of the video source device. The face picture information may include, but is not limited to, a face picture, a collection time of the face picture, and attribute information of the face picture.
本申请实施例中,人脸图片的属性信息可以包括但不限于以下之一或多种:面部表情(如是否微笑)、是否戴眼镜、性别、年龄段和民族。In the embodiment of the present application, the attribute information of the face picture may include, but is not limited to, one or more of the following: facial expression (such as whether to smile), whether to wear glasses, gender, age range, and ethnicity.
获取视频源设备的视频数据中的人脸图片、人脸图片的采集时间以及人脸图片的属性信息的具体实现方法与前述获取目标图片信息的方法类似,将目标图片信息替换成人脸图片信息即可,在此不再赘述。The specific implementation method for obtaining the face picture, the collection time of the face picture, and the attribute information of the face picture in the video data of the video source device is similar to the foregoing method of obtaining the target picture information. The target picture information is replaced with the adult face picture information. Yes, I wo n’t repeat them here.
本例中,检索设备获取到视频源设备的视频数据中的人脸图片、人脸图片的采集时间以及人脸图片的属性信息之后,可以存储所获取到的人脸图片、人脸图片的采集时间以及人脸图片的属性信息。In this example, after the retrieval device obtains the face picture in the video data of the video source device, the acquisition time of the face picture, and the attribute information of the face picture, the retrieved face picture and the collection of the face picture can be stored. Time and attribute information of face pictures.
存储视频源设备的视频数据中的人脸图片、人脸图片的采集时间以及人脸图片的属性信息,可以包括:存储人脸图片;在人脸图片信息表中记录人脸图片的存储位置、人脸图片的采集时间以及人脸图片的属性信息。Storing the face picture in the video data of the video source device, the collection time of the face picture, and the attribute information of the face picture may include: storing the face picture; recording the storage location of the face picture in the face picture information table, Collection time of face pictures and attribute information of face pictures.
检索设备获取到人脸图片、人脸图片的采集时间以及人脸图片的属性信息之后,可以存储所获取到的人脸图片,并将人脸图片的存储位置、人脸图片的采集时间以及人脸图片的属性信息记录在人脸图片信息表,其格式可以如表1所示:After the retrieval device obtains the face picture, the collection time of the face picture, and the attribute information of the face picture, it can store the obtained face picture, and store the storage location of the face picture, the collection time of the face picture, and the person. The face picture attribute information is recorded in the face picture information table, and its format can be shown in Table 1:
表1Table 1
人脸图片的位置信息Face image location information 人脸图片的采集时间Face image collection time 人脸图片的属性信息Face image attribute information
人脸图片1的位置信息Location information of face picture 1 人脸图片1的采集时间Acquisition time of face picture 1 人脸图片1的属性信息Attribute information of face picture 1
人脸图片2的位置信息Location information of face picture 2 人脸图片2的采集时间Acquisition time of face picture 2 人脸图片2的属性信息Attribute information of face picture 2
... ... ...
其中,人脸图片的位置信息可以为人脸图片在存储空间(如硬盘)中的位置偏移和长度。The position information of the face picture may be a position offset and a length of the face picture in a storage space (such as a hard disk).
上述存储视频源设备的视频数据中人脸图片、人脸图片的采集时间以及人脸图片的属性信息的实现方式仅仅是本申请中存储人脸图片、人脸图片的采集时间以及人脸图片的属性信息的一种具体示例,而并不是对本申请保护范围的限定,即本申请实施例中,也可以通过其他方式存储视频源设备的视频数据中人脸图片、人脸图片的采集时间以及人脸图片的属性信息。The above implementation manner of storing the face picture, the collection time of the face picture, and the attribute information of the face picture in the video data of the video source device is only to store the face picture, the collection time of the face picture, and the face picture in this application. A specific example of the attribute information is not a limitation on the protection scope of the present application, that is, in the embodiments of the present application, the face picture, the time of collecting the face picture, and the person in the video data of the video source device may also be stored in other ways. Attribute information of face pictures.
例如,在一个示例中,可以将人脸图片、人脸图片的采集时间以及人脸图片的属性信息存储在同一个数据库中(即直接将人脸图片以二进制的形式存储到数据库中)。在该示例中,可以将人脸图片、人脸图片的采集时间以及人脸图片的属性信息存储在同一个数据表,此时,不需要额外记录人脸图片的存储位置。For example, in one example, the face picture, the acquisition time of the face picture, and the attribute information of the face picture can be stored in the same database (that is, the face picture is directly stored in the database in a binary form). In this example, the face picture, the collection time of the face picture, and the attribute information of the face picture can be stored in the same data table. At this time, there is no need to additionally record the storage location of the face picture.
在另一个示例中,仍然可以先存储人脸图片以得到人脸图片的存储位置,但是 存储人脸图片的存储位置、人脸图片的采集时间以及人脸图片的属性信息时,不再以数据表的形式存储,而是采用其他形式存储,如树形结构或文件形式,其具体实现在此不做赘述。In another example, the face picture can still be stored first to obtain the storage location of the face picture, but when the storage location of the face picture, the collection time of the face picture, and the attribute information of the face picture are no longer stored in the data Tables are stored in the form of other forms, such as a tree structure or a file. The specific implementation is not described here.
步骤S510、当接收到人脸检索请求时,根据人脸检索请求中携带的人脸检索过滤条件,确定与人脸检索过滤条件匹配的第一目标人脸图片。Step S510: When a face retrieval request is received, a first target face picture matching the face retrieval filter condition is determined according to the face retrieval filter condition carried in the face retrieval request.
检索设备可以提供人脸检索功能,根据接收到的人脸检索请求中携带的人脸检索过滤条件检索匹配的人脸图片,并同时可以得到匹配的人脸图片的采集时间。The retrieval device can provide a face retrieval function, which retrieves matching face pictures according to the face retrieval filter conditions carried in the received face retrieval request, and at the same time can obtain the collection time of the matching face pictures.
例如,检索设备可以提供人脸检索请求界面,该人脸检索请求界面中可以包括人脸检索过滤条件输入区域或/和人脸检索过滤条件选项,由用户在该人脸检索请求界面中输入或/和选择人脸检索过滤条件,并提交人脸检索请求。For example, the retrieval device may provide a face retrieval request interface, and the face retrieval request interface may include a face retrieval filter condition input area or / and a face retrieval filter condition option, which is entered by the user in the face retrieval request interface or / And select a face search filter and submit a face search request.
在一个示例中,人脸检索过滤条件为待检索人脸图片的属性信息(本文中可以称为待检索人脸图片的第三属性信息),其可以包括但不限于待检索人脸的面部表情、是否戴眼镜、性别以及年龄等信息中的一个或多个。In one example, the face retrieval filter condition is attribute information of a face picture to be retrieved (this may be referred to as third attribute information of the face picture to be retrieved), which may include, but is not limited to, facial expressions of the face to be retrieved , Whether or not you wear glasses, gender, and age.
在另一个示例中,人脸检索过滤条件可以包括待检索人脸图片和待检索人脸图片的第三属性信息。In another example, the face retrieval filtering condition may include a face picture to be retrieved and third attribute information of the face picture to be retrieved.
检索设备接收到人脸检索请求时,可以获取该人脸检索请求中携带的人脸检索过滤条件,并根据该人脸检索过滤条件查询所存储的人脸图片、人脸图片的采集时间以及人脸图片的属性信息,并将与人脸检索过滤条件匹配的人脸图片的属性信息对应的人脸图片,确定为与该人脸检索过滤条件匹配的人脸图片(本文中称为第一目标人脸图片)。When the retrieval device receives a face retrieval request, it can obtain the face retrieval filter conditions carried in the face retrieval request, and query the stored face pictures, the collection time of the face pictures, and the person according to the face retrieval filter conditions. Attribute information of a face picture, and determine a face picture corresponding to the attribute information of a face picture matching a face retrieval filter condition as a face picture matching a face retrieval filter condition (referred to herein as a first target Face picture).
举例来说,假设检索设备按照人脸图片信息表的形式存储人脸图片、人脸图片的采集时间以及人脸图片的属性信息(参见步骤S500中的相关描述),则检索设备可以根据人脸检索过滤条件查询人脸图片信息表中的人脸图片的属性信息,以得到与该人脸检索过滤条件匹配的人脸图片信息表项,并获取该人脸图片信息表项中的人脸图片的存储位置(即第一目标人脸图片的存储位置)及第一目标人脸图片的采集时间。For example, assuming that the retrieval device stores the face picture, the acquisition time of the face picture, and the attribute information of the face picture in the form of a face picture information table (see the related description in step S500), the retrieval device may The search filter conditions query the attribute information of the face picture in the face picture information table to obtain the face picture information entry that matches the face search filter condition, and obtain the face picture in the face picture information entry Storage location (that is, the storage location of the first target face picture) and the acquisition time of the first target face picture.
进而,检索设备可以根据第一目标人脸图片的存储位置从指定存储空间中获取第一目标人脸图片。Further, the retrieval device may obtain the first target face picture from the specified storage space according to the storage location of the first target face picture.
当人脸检索过滤条件包括待检索人脸图片和待检索人脸图片的第三属性信息时,上述比较人脸检索过滤条件以及人脸图片信息表中记录的人脸图片的属性信息,可以包括:对待检索人脸图片进行建模,并提取待检索人脸图片的第四属性信息;根据待检索人脸图片的第三属性信息和待检索人脸图片的第四属性信息,确定待检索人脸的属性信息;比较待检索人脸图片的属性信息以及人脸图片信息表中记录的人脸图片的属性信息。When the face retrieval filter condition includes the face image to be retrieved and the third attribute information of the face image to be retrieved, the comparison of the face retrieval filter condition and the attribute information of the face picture recorded in the face picture information table may include: : Model the face picture to be retrieved and extract the fourth attribute information of the face picture to be retrieved; determine the person to be retrieved based on the third attribute information of the face picture to be retrieved and the fourth attribute information of the face picture to be retrieved Attribute information of the face; compare the attribute information of the face picture to be retrieved with the attribute information of the face picture recorded in the face picture information table.
当人脸检索过滤条件包括待检索人脸图片和待检索人脸图片的第三属性信息时,检索设备可以对待检索人脸图片进行建模,并提取待检索人脸图片的属性信息(本文中称为待检索人脸图片的第四属性信息)。When the face retrieval filter condition includes the face image to be retrieved and the third attribute information of the face image to be retrieved, the retrieval device can model the face image to be retrieved and extract the attribute information of the face image to be retrieved (in this article Called the fourth attribute information of the face picture to be retrieved).
检索设备得到第四属性信息之后,可以根据第三属性信息和第四属性信息,确定待检索人脸的属性信息。After the retrieval device obtains the fourth attribute information, it can determine the attribute information of the face to be retrieved according to the third attribute information and the fourth attribute information.
例如,检索设备可以比较待检索人脸图片的第三属性信息和第四属性信息,对于第三属性信息中存在,但第四属性信息中不存在的属性信息,或第三属性信息中不存在,但第四属性信息中存在的属性信息,加入到该待检索人脸图片的属性信息;对于第三属性信息和第四属性信息中均存在的属性信息,将第四属性信息中的属性信息加入到该待检索人脸图片的属性信息,进而,得到该待检索人脸图片的属性信息。For example, the retrieval device may compare the third attribute information and the fourth attribute information of the face picture to be retrieved. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or does not exist in the third attribute information , But the attribute information existing in the fourth attribute information is added to the attribute information of the face picture to be retrieved; for the attribute information existing in both the third attribute information and the fourth attribute information, the attribute information in the fourth attribute information is added The attribute information of the face picture to be retrieved is added to obtain the attribute information of the face picture to be retrieved.
检索设备得到待检索人脸图片的属性信息时,可以根据待检索人脸图片的属性信息查询所存储的人脸图片、人脸图片的采集时间以及人脸图片的属性信息,并将与待检索人脸图片的属性信息匹配的人脸图片的属性信息对应的人脸图片,确定为与该人脸检索过滤条件匹配的第一目标人脸图片。When the retrieval device obtains the attribute information of the face picture to be retrieved, it can query the stored face picture, the acquisition time of the face picture, and the attribute information of the face picture according to the attribute information of the face picture to be retrieved, and compare it with the attribute information of the face picture to be retrieved. The face picture corresponding to the attribute information of the face picture whose attribute information matches the face picture is determined as the first target face picture that matches the face retrieval filter condition.
上述人脸图片信息的检索仅仅是在以人脸图片信息表的方式存储人脸图片信息的情况下的一种具体示例,而并不是对本申请保护范围的限定,即在本申请实施例中,也可以通过其他方式实现人脸图片信息的检索。The above retrieval of face picture information is only a specific example in the case of storing face picture information in the form of a face picture information table, and is not a limitation on the protection scope of the present application, that is, in the embodiment of the present application, Retrieval of face picture information can also be achieved in other ways.
例如,当人脸图片、人脸图片的采集时间以及人脸图片的属性信息存放在数据库中的同一个数据表中时,检索设备可以直接根据待检索人脸图片的属性信息,从数据库中查询相匹配的人脸图片的属性信息所在的表项,并从查询到的表项中获取人脸图片信息。For example, when a face picture, a collection time of the face picture, and attribute information of the face picture are stored in the same data table in the database, the retrieval device may directly query the database based on the attribute information of the face picture to be retrieved. The entry where the attribute information of the matching face picture is located, and the face picture information is obtained from the queried entry.
步骤S520、根据第一目标人脸图片及第一目标人脸图片的采集时间生成视频摘要,并对视频摘要进行回放。Step S520: Generate a video summary according to the first target face picture and the collection time of the first target face picture, and play back the video summary.
本申请实施例中,检索设备获取到与人脸检索过滤条件匹配的第一目标人脸图片及第一目标人脸图片的采集时间之后,可以根据第一目标人脸图片及第一目标人脸图片的采集时间生成带检索人脸的视频摘要,并对待检测人脸的视频摘要进行回放。本步骤的具体实现方法和上述步骤S220类似,只是将第一目标图片和第一目标图片对应的采集时间替换为第一目标人脸图片和第一目标人脸图片的采集时间,在此不再赘述。In the embodiment of the present application, after the retrieval device obtains the first target face picture and the collection time of the first target face picture that match the filtering conditions of the face search, the search device may use the first target face picture and the first target face. The collection time of the picture generates a video summary with the retrieved face, and plays back the video summary of the detected face. The specific implementation method of this step is similar to the above step S220, except that the acquisition time corresponding to the first target picture and the first target picture is replaced with the acquisition time of the first target face picture and the first target face picture, and is not repeated here. To repeat.
仍以视频源设备为IPC,检索设备为NVR为例进行详细说明。其中,NVR中加载有具有智能分析功能的智能芯片。在该例中,视频摘要生成方案实现流程如下。The video source device is IPC, and the retrieval device is NVR. Among them, the NVR is loaded with a smart chip with a smart analysis function. In this example, the video digest generation scheme implementation process is as follows.
1、人脸图片信息获取1.Face picture information acquisition
当IPC具有人脸图片抓拍功能时When IPC has a face picture capture function
IPC进行人脸图片抓拍,并将抓拍的人脸图片及人脸图片的采集时间传输给NVR。在该方式中,IPC在获取实时视频流的过程中,可以进行人脸图片抓拍,并将抓拍的人脸图片及人脸图片的采集时间(即人脸图片的抓拍时间)传输给NVR。需要说明的是,在该方式中,IPC也会将实时视频流传输给NVR,由NVR根据预设策略保存视频录像。The IPC captures a face picture, and transmits the captured face picture and the acquisition time of the face picture to the NVR. In this method, during the process of obtaining a real-time video stream, the IPC can capture a face picture, and transmit the captured face picture and the acquisition time of the face picture (that is, the capture time of the face picture) to the NVR. It should be noted that in this method, the IPC will also transmit the real-time video stream to the NVR, and the NVR saves the video recording according to a preset policy.
NVR提取人脸图片中的特征值,并根据人脸图片的特征值对人脸图片进行建模,提取人脸图片的属性信息。在该方式中,NVR接收到IPC传输的人脸图片时,可以通过智能芯片对人脸图片进行智能分析。智能芯片可以使用算法库,提取人脸图片中的关于人脸的特征值,并使用算法库,根据所提取的特征值对人脸图片进行建模,并提取人脸图片的属性信息。其中,NVR获取人脸图片信息的流程示意图可以如图6所示。The NVR extracts feature values in the face picture, models the face picture according to the feature values of the face picture, and extracts attribute information of the face picture. In this method, when the NVR receives the face picture transmitted by the IPC, it can intelligently analyze the face picture through a smart chip. The smart chip can use the algorithm library to extract the feature values of the face in the face picture, and use the algorithm library to model the face picture according to the extracted feature values and extract the attribute information of the face picture. The flow chart of the NVR obtaining the face picture information can be shown in FIG. 6.
当IPC不具有图片抓拍功能时When IPC does not have a picture capture function
NVR对视频录像或者实时视频流进行目标检测,以得到人脸图片及人脸图片的采集时间。在该方式中,NVR可以通过智能芯片对视频录像或实时视频流进行目标检测,以得到视频录像或实时视频流中的人脸图片及人脸图片的采集时间(即人脸图片在视频数据中出现的时间)。The NVR performs target detection on a video recording or a real-time video stream to obtain a face picture and a collection time of the face picture. In this method, the NVR can perform target detection on the video recording or real-time video stream through the smart chip to obtain the face picture and the acquisition time of the face picture in the video recording or real-time video stream (that is, the face picture is in the video data) Time of occurrence).
NVR提取人脸图片中的特征值,并根据人脸图片的特征值对人脸图片进行建模,提取人脸图片的属性信息。在该方式中,NVR接收到IPC传输的人脸图片时,可以通过智能芯片对人脸图片进行智能分析。智能芯片可以使用算法库,提取人脸图片中的关于人脸的特征值,并使用算法库,根据所提取的特征值对人脸图片进行建模,并提取人脸图片的属性信息。其中,NVR获取人脸图片信息的流程示意图可以如图7所示。The NVR extracts feature values in the face picture, models the face picture according to the feature values of the face picture, and extracts attribute information of the face picture. In this method, when the NVR receives the face picture transmitted by the IPC, it can intelligently analyze the face picture through a smart chip. The smart chip can use the algorithm library to extract the feature values of the face in the face picture, and use the algorithm library to model the face picture according to the extracted feature values and extract the attribute information of the face picture. The flow chart of the NVR obtaining the face picture information can be shown in FIG. 7.
NVR存储人脸图片、人脸图片的采集时间以及人脸图片的属性信息,具体实现参见后续描述。The NVR stores the face picture, the collection time of the face picture, and the attribute information of the face picture. For specific implementation, refer to the subsequent description.
2、人脸图片信息存储2.Face picture information storage
NVR存储人脸图片,以得到人脸图片的存储位置。对于人脸图片信息,建立一个人脸图片信息相关的数据库表FaceTable(人脸表),其中,该FaceTable表中主要字段有:人脸图片的存储位置、人脸图片的采集时间以及人脸图片的属性信息。需要说明的是,NVR还可以存储人脸图片的模型数据,其具体实现在此不做赘述。NVR将人脸图片存储到硬盘之后,可以得到人脸图片所在硬盘的位置偏移和长度(即人脸图片的存储位置)。The NVR stores the face picture to obtain the storage location of the face picture. For the face picture information, a database table FaceTable (face table) related to the face picture information is established, wherein the main fields in the FaceTable table are: the storage location of the face picture, the collection time of the face picture, and the face picture Attribute information. It should be noted that the NVR can also store model data of face pictures, and its specific implementation is not described here. After the NVR stores the face picture to the hard disk, the position offset and length of the hard disk where the face picture is located (that is, the storage location of the face picture) can be obtained.
NVR在FaceTable表中记录人脸图片的存储位置、人脸图片的采集时间以及人脸图片的属性信息。The NVR records the storage location of the face picture, the collection time of the face picture, and the attribute information of the face picture in the FaceTable.
3、人脸检索Face retrieval
接收人脸检索请求,该人脸检索请求中携带有人脸检索过滤条件。NVR可以提供人脸检索界面,该人脸检索界面中包括人脸检索过滤条件输入区域或/和选项。用户可以通过该人脸检索界面填写或/和选择人脸检索过滤条件,并提交人脸检索请求。A face retrieval request is received, and the face retrieval request carries a face retrieval filtering condition. The NVR can provide a face search interface. The face search interface includes a face search filter input area or / and options. The user can fill in or / and select a face search filter condition through the face search interface, and submit a face search request.
NVR根据人脸检索过滤条件从FaceTable表中查询匹配的人脸图片的存储位置和人脸图片的采集时间。NVR可以查询FaceTable表,比较人脸检索过滤条件和FaceTable表中记录的人脸图片的属性信息,将记录的人脸图片的属性信息与人脸检索过滤条件匹配的FaceTable表项中记录的人脸图片的存储位置和人脸图片的采集时间,确定为匹配的人脸图片的存储位置(即第一目标人脸图片的存储位置)和人脸图片的采集时间(即第一目标人脸图片的采集时间)。The NVR queries the storage location of the matching face pictures and the collection time of the face pictures from the FaceTable table according to the face retrieval filter conditions. The NVR can query the FaceTable table, compare the face retrieval filter conditions with the attribute information of the face pictures recorded in the FaceTable table, and compare the recorded face attribute information with the face records in the FaceTable entries that match the face retrieval filter conditions. The storage location of the picture and the collection time of the face picture are determined as the storage location of the matching face picture (that is, the storage location of the first target face picture) and the collection time of the face picture (that is, the first target face picture's Acquisition time).
NVR根据第一目标人脸图片的存储位置获取第一目标人脸图片。NVR可以根据第一目标人脸图片的存储位置(位置偏移+长度)从硬盘中读取第一目标人脸图片,从而,NVR可以得到第一目标人脸图片和第一目标人脸图片的采集时间。The NVR obtains the first target face picture according to the storage location of the first target face picture. The NVR can read the first target face picture from the hard disk according to the storage position (position offset + length) of the first target face picture, so that the NVR can obtain the first target face picture and the first target face picture. Acquisition time.
4、视频摘要生成4, video summary generation
对于任一第一目标人脸图片,获取第一目标人脸图片的采集时间之前的5秒到 第一目标人脸图片的报警之间之后的5秒的目标录像片段。NVR可以建立一张VideoTable1(录像信息表1)。其中,该VideoTable1表中主要字段有:录像存储位置(硬盘位置偏移+长度)以及录像数据的起止时间(起始时间和结束时间)。当一个完成的录像存储到硬盘后,在VideoTable1表中插入一条新的记录(即新增一条表项),记录录像存储位置以及起止时间。对于任一第一目标人脸图片,NVR将该第一目标人脸图片的采集时间之前的第5秒确定为目标录像片段的起始时间,第一目标人脸图片的采集时间之后的第5秒确定为目标录像片段的结束时间,并根据第一目标人脸图片的起始时间和结束时间查询VideoTable1表,以获取目标录像片段。For any first target face picture, obtain a target video clip 5 seconds before the acquisition time of the first target face picture and 5 seconds after the alarm of the first target face picture. The NVR can create a VideoTable1. The main fields in the VideoTable1 table are: the storage location of the video (the hard disk position offset + length) and the start and end time (start time and end time) of the video data. After a completed video is stored on the hard disk, a new record (ie, a new entry) is inserted into the VideoTable1 table, recording the video storage location and the start and end time. For any first target face picture, the NVR determines the 5 second before the acquisition time of the first target face picture as the start time of the target video clip, and the 5th time after the acquisition time of the first target face picture The second is determined as the end time of the target video clip, and the VideoTable1 table is queried according to the start time and end time of the first target face picture to obtain the target video clip.
根据各目标录像片段生成视频摘要,并解码显示。Generate a video summary based on each target video clip, and decode and display it.
本例中,通过获取并存储视频源设备的视频数据中的人脸图片、人脸图片的采集时间以及人脸图片的属性信息,当接收到人脸检索请求时,根据人脸检索请求中携带的人脸检索过滤条件,确定与人脸检索过滤条件匹配的第一目标人脸图片,进而根据第一目标人脸图片及第一目标人脸图片的采集时间生成视频摘要,并对视频摘要进行回放,避免了每次人脸检索均需要从视频数据中提取匹配的人脸图片,提高了人脸检索效率和准确性,并保证了人脸视频跟踪的连贯性。In this example, by obtaining and storing the face picture in the video data of the video source device, the acquisition time of the face picture, and the attribute information of the face picture, when a face retrieval request is received, it is carried according to the face retrieval request. Face search filter conditions, determine the first target face picture that matches the face search filter conditions, and then generate a video summary based on the first target face picture and the acquisition time of the first target face picture, and perform the video summary Playback avoids the need to extract matching face pictures from video data for each face retrieval, improves the efficiency and accuracy of face retrieval, and ensures the consistency of face video tracking.
请参见图8,为本申请再一实施例提供的一种视频摘要生成方法的流程示意图,其中,该视频摘要生成方法可以应用于检索设备。在该例中,视频摘要生成方法针对车辆搜索请求。如图8所示,该视频摘要生成方法可以包括以下步骤。Please refer to FIG. 8, which is a schematic flowchart of a video abstract generating method according to another embodiment of the present application. The video abstract generating method may be applied to a retrieval device. In this example, the video digest generation method is directed to a vehicle search request. As shown in FIG. 8, the video summary generating method may include the following steps.
步骤S800、获取并存储视频源设备的视频数据中的车辆图片、车辆图片的采集时间以及车辆图片的属性信息。Step S800: Acquire and store a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture in the video data of the video source device.
本例中,为了提高车辆检索的效率,检索设备可以获取并存储视频源设备的视频数据中的车辆图片信息。其中,该车辆图片信息可以包括但不限于车辆图片、车辆图片的采集时间以及车辆图片的属性信息等。In this example, in order to improve the efficiency of vehicle retrieval, the retrieval device can acquire and store the vehicle picture information in the video data of the video source device. The vehicle picture information may include, but is not limited to, a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture.
本申请实施例中,车辆图片的属性信息可以包括但不限于以下之一或多种:车辆图片中车辆所在的位置、车辆图片中车牌所在的位置、车牌号码、车牌颜色、国家类型、车身颜色、车辆品牌、车型(如大型客车、货车或面包车等)、驾驶员是否系安全带、驾驶员是否在打电话。In the embodiment of the present application, the attribute information of the vehicle picture may include, but is not limited to, one or more of the following: the location of the vehicle in the vehicle picture, the location of the license plate in the vehicle picture, the license plate number, the license plate color, the country type, and the body color , Vehicle brand, model (such as a large passenger car, truck or van, etc.), whether the driver is wearing a seat belt, and whether the driver is calling.
获取视频源设备的视频数据中的车辆图片、车辆图片的采集时间以及车辆图片的属性信息的具体实现方法与前述获取目标图片信息的方法类似,将目标图片信息替换成车辆图片信息即可,在此不再赘述。The specific implementation method for obtaining the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device is similar to the foregoing method of obtaining the target picture information, and the target picture information can be replaced with the vehicle picture information. This will not be repeated here.
本例中,检索设备获取到视频源设备的视频数据中的车辆图片、车辆图片的采集时间以及车辆图片的属性信息之后,可以存储所获取到的车辆图片、车辆图片的采集时间以及车辆图片的属性信息。In this example, after the retrieval device obtains the vehicle picture, the acquisition time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device, it may store the acquired vehicle picture, the acquisition time of the vehicle picture, and the vehicle picture. Attribute information.
存储视频源设备的视频数据中的车辆图片、车辆图片的采集时间以及车辆图片的属性信息,可以包括:存储车辆图片;在车辆图片信息表中记录车辆图片的存储位置、车辆图片的采集时间以及车辆图片的属性信息。Storing the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture may include: storing the vehicle picture; recording the storage location of the vehicle picture, the collection time of the vehicle picture in the vehicle picture information table, and Attribute information of the vehicle picture.
检索设备获取到车辆图片、车辆图片的采集时间以及车辆图片的属性信息之后, 可以存储所获取到的车辆图片,并将车辆图片的存储位置、车辆图片的采集时间以及车辆图片的属性信息记录在车辆图片信息表,其格式可以如表2所示:After the retrieval device obtains the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture, it can store the obtained vehicle picture, and record the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in Vehicle picture information table, its format can be shown in Table 2:
表2Table 2
车辆图片的位置信息Location information of vehicle pictures 车辆图片的采集时间Collection time of vehicle pictures 车辆图片的属性信息Vehicle picture attribute information
车辆图片1的位置信息Location information of vehicle picture 1 车辆图片1的采集时间Collection time of vehicle picture 1 车辆图片1的属性信息Attribute information of vehicle picture 1
车辆图片2的位置信息Location information of vehicle picture 2 车辆图片2的采集时间Collection time of vehicle picture 2 车辆图片2的属性信息Attribute information of vehicle picture 2
... ... ...
其中,车辆图片的位置信息可以为车辆图片在存储空间(如硬盘)中的位置偏移和长度。The location information of the vehicle picture may be a position offset and a length of the vehicle picture in a storage space (such as a hard disk).
上述存储视频源设备的视频数据中车辆图片、车辆图片的采集时间以及车辆图片的属性信息的实现方式仅仅是本申请中存储车辆图片、车辆图片的采集时间以及车辆图片的属性信息的一种具体示例,而并不是对本申请保护范围的限定,即本申请实施例中,也可以通过其他方式存储视频源设备的视频数据中车辆图片、车辆图片的采集时间以及车辆图片的属性信息。The foregoing implementation manner of storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device is only a specific example of storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in this application. The example is not a limitation on the protection scope of the present application, that is, in the embodiment of the present application, the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture may also be stored.
例如,在一个示例中,可以将车辆图片、车辆图片的采集时间以及车辆图片的属性信息存储在同一个数据库中(即直接将车辆图片以二进制的形式存储到数据库中)。在该示例中,可以将车辆图片、车辆图片的采集时间以及车辆图片的属性信息存储在同一个数据表,此时,不需要额外记录车辆图片的存储位置。For example, in one example, the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture may be stored in the same database (that is, the vehicle picture is directly stored in the database in a binary form). In this example, the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture can be stored in the same data table. At this time, there is no need to additionally record the storage location of the vehicle picture.
在另一个示例中,仍然可以先存储车辆图片以得到车辆图片的存储位置,但是存储车辆图片的存储位置、车辆图片的采集时间以及车辆图片的属性信息时,不再以数据表的形式存储,而是采用其他形式存储,如树形结构或文件形式,其具体实现在此不做赘述。In another example, the vehicle picture may still be stored first to obtain the storage location of the vehicle picture, but the storage location of the vehicle picture, the time when the vehicle picture was collected, and the attribute information of the vehicle picture are no longer stored in the form of a data table. Instead, it is stored in other forms, such as a tree structure or a file. The specific implementation is not described here.
步骤S810、当接收到车辆检索请求时,根据车辆检索请求中携带的待检索车辆图片,确定与待检索车辆图片匹配的第一车辆图片。Step S810: When a vehicle retrieval request is received, a first vehicle picture that matches the vehicle picture to be retrieved is determined according to the vehicle picture to be retrieved carried in the vehicle retrieval request.
本例中,检索设备可以提供车辆检索功能,根据接收到的车辆检索请求中携带的待检索车辆图片,按照以图搜图的方式,检索匹配的车辆图片及车辆图片的采集时间。In this example, the retrieval device may provide a vehicle retrieval function, and according to the pictures of the vehicle to be retrieved carried in the received vehicle retrieval request, retrieve the matching vehicle pictures and the collection time of the vehicle pictures in the manner of map search.
例如,检索设备可以提供车辆检索请求界面,该车辆检索请求界面中可以包括待检索车辆图片输入或/和选择区域,由用户在该车辆检索请求界面中输入或/和选择待检索车辆图片,并提交车辆检索请求。For example, the retrieval device may provide a vehicle retrieval request interface, and the vehicle retrieval request interface may include an input or / and selection area of a picture of the vehicle to be retrieved, and a user enters or / and selects a picture of the vehicle to be retrieved in the vehicle retrieval request interface, and Submit a vehicle search request.
检索设备接收到车辆检索请求时,对待检索车辆图片进行建模,并提取待检索车辆图片的属性信息,进而,检索设备可以根据待检索车辆图片的属性信息查询所存储的车辆图片、车辆图片的采集时间以及车辆图片的属性信息,并将与待检索车辆图片的属性信息匹配的车辆图片的属性信息对应的车辆图片,确定为与该待检索车辆图片的属性信息匹配的车辆图片(本文中称为第一车辆图片)。When the retrieval device receives a vehicle retrieval request, it models the to-be-retrieved vehicle pictures and extracts attribute information of the to-be-retrieved vehicle pictures. Furthermore, the retrieval device can query the stored vehicle pictures and vehicle pictures based on the attribute information of the to-be-retrieved vehicle pictures. Collect the time and the attribute information of the vehicle picture, and determine the vehicle picture corresponding to the attribute information of the vehicle picture that matches the attribute information of the vehicle picture to be retrieved as the vehicle picture that matches the attribute information of the vehicle picture to be retrieved (referred to herein as For the first vehicle picture).
举例来说,假设检索设备按照车辆图片信息表的形式存储车辆图片、车辆图片 的采集时间以及车辆图片的属性信息(参见步骤S800中的相关描述),则检索设备可以根据待检索车辆图片的属性信息查询车辆图片信息表中的车辆图片的属性信息,以得到与该待检索车辆图片的属性信息匹配的车辆图片信息表项,并获取该车辆图片信息表项中的车辆图片的存储位置(即第一车辆图片的存储位置)。For example, assuming that the retrieval device stores the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the form of a vehicle picture information table (see the relevant description in step S800), the retrieval device may according to the attributes of the vehicle picture to be retrieved Information query the attribute information of the vehicle picture in the vehicle picture information table to obtain the vehicle picture information entry that matches the attribute information of the vehicle picture to be retrieved, and obtain the storage location of the vehicle picture in the vehicle picture information entry (i.e. Storage location of the first vehicle picture).
进而,检索设备可以根据第一车辆图片的存储位置从指定存储空间中获取第一车辆图片。Further, the retrieval device may acquire the first vehicle picture from the specified storage space according to the storage location of the first vehicle picture.
上述车辆图片信息的检索仅仅是在以车辆图片信息表的方式存储车辆图片信息的情况下的一种具体示例,而并不是对本申请保护范围的限定,即在本申请实施例中,也可以通过其他方式实现车辆图片信息的检索。The above retrieval of vehicle picture information is only a specific example in the case of storing vehicle picture information in the form of a vehicle picture information table, and is not a limitation on the protection scope of the present application. That is, in the embodiment of the present application, Other ways to achieve vehicle image information retrieval.
例如,当车辆图片、车辆图片的采集时间以及车辆图片的属性信息存放在数据库中的同一个数据表中时,检索设备可以直接根据待检索车辆图片的属性信息,从数据库中查询相匹配的车辆图片的属性信息所在的表项,并从查询到的表项中获取车辆图片信息。For example, when the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture are stored in the same data table in the database, the retrieval device may directly query the matching vehicle from the database according to the attribute information of the vehicle picture to be retrieved The entry where the attribute information of the picture is located, and the vehicle picture information is obtained from the queried entry.
步骤S820、根据第一车辆图片及第一车辆图片的采集时间生成视频摘要,并对视频摘要进行回放。Step S820: Generate a video summary according to the first vehicle picture and the collection time of the first vehicle picture, and play back the video summary.
本申请实施例中,检索设备获取到与待检索车辆图片匹配的第一车辆图片及第一车辆图片的采集时间之后,可以根据第一车辆图片及第一车辆图片的采集时间生成带检索车辆的视频摘要,并对待检测车辆的视频摘要进行回放。本步骤的具体实现方法和上述步骤S220类似,只是将第一目标图片和第一目标图片对应的采集时间替换为第一车辆图片和第一车辆图片的采集时间,在此不再赘述。In the embodiment of the present application, after the retrieval device obtains the first vehicle picture and the collection time of the first vehicle picture that match the picture of the vehicle to be retrieved, the retrieval device may generate a searched vehicle with the first vehicle picture and the first vehicle picture collection time. Video summary, and playback of video summary of the vehicle to be detected. The specific implementation method of this step is similar to the above step S220, except that the acquisition time corresponding to the first target picture and the first target picture is replaced with the acquisition time of the first vehicle picture and the first vehicle picture, and details are not described herein again.
仍以视频源设备为IPC,检索设备为NVR为例进行详细说明。其中,NVR中加载有具有智能分析功能的智能芯片。在该例中,视频摘要生成方案实现流程如下。The video source device is IPC, and the retrieval device is NVR. Among them, the NVR is loaded with a smart chip with a smart analysis function. In this example, the video digest generation scheme implementation process is as follows.
1、车辆图片信息获取1. Vehicle picture information acquisition
当IPC具有车辆图片抓拍功能时When the IPC has a vehicle picture capture function
IPC进行车辆图片抓拍,并将抓拍的车辆图片及车辆图片的采集时间传输给NVR。在该方式中,IPC在获取实时视频流的过程中,还可以进行车辆图片抓拍,并将抓拍的车辆图片及车辆图片的采集时间(即车辆图片的采集时间)传输给NVR。需要说明的是,在该方式中,IPC也会将实时视频流传输给NVR,由NVR根据预设策略保存视频录像。The IPC captures the vehicle pictures and transmits the captured vehicle pictures and the collection time of the vehicle pictures to the NVR. In this method, during the process of acquiring the real-time video stream, the IPC can also capture a vehicle picture, and transmit the captured vehicle picture and the acquisition time of the vehicle picture (that is, the acquisition time of the vehicle picture) to the NVR. It should be noted that in this method, the IPC will also transmit the real-time video stream to the NVR, and the NVR saves the video recording according to a preset policy.
NVR提取车辆图片中的特征值,并根据车辆图片的特征值对车辆图片进行建模,提取车辆图片的属性信息。在该方式中,NVR接收到IPC传输的车辆图片时,可以通过智能芯片对车辆图片进行智能分析。智能芯片可以使用算法库,提取车辆图片中的关于车辆的特征值,并使用算法库,根据所提取的特征值对车辆图片进行建模,并提取车辆图片的属性信息。其中,NVR获取车辆图片信息的流程示意图可以如图9所示。The NVR extracts feature values from the vehicle pictures, models the vehicle pictures according to the feature values of the vehicle pictures, and extracts attribute information of the vehicle pictures. In this mode, when the NVR receives the vehicle picture transmitted by the IPC, it can intelligently analyze the vehicle picture through a smart chip. The smart chip can use an algorithm library to extract the feature values of the vehicle from the vehicle picture, and use the algorithm library to model the vehicle picture according to the extracted feature values, and extract the attribute information of the vehicle picture. The schematic diagram of the NVR's process of obtaining vehicle picture information can be shown in FIG. 9.
当IPC不具有图片抓拍功能时When IPC does not have a picture capture function
NVR对视频录像或者实时视频流进行目标检测,以得到车辆图片及车辆图片的 采集时间。在该方式中,NVR可以通过智能芯片对视频录像或实时视频流进行目标检测,以得到视频录像或实时视频流中的车辆图片及车辆图片的采集时间(即车辆图片在视频数据中出现的时间)。The NVR performs object detection on the video recording or real-time video stream to obtain the vehicle picture and the acquisition time of the vehicle picture. In this mode, the NVR can perform target detection on the video recording or real-time video stream through the smart chip to obtain the vehicle pictures and the collection time of the vehicle pictures in the video recording or real-time video stream (that is, the time when the vehicle pictures appear in the video data ).
NVR提取车辆图片中的特征值,并根据车辆图片的特征值对车辆图片进行建模,提取车辆图片的属性信息。在该方式中,NVR接收到IPC传输的车辆图片时,可以通过智能芯片对车辆图片进行智能分析。智能芯片可以使用算法库,提取车辆图片中的关于车辆的特征值,并使用算法库,根据所提取的特征值对车辆图片进行建模,并提取车辆图片的属性信息。其中,NVR获取车辆图片信息的流程示意图可以如图10所示。The NVR extracts feature values from the vehicle pictures, models the vehicle pictures according to the feature values of the vehicle pictures, and extracts attribute information of the vehicle pictures. In this mode, when the NVR receives the vehicle picture transmitted by the IPC, it can intelligently analyze the vehicle picture through a smart chip. The smart chip can use an algorithm library to extract the feature values of the vehicle from the vehicle picture, and use the algorithm library to model the vehicle picture according to the extracted feature values, and extract the attribute information of the vehicle picture. The schematic diagram of the process for the NVR to obtain vehicle picture information can be shown in FIG. 10.
NVR存储车辆图片、车辆图片的采集时间以及车辆图片的属性信息,具体实现参见后续描述。The NVR stores the vehicle picture, the time when the vehicle picture was collected, and the attribute information of the vehicle picture. For specific implementation, see the subsequent description.
2、车辆图片信息存储2. Vehicle picture information storage
NVR存储车辆图片,以得到车辆图片的存储位置。对于车辆图片信息,建立一个车辆图片信息相关的数据库表VehicleTable(车辆表)。其中,该VehicleTable表中主要字段有:车辆图片的存储位置、车辆图片的采集时间以及车辆图片的属性信息。需要说明的是,NVR还可以存储车辆图片的模型数据,其具体实现在此不做赘述。NVR将车辆图片存储到硬盘之后,可以得到车辆图片所在硬盘的位置偏移和长度(即车辆图片的存储位置)。The NVR stores the vehicle picture to obtain the storage location of the vehicle picture. For the vehicle picture information, a vehicle table (vehicle table) related to the vehicle picture information is established. The main fields in the VehicleTable table are: the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture. It should be noted that the NVR can also store model data of vehicle pictures, and its specific implementation is not described here. After the NVR stores the vehicle picture to the hard disk, the position offset and length of the hard disk where the vehicle picture is located (that is, the storage location of the vehicle picture) can be obtained.
NVR在VehicleTable表中记录车辆图片的存储位置、车辆图片的采集时间以及车辆图片的属性信息。The NVR records the storage location of the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the VehicleTable.
3、车辆检索3. Vehicle retrieval
接收车辆检索请求,该车辆检索请求中携带有待检索车辆图片。NVR可以提供车辆检索界面,该车辆检索界面中包括待检索车辆图片输入或/和选择区域。用户可以通过该车辆检索界面输入或/和选择待检索车辆图片,并提交车辆检索请求。A vehicle retrieval request is received, and the vehicle retrieval request carries a picture of a vehicle to be retrieved. The NVR can provide a vehicle search interface, and the vehicle search interface includes a picture input or / and selection area of a vehicle to be searched. The user can input or / and select pictures of the vehicle to be retrieved through the vehicle retrieval interface, and submit a vehicle retrieval request.
NVR对待检索车辆图片进行建模,并提取待检索车辆图片的属性信息。The NVR models the vehicle pictures to be retrieved and extracts the attribute information of the vehicle pictures to be retrieved.
NVR根据待检索车辆图片的属性信息从VehicleTable表中查询匹配的车辆图片的存储位置和车辆图片的采集时间。NVR可以查询VehicleTable表,比较待检索车辆图片的属性信息和VehicleTable表中记录的车辆图片的属性信息,将记录的车辆图片的属性信息与待检索车辆图片的属性信息匹配的VehicleTable表项中记录的车辆图片的存储位置和车辆图片的采集时间,确定为匹配的车辆图片的存储位置(即第一车辆图片的存储位置)和车辆图片的采集时间(即第一车辆图片的采集时间)。The NVR queries the storage location of the matching vehicle picture and the collection time of the vehicle picture from the VehicleTable table according to the attribute information of the vehicle picture to be retrieved. The NVR can query the VehicleTable table, compare the attribute information of the vehicle picture to be retrieved with the attribute information of the vehicle picture recorded in the VehicleTable table, and match the attribute information of the recorded vehicle picture with the attribute information of the vehicle picture to be retrieved. The storage location of the vehicle picture and the collection time of the vehicle picture are determined as the matching storage location of the vehicle picture (that is, the storage location of the first vehicle picture) and the collection time of the vehicle picture (that is, the collection time of the first vehicle picture).
NVR根据第一车辆图片的存储位置获取第一车辆图片。NVR可以根据第一车辆图片的存储位置(位置偏移+长度)从硬盘中读取第一车辆图片,从而,NVR可以得到第一车辆图片和第一车辆图片的采集时间。The NVR obtains the first vehicle picture according to the storage location of the first vehicle picture. The NVR can read the first vehicle picture from the hard disk according to the storage position (position offset + length) of the first vehicle picture, so that the NVR can obtain the first vehicle picture and the collection time of the first vehicle picture.
4、视频摘要生成4, video summary generation
对于任一第一车辆图片,获取第一车辆图片的采集时间之前的5秒到第一车辆图片的报警之间之后的5秒的目标录像片段。NVR可以建立一张VideoTable2(录像信 息表2)。其中,该VideoTable2表中主要字段有:录像存储位置(硬盘位置偏移+长度)以及录像数据的起止时间(起始时间和结束时间)。当一个完成的录像存储到硬盘后,在VideoTable2表中插入一条新的记录(即新增一条表项),记录录像存储位置以及起止时间。对于任一第一车辆图片,NVR将该第一车辆图片的采集时间之前的第5秒确定为目标录像片段的起始时间,第一车辆图片的采集时间之后的第5秒确定为目标录像片段的结束时间,并根据第一车辆图片的起始时间和结束时间查询VideoTable2表,以获取目标录像片段。For any first vehicle picture, obtain a target video clip between 5 seconds before the collection time of the first vehicle picture and 5 seconds after the alarm of the first vehicle picture. The NVR can create a VideoTable2. The main fields in the VideoTable2 table are: the storage location of the video (the hard disk position offset + length) and the start and end time (start time and end time) of the video data. After a completed video is stored on the hard disk, a new record (ie, a new entry) is inserted into the VideoTable2 table, recording the video storage location and the start and end time. For any first vehicle picture, the NVR determines the 5 second before the acquisition time of the first vehicle picture as the start time of the target video clip, and the 5 second after the acquisition time of the first vehicle picture is determined as the target video clip. Query the VideoTable2 table according to the start time and end time of the first vehicle picture to obtain the target video clip.
根据各目标录像片段生成视频摘要,并解码显示。Generate a video summary based on each target video clip, and decode and display it.
本例中,通过获取并存储视频源设备的视频数据中的车辆图片、车辆图片的采集时间以及车辆图片的属性信息,当接收到车辆检索请求时,根据车辆检索请求中携带的待检索车辆图片,确定与待检索车辆图片匹配的第一车辆图片,进而根据第一车辆图片及第一车辆图片的采集时间生成视频摘要,并对视频摘要进行回放,避免了每次车辆检索均需要从视频数据中提取匹配的车辆图片,提高了车辆检索效率和准确性,并保证了车辆视频跟踪的连贯性。In this example, by obtaining and storing the vehicle picture in the video data of the video source device, the collection time of the vehicle picture, and the attribute information of the vehicle picture, when a vehicle retrieval request is received, according to the vehicle picture to be retrieved carried in the vehicle retrieval request To determine the first vehicle picture that matches the picture of the vehicle to be retrieved, and then generate a video summary based on the first vehicle picture and the acquisition time of the first vehicle picture, and play back the video summary, avoiding the need for video data from each vehicle retrieval Extracting matching vehicle pictures improves vehicle retrieval efficiency and accuracy, and ensures the consistency of vehicle video tracking.
请参见图11,为本申请又一实施例提供的一种视频摘要生成方法的流程示意图,其中,该视频摘要生成方法可以应用于检索设备。在该例中,视频摘要生成方法针对车辆搜索请求。如图11所示,该视频摘要生成方法可以包括以下步骤。Please refer to FIG. 11, which is a schematic flowchart of a video abstract generating method according to another embodiment of the present application. The video abstract generating method may be applied to a retrieval device. In this example, the video digest generation method is directed to a vehicle search request. As shown in FIG. 11, the video digest generating method may include the following steps.
步骤S1100、获取并存储视频源设备的视频数据中的车辆图片、车辆图片的采集时间以及车辆图片的属性信息。Step S1100: Acquire and store a vehicle picture, a collection time of the vehicle picture, and attribute information of the vehicle picture in the video data of the video source device.
本步骤的具体实现方法可以参考步骤S800,在此不再赘述。For a specific implementation method of this step, reference may be made to step S800, and details are not described herein again.
步骤S1110、当接收到车辆检索请求时,根据车辆检索请求中携带的车辆检索过滤条件,确定与车辆检索过滤条件匹配的第二车辆图片。Step S1110: When a vehicle search request is received, a second vehicle picture matching the vehicle search filter condition is determined according to the vehicle search filter condition carried in the vehicle search request.
本例中,检索设备可以提供车辆检索功能,根据接收到的车辆检索请求中携带的车辆检索过滤条件检索匹配的车辆图片及车辆图片的采集时间。In this example, the retrieval device may provide a vehicle retrieval function, and retrieve a matching vehicle picture and a collection time of the vehicle picture according to a vehicle retrieval filter condition carried in the received vehicle retrieval request.
例如,检索设备可以提供车辆检索请求界面,该车辆检索请求界面中可以包括车辆检索过滤条件输入区域或/和车辆检索过滤条件选项,由用户在该车辆检索请求界面中输入或/和选择车辆检索过滤条件,并提交车辆检索请求。For example, the retrieval device may provide a vehicle search request interface, and the vehicle search request interface may include a vehicle search filter condition input area or / and a vehicle search filter condition option, and a user enters or / and selects a vehicle search in the vehicle search request interface Filter conditions and submit a vehicle search request.
在一个示例中,车辆检索过滤条件为待检索车辆图片的属性信息(本文中可以称为待检索车辆图片的第三属性信息),其可以包括但不限于待检索车辆的车牌号码、车身颜色、车型以及车辆品牌等信息中的一个或多个。In one example, the vehicle retrieval filter condition is attribute information of a picture of the vehicle to be retrieved (this may be referred to as the third attribute information of the picture of the vehicle to be retrieved), which may include, but is not limited to, the license plate number, body color, One or more of information such as model and vehicle brand.
在另一个示例中,车辆检索过滤条件可以包括待检索车辆图片和待检索车辆图片的第三属性信息。In another example, the vehicle retrieval filter condition may include the third attribute information of the image of the vehicle to be retrieved and the image of the vehicle to be retrieved.
检索设备接收到车辆检索请求时,可以获取该车辆检索请求中携带的车辆检索过滤条件,并根据该车辆检索过滤条件查询所存储的车辆图片、车辆图片的采集时间以及车辆图片的属性信息,并将与车辆检索过滤条件匹配的车辆图片的属性信息对应的车辆图片,确定为与该车辆检索过滤条件匹配的车辆图片(本文中称为第二车辆图片)。When a retrieval device receives a vehicle retrieval request, it can obtain the vehicle retrieval filter conditions carried in the vehicle retrieval request, and query the stored vehicle pictures, the collection time of the vehicle pictures, and the attribute information of the vehicle pictures according to the vehicle retrieval filter conditions, and The vehicle picture corresponding to the attribute information of the vehicle picture matching the vehicle search filter condition is determined as the vehicle picture matching the vehicle search filter condition (referred to herein as the second vehicle picture).
举例来说,假设检索设备按照车辆图片信息表的形式存储车辆图片、车辆图片的采集时间以及车辆图片的属性信息,则检索设备可以根据车辆检索过滤条件查询车辆图片信息表中的车辆图片的属性信息,以得到与该车辆检索过滤条件匹配的车辆图片信息表项,并获取该车辆图片信息表项中的车辆图片的存储位置(即第二车辆图片的存储位置)及第二车辆图片的采集时间。For example, assuming that the retrieval device stores the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the form of a vehicle picture information table, the retrieval device can query the attributes of the vehicle picture in the vehicle picture information table according to the vehicle search filter Information to obtain a vehicle picture information entry that matches the vehicle search filter condition, and obtain the storage location of the vehicle picture in the vehicle picture information entry (that is, the storage location of the second vehicle picture) and the collection of the second vehicle picture time.
进而,检索设备可以根据第二车辆图片的存储位置从指定存储空间中获取第二车辆图片。Further, the retrieval device may acquire the second vehicle picture from the designated storage space according to the storage location of the second vehicle picture.
在本申请其中一个实施例中,当车辆检索过滤条件包括待检索车辆图片和待检索车辆图片的第三属性信息时,上述比较车辆检索过滤条件以及车辆图片信息表中记录的车辆图片的属性信息,可以包括:对待检索车辆图片进行建模,并提取待检索车辆图片的第四属性信息;根据待检索车辆图片的第三属性信息和待检索车辆图片的第四属性信息,确定待检索车辆的属性信息;比较待检索车辆图片的属性信息以及车辆图片信息表中记录的车辆图片的属性信息。In one embodiment of the present application, when the vehicle retrieval filter condition includes the third attribute information of the image of the vehicle to be retrieved and the image of the vehicle to be retrieved, the above comparison of the vehicle retrieval filter condition and the attribute information of the vehicle picture recorded in the vehicle picture information table May include: modeling a picture of a vehicle to be retrieved and extracting fourth attribute information of the picture of the vehicle to be retrieved; and determining, based on the third attribute information of the picture of the vehicle to be retrieved and the fourth attribute information of the picture of the vehicle to be retrieved, Attribute information; compare the attribute information of the vehicle picture to be retrieved with the attribute information of the vehicle picture recorded in the vehicle picture information table.
在该实施例中,当车辆检索过滤条件包括待检索车辆图片和待检索车辆图片的第三属性信息时,检索设备可以对待检索车辆图片进行建模,并提取待检索车辆图片的属性信息(本文中称为待检索车辆图片的第四属性信息)。In this embodiment, when the vehicle retrieval filter condition includes a picture of the vehicle to be retrieved and third attribute information of the picture of the vehicle to be retrieved, the retrieval device may model the picture of the vehicle to be retrieved and extract attribute information of the picture of the vehicle to be retrieved (this article (Referred to as the fourth attribute information of the picture of the vehicle to be retrieved).
检索设备得到待检索车辆图片的第四属性信息之后,可以根据待检索车辆图片的第三属性信息和待检索车辆图片的第四属性信息,确定待检索车辆的属性信息。After the retrieval device obtains the fourth attribute information of the picture of the vehicle to be retrieved, it can determine the attribute information of the vehicle to be retrieved according to the third attribute information of the picture of the vehicle to be retrieved and the fourth attribute information of the picture of the vehicle to be retrieved.
例如,检索设备可以比较待检索车辆图片的第三属性信息和待检索车辆图片的第四属性信息,对于第三属性信息中存在,但第四属性信息中不存在的属性信息,或第三属性信息中不存在,但第四属性信息中存在的属性信息,加入到该待检索车辆图片的属性信息;对于第三属性信息和第四属性信息中均存在的属性信息,将第四属性信息中的属性信息加入到该待检索车辆图片的属性信息,进而,得到该待检索车辆图片的属性信息。For example, the retrieval device may compare the third attribute information of the picture of the vehicle to be retrieved with the fourth attribute information of the picture of the vehicle to be retrieved. For the attribute information that exists in the third attribute information but does not exist in the fourth attribute information, or the third attribute The information does not exist, but the attribute information existing in the fourth attribute information is added to the attribute information of the vehicle picture to be retrieved; for the attribute information existing in both the third attribute information and the fourth attribute information, the fourth attribute information is added to Is added to the attribute information of the picture of the vehicle to be retrieved, and further, the attribute information of the picture of the vehicle to be retrieved is obtained.
检索设备得到待检索车辆图片的属性信息时,可以根据待检索车辆图片的属性信息查询所存储的车辆图片、车辆图片的采集时间以及车辆图片的属性信息,并将与待检索车辆图片的属性信息匹配的车辆图片的属性信息对应的车辆图片,确定为与该车辆检索过滤条件匹配的第二车辆图片。When the retrieval device obtains the attribute information of the vehicle picture to be retrieved, it can query the stored vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture according to the attribute information of the vehicle picture to be retrieved, and compare it with the attribute information of the vehicle picture to be retrieved The vehicle picture corresponding to the attribute information of the matched vehicle picture is determined as the second vehicle picture that matches the filter criteria of the vehicle search.
应该认识到,上述车辆图片信息的检索仅仅是在以车辆图片信息表的方式存储车辆图片信息的情况下的一种具体示例,而并不是对本申请保护范围的限定,即在本申请实施例中,也可以通过其他方式实现车辆图片信息的检索。It should be recognized that the retrieval of the vehicle picture information described above is only a specific example in the case of storing the vehicle picture information in the form of a vehicle picture information table, and is not a limitation on the protection scope of the present application, that is, in the embodiments of the present application , You can also retrieve vehicle picture information in other ways.
例如,当车辆图片、车辆图片的采集时间以及车辆图片的属性信息存放在数据库中的同一个数据表中时,检索设备可以直接根据待检索车辆图片的属性信息,从数据库中查询相匹配的车辆图片的属性信息所在的表项,并从查询到的表项中获取车辆图片信息。For example, when the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture are stored in the same data table in the database, the retrieval device may directly query the matching vehicle from the database according to the attribute information of the vehicle picture to be retrieved The entry where the attribute information of the picture is located, and the vehicle picture information is obtained from the queried entry.
步骤S1120、根据第二车辆图片及第二车辆图片的采集时间生成视频摘要,并对视频摘要进行回放。Step S1120: Generate a video summary according to the second vehicle picture and the collection time of the second vehicle picture, and play back the video summary.
本步骤的具体实现方法可以参考步骤S820,在此不再赘述。For a specific implementation method of this step, reference may be made to step S820, and details are not described herein again.
本申请实施例中,通过获取并存储视频源设备的视频数据中的车辆图片、车辆图片的采集时间以及车辆图片的属性信息,当接收到车辆检索请求时,根据车辆检索请求中携带的车辆检索过滤条件,确定与车辆检索过滤条件匹配的第二车辆图片,进而根据第二车辆图片及第二车辆图片的采集时间生成视频摘要,并对视频摘要进行回放,避免了每次车辆检索均需要从视频数据中提取匹配的车辆图片,提高了车辆检索效率和准确性,并保证了车辆视频跟踪的连贯性。In the embodiment of the present application, by obtaining and storing the vehicle picture, the collection time of the vehicle picture, and the attribute information of the vehicle picture in the video data of the video source device, when a vehicle retrieval request is received, the vehicle is retrieved according to the vehicle carried in the vehicle retrieval request. Filter conditions to determine the second vehicle picture that matches the vehicle search filter conditions, and then generate a video summary based on the second vehicle picture and the acquisition time of the second vehicle picture, and play back the video summary, avoiding the need for each vehicle retrieval Extracting matching vehicle pictures from video data improves the efficiency and accuracy of vehicle retrieval, and ensures the consistency of vehicle video tracking.
以上对本申请提供的方法进行了描述。下面对本申请提供的装置进行描述。The method provided in the present application has been described above. The device provided in this application is described below.
请参见图12,为本申请实施例提供的一种视频摘要生成装置的结构示意图,其中,该视频摘要生成装置可以应用于上述实施例中的搜索设备,如图12所示,该视频摘要生成装置可以包括以下单元。Please refer to FIG. 12, which is a schematic structural diagram of a video summary generating apparatus according to an embodiment of the present application. The video summary generating apparatus may be applied to the search device in the foregoing embodiment. As shown in FIG. 12, the video summary generating apparatus The device may include the following units.
接收单元1210,用于接收目标搜索请求,所述目标搜索请求中携带有待搜索目标的特征信息。The receiving unit 1210 is configured to receive a target search request, where the target search request carries characteristic information of a target to be searched.
搜索单元1220,用于搜索与所述待搜索目标的特征信息匹配的第一目标图片。The search unit 1220 is configured to search for a first target picture that matches feature information of the target to be searched.
处理单元1230,用于根据所述第一目标图片和所述第一目标图片对应的采集时间生成视频摘要。The processing unit 1230 is configured to generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
在一种可选的实施方式中,如图13所示,所述装置还包括以下单元。In an optional embodiment, as shown in FIG. 13, the device further includes the following units.
获取单元1240,用于获取视频源设备的视频数据中的目标图片信息,所述目标图片信息包括目标图片、目标图片的采集时间以及目标图片的属性信息。The obtaining unit 1240 is configured to obtain target picture information in the video data of the video source device, where the target picture information includes the target picture, a collection time of the target picture, and attribute information of the target picture.
保存单元1250,用于将所述目标图片信息保存至图片信息库。The saving unit 1250 is configured to save the target picture information to a picture information database.
在一种可选的实施方式中,所述获取单元1240,具体用于接收所述视频源设备发送的目标图片信息。In an optional implementation manner, the obtaining unit 1240 is specifically configured to receive target picture information sent by the video source device.
在一种可选的实施方式中,所述获取单元1240,具体用于接收所述视频源设备发送的所述目标图片及所述目标图片的采集时间;对所述目标图片进行建模,并提取所述目标图片的属性信息。In an optional implementation manner, the obtaining unit 1240 is specifically configured to receive the target picture and the collection time of the target picture sent by the video source device; model the target picture, and Extracting attribute information of the target picture.
在一种可选的实施方式中,所述获取单元1240,具体用于接收所述视频源设备发送的所述目标图片、所述目标图片的采集时间以及目标图片的第一属性信息;对所述目标图片进行建模,并提取所述目标图片的第二属性信息;根据所述目标图片的第一属性信息和所述目标图片的第二属性信息确定所述目标图片的属性信息。In an optional implementation manner, the obtaining unit 1240 is specifically configured to receive the target picture, the collection time of the target picture, and first attribute information of the target picture sent by the video source device; Model the target picture, and extract the second attribute information of the target picture; determine the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
在一种可选的实施方式中,所述获取单元1240,,具体用于对所述视频源设备提供的所述视频数据进行目标检测,得到所述视频数据中的所述目标图片及所述目标图片的采集时间;对所述目标图片进行建模,并提取所述目标图片的属性信息。In an optional implementation manner, the obtaining unit 1240 is specifically configured to perform target detection on the video data provided by the video source device to obtain the target picture and the target image in the video data. Acquisition time of the target picture; modeling the target picture, and extracting attribute information of the target picture.
在一种可选的实施方式中,所述待搜索目标的特征信息包括待搜索目标的属性信息;所述搜索单元1220,具体用于根据所述待搜索目标的属性信息在所述图片信息库 中搜索匹配的所述第一目标图片。In an optional implementation manner, the feature information of the target to be searched includes attribute information of the target to be searched; and the searching unit 1220 is specifically configured to be stored in the picture information database according to the attribute information of the target to be searched Searching for a matching first target picture.
在一种可选的实施方式中,所述待搜索目标的特征信息包括待搜索目标图片;所述搜索单元1220,具体用于对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的属性信息;根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。In an optional implementation manner, the feature information of the target to be searched includes a target picture to be searched; the search unit 1220 is specifically configured to model the target picture to be searched and extract the target to be searched Attribute information of the target picture; and searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
在一种可选的实施方式中,所述待搜索目标的特征信息包括待搜索目标图片和待搜索目标图片的第三属性信息;所述搜索单元1220,具体用于对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的第四属性信息;根据所述第三属性信息和所述第四属性信息,确定所述待搜索目标图片的属性信息;根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的第一目标图片。In an optional implementation manner, the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched; and the searching unit 1220 is specifically configured to perform a search on the target picture to be searched Performing modeling, and extracting fourth attribute information of the target picture to be searched; determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information; and according to the target to be searched The attribute information of the picture searches for a matching first target picture in the picture information database.
在一种可选的实施方式中,所述目标搜索请求中还携带有搜索时间段范围;所述搜索单元1220,具体用于根据所述搜索时间段范围对所述图片信息库中的目标图片进行筛选,以得到采集时间在所述搜索时间段范围内的第二目标图片;根据所述待搜索目标的特征信息从所述第二目标图片中搜索匹配的第一目标图片。In an optional implementation manner, the target search request further carries a search time period range; the search unit 1220 is specifically configured to target the target picture in the picture information database according to the search time range range. Perform filtering to obtain a second target picture whose acquisition time is within the range of the search period; and search for a matching first target picture from the second target picture according to the feature information of the target to be searched.
在一种可选的实施方式中,所述目标搜索请求中还携带有搜索通道号,所述目标图片信息还包括有目标图片的通道号;所述搜索单元1220,具体用于根据所述搜索通道号对所述图片信息库中的目标图片进行筛选,得到所述通道号与所述搜索通道号一致的第三目标图片;根据所述待搜索目标的特征信息,从所述第三目标图片中搜索匹配的第一目标图片。In an optional implementation manner, the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture; and the search unit 1220 is specifically configured to perform the search according to the search. The channel number is used to filter the target pictures in the picture information database to obtain a third target picture whose channel number is consistent with the search channel number; according to the feature information of the target to be searched, from the third target picture Search for a matching first target picture.
在一种可选的实施方式中,所述目标图片为人脸图片,所述目标搜索请求为人脸搜索请求。In an optional implementation manner, the target picture is a face picture, and the target search request is a face search request.
在一种可选的实施方式中,所述目标图片为车辆图片,所述目标搜索请求为车辆搜索请求。In an optional implementation manner, the target picture is a vehicle picture, and the target search request is a vehicle search request.
在一种可选的实施方式中,所述处理单元1230,具体用于按采集时间从早到晚的顺序排序所述第一目标图片;根据排序后的所述第一目标图片,生成所述视频摘要。In an optional implementation manner, the processing unit 1230 is specifically configured to sort the first target picture in an order from early to late in the acquisition time; and generate the first target picture according to the sorted first target picture. Video summary.
在一种可选的实施方式中,所述处理单元1230,具体用于对于每个第一目标图片,确定该第一目标图片对应的目标录像片段,其中,该目标录像片段为该第一目标图片对应的所述采集时间之前的第n秒到该第一目标图片的所述采集时间之后的第m秒之间的录像数据;根据各所述目标录像片段生成视频摘要。In an optional implementation manner, the processing unit 1230 is specifically configured to determine, for each first target picture, a target video clip corresponding to the first target picture, where the target video clip is the first target Recording data between the n-th second before the acquisition time corresponding to the picture and the m-th second after the acquisition time of the first target picture; generating a video summary according to each of the target video clips.
在一种可选的实施方式中,所述处理单元1230,具体用于当存在采集时间相同的多张第一目标图片时,对于该多张第一目标图片中的任一第一目标图片,确定该第一目标图片对应的录像片段的起始时间点和结束时间点,其中,该起始时间点为该第一目标图片对应的所述采集时间之前的第n秒,该结束时间点为该第一目标图片对应的所述采集时间之后的第m秒;搜索该第一目标图片所属的视频数据通道的录像数据中是否存在该起始时间点的I帧,以及是否存在该结束时间点的I帧;若存在该起始时间点的I帧和该结束时间点的I帧,则丢弃该多张第一目标图片中的其余第一目标图片,并将该第一目标图片对应的录像片段确定为所述目标录像片段。In an optional implementation manner, the processing unit 1230 is specifically configured to, when there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, Determine the start time point and end time point of the video clip corresponding to the first target picture, where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is The m-th second after the acquisition time corresponding to the first target picture; searching whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs, and whether the end time point exists If there are I frames at the start time point and I frames at the end time point, discard the remaining first target pictures in the multiple first target pictures, and record the video corresponding to the first target pictures The clip is determined as the target video clip.
在一种可选的实施方式中,所述处理单元1230,还用于若不存在该起始时间点的I帧,则将该第一目标图片对应的所述录像片段的所述起始时间点增加x秒得到新的起始时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的起始时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的起始时间点与所述采集时间相同;若不存在该结束时间点的I帧,则将该第一目标图片对应的所述录像片段的所述结束时间点减少x秒得到新的结束时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的结束时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的结束时间点与所述采集时间相同;在该多张第一目标图片中分别对应的录像片段中,选择时长最长录像片段作为所述目标录像片段;丢弃该多张第一目标图片中的其余时长的录像片段对应的第一目标图片。In an optional implementation manner, the processing unit 1230 is further configured to: if there is no I-frame at the start time point, the start time of the video clip corresponding to the first target picture The point is increased by x seconds to obtain a new starting time point, and the above search steps are repeated until the I frame of the new starting time point is searched in the recording data of the video data channel to which the first target picture belongs. Or, the new start time point of the video clip corresponding to the first target picture is the same as the acquisition time; if there is no I frame at the end time point, the corresponding The end time point of the video clip is reduced by x seconds to obtain a new end time point, and the above search steps are repeated until the new search result is found in the video data of the video data channel to which the first target picture belongs. I frame at the end time point, or the new end time point of the video clip corresponding to the first target picture is the same as the acquisition time; the corresponding video clips in the plurality of first target pictures respectively in, Selecting the longest video clip as the target video clip; discarding the first target picture corresponding to the remaining video clips of the plurality of first target pictures.
在一种可选的实施方式中,所述处理单元1230,具体用于根据各所述目标录像片段的起始时间点和结束时间点对所述目标录像片段进行过滤,去除时间重复的录像数据;根据过滤后的目标录像片段生成所述视频摘要。In an optional implementation manner, the processing unit 1230 is specifically configured to filter the target video clip according to a start time point and an end time point of each target video clip to remove time-repeated video data. Generating the video summary according to the filtered target video clip.
在一种可选的实施方式中,所述处理单元1230,具体用于按照各所述目标录像片段的所述起始时间点对各所述目标录像片段进行排序;对于相邻的第一目标录像片段和第二目标录像片段,当所述第一目标录像片段的结束时间点大于等于所述第二目标录像片段的起始时间点时,若所述第一目标录像片段和所述第二目标录像片段属于同一视频数据通道,则将所述第一目标录像片段和所述第二目标录像片段合并,合并后的录像片段的起始时间点为所述第一目标录像片段的起始时间点,结束时间点为所述第二目标录像片段的结束时间点;若所述第一目标录像片段和所述第二目标录像片段属于不同视频数据通道,则以所述第一目标录像片段的所述结束时间点作为所述第二目标录像片段的所述起始时间点,或,以所述第二目标片段的所述起始时间点作为所述第一目标录像片段的所述结束时间点;其中,所述第一目标录像片段的所述起始时间点小于所述第二目标录像片段的所述起始时间点。In an optional implementation manner, the processing unit 1230 is specifically configured to sort each of the target video clips according to the start time point of each of the target video clips; for an adjacent first target The video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip, if the first target video clip and the second If the target video clip belongs to the same video data channel, the first target video clip and the second target video clip are merged, and the start time point of the combined video clip is the start time of the first target video clip Point, the end time point is the end time point of the second target video clip; if the first target video clip and the second target video clip belong to different video data channels, the The end time point is used as the start time point of the second target video clip, or the start time point of the second target video clip is used as the first The end time point of the target video clip; wherein the start time point of the first target video clip is smaller than the start time point of the second target video clip.
请参见图14,为本申请实施例提供的一种电子设备的硬件结构示意图。该电子设备可以包括处理器1401、通信接口1402、存储器1403和通信总线1404。处理器1401、通信接口1402以及存储器1403通过通信总线1404完成相互间的通信。其中,存储器1403上存放有计算机程序;处理器1401可以通过执行存储器1403上所存放的程序,执行上文描述的视频摘要生成方法。Please refer to FIG. 14, which is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application. The electronic device may include a processor 1401, a communication interface 1402, a memory 1403, and a communication bus 1404. The processor 1401, the communication interface 1402, and the memory 1403 complete communication with each other through the communication bus 1404. Among them, a computer program is stored in the memory 1403; the processor 1401 can execute the program stored in the memory 1403 to execute the video digest generating method described above.
本文中提到的存储器1403可以是任何电子、磁性、光学或其它物理存储装置,可以包含或存储信息,如可执行指令、数据,等等。例如,存储器1402可以是:RAM(Radom Access Memory,随机存取存储器)、易失存储器、非易失性存储器、闪存、存储驱动器(如硬盘驱动器)、固态硬盘、任何类型的存储盘(如光盘、DVD等),或者类似的存储介质,或者它们的组合。The memory 1403 mentioned herein may be any electronic, magnetic, optical, or other physical storage device, and may contain or store information such as executable instructions, data, and so on. For example, the memory 1402 may be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as hard drive), solid state hard disk, any type of storage disk (such as optical disk , DVD, etc.), or similar storage media, or a combination thereof.
本申请实施例还提供了一种存储有计算机程序的机器可读存储介质,例如图14中的存储器1403,所述计算机程序可由图14所示电子设备中的处理器1401执行以实现上文描述的视频摘要生成方法。An embodiment of the present application further provides a machine-readable storage medium storing a computer program, such as the memory 1403 in FIG. 14, and the computer program may be executed by the processor 1401 in the electronic device shown in FIG. 14 to implement the foregoing description Video digest generation method.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are used only to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or operations There is any such actual relationship or order among them. Moreover, the terms "including", "comprising", or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article, or device that includes a series of elements includes not only those elements but also those that are not explicitly listed Or other elements inherent to such a process, method, article, or device. Without more restrictions, the elements defined by the sentence "including a ..." do not exclude the existence of other identical elements in the process, method, article, or equipment including the elements.
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。The above are only preferred embodiments of this application, and are not intended to limit this application. Any modification, equivalent replacement, or improvement made within the spirit and principles of this application shall be included in this application Within the scope of protection.

Claims (40)

  1. 一种视频摘要生成方法,其特征在于,包括:A video summary generating method is characterized in that it includes:
    接收目标搜索请求,所述目标搜索请求中携带有待搜索目标的特征信息;Receiving a target search request, where the target search request carries characteristic information of a target to be searched;
    搜索与所述待搜索目标的特征信息匹配的第一目标图片;Searching for a first target picture that matches the feature information of the target to be searched;
    根据所述第一目标图片和所述第一目标图片对应的采集时间生成视频摘要。Generating a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
  2. 根据权利要求1所述的方法,其特征在于,在搜索与所述待搜索目标的特征信息匹配的所述第一目标图片之前,还包括:The method according to claim 1, before searching for the first target picture matching the feature information of the target to be searched, further comprising:
    获取视频源设备的视频数据中的目标图片信息,所述目标图片信息包括目标图片、目标图片的采集时间以及目标图片的属性信息;Acquiring target picture information in video data of a video source device, where the target picture information includes the target picture, a collection time of the target picture, and attribute information of the target picture;
    将所述目标图片信息保存至图片信息库。Save the target picture information to a picture information database.
  3. 根据权利要求2所述的方法,其特征在于,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:
    接收所述视频源设备发送的所述目标图片信息。Receiving the target picture information sent by the video source device.
  4. 根据权利要求2所述的方法,其特征在于,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:
    接收所述视频源设备发送的所述目标图片及所述目标图片的采集时间;Receiving the target picture and the acquisition time of the target picture sent by the video source device;
    对所述目标图片进行建模,并提取所述目标图片的属性信息。Model the target picture, and extract attribute information of the target picture.
  5. 根据权利要求2所述的方法,其特征在于,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:
    接收所述视频源设备发送的所述目标图片、所述目标图片的采集时间以及目标图片的第一属性信息;Receiving the target picture, the acquisition time of the target picture, and first attribute information of the target picture sent by the video source device;
    对所述目标图片进行建模,并提取所述目标图片的第二属性信息;Modeling the target picture, and extracting second attribute information of the target picture;
    根据所述目标图片的第一属性信息和所述目标图片的第二属性信息确定所述目标图片的属性信息。Determine the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
  6. 根据权利要求2所述的方法,其特征在于,获取所述视频源设备的所述视频数据中的所述目标图片信息,包括:The method according to claim 2, wherein the acquiring the target picture information in the video data of the video source device comprises:
    对所述视频源设备提供的所述视频数据进行目标检测,得到所述视频数据中的所述目标图片及所述目标图片的采集时间;Performing target detection on the video data provided by the video source device to obtain the target picture and the acquisition time of the target picture in the video data;
    对所述目标图片进行建模,并提取所述目标图片的属性信息。Model the target picture, and extract attribute information of the target picture.
  7. 根据权利要求2所述的方法,其特征在于,所述待搜索目标的特征信息包括待搜索目标的属性信息;The method according to claim 2, wherein the feature information of the target to be searched includes attribute information of the target to be searched;
    搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:Searching for the first target picture that matches the feature information of the target to be searched includes:
    根据所述待搜索目标的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Searching for the matching first target picture in the picture information database according to the attribute information of the target to be searched.
  8. 根据权利要求2所述的方法,其特征在于,所述待搜索目标的特征信息包括待搜索目标图片;The method according to claim 2, wherein the feature information of the target to be searched comprises a target picture to be searched;
    搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:Searching for the first target picture that matches the feature information of the target to be searched includes:
    对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的属性信息;Modeling the target picture to be searched, and extracting attribute information of the target picture to be searched;
    根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
  9. 根据权利要求2所述的方法,其特征在于,所述待搜索目标的特征信息包括待搜索目标图片和所述待搜索目标图片的第三属性信息;The method according to claim 2, wherein the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched;
    搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:Searching for the first target picture that matches the feature information of the target to be searched includes:
    对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的第四属性信息;Modeling the target picture to be searched, and extracting fourth attribute information of the target picture to be searched;
    根据所述第三属性信息和所述第四属性信息,确定所述待搜索目标图片的属性信息;Determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information;
    根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
  10. 根据权利要求2所述的方法,其特征在于,所述目标搜索请求中还携带有搜索时间段范围;The method according to claim 2, wherein the target search request further carries a search time period range;
    搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:Searching for the first target picture that matches the feature information of the target to be searched includes:
    根据所述搜索时间段范围对所述图片信息库中的所述目标图片进行筛选,得到采集时间在所述搜索时间段范围内的第二目标图片;Filtering the target pictures in the picture information database according to the range of the search time period to obtain a second target picture whose acquisition time is within the range of the search time period;
    根据所述待搜索目标的特征信息,从所述第二目标图片中搜索匹配的所述第一目标图片。Searching for a matching first target picture from the second target picture according to the feature information of the target to be searched.
  11. 根据权利要求2所述的方法,其特征在于,所述目标搜索请求中还携带有搜索通道号,所述目标图片信息还包括有所述目标图片的通道号;The method according to claim 2, wherein the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture;
    搜索与所述待搜索目标的特征信息匹配的所述第一目标图片,包括:Searching for the first target picture that matches the feature information of the target to be searched includes:
    根据所述搜索通道号对所述图片信息库中的所述目标图片进行筛选,得到所述通道号与所述搜索通道号一致的第三目标图片;Filtering the target picture in the picture information database according to the search channel number to obtain a third target picture whose channel number is consistent with the search channel number;
    根据所述待搜索目标的特征信息,从所述第三目标图片中搜索匹配的所述第一目标图片。Searching for a matching first target picture from the third target picture according to the feature information of the target to be searched.
  12. 根据权利要求2所述的方法,其特征在于,所述目标图片为人脸图片,所述目 标搜索请求为人脸搜索请求。The method according to claim 2, wherein the target picture is a face picture, and the target search request is a face search request.
  13. 根据权利要求2所述的方法,其特征在于,所述目标图片为车辆图片,所述目标搜索请求为车辆搜索请求。The method according to claim 2, wherein the target picture is a vehicle picture, and the target search request is a vehicle search request.
  14. 根据权利要求1所述的方法,其特征在于,根据所述第一目标图片和所述第一目标图片对应的采集时间生成所述视频摘要,包括:The method according to claim 1, wherein generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture comprises:
    按采集时间从早到晚的顺序排序所述第一目标图片;Sorting the first target picture in the order of the collection time from morning to night;
    根据排序后的所述第一目标图片,生成所述视频摘要。Generating the video summary according to the sorted first target picture.
  15. 根据权利要求1或14所述的方法,其特征在于,根据所述第一目标图片和所述第一目标图片对应的采集时间生成所述视频摘要,包括:The method according to claim 1 or 14, wherein generating the video summary according to the first target picture and the acquisition time corresponding to the first target picture comprises:
    对于每个第一目标图片,确定该第一目标图片对应的目标录像片段,其中,该目标录像片段为该第一目标图片对应的所述采集时间之前的第n秒到该第一目标图片的所述采集时间之后的第m秒之间的录像数据;For each first target picture, a target video clip corresponding to the first target picture is determined, where the target video clip is from the nth second before the acquisition time corresponding to the first target picture to the first target picture Video data between the m-th second after the acquisition time;
    根据各所述目标录像片段生成视频摘要。A video summary is generated according to each of the target video clips.
  16. 根据权利要求15所述的方法,其特征在于,对于所述每个第一目标图片,确定该第一目标图片对应的所述目标录像片段,包括:The method according to claim 15, wherein for each of the first target pictures, determining the target video clip corresponding to the first target picture comprises:
    当存在采集时间相同的多张第一目标图片时,对于该多张第一目标图片中的任一第一目标图片,确定该第一目标图片对应的录像片段的起始时间点和结束时间点,其中,该起始时间点为该第一目标图片对应的所述采集时间之前的第n秒,该结束时间点为该第一目标图片对应的所述采集时间之后的第m秒;When there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, determine the start time point and end time point of the video clip corresponding to the first target picture Where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is the m-th second after the acquisition time corresponding to the first target picture;
    搜索该第一目标图片所属的视频数据通道的录像数据中是否存在该起始时间点的I帧,以及是否存在该结束时间点的I帧;Searching whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs, and whether the I frame at the end time point exists;
    若存在该起始时间点的I帧和该结束时间点的I帧,则丢弃该多张第一目标图片中的其余第一目标图片,并将该第一目标图片对应的录像片段确定为所述目标录像片段。If there are I frames at the start time point and I frames at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded, and the video clip corresponding to the first target picture is determined as the Describe the target video clip.
  17. 根据权利要求16所述的方法,其特征在于,所述方法还包括:The method according to claim 16, further comprising:
    若不存在该起始时间点的I帧,则将该第一目标图片对应的所述录像片段的所述起始时间点增加x秒得到新的起始时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的起始时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的起始时间点与所述采集时间相同;If there is no I frame at the start time point, increase the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new start time point, and repeat the above search steps until The I-frame at the new start time point is searched in the recording data of the video data channel to which the first target picture belongs, or the new one of the video clips corresponding to the first target picture The starting time point is the same as the collection time;
    若不存在该结束时间点的I帧,则将该第一目标图片对应的所述录像片段的所述结束时间点减少x秒得到新的结束时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的结束时间点的I帧,或,该第 一目标图片对应的所述录像片段的所述新的结束时间点与所述采集时间相同;If there is no I frame at the end time point, the end time point of the video clip corresponding to the first target picture is reduced by x seconds to obtain a new end time point, and the above search steps are repeated until the An I-frame of the new end time point is searched in the recording data of the video data channel to which a target picture belongs, or the new end time point of the video clip corresponding to the first target picture and The acquisition times are the same;
    在该多张第一目标图片中分别对应的录像片段中,选择时长最长录像片段作为所述目标录像片段;Selecting, from the corresponding video clips in the multiple first target pictures, the longest video clip as the target video clip;
    丢弃该多张第一目标图片中的其余时长的录像片段对应的第一目标图片。Discard the first target pictures corresponding to the remaining video clips in the multiple first target pictures.
  18. 根据权利要求15所述的方法,其特征在于,根据各所述目标录像片段生成所述视频摘要,包括:The method according to claim 15, wherein generating the video summary according to each of the target video recording segments comprises:
    根据各所述目标录像片段的起始时间点和结束时间点对所述目标录像片段进行过滤,去除时间重复的录像数据;Filtering the target video clip according to the start time point and the end time point of each target video clip to remove time-repeated video data;
    根据过滤后的目标录像片段生成所述视频摘要。Generating the video summary according to the filtered target video clip.
  19. 根据权利要求18所述的方法,其特征在于,根据各所述目标录像片段的所述起始时间点和所述结束时间点对所述目标录像片段进行过滤,包括:The method according to claim 18, wherein filtering the target video clip according to the start time point and the end time point of each of the target video clips comprises:
    按照各所述目标录像片段的所述起始时间点对各所述目标录像片段进行排序;Sorting each of the target video clips according to the start time point of each of the target video clips;
    对于相邻的第一目标录像片段和第二目标录像片段,当所述第一目标录像片段的结束时间点大于等于所述第二目标录像片段的起始时间点时,For the adjacent first target video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip,
    若所述第一目标录像片段和所述第二目标录像片段属于同一视频数据通道,则将所述第一目标录像片段和所述第二目标录像片段合并,合并后的录像片段的起始时间点为所述第一目标录像片段的起始时间点,结束时间点为所述第二目标录像片段的结束时间点;If the first target video clip and the second target video clip belong to the same video data channel, merge the first target video clip and the second target video clip, and start time of the combined video clip The point is the start time point of the first target video clip, and the end time point is the end time point of the second target video clip;
    若所述第一目标录像片段和所述第二目标录像片段属于不同视频数据通道,则以所述第一目标录像片段的所述结束时间点作为所述第二目标录像片段的所述起始时间点,或,以所述第二目标片段的所述起始时间点作为所述第一目标录像片段的所述结束时间点;If the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as the start of the second target video clip A time point, or the start time point of the second target clip as the end time point of the first target video clip;
    其中,所述第一目标录像片段的所述起始时间点小于所述第二目标录像片段的所述起始时间点。Wherein, the start time point of the first target video clip is smaller than the start time point of the second target video clip.
  20. 一种视频摘要生成装置,其特征在于,包括:A video digest generating device, comprising:
    接收单元,用于接收目标搜索请求,所述目标搜索请求中携带有待搜索目标的特征信息;A receiving unit, configured to receive a target search request, where the target search request carries characteristic information of a target to be searched;
    搜索单元,用于搜索与所述待搜索目标的特征信息匹配的第一目标图片;A search unit, configured to search for a first target picture that matches feature information of the target to be searched;
    处理单元,用于根据所述第一目标图片和所述第一目标图片对应的采集时间生成视频摘要。A processing unit, configured to generate a video summary according to the first target picture and the acquisition time corresponding to the first target picture.
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 20, further comprising:
    获取单元,用于获取视频源设备的视频数据中的目标图片信息,所述目标图片信息包括目标图片、目标图片的采集时间以及目标图片的属性信息;An obtaining unit, configured to obtain target picture information in video data of a video source device, where the target picture information includes the target picture, a collection time of the target picture, and attribute information of the target picture;
    保存单元,用于将所述目标图片信息保存至图片信息库。A saving unit, configured to save the target picture information to a picture information database.
  22. 根据权利要求21所述的装置,其特征在于,The device according to claim 21, wherein:
    所述获取单元,具体用于接收所述视频源设备发送的所述目标图片信息。The obtaining unit is specifically configured to receive the target picture information sent by the video source device.
  23. 根据权利要求21所述的装置,其特征在于,所述获取单元,具体用于The apparatus according to claim 21, wherein the obtaining unit is specifically configured to:
    接收所述视频源设备发送的所述目标图片及所述目标图片的采集时间;Receiving the target picture and the acquisition time of the target picture sent by the video source device;
    对所述目标图片进行建模,并提取所述目标图片的属性信息。Model the target picture, and extract attribute information of the target picture.
  24. 根据权利要求21所述的装置,其特征在于,所述获取单元,具体用于The apparatus according to claim 21, wherein the obtaining unit is specifically configured to:
    接收所述视频源设备发送的所述目标图片、所述目标图片的采集时间以及目标图片的第一属性信息;Receiving the target picture, the acquisition time of the target picture, and first attribute information of the target picture sent by the video source device;
    对所述目标图片进行建模,并提取所述目标图片的第二属性信息;Modeling the target picture, and extracting second attribute information of the target picture;
    根据所述目标图片的第一属性信息和所述目标图片的第二属性信息确定所述目标图片的属性信息。Determine the attribute information of the target picture according to the first attribute information of the target picture and the second attribute information of the target picture.
  25. 根据权利要求21所述的装置,其特征在于,所述获取单元,具体用于The apparatus according to claim 21, wherein the obtaining unit is specifically configured to:
    对所述视频源设备提供的所述视频数据进行目标检测,得到所述视频数据中的所述目标图片及所述目标图片的采集时间;Performing target detection on the video data provided by the video source device to obtain the target picture and the acquisition time of the target picture in the video data;
    对所述目标图片进行建模,并提取所述目标图片的属性信息。Model the target picture, and extract attribute information of the target picture.
  26. 根据权利要求21所述的装置,其特征在于,所述待搜索目标的特征信息包括待搜索目标的属性信息;The apparatus according to claim 21, wherein the feature information of the target to be searched includes attribute information of the target to be searched;
    所述搜索单元,具体用于根据所述待搜索目标的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。The search unit is specifically configured to search for a matching first target picture in the picture information database according to attribute information of the target to be searched.
  27. 根据权利要求21所述的装置,其特征在于,所述待搜索目标的特征信息包括待搜索目标图片;The device according to claim 21, wherein the feature information of the target to be searched includes a target picture to be searched;
    所述搜索单元,具体用于The search unit is specifically used for
    对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的属性信息;Modeling the target picture to be searched, and extracting attribute information of the target picture to be searched;
    根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
  28. 根据权利要求21所述的装置,其特征在于,所述待搜索目标的特征信息包括待搜索目标图片和所述待搜索目标图片的第三属性信息;The apparatus according to claim 21, wherein the feature information of the target to be searched includes target picture to be searched and third attribute information of the target picture to be searched;
    所述搜索单元,具体用于The search unit is specifically used for
    对所述待搜索目标图片进行建模,并提取所述待搜索目标图片的第四属性信息;Modeling the target picture to be searched, and extracting fourth attribute information of the target picture to be searched;
    根据所述第三属性信息和所述第四属性信息,确定所述待搜索目标图片的属性信息;Determining attribute information of the target picture to be searched according to the third attribute information and the fourth attribute information;
    根据所述待搜索目标图片的属性信息在所述图片信息库中搜索匹配的所述第一目标图片。Searching for the matching first target picture in the picture information database according to the attribute information of the target picture to be searched.
  29. 根据权利要求21所述的装置,其特征在于,所述目标搜索请求中还携带有搜索时间段范围;The apparatus according to claim 21, wherein the target search request further carries a search time period range;
    所述搜索单元,具体用于The search unit is specifically used for
    根据所述搜索时间段范围对所述图片信息库中的目标图片进行筛选,以得到采集时间在所述搜索时间段范围内的第二目标图片;Filtering the target pictures in the picture information database according to the search time period range to obtain a second target picture with a collection time within the search time range;
    根据所述待搜索目标的特征信息从所述第二目标图片中搜索匹配的第一目标图片。Searching for a matching first target picture from the second target picture according to the feature information of the target to be searched.
  30. 根据权利要求21所述的装置,其特征在于,所述目标搜索请求中还携带有搜索通道号,所述目标图片信息还包括有所述目标图片的通道号;The device according to claim 21, wherein the target search request further carries a search channel number, and the target picture information further includes a channel number of the target picture;
    所述搜索单元,具体用于The search unit is specifically used for
    根据所述搜索通道号对所述图片信息库中的目标图片进行筛选,得到所述通道号与所述搜索通道号一致的第三目标图片;Filtering the target pictures in the picture information database according to the search channel number to obtain a third target picture with the same channel number as the search channel number;
    根据所述待搜索目标的特征信息,从所述第三目标图片中搜索匹配的第一目标图片。Searching for a matching first target picture from the third target picture according to the feature information of the target to be searched.
  31. 根据权利要求21所述的装置,其特征在于,所述目标图片为人脸图片,所述目标搜索请求为人脸搜索请求。The device according to claim 21, wherein the target picture is a face picture, and the target search request is a face search request.
  32. 根据权利要求21所述的装置,其特征在于,所述目标图片为车辆图片,所述目标搜索请求为车辆搜索请求。The device according to claim 21, wherein the target picture is a vehicle picture, and the target search request is a vehicle search request.
  33. 根据权利要求20所述的装置,其特征在于,所述处理单元,具体用于The apparatus according to claim 20, wherein the processing unit is specifically configured to:
    按采集时间从早到晚的顺序排序所述第一目标图片;Sorting the first target picture in the order of the collection time from morning to night;
    根据排序后的所述第一目标图片,生成所述视频摘要。Generating the video summary according to the sorted first target picture.
  34. 根据权利要求20或33所述的装置,其特征在于,所述处理单元,具体用于The device according to claim 20 or 33, wherein the processing unit is specifically configured to
    对于每个第一目标图片,确定该第一目标图片对应的目标录像片段,其中,该目标录像片段为该第一目标图片对应的所述采集时间之前的第n秒到该第一目标图片的所述采集时间之后的第m秒之间的录像数据;For each first target picture, a target video clip corresponding to the first target picture is determined, where the target video clip is from the nth second before the acquisition time corresponding to the first target picture to the first target picture Video data between the m-th second after the acquisition time;
    根据各所述目标录像片段生成视频摘要。A video summary is generated according to each of the target video clips.
  35. 根据权利要求34所述的装置,其特征在于,所述处理单元,具体用于The apparatus according to claim 34, wherein the processing unit is specifically configured to:
    当存在采集时间相同的多张第一目标图片时,对于该多张第一目标图片中的任一第一目标图片,确定该第一目标图片对应的录像片段的起始时间点和结束时间点,其中, 该起始时间点为该第一目标图片对应的所述采集时间之前的第n秒,该结束时间点为该第一目标图片对应的所述采集时间之后的第m秒;When there are multiple first target pictures with the same acquisition time, for any first target picture in the multiple first target pictures, determine the start time point and end time point of the video clip corresponding to the first target picture Where the start time point is the n-th second before the acquisition time corresponding to the first target picture, and the end time point is the m-th second after the acquisition time corresponding to the first target picture;
    搜索该第一目标图片所属的视频数据通道的录像数据中是否存在该起始时间点的I帧,以及是否存在该结束时间点的I帧;Searching whether the I frame at the start time point exists in the recording data of the video data channel to which the first target picture belongs, and whether the I frame at the end time point exists;
    若存在该起始时间点的I帧和该结束时间点的I帧,则丢弃该多张第一目标图片中的其余第一目标图片,并将该第一目标图片对应的录像片段确定为所述目标录像片段。If there are I frames at the start time point and I frames at the end time point, the remaining first target pictures in the plurality of first target pictures are discarded, and the video clip corresponding to the first target picture is determined as the Describe the target video clip.
  36. 根据权利要求35所述的装置,其特征在于,所述处理单元,还用于The apparatus according to claim 35, wherein the processing unit is further configured to:
    若不存在该起始时间点的I帧,则将该第一目标图片对应的所述录像片段的所述起始时间点增加x秒得到新的起始时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的起始时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的起始时间点与所述采集时间相同;If there is no I frame at the start time point, increase the start time point of the video clip corresponding to the first target picture by x seconds to obtain a new start time point, and repeat the above search steps until The I-frame at the new start time point is searched in the recording data of the video data channel to which the first target picture belongs, or the new one of the video clips corresponding to the first target picture The starting time point is the same as the collection time;
    若不存在该结束时间点的I帧,则将该第一目标图片对应的所述录像片段的所述结束时间点减少x秒得到新的结束时间点,并重复上述搜索步骤,直至在该第一目标图片所属的所述视频数据通道的所述录像数据中搜索到该新的结束时间点的I帧,或,该第一目标图片对应的所述录像片段的所述新的结束时间点与所述采集时间相同;If there is no I frame at the end time point, the end time point of the video clip corresponding to the first target picture is reduced by x seconds to obtain a new end time point, and the above search steps are repeated until the An I-frame of the new end time point is searched in the recording data of the video data channel to which a target picture belongs, or the new end time point of the video clip corresponding to the first target picture and The acquisition times are the same;
    在该多张第一目标图片中分别对应的录像片段中,选择时长最长录像片段作为所述目标录像片段;Selecting, from the corresponding video clips in the multiple first target pictures, the longest video clip as the target video clip;
    丢弃该多张第一目标图片中的其余时长的录像片段对应的第一目标图片。Discard the first target pictures corresponding to the remaining video clips in the multiple first target pictures.
  37. 根据权利要求34所述的装置,其特征在于,所述处理单元,具体用于The apparatus according to claim 34, wherein the processing unit is specifically configured to:
    根据各所述目标录像片段的起始时间点和结束时间点对所述目标录像片段进行过滤,去除时间重复的录像数据;Filtering the target video clip according to the start time point and the end time point of each target video clip to remove time-repeated video data;
    根据过滤后的目标录像片段生成所述视频摘要。Generating the video summary according to the filtered target video clip.
  38. 根据权利要求37所述的装置,其特征在于,所述处理单元,具体用于The apparatus according to claim 37, wherein the processing unit is specifically configured to:
    按照各所述目标录像片段的所述起始时间点对各所述目标录像片段进行排序;Sorting each of the target video clips according to the start time point of each of the target video clips;
    对于相邻的第一目标录像片段和第二目标录像片段,当所述第一目标录像片段的结束时间点大于等于所述第二目标录像片段的起始时间点时,For the adjacent first target video clip and the second target video clip, when the end time point of the first target video clip is greater than or equal to the start time point of the second target video clip,
    若所述第一目标录像片段和所述第二目标录像片段属于同一视频数据通道,则将所述第一目标录像片段和所述第二目标录像片段合并,合并后的录像片段的起始时间点为所述第一目标录像片段的起始时间点,结束时间点为所述第二目标录像片段的结束时间点;If the first target video clip and the second target video clip belong to the same video data channel, merge the first target video clip and the second target video clip, and start time of the combined video clip The point is the start time point of the first target video clip, and the end time point is the end time point of the second target video clip;
    若所述第一目标录像片段和所述第二目标录像片段属于不同视频数据通道,则以所 述第一目标录像片段的所述结束时间点作为所述第二目标录像片段的所述起始时间点,或,以所述第二目标片段的所述起始时间点作为所述第一目标录像片段的所述结束时间点;If the first target video clip and the second target video clip belong to different video data channels, the end time point of the first target video clip is used as the start of the second target video clip A time point, or the start time point of the second target clip as the end time point of the first target video clip;
    其中,所述第一目标录像片段的所述起始时间点小于所述第二目标录像片段的所述起始时间点。Wherein, the start time point of the first target video clip is smaller than the start time point of the second target video clip.
  39. 一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,所述处理器,所述通信接口,所述存储器通过所述通信总线完成相互间的通信;An electronic device, comprising a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;
    所述存储器,用于存放计算机程序;The memory is used to store a computer program;
    所述处理器,用于执行所述存储器上所存放的所述计算机程序时,实现权利要求1-19任一项所述的方法步骤。The processor is configured to implement the method steps according to any one of claims 1 to 19 when the computer program stored in the memory is executed.
  40. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-19任一项所述的方法步骤。A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method steps according to any one of claims 1-19 are implemented.
PCT/CN2019/102073 2018-09-04 2019-08-22 Video abstract generation method and apparatus, and electronic device and readable storage medium WO2020048324A1 (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
CN201811027494.7 2018-09-04
CN201811027515.5 2018-09-04
CN201811026894.6 2018-09-04
CN201811026894.6A CN110876090B (en) 2018-09-04 2018-09-04 Video abstract playback method and device, electronic equipment and readable storage medium
CN201811027515.5A CN110876029B (en) 2018-09-04 2018-09-04 Video abstract playback method and device, electronic equipment and readable storage medium
CN201811025858.8A CN110876092B (en) 2018-09-04 2018-09-04 Video abstract generation method and device, electronic equipment and readable storage medium
CN201811025858.8 2018-09-04
CN201811027494.7A CN110929095A (en) 2018-09-04 2018-09-04 Video abstract playback method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
WO2020048324A1 true WO2020048324A1 (en) 2020-03-12

Family

ID=69722729

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/102073 WO2020048324A1 (en) 2018-09-04 2019-08-22 Video abstract generation method and apparatus, and electronic device and readable storage medium

Country Status (1)

Country Link
WO (1) WO2020048324A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927364A (en) * 2014-04-18 2014-07-16 苏州科达科技股份有限公司 Storage method and system and display system for video abstract data
CN105335387A (en) * 2014-07-04 2016-02-17 杭州海康威视系统技术有限公司 Retrieval method for video cloud storage system
US20160342688A1 (en) * 2013-06-25 2016-11-24 Emc Corporation Large scale video analytics architecture
CN107436944A (en) * 2017-07-31 2017-12-05 福州瑞芯微电子股份有限公司 A kind of method and system of video search
CN108337482A (en) * 2018-02-08 2018-07-27 北京信息科技大学 The storage method and system of monitor video

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160342688A1 (en) * 2013-06-25 2016-11-24 Emc Corporation Large scale video analytics architecture
CN103927364A (en) * 2014-04-18 2014-07-16 苏州科达科技股份有限公司 Storage method and system and display system for video abstract data
CN105335387A (en) * 2014-07-04 2016-02-17 杭州海康威视系统技术有限公司 Retrieval method for video cloud storage system
CN107436944A (en) * 2017-07-31 2017-12-05 福州瑞芯微电子股份有限公司 A kind of method and system of video search
CN108337482A (en) * 2018-02-08 2018-07-27 北京信息科技大学 The storage method and system of monitor video

Similar Documents

Publication Publication Date Title
EP3830714B1 (en) Systems and methods for generating metadata describing unstructured data objects at the storage edge
AU2017204338B2 (en) Industry first method that uses users search query to show highly relevant poster frames for stock videos thereby resulting in great experience and more sales among users of stock video service
US9560323B2 (en) Method and system for metadata extraction from master-slave cameras tracking system
EP3253042B1 (en) Intelligent processing method and system for video data
JP6446971B2 (en) Data processing apparatus, data processing method, and computer program
CN102483767B (en) Object association means, method of mapping, program and recording medium
US7243101B2 (en) Program, image managing apparatus and image managing method
JP5791364B2 (en) Face recognition device, face recognition method, face recognition program, and recording medium recording the program
US8457466B1 (en) Videore: method and system for storing videos from multiple cameras for behavior re-mining
US20080247610A1 (en) Apparatus, Method and Computer Program for Processing Information
US20210382933A1 (en) Method and device for archive application, and storage medium
WO2014106384A1 (en) Method, apparatus and video monitoring system for providing monitoring video information
GB2528330A (en) A method of video analysis
JP2011528150A (en) Method and system for automatic personal annotation of video content
JP2012509522A (en) Semantic classification for each event
US10037467B2 (en) Information processing system
CN110543584B (en) Method, device, processing server and storage medium for establishing face index
WO2022156234A1 (en) Target re-identification method and apparatus, and computer-readable storage medium
JP2014067333A (en) Image processing device, image processing method, and program
JPWO2018163398A1 (en) Similar image search system
CN111881320A (en) Video query method, device, equipment and readable storage medium
CN110876090B (en) Video abstract playback method and device, electronic equipment and readable storage medium
CN108540760A (en) Video monitoring recognition methods, device and system
CN109522799A (en) Information cuing method, device, computer equipment and storage medium
WO2021196551A1 (en) Image retrieval method and apparatus, computer device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19857715

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19857715

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19857715

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 040222)

122 Ep: pct application non-entry in european phase

Ref document number: 19857715

Country of ref document: EP

Kind code of ref document: A1