CN110309362A

CN110309362A - A kind of video retrieval method and system

Info

Publication number: CN110309362A
Application number: CN201910606483.2A
Authority: CN
Inventors: 赵崇毅; 赵维中; 刘青青
Original assignee: Shenzhen Zhongke Yunhai Technology Co Ltd
Current assignee: Shenzhen Zhongke Yunhai Technology Co Ltd
Priority date: 2019-07-05
Filing date: 2019-07-05
Publication date: 2019-10-08

Abstract

A kind of video retrieval method, comprising the following steps: synchronous to extract the frame picture shot when shooting video when shooting video；Trained neural network model identifies frame picture to calling, and will identify that object and the shooting time of corresponding frame picture are stored in label file；First object object is retrieved in label file, shows the shooting time with the matched frame picture of first object object in the label file.Due to when shooting video, object identification is carried out to frame picture by neural network model, and the time of the object of identification and frame picture shooting is saved in label file, when requiring to look up specific objective object, it is searched and the matched object of specific objective object directly in label file, and then the time of specific objective object appearance is obtained, then temporally playback video can observe specific objective object, greatly improve the efficiency for searching target.Present invention also provides a kind of video frequency search systems.

Description

A kind of video retrieval method and system

Technical field

This application involves video retrieval technical fields, and in particular to a kind of video retrieval method and system.

Background technique

Video system is largely used in places such as security protection, traffic administration, market, production, machine vision, storages at present, is deposited In the video data of magnanimity.The pith that analysis, the processing of video information are built with smart city is had become.

Existing video frequency searching relies primarily on artificial treatment, and the video of same time plays back video with same time to find view Target scene in frequency, inefficiency and easy omission.It is that video is uploaded to businessman to mention there are also a kind of video retrieval method The cloud server of confession calculates power by the strength of cloud server and neural network algorithm analyzes video information, on It passes video equally also to take a long time, there are problems that inefficiency.

Summary of the invention

For the drawbacks described above for overcoming the prior art, the application provides a kind of efficient video retrieval method and system.

According in a first aspect, providing a kind of video retrieval method in a kind of embodiment, comprising the following steps: shooting video When, it is synchronous to extract the frame picture shot when shooting video；Neural network model is called to identify frame picture, and by identification The shooting time of object and corresponding frame picture is stored in label file；First object is retrieved in label file Object shows the shooting time with the matched frame picture of first object object in the label file.

Preferably, after neural network model identifies frame picture, also identified frame picture is stored to label text In part folder, inputs the second object and dedicated neural network model is called to carry out secondary knowledge to the matched frame picture of first object object Not, the shooting time in secondary identification with the matched frame picture of the second object is shown.

Preferably, further include input picture, call neural network model to identify the object of the picture, will identify Object out searches the shooting time of frame picture and/or frame picture matching in label file as search condition.

Preferably, when the object of setting occurs and identified by neural network model, alarm is issued.

Preferably, the shooting time of identified object and corresponding frame picture is stored in label in the form of electronic tag In file, when multiple frame pictures are identified and object therein is identical, save one therein or several frame pictures and The corresponding electronic tag of frame picture.

Preferably, time point of setting or when memory space inadequate, unidentified frame picture and/or correspondence is deleted and is not known The video of other frame picture.

According to second aspect, a kind of video frequency search system, including frame picture extraction unit, frame figure are provided in a kind of embodiment Piece recognition unit, storage unit, input unit, retrieval unit and display unit；The picture extraction unit is for shooting It is synchronous to extract the frame picture shot when shooting video when video；Frame picture recognition unit is for calling neural network model to frame Picture is identified, and the shooting time of the object of identification and corresponding frame picture is stored in label file；Storage Unit is for storing video, label file；Input unit is for inputting object to be retrieved；Retrieval unit is used for basis Object to be retrieved searches the shooting time of frame picture and/or frame picture matched in label file；Display is single Member, for showing the shooting time and/or corresponding frame picture of frame picture.

Preferably, the display unit and input unit are integrated on touching display screen.

According to the video retrieval method and system of above-described embodiment, due to when shooting video, that is, passing through neural network mould Type carries out object identification to frame picture, and the time of the object of identification and frame picture shooting is protected in label file It deposits, when requiring to look up specific objective object, lookup and the matched object of specific objective object directly in label file, and then The time occurred to specific objective object, then temporally playback video can observe specific objective object, using the method Target is searched, comes to carry out video discriminance analysis without scene, and the storage of the time of identified object and shooting Capacity is only several bytes, so greatly improves the efficiency for searching target.Further, the application directlys adopt frame picture conduct Discriminance analysis object, relative in the past using filmed video as analysis object, avoid video pictures coding and decoding Step also improves the efficiency of picture recognition.Further, dedicated neural network model is called to carry out by profession to frame picture Secondary identification can be accomplished precisely to identify, improve recall precision again.

Detailed description of the invention

Fig. 1 is one embodiment flow chart of the application；

Fig. 2 is another embodiment flow chart of the application；

Fig. 3 is one embodiment system block diagram of the application.

Specific embodiment

Below by specific embodiment combination attached drawing, invention is further described in detail.Wherein different embodiments Middle similar component uses associated similar element numbers.In the following embodiments, many datail descriptions be in order to The application is better understood.However, those skilled in the art can recognize without lifting an eyebrow, part of feature It is dispensed, or can be substituted by other elements, material, method in varied situations.In some cases, this Shen Please it is relevant it is some operation there is no in the description show or describe, this is the core in order to avoid the application by mistake More descriptions are flooded, and to those skilled in the art, these relevant operations, which are described in detail, not to be necessary, they Relevant operation can be completely understood according to the general technology knowledge of description and this field in specification.

It is herein component institute serialization number itself, such as " first ", " second " etc., is only used for distinguishing described object, Without any sequence or art-recognized meanings.And " connection ", " connection " described in the application, unless otherwise instructed, include directly and It is indirectly connected with (connection).

Embodiment one

Referring to FIG. 1, video retrieval method, comprising the following steps:

101, synchronous to extract the frame picture shot when shooting video when shooting video；

102, call trained neural network model to identify frame picture, and by the title of the object of identification or The shooting time of the code of person's object and corresponding frame picture is fabricated to electronic tag label and is stored in label file.

Training, which refers to, carries out learning training to a large amount of object samples pictures with neural network model, and remembers its feature, Such as object " lorry " is provided, by Learning Algorithm, algorithm model can be found out from frame picture with goods The picture of vehicle.

103, input first object name claims or code retrieves label file, and display is matched with first object object Frame picture shooting time.

Object, the shooting time of frame picture and corresponding frame picture can be stored in the form of electronic tag, electronics Label can be two dimensional code, bar code etc..Such as in an electronic tag, contain object " automobile ", " ride people ", corresponding The shooting time information of frame picture and the frame picture.Refer to that first object object is identified with the matched frame picture of first object object And it appears in the frame picture.For example, the code 01 of first object object " automobile " or " automobile " is inputted when retrieval, when electronics mark When containing " automobile " or code 01 in label, then the corresponding frame picture of the electronic tag is matched with first object object.

It is stored in label file, and the frame picture is marked, such as frame after the frame picture extracted is identified The shooting time of picture, possessed object in picture.Object therein can be lorry, motor bus, car, pedestrian, The classification informations such as cyclist, traffic signboard, color lump, texture, scene, face, label time and " pedestrian if only pedestrian (or code) ", if any pedestrian, car then label time, " car (or code) " and " pedestrian (or code) ", and so on.Know Other object can classify to be configured according to scene difference.

When being retrieved, it is only necessary to label file is retrieved, the video clip of car appearance is such as found, It then inputs " car (or code) ", the time of the shooting of all frame pictures of the output containing object " car ", preferably together When display contain object " car " all frame pictures, people by shooting time playback be the scene of observable at that time.

The neural network model of calling can be one or more.Such as the neural network model of calling can be one Large-sized model is also possible to the series connection of several different classes of mini Mods.

Embodiment two

Referring to FIG. 2, video retrieval method, comprising the following steps:

201, synchronous to extract the frame picture shot when shooting video when shooting video；

202, trained neural network model is called to identify frame picture, and by the object of identification, corresponding frame The shooting time of picture and corresponding frame picture is stored in label file.

203, first object object is retrieved in label file, obtains matching in the label file with first object object Frame picture.

204, it inputs the second object and dedicated neural network model is called to carry out the matched frame picture of first object object Secondary identification shows the shooting time in secondary identification with the matched frame picture of the second object.

The present embodiment is to carry out quadratic search on the basis of example 1, when the frame picture for retrieving existing for the first time is more, When cannot rapidly find out the target object required to look up, secondary accurate identification retrieval can be carried out.Such as to find certain vehicle The car of the trade mark retrieves the frame picture containing " car " first, then calls Car license recognition model to the frame for containing " car " Picture is recognized, when finding out the shooting with target " license plate " matched frame picture or the matched frame picture of target " license plate " Between, so greatly improve search efficiency.

In a preferred embodiment, picture can also be inputted, neural network model is called to carry out the object of the picture Identification, the object that will identify that search frame picture matching in label file as search condition.I.e. to scheme to search figure, The photo for wearing the cook of cook's cap is such as inputted, then finds out the frame picture with " cook's cap " and " people " in monitor video, Improve search accuracy.

In a preferred embodiment, when the object of setting occurs and identified by neural network model, alarm is issued.Example Such as when the human face target object of certain criminal is identified, alarm is issued, this practical function is very strong, can be greatly improved people's police's It solves a case efficiency.Meanwhile this function can also be applied in other occasions, such as be mounted in automobile data recorder, camera captures When driver " eye closing " object continues three seconds, then start audio alarm, wakes up driver.

Referring to FIG. 3, the application also provides a kind of video frequency search system, including frame picture extraction unit, frame picture recognition Unit 1, storage unit 2, input unit, retrieval unit, camera unit 3, display unit；Wherein, display unit and input unit It is integrated on touching display screen 4, input unit is used to show the bat of frame picture for inputting the object that need to be retrieved, display unit Take the photograph time and/or corresponding frame picture.For camera unit 3 for shooting video, camera unit can be independent cam device. Picture extraction unit is used for when shoot video, synchronous to extract the frame picture shot when shooting video, and picture extraction unit can be with In CPU 5；Frame picture recognition unit is used to that neural network model to be called to identify frame picture, and according to the mesh of identification Frame picture classification is stored in each label file by mark object, and what is stored together further includes the shooting time of frame picture, frame figure Piece recognition unit 1 can be using U.S. Silicon Valley GTI company2801S or2803S or5801 Processing with Neural Network NPU chips, while Shenzhen cloud data technologies Co., Ltd can also be used is Column chip.Processing with Neural Network NPU chip can identify the object in picture after calling algorithm model；Storage unit For storing the shooting time of video, each label file and frame picture, storage unit 2 can be PC machine hard disk, hard disk Video recorder etc. stores equipment；Retrieval unit, for searching frame matched in label file according to the object of input Picture, retrieval unit can be similarly set in CPU 5, execute retrieval tasks by CPU.In addition, CPU 5 and other each modules Connection plays commander, assigns the effects of instruction, scheduling.

It is synchronized after the scheduling of CPU 5 by the frame picture of camera unit shooting and is divided into two-way, be encoded into video file all the way Storage unit is arrived in storage afterwards, and another way synchronous transfer to frame picture recognition unit, the frame picture number of extraction per second is according to NPU chip Processing capacity top lattice determine, cannot generally be less than 5 frame per second.Meanwhile the identification mould that frame picture recognition cell call has loaded Type, which is synchronized, carries out object identification to each frame picture, the frame picture after identification enclose the object that it has and shooting when Between be stored in the label file of storage unit.When carrying out video frequency searching, it is only necessary to label file is retrieved, it is defeated The shooting time of the matched frame picture of object and frame picture out, then temporally plays back.

Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not to limit The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple It deduces, deform or replaces.

Claims

1. a kind of video retrieval method, which comprises the following steps:

It is synchronous to extract the frame picture shot when shooting video when shooting video；

Neural network model is called to identify frame picture, and by the object of identification and the shooting time of corresponding frame picture It is stored in label file；

In label file retrieve first object object, show in the label file with the matched frame picture of first object object Shooting time.

2. video retrieval method as described in claim 1, which is characterized in that neural network model identifies frame picture Afterwards, it also by the storage of identified frame picture into label file, inputs the second object and calls dedicated neural network model The matched frame picture of first object object is recognized, show in secondary identification with the matched frame picture of the second object Shooting time.

3. video retrieval method as described in claim 1, which is characterized in that further include input picture, call neural network mould Type identifies the object of the picture, and the object that will identify that is searched in label file therewith as search condition The shooting time of the frame picture and/or frame picture matched.

4. video retrieval method as described in claim 1, which is characterized in that when the object of setting occurs and by neural network When model identifies, alarm is issued.

5. video retrieval method as described in claim 1, which is characterized in that identified object and corresponding frame picture Shooting time is stored in label file in the form of electronic tag, when multiple frame pictures are identified and object therein is identical When, save one or several frame pictures and the corresponding electronic tag of frame picture therein.

6. video retrieval method as described in claim 1, which is characterized in that the time point of setting or memory space inadequate When, delete the video of unidentified frame picture and/or corresponding unidentified frame picture.

7. a kind of video frequency search system, which is characterized in that including frame picture extraction unit, frame picture recognition unit, storage unit, Input unit, retrieval unit and display unit；

The picture extraction unit is used for when shooting video, synchronous to extract the frame picture shot when shooting video；

Frame picture recognition unit is for calling neural network model to identify frame picture, and by the object of identification and right The shooting time of frame picture is answered to be stored in label file；

Storage unit is for storing video, label file；

Input unit is for inputting object to be retrieved；

Retrieval unit, for searching frame picture and/or frame matched in label file according to object to be retrieved The shooting time of picture；

Display unit, for showing the shooting time and/or corresponding frame picture of frame picture.

8. video frequency search system as claimed in claim 7, which is characterized in that the display unit and input unit are integrated in touching It controls on display screen.