CN107749951A

CN107749951A - A kind of visually-perceptible method and system for being used for unmanned photography

Info

Publication number: CN107749951A
Application number: CN201711098855.2A
Authority: CN
Inventors: 刘博�; 张明; 黄龙; 李亮; 王禹
Original assignee: Rui Magic Intelligent Technology (dongguan) Co Ltd
Current assignee: Rui Magic Intelligent Technology (dongguan) Co Ltd
Priority date: 2017-11-09
Filing date: 2017-11-09
Publication date: 2018-03-02

Abstract

A kind of visually-perceptible method and system for being used for unmanned photography, comprises the following steps：S1, camera shooting terminal is erected in support terminal, starts camera shooting terminal and target is started to shoot video, extract the foundation characteristic of target；S2, according to foundation characteristic, analysis target current behavior, posture and position, the tutorial message of next step shooting picture is exported, tutorial message is provided for the behavior of camera shooting terminal, adjusts camera terminal, according to this repeatedly, until video capture terminates.In the case that the present invention does not need people's participation, shooting is completed, human cost is reduced, saves the time, shooting effect is good.

Description

A kind of visually-perceptible method and system for being used for unmanned photography

Technical field

It is specifically a kind of to be used for the unmanned visually-perceptible method photographed and be the invention belongs to unmanned shooting field System.

Background technology

Existing camera work is required for the participation of someone.User directly holds camera terminal and shot, or borrows Shooting tripod head etc. is controlled.Or the so-called unmanned shooting instrument of such as unmanned plane, Underwater Camera, it is substantially and by people Manipulated, the content of shooting, time start-stop, coverage, shooting angle etc. shooting key element are determined by people.I.e. various bats Mode is taken the photograph, is substantially all the directly or indirectly participation for needing people.Moreover, the unmanned shooting set, can only simply enter Row shooting, can not carry out perception shooting.

The content of the invention

The technical problem to be solved in the present invention is to provide a kind of visually-perceptible method and system for being used for unmanned photography, it is not required to In the case that very important person participates in, shooting is completed, human cost is reduced, saves the time, shooting effect is good.

In order to solve the above-mentioned technical problem, the present invention takes following technical scheme：

A kind of visually-perceptible method for being used for unmanned photography, comprises the following steps：

S1, camera shooting terminal is erected in support terminal, starts camera shooting terminal and target is started to shoot video, extract the base of target Plinth feature；

S2, according to foundation characteristic, analysis target current behavior, posture and position, the guidance letter of output next step shooting picture Breath, tutorial message is provided for the behavior of camera shooting terminal, adjusts camera terminal, according to this repeatedly, until video capture terminates.

The step S2 specifically includes front-end vision perception, and the front-end vision perceives and specifically includes following steps：

S201, target detection, determine particular location of the target in current shooting picture；

S202, Attitude estimation, obtain the posture feature of target, thus it is speculated that the trend of target next step；

S203, target following, according to the historical position of target and gesture distribution information, determine the position of target in subsequent frames And attitude information.

In the step S201, during target detection, the picture frame in current shooting picture is detected every setting frame, The target location in shooting picture is found, auxiliary information is provided for Attitude estimation and target following.

The step S2 specifically includes rear end scene and understood, the rear end scene, which understands, specifically includes following steps：

S211, behavioral value, the action to target detect, and obtain the action message of target；

S212, recognition of face, the face in shooting picture is identified, provided for identification target and object select with reference to letter Breath；

S213, gesture identification, detect whether to represent the gesture of operational order, if so, performing corresponding operating according to gesture；

S214, focus perceive, according to the posture feature obtained in the action message and Attitude estimation obtained in behavioral value, to working as The focus of preceding shooting video is estimated, automatically determines coverage, shooting angle and shooting path, output shooting strategy, adjusts Whole camera terminal to relevant position is shot.

The information that the recognition of face of the step S212 obtains and the information that step S213 gesture identification obtains, are also provided To step S203 target following, the information is analyzed and processed.

The information that the information and step S202 Attitude estimations that the target detection of the step S201 obtains obtain, is also provided to Step S211 behavioral value, the information is analyzed and processed.

A kind of visually-perceptible system for being used for unmanned photography, the system includes foundation characteristic extraction module, for obtaining The foundation characteristic of target in shooting picture；

Front-end vision sensing module, for receiving foundation characteristic and being analyzed and processed, export the guidance of next step shooting picture Information；

Rear end scene Understanding Module, for receiving foundation characteristic and being analyzed and processed, export the guidance of next step shooting picture Information.

The front-end vision sensing module includes：

Module of target detection, for determining particular location of the target in current shooting picture；

Attitude estimation module, obtain the posture feature of target, thus it is speculated that the trend of target next step；

Target tracking module, according to the historical position of target and gesture distribution information, determine the position of target in subsequent frames And attitude information；

Rear end scene Understanding Module includes：Behavioral value module, the action to target detect, and obtain the action letter of target Breath；

Face recognition module, the face in shooting picture is identified, reference information is provided for identification target and object select；

Gesture recognition module, detect whether to represent the gesture of operational order, if so, performing corresponding operating according to gesture；

Focus sensing module, according to the posture feature obtained in the action message and Attitude estimation obtained in behavioral value, to working as The focus of preceding shooting video is estimated, automatically determines coverage, shooting angle and shooting path, output shooting strategy, adjusts Whole camera terminal to relevant position is shot.

In the case that the present invention does not need people's participation, shooting is completed, human cost is reduced, is extracted, had by foundation characteristic The time that effect reduces the amount of calculation of shooting and expended, shooting effect are good.

Brief description of the drawings

Accompanying drawing 1 is the catenation principle schematic diagram of system in the present invention；

The overall flow schematic diagram that accompanying drawing 2 is shot for the present invention with reference to camera terminal.

Embodiment

For the feature of the present invention, technological means and the specific purposes reached, function can be further appreciated that, with reference to Accompanying drawing is described in further detail with embodiment to the present invention.

S1, camera shooting terminal is erected in support terminal, starts camera shooting terminal and target is started to shoot video, extract the base of target Plinth feature.It can be head either other movable support meanss to support terminal, and camera shooting terminal is such as camera, mobile phone has The electronic equipment of camera function.Obtain a variety of foundation characteristics of target, the foundation characteristic is as the relative of target and surrounding enviroment Relation, action, posture etc..

S2, according to foundation characteristic, analysis target current behavior, posture and position, the finger of output next step shooting picture Information is led, tutorial message is provided for the behavior of camera shooting terminal, adjusts camera terminal, according to this repeatedly, until video capture terminates.

S201, target detection, determine particular location of the target in current shooting picture.Every setting frame to current shooting picture In picture frame detected, the target location in shooting picture is found, so as to acquire more accurate target location.

S202, Attitude estimation, the posture feature of target is obtained, according to the inner link between posture and behavior, thus it is speculated that mesh The trend of next step is marked, the target for obtaining a supposition tends to.

S203, target following, according to the historical position of target and gesture distribution information, determine target in subsequent frames Position and attitude information, export the target location in subsequent frame and attitude information.Mainly the action to target is tracked, can To be more smoothly rapidly performed by tracking, according to the position of target and attitude information, tracking effect can be effectively improved.And And among the scene of complexity, the information that is obtained by target detection and Attitude estimation can also find the people in video Body position, so as to aid in the work of human body tracking, or target following is helped to recover from failure, so as to continue to track.

S211, behavioral value, the action to target detect, and obtain the action message of target, while can be used for estimation and work as The behavior focus of preceding object.

S212, recognition of face, the face in shooting picture is identified, reference is provided for identification target and object select Information；

S213, gesture identification, detect whether to represent the gesture of operational order, if so, performing corresponding operating according to gesture.Such as Target is people, then people can be sent to video camera by gesture before camera and start to image, stop the instruction such as shooting.Pass through knowledge These other gestures, it is possible to achieve corresponding operation.

S214, focus perceives, according to the posture feature obtained in the action message and Attitude estimation obtained in behavioral value, The focus of current shooting video is estimated, automatically determines coverage, shooting angle and shooting path, output shooting plan Slightly, adjustment camera terminal to relevant position is shot.So as to according to obtained shooting strategy, adjust camera terminal, Huo Zhetong Adjustment support terminal is crossed to reach the purpose of adjustment camera terminal, is once shot under realizing more preferably.With control camera shooting terminal or Person supports the control system of terminal to carry out the communication of director data, realizes specific control.

In order to preferably be shot and be perceived, the information that information and gesture identification that recognition of face obtains obtain also provides To target following, target following step is conveyed to, is advantageous to preferably be tracked.

In addition, the information that information and step S202 Attitude estimations that the target detection of the step S201 obtains obtain, also Step S211 behavioral value is supplied to, the information is analyzed and processed.Understand so as to which front-end vision is perceived as rear end scene Substantial amounts of Back ground Information, such as posture feature, position are provided.And rear end scene understanding simultaneously can also feed back to front-end vision Perceive, perceived by front-end vision and rear end scene understands that the next step of dual output shoots tutorial message, it is comprehensive obtain one compared with The shooting strategy of extension.

Present invention further teaches a kind of visually-perceptible system for being used for unmanned photography, as shown in Figure 1, the system includes Foundation characteristic extraction module, for obtaining the foundation characteristic of target in shooting picture, by foundation characteristic extraction module, as list Only extraction module, save and subsequently clarification of objective is extracted, a unified characteristic extracting module is formed, so as to save several times Time and computer memory space.The deep neural network parameter of foundation characteristic extraction module, first comes from what is trained ImageNet disaggregated model, then using transfer learning, trained successively by different modules, obtain shared foundation characteristic Extract network.

Front-end vision sensing module, for receiving foundation characteristic and being analyzed and processed, output next step shooting picture Tutorial message；Rear end scene Understanding Module, for receiving foundation characteristic and being analyzed and processed, output next step shooting picture Tutorial message.

The front-end vision sensing module includes：Module of target detection, for determining target in current shooting picture Particular location；Attitude estimation module, obtain the posture feature of target, thus it is speculated that the trend of target next step；Target tracking module, root According to the historical position and gesture distribution information of target, target position in subsequent frames and attitude information are determined, after output Target location and attitude information in continuous frame.

Rear end scene Understanding Module includes：Behavioral value module, the action to target detect, and obtain the action of target Information；Face recognition module, the face in shooting picture is identified, provided for identification target and object select with reference to letter Breath；Gesture recognition module, detect whether to represent the gesture of operational order, if so, performing corresponding operating according to gesture；Focus sense Module is known, according to the posture feature obtained in the action message and Attitude estimation obtained in behavioral value, to current shooting video Focus estimated, automatically determine coverage, shooting angle and shooting path, output shooting strategy, adjust camera terminal Shot to relevant position.

Target detection can find target location, determine skeleton location for Attitude estimation, there is provided auxiliary information.Target Detection module provides auxiliary information for behavioral value module.

Attitude estimation module provides auxiliary information for target tracking module：Attitude estimation module can obtain each portion of target Position where position, more auxiliary informations are provided for the reset of target following.Attitude estimation module provides for behavioral value module Auxiliary information.Attitude estimation module can obtain the posture feature of target, exist between posture and behavior and contact, therefore can pass through Posture feature, thus it is speculated that possible goal behavior.

Attitude estimation module provides auxiliary information for focus sensing module：The definition of focus depends on target to a certain extent Posture.

Face recognition module provides auxiliary information for target tracking module：Carried with module of target detection for target tracking module It is similar for auxiliary information.

Behavioral value module provides auxiliary information for focus sensing module：The definition of focus depends on target to a certain extent Behavior.

Finally, the output information of comprehensive front-end visually-perceptible module and rear end scene analysis module, for support terminal and take the photograph As the behavior offer tutorial message of terminal, so as to complete to shoot.

As shown in Figure 2, goal-setting is behaved, and support terminal uses head.Current picture is shot, then into we In visually-perceptible system in method, the information such as position, posture to people obtains, thus it is speculated that and next posture of people tends to, and And it is tracked.The action also to people detects in addition, is perceived by various focuses, obtains a preferably shooting strategy, Then by the control system of head control head turn particularly to and angle, realize steering and the angular adjustment of camera shooting terminal. According to this repeatedly, until shooting terminates.Button can be either pressed by people's gesture identification or other signals are controlled.

It should be noted that these are only the preferred embodiments of the present invention, it is not intended to limit the invention, although ginseng The present invention is described in detail according to embodiment, for those skilled in the art, it still can be to foregoing reality Apply the technical scheme described in example to modify, or equivalent substitution is carried out to which part technical characteristic, but it is all in this hair Within bright spirit and principle, any modification, equivalent substitution and improvements made etc., protection scope of the present invention should be included in Within.

Claims

1. a kind of visually-perceptible method for being used for unmanned photography, comprises the following steps：

2. the visually-perceptible method according to claim 1 for being used for unmanned photography, it is characterised in that the step S2 is specific Perceived including front-end vision, the front-end vision perceives and specifically includes following steps：

3. the visually-perceptible method according to claim 2 for being used for unmanned photography, it is characterised in that the step S201 In, during target detection, the picture frame in current shooting picture is detected every setting frame, finds the target in shooting picture Position, auxiliary information is provided for Attitude estimation and target following.

4. the visually-perceptible method according to claim 3 for being used for unmanned photography, it is characterised in that the step S2 is specific Understand including rear end scene, the rear end scene, which understands, specifically includes following steps：

5. the visually-perceptible method according to claim 4 for being used for unmanned photography, it is characterised in that the step S212's The information that recognition of face obtains and the information that step S213 gesture identification obtains, are also provided to step S203 target following, The information is analyzed and processed.

6. the visually-perceptible method according to claim 5 for being used for unmanned photography, it is characterised in that the step S201's The information that the information and step S202 Attitude estimations that target detection obtains obtain, is also provided to step S211 behavioral value, right The information is analyzed and processed.

A kind of 7. visually-perceptible system for being used for unmanned photography, it is characterised in that the system includes foundation characteristic extraction module, For obtaining the foundation characteristic of target in shooting picture；

8. the visually-perceptible system according to claim 7 for being used for unmanned photography, it is characterised in that the front-end vision sense Know that module includes：