CN109862313A - A kind of video concentration method and device - Google Patents
A kind of video concentration method and device Download PDFInfo
- Publication number
- CN109862313A CN109862313A CN201811518639.3A CN201811518639A CN109862313A CN 109862313 A CN109862313 A CN 109862313A CN 201811518639 A CN201811518639 A CN 201811518639A CN 109862313 A CN109862313 A CN 109862313A
- Authority
- CN
- China
- Prior art keywords
- video
- moving target
- concentrated
- cluster
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
This application discloses a kind of video concentration method and devices, this method comprises: obtaining the video to be concentrated including multiple moving targets, then moving target combination is selected for each condensed images, each condensed images are each frame images obtained after video to be concentrated is concentrated, to carry out video concentration to video to be concentrated according to the moving target combination in each condensed images.It can be seen that, the application can select a kind of reasonable moving target combination for each frame condensed images, to enable each moving target in every frame condensed images to occupy image space to greatest extent and keep the overlapping degree between the different motion target in every frame condensed images small as far as possible, and then improve the concentration precision of concentration video.
Description
Technical field
This application involves technical field of video processing more particularly to a kind of video concentration method and devices.
Background technique
With the raising that people's security protection is realized, household, campus, traffic and communal facility are more and more weighed safely
Depending on, therefore also produce the monitor video of magnanimity.Due to being limited by storage size and scene complexity, monitor video is deposited
All there are extreme difficulties in storage, transfer and screening investigation, how useful information is fast and effeciently obtained from massive video is one
Particularly significant and problem to be solved, therefore, video concentration technique are come into being.
Video concentration refers to a simplified summary to original video content, is to retain original all movement mesh of video
Mark (such as people and vehicle) while, over time and space reset moving target sequence, remove unnecessary background information from
And obtain concentration video so that the video of original long period (such as 2 hours), within a short period of time (such as 10 minutes it
It is interior) it can show and finish, this will greatly improve the efficiency of magnanimity monitor video analysis.
Mainly pass through target detection gets moving target sequence, the moving target sequence to existing video concentration method
Column refer to all frame images that the same moving target occurs in original video, then, using the method based on Trace Formation
Moving target sequence is fused in background picture to obtain concentration video.But the concentration densities of existing method (are concentrated
The number of moving target in video in every frame image) be it is fixed, will utmostly and reasonably can not be closed in original video
It is concentrated in the picture material of moving target, so that the concentration precision of concentration video is inadequate.
Summary of the invention
The main purpose of the embodiment of the present application is to provide a kind of video concentration method and device, can be improved concentration video
Concentration precision.
The embodiment of the present application provides a kind of video concentration method, comprising:
Video to be concentrated is obtained, includes multiple moving targets in the video to be concentrated;
Moving target combination is selected for each condensed images, each condensed images are to the video to be concentrated
The each frame image obtained after being concentrated;
According to the moving target combination in each condensed images, it is dense that video is carried out to the video to be concentrated
Contracting.
It is optionally, described to select moving target combination for each condensed images, comprising:
It determines footprint area of each moving target in each frame image of the video to be concentrated and plants oneself;
It according to the footprint area of each moving target and plants oneself, selects moving target combination side for each condensed images
Formula.
Optionally, it the footprint area according to each moving target and plants oneself, for each condensed images selection fortune
Moving-target combination, comprising:
By the way that the moving target for having overlapping in the video to be concentrated is divided into one group, each initial packet is obtained, it is described
It include at least one moving target in initial packet;
By carrying out event detection to the video to be concentrated, each initial packet is grouped adjustment, is obtained each
First object grouping;
According to the footprint area of each moving target and plant oneself and each first object grouping, be each concentration figure
As selection moving target combination.
Optionally, described by the way that the moving target for having overlapping in the video to be concentrated is divided into one group, it obtains each first
Begin to be grouped, comprising:
The corresponding cluster feature of each moving target is extracted from the video to be concentrated;
According to the corresponding cluster feature of each moving target, the moving target for having overlapping is divided into one group, is obtained each
A initial packet.
Optionally, the cluster feature include moving direction information of the corresponding moving target in the video to be concentrated,
At least one of in translational speed information, location information when first appearing and temporal information when first appearing.
Optionally, described according to the corresponding cluster feature of each moving target, there will be the moving target of overlapping to be divided into
One group, obtain each initial packet, comprising:
It will be each respectively according to the corresponding cluster feature of each moving target according to the different default cluster number of M kind
A moving target is clustered, and M kind cluster result is obtained;
For each cluster result in the M kind cluster result, the poly- of each moving target in the cluster result is calculated
Error between category feature and the cluster centre feature of the cluster result, and the sum of square of each error is calculated, it is poly- to obtain this
The corresponding error sum of squares of class result;
The corresponding cluster result of minimum value is selected from the corresponding error sum of squares of M kind cluster result, and will
Each cluster in the cluster result selected is as each initial packet.
Optionally, described by carrying out event detection to the video to be concentrated, each initial packet is grouped tune
It is whole, obtain each first object grouping, comprising:
The video to be concentrated is divided into multiple first video-frequency bands;
In each first video-frequency band of division, the number of moving target is counted respectively;
Event detection is carried out in each first video-frequency band that statistics number reaches predetermined number threshold value;
If detecting, the moving target in the first video-frequency band in different initial packets participates in same event, will be at the beginning of difference
Moving target in beginning grouping merges into same group, by each grouping after merging and each initial point without merging
Group is grouped respectively as first object.
Optionally, the footprint area according to each moving target and plant oneself and each first object grouping,
Moving target combination is selected for each condensed images, comprising:
The video to be concentrated is segmented, each second video-frequency band is obtained;
It determines each first object grouping occurred in second video-frequency band, and each first object of appearance is grouped
It is differently combined, obtains each second targeted packets;
For each second targeted packets, determine each moving target in the second targeted packets in second video-frequency band
Each frame image in footprint area and plant oneself;
It according to the footprint area of each moving target of each second targeted packets and plants oneself, in each second target
A kind of packet mode is selected in grouping, and according to the packet mode of selection, determination is concentrated to give second video-frequency band
The corresponding moving target combination of each condensed images.
It is optionally, described to be segmented the video to be concentrated, comprising:
According to the packet count of the totalframes of the video to be concentrated and first object grouping, by the video to be concentrated
It is segmented.
Optionally, it the footprint area of each moving target according to each second targeted packets and plants oneself,
A kind of packet mode is selected in each second targeted packets, comprising:
The corresponding target function value of each second targeted packets is generated, and selects minimum target functional value corresponding
Two targeted packets;
Wherein, the target function value is generated according to scene utilization rate and duplication loss rate;The scene utilization rate
The footprint area for reflecting each moving target in corresponding second targeted packets occupies degree in background image, and described
Scene utilization rate is inversely proportional with the target function value;The duplication loss rate reflects each in corresponding second targeted packets
Overlapping degree of the moving target in the footprint area in background image between each moving target caused by planting oneself.
The embodiment of the present application also provides a kind of video enrichment facilities, comprising:
Video acquisition unit to be concentrated includes multiple movement mesh in the video to be concentrated for obtaining video to be concentrated
Mark;
Objective cross selecting unit, for selecting moving target combination, each concentration for each condensed images
Image is each frame image obtained after the video to be concentrated is concentrated;
Video concentration unit, for according to the moving target combination in each condensed images, to described
Video to be concentrated carries out video concentration.
Optionally, the objective cross selecting unit includes:
Target data obtains subelement, for determining each moving target in each frame image of the video to be concentrated
It footprint area and plants oneself;
Objective cross selects subelement, is each dense for according to the footprint area of each moving target and planting oneself
Contracting image selection moving target combination.
Optionally, the objective cross selection subelement includes:
Initial packet subelement, for obtaining by the way that the moving target for having overlapping in the video to be concentrated is divided into one group
It include at least one moving target in the initial packet to each initial packet;
Event detection subelement, for by carrying out event detection to the video to be concentrated, by each initial packet into
Row grouping adjustment obtains each first object grouping;
Combination selection subelement, for according to the footprint area of each moving target and planting oneself and each first mesh
Mark grouping selects moving target combination for each condensed images.
Optionally, the initial packet subelement includes:
Cluster feature extracts subelement, corresponding poly- for extracting each moving target from the video to be concentrated
Category feature;
Initial packet forms subelement, for will have overlapping according to the corresponding cluster feature of each moving target
Moving target is divided into one group, obtains each initial packet.
Optionally, the cluster feature include moving direction information of the corresponding moving target in the video to be concentrated,
At least one of in translational speed information, location information when first appearing and temporal information when first appearing.
Optionally, the initial packet formation subelement includes:
Target clusters subelement, for the default cluster number different according to M kind, respectively respectively according to each moving target
Corresponding cluster feature clusters each moving target, obtains M kind cluster result;
Quadratic sum computation subunit, for calculating the cluster knot for each cluster result in the M kind cluster result
Error in fruit between the cluster feature of each moving target and the cluster centre feature of the cluster result, and calculate each error
The sum of square, obtain the corresponding error sum of squares of the cluster result;
Cluster result selects subelement, for selecting most from the corresponding error sum of squares of M kind cluster result
It is small to be worth corresponding cluster result, and using each cluster in the cluster result selected as each initial packet.
Optionally, the event detection subelement includes:
First divides subelement, for the video to be concentrated to be divided into multiple first video-frequency bands;
Number counts subelement, for counting the number of moving target respectively in each first video-frequency band of division;
Event detection subelement, for carrying out thing in each first video-frequency band that statistics number reaches predetermined number threshold value
Part detection;
Targeted packets subelement, if for detecting that the moving target in the first video-frequency band in different initial packets participates in
Moving target in different initial packets is then merged into same group by same event, by each grouping after merging and not
Each initial packet through merging is grouped respectively as first object.
Optionally, the combination selection subelement includes:
Second division subelement obtains each second video-frequency band for the video to be concentrated to be segmented;
Packet assembling subelement for determining each first object occurred in second video-frequency band grouping, and will go out
Existing each first object grouping is differently combined, and obtains each second targeted packets;
Determining subelement is occupied, for determining each movement in the second targeted packets for each second targeted packets
It footprint area of the target in each frame image of second video-frequency band and plants oneself;
Mode determines subelement, for according to the footprint area of each moving targets of each second targeted packets and occupying
Position selects a kind of packet mode in each second targeted packets, and according to the packet mode of selection, determines to described second
The corresponding moving target combination of each condensed images that video-frequency band is concentrated to give.
Optionally, described second subelement is divided, specifically for according to the totalframes of the video to be concentrated and described the
The video to be concentrated is segmented by the packet count of one targeted packets.
Optionally, the mode determines subelement, is specifically used for generating the corresponding target of each second targeted packets
Functional value, and select corresponding second targeted packets of minimum target functional value;
Wherein, the target function value is generated according to scene utilization rate and duplication loss rate;The scene utilization rate
The footprint area for reflecting each moving target in corresponding second targeted packets occupies degree in background image, and described
Scene utilization rate is inversely proportional with the target function value;The duplication loss rate reflects each in corresponding second targeted packets
Overlapping degree of the moving target in the footprint area in background image between each moving target caused by planting oneself.
The embodiment of the present application also provides a kind of video enrichment facilities, comprising: processor, memory, system bus;
The processor and the memory are connected by the system bus;
The memory includes instruction, described instruction for storing one or more programs, one or more of programs
The processor is set to execute any one implementation in above-mentioned video concentration method when being executed by the processor.
The embodiment of the present application also provides a kind of computer readable storage medium, deposited in the computer readable storage medium
Instruction is contained, when described instruction is run on the terminal device, so that the terminal device executes in above-mentioned video concentration method
Any one implementation.
The embodiment of the present application also provides a kind of computer program product, the computer program product is on the terminal device
When operation, so that the terminal device executes any one implementation in above-mentioned video concentration method.
Video concentration method provided by the embodiments of the present application and device obtain the view to be concentrated including multiple moving targets
Frequently, moving target combination then is selected for each condensed images, which is dense to video to be concentrated progress
The each frame image obtained after contracting, so as to according to the moving target combination in each condensed images, to video to be concentrated into
The concentration of row video.As it can be seen that the present embodiment can select a kind of reasonable moving target combination for each frame condensed images, from
And so that each moving target in every frame condensed images is occupied image space to greatest extent and make in every frame condensed images
Different motion target between overlapping degree it is small as far as possible, and then improve concentration video concentration precision.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the application
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is the flow diagram of video concentration method provided by the embodiments of the present application;
Fig. 2 is the flow diagram of determining moving target combination provided by the embodiments of the present application;
Fig. 3 is the grouping flow diagram provided by the embodiments of the present application based on cluster;
Fig. 4 is that the grouping provided by the embodiments of the present application based on event detection adjusts schematic diagram;
Fig. 5 is moving target Statistics of Density schematic diagram provided by the embodiments of the present application;
Fig. 6 is the composition schematic diagram of video enrichment facility provided by the embodiments of the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
First embodiment
It is a kind of flow diagram of video concentration method provided in this embodiment, this method includes following step referring to Fig. 1
It is rapid:
S101: video to be concentrated is obtained, wherein include multiple moving targets in the video to be concentrated.
In the present embodiment, the original video for carrying out video concentration will be needed to be defined as video to be concentrated, such as certain fields
Monitor video under scape.The video to be concentrated is to may include the moving objects such as any video, such as people, vehicle of any moving object
The moving object moved each in the video to be concentrated is defined as moving target here by body.
In the present embodiment, the background image in video to be concentrated can be extracted by background modeling.Wherein, the Background
Seem to refer to by rejecting a clean background image after the moving target in video to be concentrated, the background image be by
Background modeling algorithm is constantly updated background image, it is generally the case that the background in video to be concentrated is substantially not
It changes, therefore need to only extract a most clean background image, so as to when subsequent step S103 carries out video concentration
It uses.
S102: moving target combination is selected for each condensed images, each condensed images are to described to dense
Each frame image that contracting video obtains after being concentrated.
In the present embodiment, a kind of reasonable moving target combination can be selected for each frame condensed images, thus
So that each moving target in every frame condensed images is occupied image space to greatest extent and makes in every frame condensed images
Overlapping degree between different motion target is small as far as possible, and then improves the concentration precision of concentration video.
In a kind of implementation of the present embodiment, this step S102 may include step S1021-S1022:
S1021: it determines footprint area of each moving target in each frame image of video to be concentrated and plants oneself.
In the present embodiment, the moving target in video to be concentrated can be extracted by Object Detecting and Tracking algorithm
Sequence.Wherein, moving target is the object, such as people, vehicle etc. for referring to movement, and moving target sequence is then by same movement
The frame image composition for all frame images that target occurs in video to be concentrated, that is, the corresponding movement of a moving target
Target sequence.Due to the moving target in video to be concentrated might have it is multiple, the movement obtained from video to be concentrated
Target sequence also has multiple, and it is total that the length of each moving target sequence is that each moving target occurs in video to be concentrated
Frame number.
It should be noted that due to being to be tracked using target tracking algorism to moving target, for some fortune
Moving-target, when after it leaves in video to be concentrated and appearing in video to be concentrated, will the moving target be considered as it is same
Moving target.
In order to carry out video concentration to video to be concentrated, for each of video to be concentrated moving target, Ke Yi
In the corresponding moving target sequence of the moving target, the moving target in each frame image in the moving target sequence is determined
Footprint area in Background and plant oneself, that is, determine moving target in background image in which position and
The space size that the position occupies.
S1022: according to the footprint area of each moving target and planting oneself, and selects moving target for each condensed images
Combination, wherein each condensed images are each frame images obtained after video to be concentrated is concentrated.
In the present embodiment, for each frame image in video to be concentrated, determine each moving target belonging to it
In which position and the space size occupied in the position in each frame image, that is, determine each moving target to dense
It footprint area in each frame image of contracting video and plants oneself, the purpose is to when carrying out video concentration, that is, by different frame
When the different motion subject fusion of image is into same condensed images, can make each moving target in condensed images occupy compared with
More spatial context and make to reduce overlapping to the greatest extent between each moving target in the condensed images.Wherein, " overlapping " refers to two
A or more than two moving targets overlap in same image, for example, the body generated in the picture between men
Overlapping.
In the present embodiment, it based on footprint area of each moving target in each frame image and can plant oneself, and
Consider appearance period of each moving target in video to be concentrated, to greatest extent and reasonable utilization background image is showed
Image space selects a kind of reasonable moving target combination for each frame condensed images, makes each in every frame condensed images
A moving target can occupy image space to greatest extent and make the weight between the different motion target in every frame condensed images
Folded degree is small as far as possible, that is, keeps the scene utilization rate in every frame condensed images high as far as possible and keeps duplication loss rate small as far as possible.
For this purpose, in a kind of implementation of the present embodiment, the process of determination moving target combination as shown in Figure 2
Schematic diagram, this step S1022 may comprise steps of S201-S203:
S201: by the way that the moving target for having overlapping in video to be concentrated is divided into one group, obtaining each initial packet,
In, it include at least one moving target in each initial packet.
Wherein, " overlapping " refers to that two or more moving targets occur in the same frame image of video to be concentrated
Overlapping, for example, two people hand in hand, shoulder to shoulder together walk when, then, in video to be concentrated, the two people will be produced
Raw body overlapping.If the moving target for having overlapping is divided into one group, subsequent grouping speed can be improved, moreover, carrying out
When video is concentrated, if the moving target for having overlapping is concentrated together, be conducive to improve scene utilization rate, if be separately concentrated
Words, concentration precision may be reduced, it is also possible to so that have two moving targets of overlapping concentration video in mutual alignment with
Actual mutual alignment has differences in video to be concentrated.
In the present embodiment, the cluster feature of each moving target can be extracted from video to be concentrated, it specifically can be with
Moving target sequential extraction procedures cluster feature based on each moving target, then, by clustering algorithm by institute in video to be concentrated
There is the moving target of overlapping to be divided into one group, here, obtained each grouping is defined as initial packet, in each initial packet
It may include one or more moving targets.
It should be noted that the specific implementation of this step S201, refers to the related introduction in second embodiment.
S202: by carrying out event detection to video to be concentrated, each initial packet is grouped adjustment, is obtained each
First object grouping.
It in the present embodiment, may grouping knot due to only obtaining each initial packet of moving target by clustering algorithm
Fruit is inaccurate, because this clustering algorithm possibly can not be by the total movement in the events such as assemble a crowd for a long time containing large area
Target (such as a lot of people in certain activity) is assigned in same initial packet, therefore, in order to obtain more accurate group result, is needed
Will try again to the moving target in the group in each initial packet between group screening.For this purpose, can be as unit of the time, it will
Video to be concentrated is divided into multiple video-frequency bands, here, each video-frequency band is defined as the first video-frequency band, in each first video-frequency band
Moving target number counted, when the number of statistics be more than preset threshold value when, just to this be more than threshold value the first video
Duan Jinhang event detection is according to testing result adjusted each initial packet before, for example two initial packets are closed
And be one group, here, by each grouping after merging and each initial packet without merging, it is respectively defined as first object
Grouping.
It should be noted that the specific implementation of this step S202, refers to the related introduction in second embodiment.
It is of course also possible to adjustment is not grouped to each initial packet by S202, and directly by each initial packet
It is grouped as each first object.
S203: it according to the footprint area of each moving target and plants oneself and each first object is grouped, to be each
Condensed images select moving target combination.
In this implementation, in order to greatest extent and the image space that is showed of reasonable utilization background image, need
A kind of reasonable moving target combination is selected for each frame condensed images, makes each moving target in every frame condensed images
Image space can be occupied to greatest extent and makes overlapping degree between the different motion target in every frame condensed images as far as possible
It is small, that is, to keep the scene utilization rate in every frame condensed images high as far as possible and keep duplication loss rate small as far as possible.
In a kind of concrete implementation mode, this step S203 may comprise steps of A1-A5:
Step A1: video to be concentrated is segmented, and obtains each second video-frequency band.
It is understood that the duration of video to be concentrated is longer, show that the search space of moving target is bigger, it is contemplated that right
The efficiency that video to be concentrated is concentrated video segmentation to be concentrated can be concentrated, finally by the concentration knot of each segmentation
Fruit is stitched together to arrive final concentration video.In one implementation, it can be got based on step S202 each
Entire video to be concentrated is segmented, here, each video-frequency band obtained after segmentation is determined by the packet count of first object grouping
Justice is the second video-frequency band.
Specifically, this step A1 can be according to the packet count of the totalframes of video to be concentrated and first object grouping, will be to
Concentration video is segmented, to obtain each second video-frequency band.
Wherein, specific segmentation formula is as follows:
Wherein, T indicates the segments of video to be concentrated;α is Optimal Parameters, needs to be determined according to actual scene;Fn
Indicate the totalframes of video to be concentrated;NcIndicate the packet count of first object grouping, i.e., the total number of each first object grouping.
In formula (1), if in formula (1)For integer, then can using the integer as segments T, if
In formula (1)It is not integer, integer part or integer part can be added 1 to be used as segments T.
Then, video to be concentrated is segmented according to segments T, to obtain T the second video-frequency bands.
Step A2: for every one second video-frequency band, each first object grouping occurred in the second video-frequency band is determined, and will
The each first object grouping occurred is differently combined, and obtains each second targeted packets.
After it will be divided into each second video-frequency band to concentration video, can find out appear in it is each in the second video-frequency band
A first object grouping, it should be noted that if the different motion target in the grouping of some first object appears in two differences
The second video-frequency band in, for example the grouping of some first object includes moving target A and moving target B, but moving target A and movement
Target B is respectively appeared in the second different video-frequency band of adjacent two, then all movements being grouped such first object
Target belongs to one of them second video-frequency band.
After each first object grouping in the second video-frequency band has been determined, next, these first objects are grouped
Be combined according to preset different modes, for example, by any two first object packet assembling be one group, based on these first
The number of targeted packets may can choose all or part of combination, here, by each there are many combination
Combined result under combination is defined as the second targeted packets.
It specifically, can be that the second video-frequency band sets an initial concentration densities as d, the size of d indicates the second target
The group number for the first object grouping for including in grouping, is grouped based on concentration densities d;Later with concentration densities d+1 progress
Grouping, then be grouped with concentration densities d+2 ... ..., until concentration densities reach the first object in second video-frequency band point
Until the total number packets of group.
For example, if certain second video-frequency band in include these three first objects of AB, CD, EFG be grouped, wherein A, B, C, D, E,
F, G respectively represents an independent moving target.These three first objects can be grouped and be considered as three moving targets X, Y, Z,
So that all moving targets in being grouped first object are considered as a moving target progress video concentration, wherein X represents AB, Y
CD is represented, Z represents EFG.If the concentration densities most started setting up are 2, by X, Y, Z these three moving targets arrangement group two-by-two
It closes, i.e. XY, XZ, YZ, later, by initial concentration density 2 plus 1, i.e., current concentration densities are 3, then by these three movements of X, Y, Z
Objective cross is one group, i.e. XYZ, since current concentration densities 3 are equal to the total number packets 3 of first object grouping, combination
Terminate, thus, obtain this 4 second targeted packets of XY, XZ, YZ, XYZ.
Step A3: for each second targeted packets, determine each moving target in the second targeted packets in the second view
It footprint area in each frame image of frequency range and plants oneself.
Step A4: it according to the footprint area of each moving target of each second targeted packets and plants oneself, each
A kind of packet mode is selected in second targeted packets.
It according to the corresponding footprint area of every one second targeted packets and can plant oneself, utilize a given mesh
The corresponding target function value of the second targeted packets is calculated in scalar functions, and the target function value is smaller, illustrates second target
Each moving target in grouping is the moving target that scene utilization rate is high and duplication loss rate is small.
Wherein, the corresponding scene utilization rate of the second targeted packets reflects each movement mesh in second targeted packets
Target footprint area occupies degree in background image.Specifically, scene utilization rate can refer in second targeted packets
Different motion target image is after being pasted into the corresponding each condensed images of second video-frequency band, the face of all movement destination images
Product union accounts for the specific gravity size of the scene area union of each condensed images, and the rate of specific gravity is bigger, then scene utilization rate is higher.
Wherein, the corresponding duplication loss rate of the second targeted packets reflects each movement mesh in corresponding second targeted packets
Overlapping degree between the footprint area being marked in background image and the caused each moving target that plants oneself.Specifically, weight
Folded loss late can refer to that overlapping area of the different motion target image in the second video-frequency band in second targeted packets accounts for respectively
The size of a movement destination image gross area, the value are bigger, then it represents that duplication loss rate is higher.
Therefore, it when selecting a kind of packet mode by this step A4, that is, when selecting a kind of second targeted packets, can be generated
The corresponding target function value of each second targeted packets, and select corresponding second targeted packets of minimum target functional value.
Wherein, the corresponding target function value of the second targeted packets is according to the corresponding scene utilization rate of second targeted packets and overlapping
What loss late generated, and scene utilization rate is inversely proportional to target function value, duplication loss rate is directlyed proportional to target function value.
Specifically, the corresponding target function value of the second targeted packets can be calculated according to following formula:
Wherein, U/S indicates scene utilization rate, and IoU indicates duplication loss rate.
In formula (2),Indicate the movement destination image in t-th of second targeted packets in its affiliated frame image
Area union in (the frame image is the frame image in the second video-frequency band).For example, it is assumed that including first in the second targeted packets
Targeted packets AB and first object are grouped CD, and A, B, C and D respectively represent different moving targets, it is assumed that first object is grouped AB
Occur in the 1-5 frame in the second video-frequency band, first object grouping CD occurs in the 6-15 frame in the second video-frequency band, then by the
The area of A image and B image in 1 frame image takes union, and the area of C image and D image in the 6th frame image is taken simultaneously
The two union areas are taken union to obtain U1 by collection, the image of A image and B image in the 2nd frame image are taken union, and will
The area of C image and D image in 7th frame image takes union, takes union to obtain U2 the two unions ... ..., by the 5th frame figure
The image of A image and B image as in takes union, and the area of C image and D image in the 10th frame image is taken union, will
The two unions take union to obtain U5, and each of these moving target can use a rectangle frame and frame, and utilize the rectangle frame
Frame the edge of movement destination image, using the area of the rectangle frame as the image area of the moving target, then, by this 5
A union U1, U2 ..., U5 takes after union to get in formula (2)
In formula (2),Indicate (the frame figure of frame image belonging to the moving target in t-th of second targeted packets
Seem the frame image in the second video-frequency band) area union.Continue previous example, due to each frame image of the second video-frequency band
Size is identical, it is assumed that its area is s, then 5s is in formula (2)
In formula (2),Indicate the area union and its area of the moving target in t-th of second targeted packets
The ratio of intersection.Continue previous example, calculates the area union U of A image and B image in the 1-5 frame image belonging to itAB, and
Calculate the area union U of C image and D image in the 6-10 frame image belonging to itCD, by UABWith UCDUnion is taken to obtain Ub, and will
UABWith UCDIt takes intersection to obtain Uj, then by the ratio Uj/Ub of the intersection area Uj obtained and the union area Ub of acquirement, makees
For in formula (2)
Therefore, each of the second video-frequency band corresponding target letter of the second targeted packets can be calculated according to above-mentioned formula (2)
Corresponding second targeted packets of minimum value can be distributed to second video-frequency band by numerical value.
For example, continuing the example in step A2, firstly, if the concentration densities most started setting up are 2, by X, Y, Z tri-
Moving target permutation and combination two-by-two, and the value of the corresponding objective function F of each combination is calculated, for example, XY is calculated when combining
The value of objective function F be the value of objective function F being calculated when 0.7, YZ combination be that 0.5, XZ is calculated when combining
The value of objective function F is 0.6, at this point, by minimum value and its corresponding combined result record in be calculated three F values
Get off, i.e., minimum F value is 0.5, and corresponding group is combined into YZ combination.Then, it by initial concentration density 2 plus 1, i.e., is currently concentrated close
Degree is 3, then is one group by tri- moving targets of X, Y, Z, and calculate the value of the corresponding objective function F of the combination, for example, XYZ group
The value for the objective function F being calculated when conjunction is 0.35, at this point, the F value being calculated and its corresponding combined result are remembered
Record is got off, i.e., F value is 0.35, and corresponding group is combined into XYZ combination.Since total group of number of targeted packets is 3, current concentration densities
Also it is 3, therefore, terminates the calculating of F value.
Finally, the minimum value for comparing calculated objective function F when concentration densities are respectively 2 and 3 compare YZ combination and
XYZ combines the value of corresponding objective function F, i.e., such as 0.5 and 0.35, by comparing it is found that corresponding mesh when XYZ is combined
The value of scalar functions F is minimum, then, which can be distributed to the second video-frequency band, and team to be concentrated is formed based on the combination
Column, so that second video-frequency band is concentrated.That is, the length of queue to be concentrated is X+Y+Z=AB+CD+EFG, i.e., it is final
Queue length to be concentrated is 7, i.e. 7 independent moving targets, at this point, can be by the moving target sequence of this 7 moving targets
Column while being put into queue to be concentrated and be concentrated, each moving target sequence be aligned since first image successively carry out it is dense
Contracting.
Step A5: according to the packet mode of selection, each condensed images being concentrated to give to the second video-frequency band are determined
Corresponding moving target combination.
After selecting a packet mode from corresponding each second targeted packets of the second video-frequency band, this can be determined
There is which first object grouping in the second selected targeted packets, at this point it is possible to obtain the movement mesh of each first object grouping
It marks sequence (every frame image in the moving target sequence belongs to second video-frequency band), and by each moving target sequence according to phase
It is encoded with mode, such as frame 1, frame 2 ..., it then, will be in the picture frame with identical frame number of each moving target sequence
Moving target, as the moving target in same condensed images.It should be noted that due to each moving target sequence
Length is different, if some moving target sequence terminates with other moving target sequence concentrations, when in the second video-frequency band
When being grouped there is also other first objects other than selected combination, then by first object grouping and remaining moving target
Sequence continues to be concentrated, until the moving target in frame images all in the second video-frequency band is completed concentration.
For example, it is assumed that including the total movement target in the second video-frequency band, such as step in the second targeted packets of selection
Citing in A4, i.e. 3 first object groupings, then by each movement mesh in the 1st frame in corresponding 3 moving target sequences
It marks, all the moving target as the 1st frame condensed images, by each movement mesh in the 2nd frame in 3 moving target sequences
It marks, all moving target ... the .. as the 2nd frame condensed images.It is understood that due to the frame number of 3 moving target sequences
May be different, behind moving target number in condensed images can reduce.
In another example, it is assumed that include the componental movement target in the second video-frequency band in the second targeted packets of selection, for example walks
Citing in rapid A4, including first object are grouped AB and the second targeted packets CD, then will be in corresponding two moving target sequences
The 1st frame in each moving target, whole moving target as the 1st frame condensed images, by the two moving target sequences
In the 2nd frame in each moving target, whole moving targets ... ... as the 2nd frame condensed images.It should be noted that
Since the frame number of this 2 moving target sequences may be different, for example first object grouping AB includes 5 frames, the second targeted packets CD
Including 10 frames, then when each moving target being grouped first object in the 5th frame in the moving target sequence of AB and CD, all
The each movement mesh being grouped as the moving target of the 5th frame condensed images and then by first object in the 1st frame image of DEF
Mark and each moving target in the 6th frame in the moving target sequence of first object grouping CD, are all concentrated as the 6th frame and scheme
The moving target of picture, in the manner described above, until by all corresponding concentrations of each movement destination image of the second video-frequency band
Until image.
S103: according to the moving target combination in each condensed images, video concentration is carried out to video to be concentrated.
In the present embodiment, each second video-frequency band in video to be concentrated can be concentrated parallel, also, right
When every one second video-frequency band is concentrated, the moving target combination in each condensed images can be successively determined, and true
After the moving target combination for having determined current condensed images, current condensed images are formed, in this way, can effectively improve concentration speed
Degree.
It, can be by each moving target under corresponding moving target combination when being concentrated to get each frame condensed images
Image is pasted into the corresponding position in background image (identical as the picture position in the second video-frequency band), to obtain a series of
Condensed images are spliced according to the formation sequence of condensed images to get the concentration video-frequency band of second video-frequency band is arrived.Wherein,
A condensed images are obtained in order to be merged movement destination image, can be merged using blending algorithm, for example moor
Loose blending algorithm.
Since each second video-frequency band of video to be concentrated respectively corresponds a concentration video-frequency band, then according to each second view
The playing sequence of frequency range splices its corresponding each concentration video-frequency band to get final concentration video is arrived.
To sum up, video concentration method provided in this embodiment, obtain include multiple moving targets video to be concentrated, can be with
Moving target combination is selected for each condensed images, which obtained after video to be concentrated is concentrated
Each frame image, so that it is dense to carry out video to video to be concentrated according to the moving target combination in each condensed images
Contracting.As it can be seen that the present embodiment can be based on footprint area of each moving target in each frame image of the video to be concentrated and accounting for
According to position, a kind of reasonable moving target combination is selected for each frame condensed images, to make in every frame condensed images
Each moving target can occupy image space to greatest extent and make between the different motion target in every frame condensed images
Overlapping degree is small as far as possible, and then improves the concentration precision of concentration video.
Second embodiment
It should be noted that the present embodiment is by the specific implementation to step S201 and step S202 in first embodiment
It is introduced.
In the present embodiment, the grouping flow diagram shown in Figure 3 based on cluster, the step in first embodiment
S201 " by the way that the moving target for having overlapping in video to be concentrated is divided into one group, obtains each initial packet ", may include with
Lower step S301-S302:
S301: the corresponding cluster feature of each moving target is extracted from video to be concentrated.
In the present embodiment, the moving targets for having overlapping all in video to be concentrated can be divided into one by clustering algorithm
Group includes one or more moving targets in each initial packet to obtain each initial packet.In order to pass through clustering algorithm
It obtains each initial packet, needs to extract the corresponding cluster feature of each moving target from video to be concentrated, i.e., it is each
Moving target corresponds to a group cluster feature, wherein the cluster feature may include corresponding moving target in video to be concentrated
One in moving direction information, translational speed information, location information when first appearing and temporal information when first appearing
Item is multinomial.
Specifically, in getting video to be concentrated the moving target sequence of each moving target (that is, in video to be concentrated
Each frame image with corresponding moving target) after, for each moving target, mentioned from its corresponding moving target sequence
Take a group cluster feature.
It is now illustrated by taking one of moving target as an example, for example, with the movement of moving target 1 and moving target 1
Target sequence 1 is illustrated, and extracts moving direction information, the translational speed information of moving target sequence 1, and extracts moving target
1 location information first appeared in video to be concentrated and temporal information, by " moving direction information ", " movement speed of extraction
Information ", " location information " and " temporal information " this four information, as the corresponding cluster feature of moving target 1.Below to this
The acquisition modes of this four information in cluster feature are introduced respectively:
(1), in each frame image of video to be concentrated, each moving target in frame image can be made to be in a square
In shape frame, the edge of movement destination image is framed using the rectangle frame, then, for the moving target sequence of moving target 1
1, the rectangle frame (being moving target 1 in the rectangle frame) in each adjacent two frame image of moving target sequence 1 can be calculated
The offset of center point coordinate, such as, it is assumed that moving target sequence 1 has 5 frame images, then 4 offsets can be calculated,
And each offset being calculated is averaged, obtaining mean deviation amount, (Δ x, Δ y), willExist as moving target 1
" moving direction information " in video to be concentrated.
(2), mean deviation amount is calculated in (1) (after Δ x, Δ y), to incite somebody to actionAs moving target 1 to
" translational speed information " in video is concentrated.
(3), for the moving target sequence 1 of moving target 1, first frame image in moving target sequence 1 can be determined
In rectangle frame (rectangle frame in be moving target 1) center point coordinate (x, y), regard the coordinate (x, y) as moving target 1
" location information " when being first appeared in video to be concentrated.
It (4), can be in advance by each frame image of video to be concentrated, according to playing sequence number consecutively, thus each frame
Image corresponds to a frame number, in this way, can determine in moving target sequence 1 for the moving target sequence 1 of moving target 1
First frame image frame number f, using frame number f as moving target 1 in video to be concentrated when first appearing " time believe
Breath ".
So just having obtained the corresponding cluster feature of moving target 1, i.e.,Further,
Parameters in the cluster feature can also be normalized.
As it can be seen that by way of the corresponding cluster feature of above-mentioned acquisition moving target 1, it is each in available video to be concentrated
A corresponding cluster feature of moving target.
S302: according to the corresponding cluster feature of each moving target, the moving target for having overlapping is divided into one group, is obtained
To each initial packet.
In the present embodiment, it can be based on the corresponding cluster feature of each moving target, by clustering algorithm by institute
There is the moving target of overlapping to be divided into one group, so that each initial packet is obtained, for example, can be poly- using K mean value (K-means)
Class algorithm is grouped.
In a kind of implementation of the present embodiment, this step S302 can specifically include following steps B1-B3:
Step B1: special according to the corresponding cluster of each moving target respectively according to the different default cluster number of M kind
Sign, each moving target is clustered, M kind cluster result is obtained.
Before being clustered, the number of default cluster is needed, is then clustered according to the number, in the present embodiment
In, it can choose a variety of different cluster numbers, such as M kind, in this way, M kind cluster result can be obtained, each cluster knot
The cluster number of fruit is different.It should be noted that the present embodiment does not limit the size of M, as long as M is less than the total number of moving target
?.
For example, it is assumed that the moving target total number in video to be concentrated is N (N >=2), if the clustering algorithm used is K
Mean value (K-means) clustering algorithm then can first set an initial clustering number as N-1, that is, by the movement in video to be concentrated
Target is tentatively divided into N-1 group, therefore, can first from arbitrarily selected in N number of cluster feature N-1 cluster feature as cluster in
The heart, and for remaining cluster feature, then it, will according to the similarity (distance) of remaining cluster feature and these cluster centres
Remaining cluster feature is distributed to the cluster most like with it (cluster representated by cluster centre), and each obtained new cluster is calculated
Cluster centre (mean value of each dimension of all cluster features in the cluster), constantly repeat the above process, until cluster knot
Shu Hou obtains N-1 cluster;Then, resetting cluster numbers is N-2, is clustered in the manner described above, to obtain N-2
Cluster;……;Finally, setting cluster numbers as N/2, clustered in the manner described above, to obtain N/2 cluster.Pass through
This M kind of above-mentioned N-1, N-2 ... N/2 clusters number, available M kind cluster result.
Step B2: for each cluster result in M kind cluster result, each moving target in the cluster result is calculated
Error between cluster feature and the cluster centre feature of the cluster result, and the sum of square of each error is calculated, it is somebody's turn to do
The corresponding error sum of squares of cluster result.
After obtaining M cluster result by step B2, there are multiple clusters in each cluster result, each
Cluster all has a cluster centre.So, in same cluster result, the poly- of each cluster in the cluster result can be determined
Then class central feature calculates the cluster feature of each moving target in same cluster and the cluster centre feature of the cluster
Between error, and by the sum of square of each error of each cluster, as the corresponding error sum of squares of the cluster result, accidentally
Poor quadratic sum is smaller, indicates that the Clustering Effect of the cluster result is better.
Step B3: selecting the corresponding cluster result of minimum value from the corresponding error sum of squares of M kind cluster result, and
Using each cluster in the cluster result selected as each initial packet.
When there are M cluster result, then corresponding there are M errors sum of squares, since error sum of squares is smaller, then it represents that
The Clustering Effect of corresponding cluster result is better, therefore, can therefrom select a smallest error sum of squares, that is, select the mistake
The corresponding cluster result of poor quadratic sum, and the moving target in cluster each in the cluster result is correspondingly formed one initial point
Group.
In the present embodiment, grouping shown in Figure 4 based on event detection adjusts schematic diagram, in first embodiment
Each initial packet " by carrying out event detection to video to be concentrated, is grouped adjustment, obtains each first by step S202
Targeted packets " may comprise steps of S401-S403:
S401: being divided into multiple first video-frequency bands for video to be concentrated, and in each first video-frequency band of division, respectively
Count the number of moving target.
In the present embodiment, video to be concentrated can be divided into multiple video-frequency bands, such as every five seconds as unit of the time
A video-frequency band is divided, the duration of video-frequency band can be set based on experience or experimental result, here, by each video-frequency band of division
It is defined as the first video-frequency band.For each first video-frequency band, statistics appears in the different motion target in first video-frequency band
Number.
It is understood that the moving target number in each first video-frequency band of statistics, really counts each first view
The Density Distribution situation of the moving target of frequency range, wherein the Density Distribution situation of moving target can be by histogram or other
Form indicates.For example, it is assumed that video total duration to be concentrated is 30 minutes, as unit of the time, by the duration of the first video-frequency band
Be set as 5s, then available 360 the first video-frequency bands, then, in each first video-frequency band statistics occur it is all not
With the number of moving target, if certain moving target repeatedly occurs in multiple frame images in some first video-frequency band, only unite
Meter once without repeating to count, is here provided specific statistical result, moving target as shown in Figure 5 in the form of histogram
Statistics of Density schematic diagram, wherein the abscissa of Fig. 5 indicates the number of each first video-frequency band, and ordinate indicates every in abscissa
The number of the moving target occurred in corresponding first video-frequency band of one number, as " 1 " in abscissa indicates first first view
Frequency range, i.e., the period corresponding video of 00:00:00 to 00:00:05 in video to be concentrated, corresponding ordinate be " 1 " i.e.
There is 1 moving target in first the first video-frequency band in expression, and so on, can draw out indicates each first video
The histogram of the corresponding moving target number of section.
S402: event detection is carried out in each first video-frequency band that statistics number reaches predetermined number threshold value.
Number for the moving target occurred in each first video-frequency band, can be by each moving target number and one
Preset number threshold value (the number threshold value can be set based on experience or experimental result) is compared, if certain moving target
Number is greater than or equal to the number threshold value, then event inspection can be carried out in corresponding first video-frequency band of the moving target number
Survey, for example, the event detected can be such as assemble a crowd, run, packet loss, the preset event type such as fight.
In the present embodiment, the purpose of event detection is will to participate in all movements of same event (event of such as assembling a crowd)
Target assigns to same group, to guarantee the integrality of the moving target in same event, further, since event detection can further expand
Moving target number in every group big, so that subsequent video can save more times when being concentrated.
It specifically, can when the first video-frequency band for being greater than predetermined number threshold value to moving target number carries out event detection
From continuous a few frame images in first video-frequency band (number of frame image can be based on the setting such as experience or experimental result)
Extract Optic flow information, that is, characteristic value is asked to each pixel in each image in these frame images, characteristic value is greater than pre-
If the pixel of threshold value is as characteristic point, and using the corresponding displacement information of these characteristic points as Optic flow information, then, by light stream
Information input utilizes CNN into the convolutional neural networks (Convolutional neural network, CNN) constructed in advance
Model analyzes the motion conditions of the moving target in first video-frequency band based on Optic flow information, to predict first video
Whether contain in section and assembles a crowd to wait predeterminable events.
S403: if detecting, the moving target in the first video-frequency band in different initial packets participates in same event, will
Moving target in different initial packets merges into same group, by each grouping after merging and without merging it is each just
Begin to be grouped, be grouped respectively as first object.
Event detection is carried out by step S402, the purpose is to the moving target for participating in same event is merged into one
Group, that is, for not in multiple moving targets of the same initial packet, if it is same to detect that they take part in by event detection
A event, the then each moving target for needing to belong to these different initial packets merge into one group, in this way, by some initial point
Each grouping that group merges that treated and each initial packet for not merging processing, respectively as one first
Targeted packets, in this way, just carried out grouping adjustment to the obtained each initial packet of cluster, so as to form one or more the
One targeted packets.
For example, as shown in Figure 5, it is assumed that preset number threshold value is 10, then it is assumed that be greater than 10 institute in moving target number
Interested event (the interested event can be any type of event of assembling a crowd) generation may be had by having in the first video-frequency band,
If the types such as having detected interested event, for example having fought, then it will belong to the different initial packets of the same event
Each moving target be merged into same group.
It should be noted that if there are two determinations or more than two after carrying out event detection to certain first video-frequency band
Moving target in initial packet participates in same event, but the componental movement target of some initial packets therein does not occur
In first video-frequency band, then will these initial packets assign to same group.For example, it is assumed that in initial packet 1 and initial packet 2
Componental movement target appear in first video-frequency band and take part in certain event, but in initial packet 1 and/or initial packet 2
Other parts moving target does not appear in first video-frequency band, then will initial packet 1 and initial packet 2 total movement
Target merges into one group, becomes a first object grouping.
To sum up, each moving target can be carried out preliminary grouping by clustering algorithm by the present embodiment, to obtain each
Then each initial packet of parameter to same event is merged into same group by event detection by a initial packet, thus
It is grouped to each first object, in this way, in subsequent progress video concentration, it can be by each fortune in the grouping of the same first object
Moving-target is concentrated in video together, that is, user can see simultaneously same the in the same condensed images of concentration video
All moving targets of one targeted packets, to improve concentrated effect.
3rd embodiment
A kind of video enrichment facility will be introduced in the present embodiment, and related content refers to above method embodiment.
It is a kind of composition schematic diagram of video enrichment facility provided in this embodiment referring to Fig. 6, which includes:
Video acquisition unit 601 to be concentrated includes multiple movements in the video to be concentrated for obtaining video to be concentrated
Target;
Objective cross selecting unit 602, it is described each dense for selecting moving target combination for each condensed images
Contract drawing seems each frame image obtained after the video to be concentrated is concentrated;
Video concentration unit 603, for according to the moving target combination in each condensed images, to institute
It states video to be concentrated and carries out video concentration.
In a kind of implementation of the present embodiment, the objective cross selecting unit 602 includes:
Target data obtains subelement, for determining each moving target in each frame image of the video to be concentrated
It footprint area and plants oneself;
Objective cross selects subelement, is each dense for according to the footprint area of each moving target and planting oneself
Contracting image selection moving target combination.
In a kind of implementation of the present embodiment, the objective cross selection subelement includes:
Initial packet subelement, for obtaining by the way that the moving target for having overlapping in the video to be concentrated is divided into one group
It include at least one moving target in the initial packet to each initial packet;
Event detection subelement, for by carrying out event detection to the video to be concentrated, by each initial packet into
Row grouping adjustment obtains each first object grouping;
Combination selection subelement, for according to the footprint area of each moving target and planting oneself and each first mesh
Mark grouping selects moving target combination for each condensed images.
In a kind of implementation of the present embodiment, the initial packet subelement includes:
Cluster feature extracts subelement, corresponding poly- for extracting each moving target from the video to be concentrated
Category feature;
Initial packet forms subelement, for will have overlapping according to the corresponding cluster feature of each moving target
Moving target is divided into one group, obtains each initial packet.
In a kind of implementation of the present embodiment, the cluster feature includes corresponding moving target in the view to be concentrated
Moving direction information, translational speed information in frequency, location information when first appearing and temporal information when first appearing
At least one of in.
In a kind of implementation of the present embodiment, the initial packet forms subelement and includes:
Target clusters subelement, for the default cluster number different according to M kind, respectively respectively according to each moving target
Corresponding cluster feature clusters each moving target, obtains M kind cluster result;
Quadratic sum computation subunit, for calculating the cluster knot for each cluster result in the M kind cluster result
Error in fruit between the cluster feature of each moving target and the cluster centre feature of the cluster result, and calculate each error
The sum of square, obtain the corresponding error sum of squares of the cluster result;
Cluster result selects subelement, for selecting most from the corresponding error sum of squares of M kind cluster result
It is small to be worth corresponding cluster result, and using each cluster in the cluster result selected as each initial packet.
In a kind of implementation of the present embodiment, the event detection subelement includes:
First divides subelement, for the video to be concentrated to be divided into multiple first video-frequency bands;
Number counts subelement, for counting the number of moving target respectively in each first video-frequency band of division;
Event detection subelement, for carrying out thing in each first video-frequency band that statistics number reaches predetermined number threshold value
Part detection;
Targeted packets subelement, if for detecting that the moving target in the first video-frequency band in different initial packets participates in
Moving target in different initial packets is then merged into same group by same event, by each grouping after merging and not
Each initial packet through merging is grouped respectively as first object.
In a kind of implementation of the present embodiment, the combination selection subelement includes:
Second division subelement obtains each second video-frequency band for the video to be concentrated to be segmented;
Packet assembling subelement for determining each first object occurred in second video-frequency band grouping, and will go out
Existing each first object grouping is differently combined, and obtains each second targeted packets;
Determining subelement is occupied, for determining each movement in the second targeted packets for each second targeted packets
It footprint area of the target in each frame image of second video-frequency band and plants oneself;
Mode determines subelement, for according to the footprint area of each moving targets of each second targeted packets and occupying
Position selects a kind of packet mode in each second targeted packets, and according to the packet mode of selection, determines to described second
The corresponding moving target combination of each condensed images that video-frequency band is concentrated to give.
In a kind of implementation of the present embodiment, described second divides subelement, is specifically used for according to described to be concentrated
The packet count of the totalframes of video and first object grouping, the video to be concentrated is segmented.
In a kind of implementation of the present embodiment, the mode determines subelement, is specifically used for generating each second mesh
Mark is grouped corresponding target function value, and selects corresponding second targeted packets of minimum target functional value;
Wherein, the target function value is generated according to scene utilization rate and duplication loss rate;The scene utilization rate
The footprint area for reflecting each moving target in corresponding second targeted packets occupies degree in background image, and described
Scene utilization rate is inversely proportional with the target function value;The duplication loss rate reflects each in corresponding second targeted packets
Overlapping degree of the moving target in the footprint area in background image between each moving target caused by planting oneself.
Further, the embodiment of the present application also provides a kind of video enrichment facilities, comprising: processor, memory, system
Bus;
The processor and the memory are connected by the system bus;
The memory includes instruction, described instruction for storing one or more programs, one or more of programs
The processor is set to execute any one implementation in above-mentioned video concentration method when being executed by the processor.
Further, described computer-readable to deposit the embodiment of the present application also provides a kind of computer readable storage medium
Instruction is stored in storage media, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned video
Any one implementation in method for concentration.
Further, the embodiment of the present application also provides a kind of computer program product, the computer program product exists
When being run on terminal device, so that the terminal device executes any one implementation in above-mentioned video concentration method.
As seen through the above description of the embodiments, those skilled in the art can be understood that above-mentioned implementation
All or part of the steps in example method can be realized by means of software and necessary general hardware platform.Based on such
Understand, substantially the part that contributes to existing technology can be in the form of software products in other words for the technical solution of the application
It embodies, which can store in storage medium, such as ROM/RAM, magnetic disk, CD, including several
Instruction is used so that a computer equipment (can be the network communications such as personal computer, server, or Media Gateway
Equipment, etc.) execute method described in certain parts of each embodiment of the application or embodiment.
It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment emphasis is said
Bright is the difference from other embodiments, and the same or similar parts in each embodiment may refer to each other.For reality
For applying device disclosed in example, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place
Referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one
Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation
There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain
Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (21)
1. a kind of video concentration method characterized by comprising
Video to be concentrated is obtained, includes multiple moving targets in the video to be concentrated;
Moving target combination is selected for each condensed images, each condensed images are carried out to the video to be concentrated
The each frame image obtained after concentration;
According to the moving target combination in each condensed images, video concentration is carried out to the video to be concentrated.
2. the method according to claim 1, wherein described select moving target combination side for each condensed images
Formula, comprising:
It determines footprint area of each moving target in each frame image of the video to be concentrated and plants oneself;
It according to the footprint area of each moving target and plants oneself, selects moving target combination for each condensed images.
3. according to the method described in claim 2, it is characterized in that, the footprint area according to each moving target and occupying
Position selects moving target combination for each condensed images, comprising:
By the way that the moving target for having overlapping in the video to be concentrated is divided into one group, each initial packet is obtained, it is described initial
It include at least one moving target in grouping;
By carrying out event detection to the video to be concentrated, each initial packet is grouped adjustment, obtains each first
Targeted packets;
According to the footprint area of each moving target and plant oneself and each first object grouping, for each condensed images select
Select moving target combination.
4. according to the method described in claim 3, it is characterized in that, the fortune by will have overlapping in the video to be concentrated
Moving-target is divided into one group, obtains each initial packet, comprising:
The corresponding cluster feature of each moving target is extracted from the video to be concentrated;
According to the corresponding cluster feature of each moving target, the moving target for having overlapping is divided into one group, is obtained each first
Begin to be grouped.
5. according to the method described in claim 4, it is characterized in that, the cluster feature include corresponding moving target it is described to
Moving direction information, translational speed information in concentration video, location information when first appearing and when first appearing when
Between in information at least one of.
6. according to the method described in claim 4, it is characterized in that, described special according to the corresponding cluster of each moving target
Sign, is divided into one group for the moving target for having overlapping, obtains each initial packet, comprising:
According to the different default cluster number of M kind, respectively according to the corresponding cluster feature of each moving target, by each fortune
Moving-target is clustered, and M kind cluster result is obtained;
For each cluster result in the M kind cluster result, the cluster for calculating each moving target in the cluster result is special
Error between sign and the cluster centre feature of the cluster result, and the sum of square of each error is calculated, obtain the cluster knot
The corresponding error sum of squares of fruit;
The corresponding cluster result of minimum value is selected from the corresponding error sum of squares of M kind cluster result, and will selection
Cluster result in each cluster as each initial packet.
7. according to the method described in claim 3, it is characterized in that, described by carrying out event inspection to the video to be concentrated
It surveys, each initial packet is grouped adjustment, obtain each first object grouping, comprising:
The video to be concentrated is divided into multiple first video-frequency bands;
In each first video-frequency band of division, the number of moving target is counted respectively;
Event detection is carried out in each first video-frequency band that statistics number reaches predetermined number threshold value;
If detecting, the moving target in the first video-frequency band in different initial packets participates in same event, by different initial points
Moving target in group merges into same group, and each grouping after merging and each initial packet without merging are divided
It Zuo Wei not first object grouping.
8. according to the described in any item methods of claim 3 to 7, which is characterized in that the occupying according to each moving target
Area and plant oneself and each first object grouping, for each condensed images select moving target combination, comprising:
The video to be concentrated is segmented, each second video-frequency band is obtained;
Determine occur in second video-frequency band each first object grouping, and by each first object of appearance grouping according to
Different modes are combined, and obtain each second targeted packets;
For each second targeted packets, determine each moving target in the second targeted packets in each of second video-frequency band
It footprint area in frame image and plants oneself;
It according to the footprint area of each moving target of each second targeted packets and plants oneself, in each second targeted packets
In select a kind of packet mode, and according to the packet mode of selection, determine second video-frequency band is concentrated to give it is each
The corresponding moving target combination of a condensed images.
9. according to the method described in claim 8, it is characterized in that, described be segmented the video to be concentrated, comprising:
According to the packet count of the totalframes of the video to be concentrated and first object grouping, the video to be concentrated is carried out
Segmentation.
10. according to the method described in claim 8, it is characterized in that, each movement according to each second targeted packets
It the footprint area of target and plants oneself, selects a kind of packet mode in each second targeted packets, comprising:
The corresponding target function value of each second targeted packets is generated, and selects corresponding second mesh of minimum target functional value
Mark grouping;
Wherein, the target function value is generated according to scene utilization rate and duplication loss rate;The scene utilization rate reflection
The footprint area of each moving target in corresponding second targeted packets in background image occupies degree, and the scene
Utilization rate is inversely proportional with the target function value;The duplication loss rate reflects each movement in corresponding second targeted packets
Overlapping degree of the target in the footprint area in background image between each moving target caused by planting oneself.
11. a kind of video enrichment facility characterized by comprising
Video acquisition unit to be concentrated includes multiple moving targets in the video to be concentrated for obtaining video to be concentrated;
Objective cross selecting unit, for selecting moving target combination, each condensed images for each condensed images
It is each frame image obtained after the video to be concentrated is concentrated;
Video concentration unit, for according to the moving target combination in each condensed images, to described to dense
Contracting video carries out video concentration.
12. device according to claim 11, which is characterized in that the objective cross selecting unit includes:
Target data obtains subelement, for determining each moving target occupying in each frame image of the video to be concentrated
It area and plants oneself;
Objective cross selects subelement, is each concentration figure for according to the footprint area of each moving target and planting oneself
As selection moving target combination.
13. device according to claim 12, which is characterized in that objective cross selection subelement includes:
Initial packet subelement, for obtaining each by the way that the moving target for having overlapping in the video to be concentrated is divided into one group
A initial packet includes at least one moving target in the initial packet;
Event detection subelement, for by carrying out event detection to the video to be concentrated, each initial packet to be divided
Group adjustment obtains each first object grouping;
Combination selection subelement, for according to the footprint area of each moving target and planting oneself and each first object point
Group selects moving target combination for each condensed images.
14. device according to claim 13, which is characterized in that the initial packet subelement includes:
Cluster feature extracts subelement, special for extracting the corresponding cluster of each moving target from the video to be concentrated
Sign;
Initial packet forms subelement, for will have the movement of overlapping according to the corresponding cluster feature of each moving target
Target is divided into one group, obtains each initial packet.
15. device according to claim 14, which is characterized in that the initial packet forms subelement and includes:
Target cluster subelement is respectively corresponded to according to each moving target respectively for the default cluster number different according to M kind
Cluster feature, each moving target is clustered, M kind cluster result is obtained;
Quadratic sum computation subunit, for calculating in the cluster result for each cluster result in the M kind cluster result
Error between the cluster feature of each moving target and the cluster centre feature of the cluster result, and calculate the flat of each error
The sum of side, obtains the corresponding error sum of squares of the cluster result;
Cluster result selects subelement, for selecting minimum value from the corresponding error sum of squares of M kind cluster result
Corresponding cluster result, and using each cluster in the cluster result selected as each initial packet.
16. device according to claim 13, which is characterized in that the event detection subelement includes:
First divides subelement, for the video to be concentrated to be divided into multiple first video-frequency bands;
Number counts subelement, for counting the number of moving target respectively in each first video-frequency band of division;
Event detection subelement, for carrying out event inspection in each first video-frequency band that statistics number reaches predetermined number threshold value
It surveys;
Targeted packets subelement, if for detecting that it is same that the moving target in the first video-frequency band in different initial packets participates in
Moving target in different initial packets is then merged into same group by event, by each grouping and non-economic cooperation after merging
And each initial packet, respectively as first object be grouped.
17. 3 to 16 described in any item methods according to claim 1, which is characterized in that combination selection subelement includes:
Second division subelement obtains each second video-frequency band for the video to be concentrated to be segmented;
Packet assembling subelement, for determining each first object occurred in second video-frequency band grouping, and by appearance
Each first object grouping is differently combined, and obtains each second targeted packets;
Determining subelement is occupied, for determining each moving target in the second targeted packets for each second targeted packets
It footprint area in each frame image of second video-frequency band and plants oneself;
Mode determines subelement, for according to the footprint area of each moving targets of each second targeted packets and occupying position
It sets, a kind of packet mode is selected in each second targeted packets, and according to the packet mode of selection, determine to second view
The corresponding moving target combination of each condensed images that frequency range is concentrated to give.
18. device according to claim 17, which is characterized in that the mode determines subelement, is specifically used for generating each
The corresponding target function value of a second targeted packets, and select corresponding second targeted packets of minimum target functional value;
Wherein, the target function value is generated according to scene utilization rate and duplication loss rate;The scene utilization rate reflection
The footprint area of each moving target in corresponding second targeted packets in background image occupies degree, and the scene
Utilization rate is inversely proportional with the target function value;The duplication loss rate reflects each movement in corresponding second targeted packets
Overlapping degree of the target in the footprint area in background image between each moving target caused by planting oneself.
19. a kind of video enrichment facility characterized by comprising processor, memory, system bus;
The processor and the memory are connected by the system bus;
The memory includes instruction for storing one or more programs, one or more of programs, and described instruction works as quilt
The processor makes the processor perform claim require 1-10 described in any item methods when executing.
20. a kind of computer readable storage medium, which is characterized in that instruction is stored in the computer readable storage medium,
When described instruction is run on the terminal device, so that the terminal device perform claim requires the described in any item sides of 1-10
Method.
21. a kind of computer program product, which is characterized in that when the computer program product is run on the terminal device, make
It obtains the terminal device perform claim and requires the described in any item methods of 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811518639.3A CN109862313B (en) | 2018-12-12 | 2018-12-12 | Video concentration method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811518639.3A CN109862313B (en) | 2018-12-12 | 2018-12-12 | Video concentration method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109862313A true CN109862313A (en) | 2019-06-07 |
CN109862313B CN109862313B (en) | 2022-01-14 |
Family
ID=66891091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811518639.3A Active CN109862313B (en) | 2018-12-12 | 2018-12-12 | Video concentration method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109862313B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110191324A (en) * | 2019-06-28 | 2019-08-30 | Oppo广东移动通信有限公司 | Image processing method, device, server and storage medium |
CN110796664A (en) * | 2019-10-14 | 2020-02-14 | 北京字节跳动网络技术有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103079117A (en) * | 2012-12-30 | 2013-05-01 | 信帧电子技术(北京)有限公司 | Video abstract generation method and video abstract generation device |
CN103957472A (en) * | 2014-04-10 | 2014-07-30 | 华中科技大学 | Timing-sequence-keeping video summary generation method and system based on optimal reconstruction of events |
US20160029031A1 (en) * | 2014-01-24 | 2016-01-28 | National Taiwan University Of Science And Technology | Method for compressing a video and a system thereof |
CN105469425A (en) * | 2015-11-24 | 2016-04-06 | 上海君是信息科技有限公司 | Video condensation method |
CN106856577A (en) * | 2015-12-07 | 2017-06-16 | 北京航天长峰科技工业集团有限公司 | The video abstraction generating method of multiple target collision and occlusion issue can be solved |
CN106937120A (en) * | 2015-12-29 | 2017-07-07 | 北京大唐高鸿数据网络技术有限公司 | Object-based monitor video method for concentration |
-
2018
- 2018-12-12 CN CN201811518639.3A patent/CN109862313B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103079117A (en) * | 2012-12-30 | 2013-05-01 | 信帧电子技术(北京)有限公司 | Video abstract generation method and video abstract generation device |
US20160029031A1 (en) * | 2014-01-24 | 2016-01-28 | National Taiwan University Of Science And Technology | Method for compressing a video and a system thereof |
CN103957472A (en) * | 2014-04-10 | 2014-07-30 | 华中科技大学 | Timing-sequence-keeping video summary generation method and system based on optimal reconstruction of events |
CN105469425A (en) * | 2015-11-24 | 2016-04-06 | 上海君是信息科技有限公司 | Video condensation method |
CN106856577A (en) * | 2015-12-07 | 2017-06-16 | 北京航天长峰科技工业集团有限公司 | The video abstraction generating method of multiple target collision and occlusion issue can be solved |
CN106937120A (en) * | 2015-12-29 | 2017-07-07 | 北京大唐高鸿数据网络技术有限公司 | Object-based monitor video method for concentration |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110191324A (en) * | 2019-06-28 | 2019-08-30 | Oppo广东移动通信有限公司 | Image processing method, device, server and storage medium |
CN110796664A (en) * | 2019-10-14 | 2020-02-14 | 北京字节跳动网络技术有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN110796664B (en) * | 2019-10-14 | 2023-05-23 | 北京字节跳动网络技术有限公司 | Image processing method, device, electronic equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109862313B (en) | 2022-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Crowd counting with deep structured scale integration network | |
Leal-Taixé et al. | Tracking the trackers: an analysis of the state of the art in multiple object tracking | |
CN106295594B (en) | A kind of across camera method for tracking target and device based on dynamic route tree | |
CN105069429B (en) | A kind of flow of the people analytic statistics methods and system based on big data platform | |
Liu et al. | Discovering spatio-temporal causal interactions in traffic data streams | |
CN102915347B (en) | A kind of distributed traffic clustering method and system | |
Curtis et al. | Virtual tawaf: A case study in simulating the behavior of dense, heterogeneous crowds | |
CN110111363A (en) | A kind of tracking and equipment based on target detection | |
CN105740904B (en) | A kind of trip based on DBSCAN clustering algorithm and activity pattern recognition methods | |
CN103546726B (en) | Method for automatically discovering illegal land use | |
CN105513339B (en) | A kind of track of vehicle analysis method and equipment | |
CN103929685A (en) | Video abstract generating and indexing method | |
CN109447186A (en) | Clustering method and Related product | |
CN103888541B (en) | Method and system for discovering cells fused with topology potential and spectral clustering | |
CN110428449A (en) | Target detection tracking method, device, equipment and storage medium | |
CN109862313A (en) | A kind of video concentration method and device | |
CN105046720B (en) | The behavior dividing method represented based on human body motion capture data character string | |
CN108280844A (en) | A kind of video object localization method based on the tracking of region candidate frame | |
Zhou et al. | TS4Net: Two-stage sample selective strategy for rotating object detection | |
CN111652136A (en) | Pedestrian detection method and device based on depth image | |
CN113065397A (en) | Pedestrian detection method and device | |
CN105426392A (en) | Collaborative filtering recommendation method and system | |
Sun et al. | Automated human use mapping of social infrastructure by deep learning methods applied to smart city camera systems | |
CN108564595A (en) | Image tracking method and device, electronic equipment, storage medium, program | |
Zhou et al. | R2-D2: a system to support probabilistic path prediction in dynamic environments via" Semi-lazy" learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220414 Address after: 230088 8th-10th floor, iFLYTEK building, 666 Wangjiang West Road, hi tech Zone, Hefei City, Anhui Province Patentee after: IFLYTEK INTELLIGENT SYSTEM Co.,Ltd. Address before: NO.666, Wangjiang West Road, hi tech Zone, Hefei City, Anhui Province Patentee before: IFLYTEK Co.,Ltd. |
|
TR01 | Transfer of patent right |