CN107993217A

CN107993217A - Video data real-time processing method and device, computing device

Info

Publication number: CN107993217A
Application number: CN201711405700.9A
Authority: CN
Inventors: 董健
Original assignee: Beijing Qihoo Technology Co Ltd
Current assignee: Beijing Qihoo Technology Co Ltd
Priority date: 2017-12-22
Filing date: 2017-12-22
Publication date: 2018-05-04
Anticipated expiration: 2037-12-22
Also published as: CN107993217B

Abstract

The invention discloses a kind of video data real-time processing method and device, computing device, the two field picture that method includes video data is grouped processing, including：The current frame image in video is obtained in real time；Judge current frame image whether be any packet the 1st two field picture；If so, then current frame image is inputted into neutral net, the current frame image after being handled；If not, then current frame image is inputted into neutral net, after i-th layer of convolutional layer of computing to neutral net obtains the operation result of i-th layer of convolutional layer, the 1st two field picture for obtaining packet belonging to current frame image inputs the operation result of the jth layer warp lamination obtained into neutral net, the operation result of the operation result of i-th layer of convolutional layer and jth layer warp lamination is directly subjected to image co-registration, the current frame image after being handled；I and j is natural number；Current frame image after output processing；Above-mentioned steps are repeated until completing the processing to all two field pictures in video data.

Description

Video data real-time processing method and device, computing device

Technical field

The present invention relates to image processing field, and in particular to a kind of video data real-time processing method and device, calculating are set It is standby.

Background technology

With the development of science and technology, the technology of image capture device also increasingly improves.Regarded using what image capture device was recorded Frequency also becomes apparent from, resolution ratio, display effect also greatly improve.Can also be according to the various personalizations of user in recorded video Demand, handles video accordingly.

The prior art is when handling video, often using each two field picture in video as single two field picture Handled, without in view of the continuity between front and rear frame in video.So processing is so that need to carry out each frame Processing, the speed of processing are relatively slow, it is necessary to spend the more time.

Therefore, it is necessary to a kind of video data real-time processing method, to lift the speed that video is handled in real time.

The content of the invention

In view of the above problems, it is proposed that the present invention overcomes the above problem in order to provide one kind or solves at least in part State the video data real-time processing method and device, computing device of problem.

According to an aspect of the invention, there is provided a kind of video data real-time processing method, method is to video data institute Comprising two field picture be grouped processing, it includes：

Real-time image acquisition collecting device is captured and/or the video recorded in current frame image；Alternatively, obtain in real time Take the current frame image in currently played video；

Judge current frame image whether be any packet the 1st two field picture；

If so, then inputting current frame image into trained obtained neutral net, all rolled up by the neutral net After the computing of lamination and warp lamination, the current frame image after being handled；

If it is not, then current frame image is inputted into trained obtained neutral net, the i-th of computing to neutral net After layer convolutional layer obtains the operation result of i-th layer of convolutional layer, the 1st two field picture for obtaining packet belonging to current frame image is inputted to god The operation result of jth layer warp lamination through being obtained in network, directly by the operation result of i-th layer of convolutional layer and jth layer warp The operation result of lamination carries out image co-registration, the current frame image after being handled；Wherein, i and j is natural number；

Current frame image after output processing；

Above-mentioned steps are repeated until completing the processing to all two field pictures in video data.

Alternatively, after the 1st two field picture that current frame image is not any packet is judged, method further includes：

Calculate the frame pitch of current frame image and the 1st two field picture of packet belonging to it；

According to frame pitch, the value of i and j are determined；Wherein, the layer between i-th layer of convolutional layer and last layer of convolutional layer away from With frame pitch inversely, the layer between jth layer warp lamination and output layer is away from proportional with frame pitch.

Alternatively, method further includes：Pre-set frame pitch and the correspondence of the value of i and j.

Alternatively, the operation result of the operation result of i-th layer of convolutional layer and jth layer warp lamination is directly being subjected to image After fusion, method further includes：

If jth layer warp lamination is last layer of warp lamination of neutral net, image co-registration result is input to defeated Go out layer, with the current frame image after being handled；

If jth layer warp lamination is not last layer of warp lamination of neutral net, image co-registration result is input to + 1 layer of warp lamination of jth, by the computing of follow-up warp lamination and output layer, with the current frame image after being handled.

Alternatively, current frame image is inputted into trained obtained neutral net, is all rolled up by the neutral net After the computing of lamination and warp lamination, the current frame image after being handled further comprises：Passing through the neutral net most After each layer of convolutional layer computing before later layer convolutional layer, down-sampling processing is carried out to the operation result of each layer of convolutional layer.

Alternatively, before i-th layer of convolutional layer of computing to neutral net obtains the operation result of i-th layer of convolutional layer, side Method further includes：After each layer of convolutional layer computing before i-th layer of convolutional layer by the neutral net, to each layer of convolutional layer Operation result carry out down-sampling processing.

Alternatively, every group of video data includes n frame two field pictures；Wherein, n is fixed preset value.

Alternatively, method further includes：

By the video data real-time display after processing.

Alternatively, method further includes：

Video data after processing is uploaded to Cloud Server.

Alternatively, the video data after processing is uploaded to Cloud Server to further comprise：

Video data after processing is uploaded to cloud video platform server, so that cloud video platform server is in cloud video Platform is shown video data.

Video data after processing is uploaded to cloud direct broadcast server, so that cloud direct broadcast server pushes away video data in real time Give viewing subscription client.

Video data after processing is uploaded to cloud public platform server, so that cloud public platform server pushes away video data Give public platform concern client.

According to another aspect of the present invention, there is provided a kind of video data real-time processing device, device is to video data institute Comprising two field picture be grouped processing, it includes：

Acquisition module, suitable for the present frame figure captured by real-time image acquisition collecting device and/or in the video recorded Picture；Alternatively, the current frame image in currently played video is obtained in real time；

Judgment module, suitable for judging whether current frame image is the 1st two field picture of any packet, if so, performing at first Manage module；Otherwise, Second processing module is performed；

First processing module, suitable for inputting current frame image into trained obtained neutral net, by the nerve After the computing of network whole convolutional layer and warp lamination, the current frame image after being handled；

Second processing module, suitable for inputting current frame image into trained obtained neutral net, in computing to god After i-th layer of convolutional layer of network obtains the operation result of i-th layer of convolutional layer, the 1st frame of packet belonging to current frame image is obtained Image inputs the operation result of the jth layer warp lamination obtained into neutral net, directly by the operation result of i-th layer of convolutional layer Image co-registration, the current frame image after being handled are carried out with the operation result of jth layer warp lamination；Wherein, i and j is nature Number；

Output module, the current frame image after being handled suitable for output；

Loop module, suitable for repeating above-mentioned acquisition module, judgment module, first processing module, Second processing module And/or output module is until complete the processing to all two field pictures in video data.

Alternatively, device further includes：

Frame pitch computing module, suitable for calculating the frame pitch of current frame image and the 1st two field picture of packet belonging to it；

Determining module, suitable for according to frame pitch, determining the value of i and j；Wherein, i-th layer of convolutional layer and last layer of convolution Layer between layer is away from inversely, layer between jth layer warp lamination and output layer to frame pitch away from directlying proportional with frame pitch Relation.

Alternatively, device further includes：

Presetting module, suitable for pre-setting frame pitch and the correspondence of the value of i and j.

Alternatively, Second processing module is further adapted for：

Alternatively, first processing module is further adapted for：

After each layer of convolutional layer computing before last layer of convolutional layer by the neutral net, to each layer of convolution The operation result of layer carries out down-sampling processing.

Alternatively, Second processing module is further adapted for：

After each layer of convolutional layer computing before i-th layer of convolutional layer by the neutral net, to each layer of convolutional layer Operation result carry out down-sampling processing.

Alternatively, device further includes：

Display module, suitable for by the video data real-time display after processing.

Alternatively, device further includes：

Uploading module, suitable for the video data after processing is uploaded to Cloud Server.

Alternatively, uploading module is further adapted for：

According to another aspect of the invention, there is provided a kind of computing device, including：Processor, memory, communication interface and Communication bus, processor, memory and communication interface complete mutual communication by communication bus；

The memory is used to store an at least executable instruction, and executable instruction makes processor perform above-mentioned video data The corresponding operation of real-time processing method.

In accordance with a further aspect of the present invention, there is provided a kind of computer-readable storage medium, is stored with least one in storage medium Executable instruction, executable instruction make processor perform such as the corresponding operation of above-mentioned video data real-time processing method.

Video data real-time processing method and device, the computing device provided according to the present invention, real-time image acquisition collection Equipment is captured and/or the video recorded in current frame image；Alternatively, working as in currently played video is obtained in real time Prior image frame；Judge current frame image whether be any packet the 1st two field picture；If so, then current frame image is inputted to warp In the neutral net that training obtains, after the computing of the neutral net whole convolutional layer and warp lamination, after being handled Current frame image；If it is not, then current frame image is inputted into trained obtained neutral net, in computing to neutral net After i-th layer of convolutional layer obtains the operation result of i-th layer of convolutional layer, the 1st two field picture input of packet belonging to current frame image is obtained The operation result of the jth layer warp lamination obtained into neutral net, directly by the operation result of i-th layer of convolutional layer and jth layer The operation result of warp lamination carries out image co-registration, the current frame image after being handled；Wherein, i and j is natural number；Output Current frame image after processing；Above-mentioned steps are repeated until completing the processing to all two field pictures in video data.This hair The bright continuity taken full advantage of in video data between each two field picture, relevance, when being handled in real time video data, Video data packets are handled, complete the computing of whole convolutional layers and warp lamination in every group by neutral net to the 1st two field picture, To other two field pictures only computing in addition to the 1st two field picture to i-th layer of convolutional layer, the jth layer that the 1st two field picture has obtained is multiplexed The operation result of warp lamination carries out image co-registration, greatly reduces the operand of neutral net, it is real-time to improve video data The speed of processing.

Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by the embodiment of the present invention.

Brief description of the drawings

By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area Technical staff will be clear understanding.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole attached drawing, identical component is denoted by the same reference numerals.In the accompanying drawings：

Fig. 1 shows the flow chart of video data real-time processing method according to an embodiment of the invention；

Fig. 2 shows the flow chart of video data real-time processing method in accordance with another embodiment of the present invention；

Fig. 3 shows the functional block diagram of video data real-time processing device according to an embodiment of the invention；

Fig. 4 shows the functional block diagram of video data real-time processing device in accordance with another embodiment of the present invention；

Fig. 5 shows a kind of structure diagram of computing device according to an embodiment of the invention.

Embodiment

The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.

Fig. 1 shows the flow chart of video data real-time processing method according to an embodiment of the invention.Such as Fig. 1 institutes Show, video data real-time processing method specifically comprises the following steps：

Step S101, real-time image acquisition collecting device is captured and/or the video recorded in current frame image；Or Person, obtains the current frame image in currently played video in real time.

Image capture device is illustrated by taking mobile terminal as an example in the present embodiment.Get mobile terminal camera in real time Current frame image when current frame image in recorded video or shooting video.Except real-time image acquisition collecting device is clapped Outside the video taken the photograph and/or recorded, the current frame image in currently played video can also be obtained in real time.

The present embodiment make use of continuity, the relevance in video data between each two field picture, in video data During each two field picture processing, each two field picture packet in video data is handled.Each two field picture in video data is carried out , it is necessary to consider the incidence relation between each two field picture during packet, the close two field picture of incidence relation in each two field picture is divided into one group. The frame number of the two field picture specifically included in different framing image can be same or different, it is assumed that be wrapped in per framing image The two field picture of frame containing n, n can be fixed value or on-fixed value, and the value of n is set according to performance.Present frame is being obtained in real time During image, just current frame image is grouped, determines whether it is the two field picture in current group or in new be grouped 1st two field picture.Specifically, need to be carried out according to the incidence relation between current frame image and previous frame image or former two field pictures Packet.Track algorithm is such as used, if it is effective tracking result that track algorithm, which obtains current frame image, current frame image is determined For the two field picture in current group, if it is invalid tracking result that track algorithm, which obtains current frame image, by current frame image Really it is the 1st two field picture in new packet；Or the order according to each two field picture, adjacent two frames or three two field pictures are divided into one Group, by taking one group of three two field picture as an example, the 1st two field picture is the 1st two field picture of the first packet in video data, and the 2nd two field picture is the 2nd two field picture of one packet, the 3rd two field picture are the 3rd two field picture of the first packet, and the 4th two field picture is the 1st frame figure of second packet Picture, the 5th two field picture are the 2nd two field picture of second packet, and the 6th two field picture is the 3rd two field picture of second packet, and so on.It is real Specific packet mode is certain according to performance in applying, and does not limit herein.

Step S102, judge current frame image whether be any packet the 1st two field picture.

Judge current frame image whether be any of which packet the 1st two field picture, if so, perform step S103, otherwise hold Row step S104.Specific judgment mode is judged accordingly according to the difference of packet mode.

Step S103, current frame image is inputted into trained obtained neutral net, whole by the neutral net After the computing of convolutional layer and warp lamination, the current frame image after being handled.

Current frame image is the 1st two field picture in any packet, and current frame image is inputted to trained obtained nerve In network, perform the computing of whole convolutional layers and the computing of warp lamination to it by the neutral net successively, finally obtain place Current frame image after reason.Specifically, such as the computing comprising 4 layers of convolutional layer in the neutral net and the computing of 3 layers of warp lamination, Current frame image is inputted to the neutral net by the whole computing of 4 layers of convolutional layer and the computing of 3 layers of warp lamination.Its In, which further comprises carries out image co-registration by the operation result of convolutional layer with the operation result of corresponding warp lamination Processing, finally obtain processing after current frame image.

Step S104, current frame image is inputted into trained obtained neutral net, in computing to neutral net After i-th layer of convolutional layer obtains the operation result of i-th layer of convolutional layer, the 1st two field picture input of packet belonging to current frame image is obtained The operation result of the jth layer warp lamination obtained into neutral net, directly by the operation result of i-th layer of convolutional layer and jth layer The operation result of warp lamination carries out image co-registration, the current frame image after being handled.

Current frame image is not the 1st two field picture in any packet, and current frame image is inputted to trained obtained god Through in network, at this time, it is not necessary to perform the computing of whole convolutional layers and the computing of warp lamination to it by the neutral net, only After i-th layer of convolutional layer of computing to neutral net obtains the operation result of i-th layer of convolutional layer, directly acquire belonging to current frame image 1st two field picture of packet inputs the operation result of the jth layer warp lamination obtained into neutral net, by i-th layer of convolutional layer The operation result of operation result and jth layer warp lamination carries out image co-registration, it is possible to the current frame image after being handled.Its In, there is correspondence, which is specially the fortune of i-th layer of convolutional layer between i-th layer of convolutional layer and jth layer warp lamination It is identical with the output dimension of the operation result of jth layer warp lamination to calculate result.I and j is natural number, and the value of i is no more than The number of plies for last layer of convolutional layer that neutral net is included, the value of j are anti-no more than last layer that neutral net is included The number of plies of convolutional layer.Specifically, such as input current frame image into neutral net, computing to neutral net level 1 volume lamination, The operation result of level 1 volume lamination is obtained, the 1st two field picture for directly acquiring packet belonging to current frame image is inputted to neutral net In the obtained operation result of the 3rd layer of warp lamination, by the operation result of level 1 volume lamination and the 3rd layer of warp of the 1st two field picture The operation result of lamination is merged.Wherein, the operation result of the operation result of level 1 volume lamination and the 3rd layer of warp lamination It is identical to export dimension.

The operation result of jth layer warp lamination obtained by the 1st two field picture computing in packet belonging to multiplexing, can be with Computing of the neutral net to current frame image is reduced, the processing speed of neutral net is greatly speeded up, so as to improve neutral net Computational efficiency.

Step S105, the current frame image after output processing.

Current frame image during output after direct output processing, can also use the current frame image after processing and directly covers Former current frame image, speed during covering, was generally completed within 1/24 second.For a user, due to covering treatment Time it is relatively short, human eye is not discovered significantly, i.e., the former current frame image that human eye is not perceived in video data is coated to The process of lid, shoots and/or records equivalent to one side and/or during playing video data, while after handling user's output in real time Video data current frame image, user do not feel as the display effect that current frame image in video data covers.

Step S106, judges whether to complete the processing to all two field pictures in video data.

If current frame image is the last frame image of video data, judge to have completed to all frames in video data The processing of image, execution terminate.If also continuing to obtain the two field picture in video data after processing current frame image, judgement does not have There is the processing completed to all two field pictures in video data, perform step S101, continue to obtain the two field picture in video data, and It is handled.

The video data real-time processing method provided according to the present invention, real-time image acquisition collecting device it is captured and/or Current frame image in the video recorded；Alternatively, the current frame image in currently played video is obtained in real time；Judge to work as Prior image frame whether be any packet the 1st two field picture；If so, then current frame image is inputted to trained obtained nerve net In network, after the computing of the neutral net whole convolutional layer and warp lamination, the current frame image after being handled；If it is not, Then current frame image is inputted into trained obtained neutral net, is obtained in i-th layer of convolutional layer of computing to neutral net The 1st two field picture of packet is inputted after the operation result of i-th layer of convolutional layer, belonging to acquisition current frame image obtains into neutral net Jth layer warp lamination operation result, directly by the computing knot of the operation result of i-th layer of convolutional layer and jth layer warp lamination Fruit carries out image co-registration, the current frame image after being handled；Wherein, i and j is natural number；Present frame figure after output processing Picture；Above-mentioned steps are repeated until completing the processing to all two field pictures in video data.The present invention takes full advantage of video Continuity, relevance in data between each two field picture, when being handled in real time video data, at video data packets Reason, the 1st two field picture is completed by neutral net in every group the computing of whole convolutional layers and warp lamination, to except the 1st two field picture it The computing for the jth layer warp lamination that outer other two field pictures only computing has been obtained to i-th layer of convolutional layer, the 1st two field picture of multiplexing As a result image co-registration is carried out, greatly reduces the operand of neutral net, improves the speed that video data is handled in real time.

Fig. 2 shows the flow chart of video data real-time processing method in accordance with another embodiment of the present invention.Such as Fig. 2 institutes Show, video data real-time processing method specifically comprises the following steps：

Step S201, real-time image acquisition collecting device is captured and/or the video recorded in current frame image；Or Person, obtains the current frame image in currently played video in real time.

The step refers to the step S101 in Fig. 1 embodiments, and details are not described herein.

Step S202, judge current frame image whether be any packet the 1st two field picture.

Judge current frame image whether be any of which packet the 1st two field picture, if so, perform step S203, otherwise hold Row step S204.

Step S203, current frame image is inputted into trained obtained neutral net, whole by the neutral net After the computing of convolutional layer and warp lamination, the current frame image after being handled.

Current frame image is the 1st two field picture in any packet, and current frame image is inputted to trained obtained nerve In network, perform the computing of whole convolutional layers and the computing of warp lamination to it by the neutral net successively, finally obtain place Current frame image after reason.

To further improve the arithmetic speed of neutral net, before last layer of convolutional layer Jing Guo the neutral net After each layer of convolutional layer computing, down-sampling processing is carried out to the operation result of each layer of convolutional layer, i.e., is inputted current frame image After neutral net, after level 1 volume lamination computing, down-sampling processing is carried out to operation result, reduces the resolution ratio of operation result, The operation result after down-sampling is subjected to level 2 volume lamination computing again, and the operation result of level 2 volume lamination is also carried out down adopting Sample processing, and so on, until last layer of convolutional layer (i.e. the bottleneck layer of convolutional layer) of neutral net, with last layer of convolution Exemplified by layer is the 4th layer of convolutional layer, down-sampling processing is no longer done after the 4th layer of convolutional layer operation result.Last layer of convolutional layer After each layer of convolutional layer computing before, down-sampling processing is carried out to the operation result of each layer of convolutional layer, reduces each layer convolution The resolution ratio of the two field picture of layer input, can improve the arithmetic speed of neutral net.It should be noted that the of neutral net During convolutional layer computing, input is the current frame image obtained in real time, without carrying out down-sampling processing, can so be obtained To the details of relatively good current frame image.And then when carrying out down-sampling processing to the operation result of output, both do not interfered with The details of current frame image, and the arithmetic speed of neutral net can be improved.

Step S204, calculates the frame pitch of current frame image and the 1st two field picture of packet belonging to it.

When calculating the frame pitch of current frame image and the 1st two field picture of its affiliated packet, specifically, current frame image is to appoint 3rd two field picture of one packet, the frame pitch that itself and the 1st two field picture of affiliated packet is calculated is 2.

Step S205, according to frame pitch, determines the value of i and j.

According to obtained frame pitch, to determine the value of the i of i-th layer of convolutional layer in neutral net, and the 1st two field picture The value of the j of j layers of warp lamination.In definite i and j, it is believed that i-th layer of convolutional layer and last layer of convolutional layer (convolutional layer Bottleneck layer) between layer away from frame pitch inversely, layer between jth layer warp lamination and output layer away from frame pitch It is proportional.When frame pitch is bigger, away from smaller, i values are bigger for layer between i-th layer of convolutional layer and last layer of convolutional layer, More need to run the computing of more convolutional layer；Away from bigger, j values are smaller, need to obtain for layer between jth layer warp lamination and output layer Take the operation result of the warp lamination of the smaller number of plies.

Exemplified by including 1-4 layers of convolutional layer in neutral net, wherein, the 4th layer of convolutional layer is last layer of convolutional layer；God Through further comprises 1-3 layers of warp lamination and output layer in network.When frame pitch is 1, i-th layer of convolutional layer and last are determined For layer between layer convolutional layer away from for 3, determining that i is 1, i.e. computing to level 1 volume lamination, determines jth layer warp lamination and output layer Between layer away from for 1, determining that j is 3, obtain the operation result of the 3rd layer of warp lamination；When frame pitch is 2, i-th layer of volume is determined Layer between lamination and last layer of convolutional layer is away from for 2, and it is 2 to determine i, i.e. computing to level 2 volume lamination, determines jth layer warp Layer between lamination and output layer is away from the operation result for for 2, j 2, obtaining the 2nd layer of warp lamination.Specific layer away from size and god Related to each number of plies and actual implementation effect to be reached of warp lamination through the convolutional layer that network is included, the above is equal To illustrate.

Alternatively, according to obtained frame pitch, the value of the i of i-th layer of convolutional layer in neutral net, and the 1st frame figure are determined As the j of jth layer warp lamination value when, frame pitch and pair of the value of i and j can be pre-set directly according to frame pitch It should be related to.Specifically, pre-set the value of different i and j according to different frame pitch, if frame pitch is 1, the value of i is set Value for 1, j is 3；Frame pitch is 2, and the value that the value for setting i is 2, j is 2；Or can also be according to different interframe Away from setting the value of identical i and j；No matter during such as size of frame pitch, the value for being respectively provided with corresponding i is for the value of 2, j 2；Or the value of identical i and j to a part of different frame pitch, can also be set, as frame pitch be 1 and 2, set correspond to I value be 1, j value be 3；Frame pitch is 3 and 4, and the value that the value for setting corresponding i is 2, j is 2.With specific reference to Performance is configured, and is not limited herein.

Step S206, current frame image is inputted into trained obtained neutral net, in computing to neutral net After i-th layer of convolutional layer obtains the operation result of i-th layer of convolutional layer, the 1st two field picture input of packet belonging to current frame image is obtained The operation result of the jth layer warp lamination obtained into neutral net, directly by the operation result of i-th layer of convolutional layer and jth layer The operation result of warp lamination carries out image co-registration, the current frame image after being handled.

Current frame image is not the 1st two field picture in any packet, after the value of i and j is determined, current frame image is defeated Enter into trained obtained neutral net, i-th layer of convolutional layer of only computing to neutral net obtains the computing of i-th layer of convolutional layer As a result after, the 1st two field picture for directly acquiring packet belonging to current frame image inputs the jth layer deconvolution obtained into neutral net The operation result of layer, carries out image co-registration, just by the operation result of the operation result of i-th layer of convolutional layer and jth layer warp lamination Current frame image after can be processed.Wherein, the 1st two field picture of affiliated packet inputs the jth obtained into neutral net The operation result of layer warp lamination can be directly obtained, it is not necessary to the 1st two field picture of affiliated packet is being inputted nerve again Network obtains, and greatly reduces the operation frequency of neutral net, accelerates the arithmetic speed of neutral net.

Further, after each layer of convolutional layer computing before i-th layer of convolutional layer by the neutral net, to each layer The operation result of convolutional layer carries out down-sampling processing.After current frame image is inputted neutral net, in level 1 volume lamination computing Afterwards, down-sampling processing is carried out to operation result, reduces the resolution ratio of operation result, then the operation result after down-sampling is carried out the Level 2 volume lamination computing, and down-sampling processing is also carried out to the operation result of level 2 volume lamination, and so on, until i-th layer of volume Lamination, can so reduce the resolution ratio of the two field picture of each layer convolutional layer input, improve the arithmetic speed of neutral net.Need to note Meaning, in the first time convolutional layer computing of neutral net, input is the current frame image obtained in real time, without carrying out Down-sampling processing, can so obtain the details of relatively good current frame image.And then the operation result of output is carried out down During sampling processing, the details of current frame image was not only interfered with, but also the arithmetic speed of neutral net can be improved.

Further, it is if jth layer warp lamination is last layer of warp lamination of neutral net, image co-registration result is defeated Enter to output layer, with the current frame image after being handled.If jth layer warp lamination is not last layer of warp of neutral net Lamination, then be input to+1 layer of warp lamination of jth by image co-registration result, by follow-up each warp lamination, and the fortune of output layer Calculate, with the current frame image after being handled.

Step S207, the current frame image after output processing.

Step S208, by the video data real-time display after processing.

Current frame image during output after direct output processing, can also use the current frame image after processing and directly covers Former current frame image, speed during covering, was generally completed within 1/24 second.So can be by the video data after processing Being shown in real time, user can directly be seen that the display effect of the current frame image of the video data after processing, without Feel that current frame image is handled in video data.

Step S209, judges whether to complete the processing to all two field pictures in video data.

If current frame image is the last frame image of video data, judge to have completed to all frames in video data The processing of image, processing are completed, and can perform follow-up step S209.If also continue to obtain video after processing current frame image Two field picture in data, then judge without the processing completed to all two field pictures in video data, perform step S201, continue to obtain The two field picture in video data is taken, and it is handled.

Step S210, Cloud Server is uploaded to by the video data after processing.

Video data after processing can be directly uploaded to Cloud Server, specifically, can be by the video counts after processing According to be uploaded to one or more cloud video platform server, such as iqiyi.com, youku.com, fast video cloud video platform server, So that cloud video platform server is shown video data in cloud video platform.Or can also be by the video data after processing Cloud direct broadcast server is uploaded to, can be straight by cloud when the user for having live viewing end is watched into cloud direct broadcast server Broadcast server and give video data real time propelling movement to viewing subscription client.Or the video data after processing can also be uploaded to Cloud public platform server, when there is user to pay close attention to the public platform, public platform is pushed to by cloud public platform server by video data Pay close attention to client；Further, cloud public platform server can also be accustomed to according to the viewing of the user of concern public platform, and push meets The video data of user's custom pays close attention to client to public platform.

The video data real-time processing method provided according to the present invention, after getting current frame image, to current frame image Judged, if current frame image is the 1st two field picture in any packet, current frame image is inputted to trained obtained god Through in network, after the computing of the neutral net whole convolutional layer and warp lamination, the current frame image after being handled；If Current frame image is not the 1st two field picture in any packet, calculates the frame of current frame image and the 1st two field picture of packet belonging to it Spacing.According to frame pitch, determine the i values of i-th layer of convolutional layer of neutral net, obtain the operation result of i-th layer of convolutional layer.Together When, the j values of the jth layer warp lamination of neutral net are determined, so as to directly acquire the 1st frame figure of packet belonging to current frame image Operation result as inputting the jth layer warp lamination obtained into neutral net, is multiplexed the operation result of jth layer warp lamination, The operation result of the operation result of i-th layer of convolutional layer and jth layer warp lamination is subjected to image co-registration, it is current after being handled Two field picture, reduces the frequency of neural network computing, improves computational efficiency.Further, can also be in i-th layer of volume of neutral net After each layer of convolutional layer computing before lamination or last layer of convolutional layer, the operation result of each layer of convolutional layer adopt Sample processing, reduces the resolution ratio of the two field picture of each layer convolutional layer input, to improve the arithmetic speed of neutral net.The present invention can be with The video data after processing is directly obtained, the video data after processing can also be directly uploaded to Cloud Server, it is not necessary to use Family carries out extra process to video data, saves user time, the video data after being handled with real-time display to user, side Just user checks display effect.

Fig. 3 shows the functional block diagram of video data real-time processing device according to an embodiment of the invention.Such as Fig. 3 institutes Show, video data real-time processing device includes following module：

Acquisition module 301, suitable for the present frame captured by real-time image acquisition collecting device and/or in the video recorded Image；Alternatively, the current frame image in currently played video is obtained in real time.

Image capture device is illustrated by taking mobile terminal as an example in the present embodiment.Acquisition module 301 gets shifting in real time Move current frame image when current frame image or shooting video of the terminal camera in recorded video.Acquisition module 301 removes Real-time image acquisition collecting device is captured and/or the video recorded outside, currently played video can also be obtained in real time In current frame image.

The present embodiment make use of continuity, the relevance in video data between each two field picture, in video data During each two field picture processing, each two field picture packet in video data is handled.Each two field picture in video data is carried out , it is necessary to consider the incidence relation between each two field picture during packet, the close two field picture of incidence relation in each two field picture is divided into one group. The frame number of the two field picture specifically included in different framing image can be same or different, it is assumed that be wrapped in per framing image The two field picture of frame containing n, wherein it is possible to be fixed value or on-fixed value, the value of n is set according to performance.Acquisition module 301 exists When obtaining current frame image in real time, just current frame image is grouped, determines whether it is the two field picture in current group Or it is the 1st two field picture in new packet.Specifically, need according between current frame image and previous frame image or former two field pictures Incidence relation be grouped.Track algorithm is such as used, will if it is effective tracking result that track algorithm, which obtains current frame image, Current frame image is determined as the two field picture in current group, if it is invalid tracking knot that track algorithm, which obtains current frame image, Fruit, is really the 1st two field picture in new packet by current frame image；Or the order according to each two field picture, by two adjacent frames Or three two field picture be divided into one group, by taking one group of three two field picture as an example, in video data the 1st two field picture be first packet the 1st frame figure Picture, the 2nd two field picture are the 2nd two field picture of the first packet, and the 3rd two field picture is the 3rd two field picture of the first packet, and the 4th two field picture is 1st two field picture of second packet, the 5th two field picture are the 2nd two field picture of second packet, and the 6th two field picture is the 3rd frame of second packet Image, and so on.Specific packet mode is certain according to performance in implementation, does not limit herein.

Judgment module 302, suitable for judging whether current frame image is the 1st two field picture of any packet, if so, performing first Processing module 303；Otherwise, Second processing module 304 is performed.

Judgment module 302 judges whether current frame image is the 1st two field picture of any of which packet, if so, performing first Processing module 303, otherwise performs Second processing module 304.302 specific judgment mode of judgment module is according to the difference of packet mode Judged accordingly.

First processing module 303, suitable for inputting current frame image into trained obtained neutral net, by the god After the computing of network whole convolutional layer and warp lamination, the current frame image after being handled.

Judgment module 302 judges current frame image for the 1st two field picture in any packet, and first processing module 303 ought Prior image frame is inputted into trained obtained neutral net, performs the fortune of whole convolutional layers to it by the neutral net successively The computing with warp lamination is calculated, finally obtains the current frame image after processing.Specifically, as included 4 layers of volume in the neutral net The computing of lamination and the computing of 3 layers of warp lamination, first processing module 303, which inputs current frame image to the neutral net, passes through Whole computings of 4 layers of convolutional layer and the computing of 3 layers of warp lamination.Wherein, which further comprises the fortune of convolutional layer The processing that result carries out image co-registration with the operation result of corresponding warp lamination is calculated, finally obtains the present frame figure after processing Picture.

Further, to improve the arithmetic speed of neutral net, before last layer of convolutional layer Jing Guo the neutral net Each layer of convolutional layer computing after, first processing module 303 carries out down-sampling processing to the operation result of each layer of convolutional layer, i.e., After current frame image is inputted neutral net, after level 1 volume lamination computing, first processing module 303 carries out operation result Down-sampling processing, reduces the resolution ratio of operation result, then the operation result after down-sampling is carried out level 2 volume lamination computing, the One processing module 303 also carries out down-sampling processing to the operation result of level 2 volume lamination, and so on, until neutral net Last layer of convolutional layer (i.e. the bottleneck layer of convolutional layer), by taking last layer of convolutional layer is the 4th layer of convolutional layer as an example, in the 4th layer of volume First processing module 303 no longer does down-sampling processing after lamination operation result.Each layer of volume before last layer of convolutional layer After lamination computing, first processing module 303 carries out down-sampling processing to the operation result of each layer of convolutional layer, reduces each layer convolution The resolution ratio of the two field picture of layer input, can improve the arithmetic speed of neutral net.It should be noted that the of neutral net During convolutional layer computing, what first processing module 303 inputted is the current frame image obtained in real time, without carrying out down-sampling Processing, can so obtain the details of relatively good current frame image.Afterwards, the computing to output again of first processing module 303 When as a result carrying out down-sampling processing, the details of current frame image was not only interfered with, but also the arithmetic speed of neutral net can be improved.

Second processing module 304, suitable for inputting current frame image into trained obtained neutral net, in computing extremely After i-th layer of convolutional layer of neutral net obtains the operation result of i-th layer of convolutional layer, the 1st of packet belonging to current frame image is obtained Two field picture inputs the operation result of the jth layer warp lamination obtained into neutral net, directly by the computing knot of i-th layer of convolutional layer Fruit and the operation result of jth layer warp lamination carry out image co-registration, the current frame image after being handled.

Judgment module 302 judges that current frame image is not the 1st two field picture in any packet, and Second processing module 304 will Current frame image is inputted into trained obtained neutral net, at this time, it is not necessary to it is performed by the neutral net whole The computing of convolutional layer and the computing of warp lamination, only the i-th of computing to neutral net layer convolutional layer obtain the fortune of i-th layer of convolutional layer After calculating result, the 1st two field picture that Second processing module 304 directly acquires packet belonging to current frame image is inputted into neutral net The operation result of obtained jth layer warp lamination, by the operation result of i-th layer of convolutional layer and the computing knot of jth layer warp lamination Fruit carries out image co-registration, it is possible to the current frame image after being handled.Wherein, i-th layer of convolutional layer and jth layer warp lamination it Between there is correspondence, which is specially operation result of the operation result with jth layer warp lamination of i-th layer of convolutional layer Output dimension it is identical.I and j is natural number, and the value of i is no more than last layer of convolutional layer that neutral net is included The number of plies, the value of j are no more than the number of plies for last layer of warp lamination that neutral net is included.Specifically, such as second processing mould Block 304 inputs current frame image into neutral net, and computing obtains level 1 volume lamination to neutral net level 1 volume lamination Operation result, the 1st two field picture that Second processing module 304 directly acquires packet belonging to current frame image are inputted into neutral net The operation result of the 3rd layer of obtained warp lamination, by the 3rd layer of deconvolution of the operation result of level 1 volume lamination and the 1st two field picture The operation result of layer is merged.Wherein, the operation result of level 1 volume lamination and the operation result of the 3rd layer of warp lamination is defeated It is identical to go out dimension.

The jth layer warp lamination that Second processing module 304 is obtained by the 1st two field picture computing in packet belonging to multiplexing Operation result, it is possible to reduce computing of the neutral net to current frame image, greatly speeds up the processing speed of neutral net so that Improve the computational efficiency of neutral net.

Further, after each layer of convolutional layer computing before i-th layer of convolutional layer by the neutral net, second processing Module 304 carries out down-sampling processing to the operation result of each layer of convolutional layer.After current frame image is inputted neutral net, After level 1 volume lamination computing, Second processing module 304 carries out down-sampling processing to operation result, reduces the resolution of operation result Rate, then the operation result after down-sampling is subjected to level 2 volume lamination computing, fortune of the Second processing module 304 to level 2 volume lamination Calculate result and also carry out down-sampling processing, and so on, until i-th layer of convolutional layer, can so reduce each layer convolutional layer input The resolution ratio of two field picture, improves the arithmetic speed of neutral net.It should be noted that the first time convolutional layer in neutral net is transported During calculation, what Second processing module 304 inputted is the current frame image obtained in real time, without carrying out down-sampling processing, so may be used To obtain the details of relatively good current frame image.Afterwards, Second processing module 304 to the operation result of output adopt again During sample processing, the details of current frame image was not only interfered with, but also the arithmetic speed of neutral net can be improved.

Further, if jth layer warp lamination is last layer of warp lamination of neutral net, Second processing module 304 Image co-registration result is input to output layer, with the current frame image after being handled.If jth layer warp lamination is not nerve net Last layer of warp lamination of network, then Second processing module 304 image co-registration result is input to+1 layer of warp lamination of jth, pass through Later each warp lamination is continued, and the computing of output layer, with the current frame image after being handled.

Output module 305, the current frame image after being handled suitable for output.

Current frame image of the output module 305 when exporting after direct output processing, output module 305 can also use processing Current frame image afterwards directly covers former current frame image, and speed during covering, was generally completed within 1/24 second.For For user, since the time of covering treatment is relatively short, human eye is not discovered significantly, i.e., human eye does not perceive video data In the process that is capped of former current frame image, shoot and/or record equivalent to one side and/or during playing video data, on one side The current frame image of video data after handling user's output in real time, user do not feel as current frame image in video data The display effect covered.

Loop module 306, suitable for repeat above-mentioned acquisition module 301, judgment module 302, first processing module 303, Second processing module 304 and/or output module 305 are until complete the processing to all two field pictures in video data.

If current frame image is the last frame image of video data, judge to have completed to all frames in video data The processing of image, execution terminate.If also continuing to obtain the two field picture in video data after processing current frame image, judgement does not have There is the processing completed to all two field pictures in video data, loop module 306 performs above-mentioned acquisition module 301, judgment module 302nd, first processing module 303, Second processing module 304, output module 305 are until complete to all two field pictures in video data Processing.

The video data real-time processing device provided according to the present invention, real-time image acquisition collecting device it is captured and/or Current frame image in the video recorded；Alternatively, the current frame image in currently played video is obtained in real time；Judge to work as Prior image frame whether be any packet the 1st two field picture；If so, then current frame image is inputted to trained obtained nerve net In network, after the computing of the neutral net whole convolutional layer and warp lamination, the current frame image after being handled；If it is not, Then current frame image is inputted into trained obtained neutral net, is obtained in i-th layer of convolutional layer of computing to neutral net The 1st two field picture of packet is inputted after the operation result of i-th layer of convolutional layer, belonging to acquisition current frame image obtains into neutral net Jth layer warp lamination operation result, directly by the computing knot of the operation result of i-th layer of convolutional layer and jth layer warp lamination Fruit carries out image co-registration, the current frame image after being handled；Wherein, i and j is natural number；Present frame figure after output processing Picture；Above-mentioned execution is repeated until completing the processing to all two field pictures in video data.The present invention takes full advantage of video Continuity, relevance in data between each two field picture, when being handled in real time video data, at video data packets Reason, the 1st two field picture is completed by neutral net in every group the computing of whole convolutional layers and warp lamination, to except the 1st two field picture it The computing for the jth layer warp lamination that outer other two field pictures only computing has been obtained to i-th layer of convolutional layer, the 1st two field picture of multiplexing As a result image co-registration is carried out, greatly reduces the operand of neutral net, improves the speed that video data is handled in real time.Into one Step, can also be after each layer of convolutional layer computing before i-th layer of convolutional layer of neutral net or last layer of convolutional layer, to every The operation result of one layer of convolutional layer carries out down-sampling processing, the resolution ratio of the two field picture of each layer convolutional layer input is reduced, to improve The arithmetic speed of neutral net.

Fig. 4 shows the functional block diagram of video data real-time processing device in accordance with another embodiment of the present invention.Such as Fig. 4 It is shown, it is with Fig. 3 differences, video data real-time processing device further includes：

Frame pitch computing module 307, suitable for calculating the frame pitch of current frame image and the 1st two field picture of packet belonging to it.

When frame pitch computing module 307 calculates the frame pitch of current frame image and the 1st two field picture of its affiliated packet, specifically , current frame image is the 3rd two field picture of any packet, its 1 with affiliated packet is calculated in frame pitch computing module 307 The frame pitch of two field picture is 2.

Determining module 308, suitable for according to frame pitch, determining the value of i and j.

Determining module 308 is according to obtained frame pitch, to determine the value of the i of i-th layer of convolutional layer in neutral net, and The value of the j of 1st two field picture jth layer warp lamination.Determining module 308 is in definite i and j, it is believed that i-th layer of convolutional layer with Layer between last layer of convolutional layer (the bottleneck layer of convolutional layer) away from frame pitch inversely, jth layer warp lamination with it is defeated Go out the layer between layer away from proportional with frame pitch.When frame pitch is bigger, i-th layer of convolutional layer and last layer of convolutional layer it Between layer away from smaller, i values are bigger, and Second processing module 304 more needs to run more convolutional layer；Jth layer warp lamination with it is defeated Go out layer between layer away from bigger, j values are smaller, and Second processing module 304 need to obtain the operation result of the warp lamination of the smaller number of plies. Exemplified by including 1-4 layers of convolutional layer in neutral net, wherein, the 4th layer of convolutional layer is last layer of convolutional layer；In neutral net 1-3 layers of warp lamination and output layer are further comprises, when the calculating frame pitch of frame pitch computing module 307 is 1, determining module 308 Layer between i-th layer of convolutional layer and last layer of convolutional layer is determined away from for 3, it is 1 to determine i, i.e., 304 computing of Second processing module is extremely Level 1 volume lamination, determining module 308 determines layer between jth layer warp lamination and output layer away from for 1, determining that j is 3, at second Manage the operation result that module 304 obtains the 3rd layer of warp lamination；When the calculating frame pitch of frame pitch computing module 307 is 2, mould is determined Block 308 determines layer between i-th layer of convolutional layer and last layer of convolutional layer away from for 2, and it is 2, i.e. Second processing module 304 to determine i To level 2 volume lamination, determining module 308 determines layer between jth layer warp lamination and output layer away from for 2, j 2 for computing, second Processing module 304 obtains the operation result of the 2nd layer of warp lamination.Specific layer away from the convolutional layer that is included with neutral net of size It is related to each number of plies and actual implementation effect to be reached of warp lamination, it is to illustrate above.

Presetting module 309, suitable for pre-setting frame pitch and the correspondence of the value of i and j.

Presetting module 309 determines the value of the i of i-th layer of convolutional layer in neutral net, Yi Ji according to obtained frame pitch During the value of the j of 1 two field picture jth layer warp lamination, the value of frame pitch and i and j directly according to frame pitch, can be pre-set Correspondence.Specifically, presetting module 309 pre-sets the value of different i and j, such as frame pitch according to different frame pitch It is 1 that computing module 307, which calculates frame pitch, and it is 3 that presetting module 309, which sets the value that the value of i is 1, j,；Frame pitch computing module 307 calculating frame pitch are 2, and it is 2 that presetting module 309, which sets the value that the value of i is 2, j,；Or can also be according to different frames Spacing, sets the value of identical i and j；No matter during such as size of frame pitch, presetting module 309 is respectively provided with the value of corresponding i Value for 2, j is 2；Or the value of identical i and j to a part of different frame pitch, can also be set, such as frame pitch meter It is 1 and 2 to calculate module 307 and calculate frame pitch, and the value that presetting module 309 sets the value of corresponding i to be 1, j is 3；Frame pitch meter It is 3 and 4 to calculate module 307 and calculate frame pitch, and the value that presetting module 309 sets the value of corresponding i to be 2, j is 2.With specific reference to Performance is configured, and is not limited herein.

Display module 310, suitable for by the video data real-time display after processing.

Display module 310 handled after video data current frame image after, it can be shown in real time, User can directly be seen that the display effect of the current frame image of the video data after processing.

Uploading module 311, suitable for the video data after processing is uploaded to Cloud Server.

Video data after processing can be directly uploaded to Cloud Server by uploading module 311, specifically, uploading module 311 can be uploaded to the video data after processing the cloud video platform server of one or more, such as iqiyi.com, youku.com, fast The cloud video platform server such as video, so that cloud video platform server is shown video data in cloud video platform.Or Video data after processing can also be uploaded to cloud direct broadcast server by uploading module 311, when have it is live viewing end user into When entering cloud direct broadcast server and being watched, can by cloud direct broadcast server by video data real time propelling movement to viewing user client End.Or the video data after processing can also be uploaded to cloud public platform server by uploading module 311, it is somebody's turn to do when there is user's concern During public platform, video data is pushed to public platform concern client by cloud public platform server；Further, cloud public platform service Device can also be accustomed to according to the viewing of the user of concern public platform, and the video data that push meets user's custom is paid close attention to public platform Client.

The video data real-time processing device provided according to the present invention, after getting current frame image, to current frame image Judged, if current frame image is the 1st two field picture in any packet, current frame image is inputted to trained obtained god Through in network, after the computing of the neutral net whole convolutional layer and warp lamination, the current frame image after being handled；If Current frame image is not the 1st two field picture in any packet, calculates the frame of current frame image and the 1st two field picture of packet belonging to it Spacing.According to frame pitch, determine the i values of i-th layer of convolutional layer of neutral net, obtain the operation result of i-th layer of convolutional layer.Together When, the j values of the jth layer warp lamination of neutral net are determined, so as to directly acquire the 1st frame figure of packet belonging to current frame image Operation result as inputting the jth layer warp lamination obtained into neutral net, is multiplexed the operation result of jth layer warp lamination, The operation result of the operation result of i-th layer of convolutional layer and jth layer warp lamination is subjected to image co-registration, it is current after being handled Two field picture, reduces the frequency of neural network computing, improves computational efficiency.The present invention can directly obtain the video counts after processing According to the video data after processing can also being directly uploaded to Cloud Server, it is not necessary to which user additionally locates video data Reason, saves user time, and the video data after being handled with real-time display to user, facilitates user to check display effect.

Present invention also provides a kind of nonvolatile computer storage media, the computer-readable storage medium is stored with least One executable instruction, the computer executable instructions can perform video data in the above-mentioned any means embodiment side of processing in real time Method.

Fig. 5 shows a kind of structure diagram of computing device according to an embodiment of the invention, and the present invention is specific real Specific implementation of the example not to computing device is applied to limit.

As shown in figure 5, the computing device can include：Processor (processor) 502, communication interface (Communications Interface) 504, memory (memory) 506 and communication bus 508.

Wherein：

Processor 502, communication interface 504 and memory 506 complete mutual communication by communication bus 508.

Communication interface 504, for communicating with the network element of miscellaneous equipment such as client or other servers etc..

Processor 502, for executive program 510, can specifically perform above-mentioned video data real-time processing method embodiment In correlation step.

Specifically, program 510 can include program code, which includes computer-managed instruction.

Processor 502 is probably central processor CPU, or specific integrated circuit ASIC (Application Specific Integrated Circuit), or be arranged to implement the embodiment of the present invention one or more integrate electricity Road.The one or more processors that computing device includes, can be same type of processors, such as one or more CPU；Also may be used To be different types of processor, such as one or more CPU and one or more ASIC.

Memory 506, for storing program 510.Memory 506 may include high-speed RAM memory, it is also possible to further include Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.

Program 510 specifically can be used for so that processor 502 performs video counts in above-mentioned any means embodiment factually When processing method.The specific implementation of each step may refer to the phase in the real-time Processing Example of above-mentioned video data in program 510 Corresponding description in step and unit is answered, this will not be repeated here.It is apparent to those skilled in the art that it is description It is convenienct and succinct, the equipment of foregoing description and the specific work process of module, may be referred to pair in preceding method embodiment Process description is answered, details are not described herein.

Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.

In the specification that this place provides, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice in the case of these no details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.

Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description to the exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention：I.e. required guarantor The application claims of shield features more more than the feature being expressly recited in each claim.It is more precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself Separate embodiments all as the present invention.

Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Member or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it can use any Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and attached drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power Profit requires, summary and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation Replace.

In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.

The all parts embodiment of the present invention can be with hardware realization, or to be run on one or more processor Software module realize, or realized with combinations thereof.It will be understood by those of skill in the art that it can use in practice Microprocessor or digital signal processor (DSP) realize device that video data according to embodiments of the present invention is handled in real time In some or all components some or all functions.The present invention is also implemented as being used to perform as described herein The some or all equipment or program of device (for example, computer program and computer program product) of method.So Realization the present invention program can store on a computer-readable medium, or can have one or more signal shape Formula.Such signal can be downloaded from internet website and obtained, and either be provided or with any other shape on carrier signal Formula provides.

It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of some different elements and being come by means of properly programmed computer real It is existing.In if the unit claim of equipment for drying is listed, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.

Claims

1. a kind of video data real-time processing method, the two field picture that the method includes the video data are grouped place Reason, it includes：

Real-time image acquisition collecting device is captured and/or the video recorded in current frame image；Work as alternatively, obtaining in real time Current frame image in preceding played video；

Judge the current frame image whether be any packet the 1st two field picture；

If so, then inputting the current frame image into trained obtained neutral net, all rolled up by the neutral net After the computing of lamination and warp lamination, the current frame image after being handled；

If it is not, then the current frame image is inputted into trained obtained neutral net, in computing to the neutral net I-th layer of convolutional layer obtain the operation result of i-th layer of convolutional layer after, obtain current frame image belonging to packet the 1st two field picture it is defeated Enter the operation result of the jth layer warp lamination obtained into the neutral net, directly by the computing knot of i-th layer of convolutional layer Fruit and the operation result of the jth layer warp lamination carry out image co-registration, the current frame image after being handled；Wherein, i and j For natural number；

Current frame image after output processing；

2. according to the method described in claim 1, wherein, judging that the current frame image is not the 1st frame of any packet After image, the method further includes：

Calculate the frame pitch of the current frame image and the 1st two field picture of packet belonging to it；

According to the frame pitch, the value of i and j are determined；Wherein, between i-th layer of convolutional layer and last layer of convolutional layer Layer away from the frame pitch inversely, layer between the jth layer warp lamination and output layer away from the frame pitch into Proportional relation.

3. method according to claim 1 or 2, wherein, the method further includes：Pre-set frame pitch and the i and j Value correspondence.

4. method according to any one of claim 1-3, wherein, described directly by the fortune of i-th layer of convolutional layer Calculate result and after the operation result progress image co-registration of the jth layer warp lamination, the method further includes：

If the jth layer warp lamination is last layer of warp lamination of the neutral net, image co-registration result is inputted To output layer, with the current frame image after being handled；

If the jth layer warp lamination is not last layer of warp lamination of the neutral net, and image co-registration result is defeated Enter to+1 layer of warp lamination of jth, by the computing of follow-up warp lamination and output layer, with the current frame image after being handled.

5. according to the described method of any one of claim 1-4, wherein, described input current frame image to trained obtains Neutral net in, after the computing of the neutral net whole convolutional layer and warp lamination, the present frame figure after being handled As further comprising：After each layer of convolutional layer computing before last layer of convolutional layer by the neutral net, to each The operation result of layer convolutional layer carries out down-sampling processing.

6. according to the described method of any one of claim 1-4, wherein, in i-th layer of convolution of computing to the neutral net Before layer obtains the operation result of i-th layer of convolutional layer, the method further includes：In i-th layer of convolutional layer Jing Guo the neutral net After each layer of convolutional layer computing before, down-sampling processing is carried out to the operation result of each layer of convolutional layer.

7. according to the method any one of claim 1-6, wherein, every group of the video data includes n frame two field pictures；Its In, n is fixed preset value.

8. a kind of video data real-time processing device, the two field picture that described device includes the video data are grouped place Reason, it includes：

Acquisition module, suitable for the current frame image captured by real-time image acquisition collecting device and/or in the video recorded；Or Person, obtains the current frame image in currently played video in real time；

Judgment module, suitable for judging whether the current frame image is the 1st two field picture of any packet, if so, performing at first Manage module；Otherwise, Second processing module is performed；

First processing module, suitable for inputting the current frame image into trained obtained neutral net, by the nerve After the computing of network whole convolutional layer and warp lamination, the current frame image after being handled；

Second processing module, suitable for inputting the current frame image into trained obtained neutral net, in computing to institute State neutral net i-th layer of convolutional layer obtain the operation result of i-th layer of convolutional layer after, obtain the of packet belonging to current frame image 1 two field picture inputs into the neutral net operation result of obtained jth layer warp lamination, directly by i-th layer of convolution The operation result of layer carries out image co-registration, the current frame image after being handled with the operation result of the jth layer warp lamination； Wherein, i and j is natural number；

Loop module, suitable for repeat above-mentioned acquisition module, judgment module, first processing module, Second processing module and/or Output module is until complete the processing to all two field pictures in video data.

9. a kind of computing device, including：Processor, memory, communication interface and communication bus, the processor, the storage Device and the communication interface complete mutual communication by the communication bus；

The memory is used to store an at least executable instruction, and the executable instruction makes the processor perform right such as will Ask the corresponding operation of video data real-time processing method any one of 1-7.

10. a kind of computer-readable storage medium, an at least executable instruction, the executable instruction are stored with the storage medium Processor is set to perform the corresponding operation of video data real-time processing method as any one of claim 1-7.