CN108848384A - A kind of efficient parallel code-transferring method towards multi-core platform - Google Patents

A kind of efficient parallel code-transferring method towards multi-core platform Download PDF

Info

Publication number
CN108848384A
CN108848384A CN201810628187.8A CN201810628187A CN108848384A CN 108848384 A CN108848384 A CN 108848384A CN 201810628187 A CN201810628187 A CN 201810628187A CN 108848384 A CN108848384 A CN 108848384A
Authority
CN
China
Prior art keywords
thread
decoding
coding
stage
gop
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810628187.8A
Other languages
Chinese (zh)
Inventor
张为华
李弋
鲁云萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201810628187.8A priority Critical patent/CN108848384A/en
Publication of CN108848384A publication Critical patent/CN108848384A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention belongs to field of computer technology, specially a kind of efficient parallel code-transferring method towards multi-core platform.In the present invention, video code conversion includes decoding and encoding two stages, and energy level includes two modules of decoding and coding parallel, and data level includes GOP grades and frame level parallel;One section of buffer area is equipped in system to store the image arranged by display order, coding thread is taken out continuous one section(Coding unit)Absolute coding is carried out, and generates intermediate temporary file;Finally, temporary file can be merged into target video;After video input, thread is waken up and executes transcoding task;In transcoding process, the stripping and slicing of thread experience, coding, merges this four-stage at decoding, and the thread of different phase is parallel in pipelined fashion;The result that the previous stage generates is supplied to latter stage use, and by special data structure managing;The present invention can make full use of the efficiency of the computing resource raising transcoding of bottom multicore hardware under the premise of guaranteeing video quality.

Description

A kind of efficient parallel code-transferring method towards multi-core platform
Technical field
The invention belongs to field of computer technology, and in particular to a kind of efficient parallel code-transferring method towards multi-core platform, The computing resource of bottom multicore hardware is made full use of to improve the efficiency of transcoding under the premise of guaranteeing video quality.
Background technique
With the rapid development of internet and multimedia information, data start explosive growth, and internet is every The mass data of its transmission, digital video account for main part.According to CISCO in the network traffic data report of publication in 2017 It accuses, network total flow in 2016 is 1.15ZB(1ZB=10243TB), the ratio of video flow is 72%;Pre-estimation by 2021, Total flow is up to 3.33ZB, and the ratio of video flow is even more to reach 82%.
The universal of digital video enriches people's lives, and people can be used mobile phone, apparatus such as computer and watch view online Frequently.However, video is needed in playing process in view of compatibility issues such as resolution ratio, code rate, coded formats.For example, video It to play, be needed according to corresponding scaling in the equipment of different screen size;It is broadcast under the poor environment of network broadband It puts, needs to reduce code rate;Played in specific player, need transform coding format, such as H.264, MPEG-4.Video code conversion Technology is exactly to develop to solve the above-mentioned problems.
In order to allow user to watch video under various circumstances, service provider can first be regarded local Video Quality Metric at certain specification Frequently, user is then transmitted to by network.By taking Netflix as an example, a video need to be transcoded into 120 targets according to different parameters Video file is then transferred to user.When video code conversion, generally require to turn according to different resolution ratio, code rate, coded format At multiple target video files.Transcoding needs to guarantee lower delay, such as 25 frames are per second above just can guarantee good user Experience.Along with the application of transcoding inherently computation-intensive, these all bring huge challenge to Video service quotient, add Fast video code conversion very it is necessary to.
Compatibility of the video under different scenes can be improved in transcoding technology, and according to different parameters, the same video can To change into the target video of multiple format.For ordinary user, input video is transcoded into certain format target video very Kind is common, i.e., single source single goal transcoding.For Video service quotient, need certain HD video by different transcoding parameters(Point Resolution, code rate, coded format)Change into multichannel target video, i.e., the transcoding scene of single source multiple target.Either which kind of scene turns Code, which generally requires lower delay, can just good user experience, such as must reach 25 frame per second, and transcoding usually needs Guarantee the constant mass of target video.
The computing cost of transcoding is larger, and the transcoding frame per second of single core processor is not usually in ten frames hereinafter, be able to satisfy user's Demand.The appearance of multi-core technology provides opportunity to transcoding acceleration, and has had relevant concurrent technique to add applied to transcoding Speed can be mainly divided into GOP(Group of Picture, picture group)Rank and frame level are other parallel.Although GOP rank and Row scalability is preferable, but develop and it is immature, there are problems that objective video quality decline.Frame level is still main parallel The parallel scheme of stream is used by the mainstreams codec such as FFmpeg, x264, but their parallel scalability is poor, can not Make full use of the computing resource of multi-core platform.
The present invention analyzes the concurrency based on GOP transcoding, devises a kind of efficient parallel code-transferring method, solves meter Calculate the low problem of resource utilization.
Summary of the invention
The purpose of the present invention is to provide a kind of high efficient parallel transcodings towards multi-core platform of computing resource utilization rate Method.
Efficient parallel code-transferring method provided by the invention towards multi-core platform is the independence using video GOP encoding and decoding Property, under the premise of guaranteeing video image quality, the concurrency of video code conversion is excavated, makes full use of the calculating of bottom multicore hardware Resource accelerates the process of transcoding.
Video code conversion includes decoding and encoding two stages, and energy level mainly includes decoding and encoding this coarseness mould parallel Block, data level mainly include GOP grades and frame level parallel.GOP grades of parallel transcodings need in advance by video by closure GOP cutting, decoding Thread obtains different closure GOP, and is decoded into original sequence.It is suitable by showing to store that one section of buffer area is had in system The image of sequence arrangement, coding thread are taken out continuous one section(Coding unit)Absolute coding is carried out, and generates intermediate interim text Part.Finally, temporary file can be merged into target video.
It is closed between GOP and data dependence is not present, so the scalability of this parallel mode is preferable, the present invention is based on close The GOP of conjunction realizes efficient parallel trans-coding system.
Efficient parallel code-transferring method provided by the invention towards multi-core platform, frame are as shown in Figure 1.The present invention utilizes Thread pool manages computing resource, when without transcoding task, thread suspend mode;After video input, thread, which is waken up and executes transcoding, appoints Business.In transcoding process, the stripping and slicing of thread experience, coding, merges this four-stage at decoding, and the thread of different phase is with assembly line Mode is parallel.The adjacent stage meets producers and consumers' relationship, and the result that the previous stage generates can be supplied to the latter Stage uses, and by special data structure managing, such as the video section information that closure GOP queue storage stripping and slicing generates.
After transcoding threads are waken up, dicing stage is initially entered.The thread of dicing stage is video to be closed GOP as unit Being cut into independently decoded section, other threads can immediately enter decoding stage.System uses the label of a stripping and slicing state Come control only one thread can stripping and slicing, specific implementation can be parallel between the stage part be discussed in detail.Stripping and slicing thread will close The block information that closing GOP indicates is put into closure GOP queue, and decodes thread and obtain block information from the queue and be decoded, The two can execute parallel.
In decoding process, the original image that decoding generates is put into coding unit by thread.With existing parallel transcoding system Unified sample, coding unit are to store the data structure of continuous one section of original image to carry out as a whole after being filled Coding.In view of decoding intermediate data committed memory is larger, efficient parallel trans-coding system using in annular team to coding unit into Row unified management.It decodes thread and coding thread and dynamic dispatching is carried out according to the state of circle queue, to guarantee encoding and decoding Higher computing resource utilization rate is maintained in journey.
After having encoded the original image frame in a coding unit, coding thread can export in the section at temporary file. If there is continuous one section of temporary file generates, coding thread can be responsible in advance merging these temporary files, avoid integrating Used time is longer.Which temporary file is the present invention, which record using reorder table, is generated, and merges in advance to help to encode thread.Institute After the completion of there are encoding tasks, then integrate.After file destination generates, transcoding task terminates, and thread is recycled by thread pool, Into dormant state, transcoding task next time is waited.
In the present invention, the parallel transcoding, it is parallel by the way of assembly line to be primarily referred to as four stage of transcoding, each stage Next stage can't be entered back into after being fully completed, but the adjacent stage parallel simultaneously can execute.Flowing water is parallel Mode serially executes for eliminating stripping and slicing, decodes thread and frequently sleep, merge and serial execute asking for bring computing resource waste Topic.
Firstly, dicing stage can execute parallel with decoding.In efficient parallel trans-coding system, all threads are by same Entrance executes transcoding task.However, due to dicing stage discomfort merging rows, in order to ensure only one thread cuts video Block, the label of one stripping and slicing state of system maintenance, the label share non-stripping and slicing, stripping and slicing carry out in, stripping and slicing three kinds of states are completed, It uses respectivelyc 0 ,c 1 ,c 2 It indicates, as shown in Figure 5.
The label is initialized as when transcoding task startc 0 , when reading if there is a thread, this is labeled asc 0 , then The thread is set toc 1 , and execute stripping and slicing task.When other threads read stripping and slicing label.The label has been configured toc 1 Orc 2 , it is then directly entered decoding stage, as shown in Figure 4.Thread needs to be carried out with lock to the reading or modification of stripping and slicing status indication Protection, just can ensure that the atomicity of read operation.After thread has determined the task of oneself, stripping and slicing thread will by scan video The block information of closure GOP is put into closure GOP queue, and decodes thread and obtain closure GOP information from the queue, indicates it Video section be decoded.Therefore, stripping and slicing thread can be executed with decoding thread parallel.After stripping and slicing thread executes completion, Stripping and slicing state can be set toc 2 , and enter decoding stage.
According to the state of annular coding unit queue, decoding stage and coding stage will do it dynamic dispatching.Due to decoding The raw image data committed memory that process generates is more, is uniformly deposited using annular coding queue to original image herein Storage.Coding unit is available free, is saturated two states, as shown in Figure 6.Idle state represents the not stored any original graph of coding unit Picture;Saturation state represents the coding unit and is filled up by original image.The state of coding unit can be during encoding and decoding Saturation state can be set to after decoding thread fills a full coding unit by carrying out switching at runtime;The coding of coding unit is appointed After business terminates, coding unit can be set to idle state.
In the present invention, parallel transcoding further includes parallel in the stage in each stage of transcoding;Video is by a series of video frame structures At the GOP of multiple closures can be divided into.The decoding process for being closed GOP is mutually indepedent, can be with parallel decoding.Decoded figure It, can be to their parallel encodings as being divided into different image sequences by display order.Finally, the adjacent code sequence that coding generates Column can be merged parallel with the method for merger.
Dicing stage:If video does not have frame index, stripping and slicing needs progressive scan video and is divided into closure GOP.Such as Fruit video file includes frame index, then reads the frame index data of video, is that the time is long video slicing according to the number of cutting Similar several segments are spent, and dicing position is sent to next stage --- decoding.
Decoding stage:It is decoded in order to prevent there are data dependence between thread, the present invention is single as cutting using closure GOP The of member, and adjacent closure GOP has an overlapping of I frame, the last frame of previous closure GOP and the latter closure GOP One frame is same frame, as shown in Figure 2.After closure GOP is decoded completely, last I frame is thrown away, to guarantee that the I frame of overlapping only can Retained by the latter decoding GOP.The I frame at the last one decoding end GOP is not Chong Die with other decoding GOP, therefore the frame needs Retain.
Merging phase:Parallel encoding generates many temporary files, and the method for present invention merger merges temporary file. In order to reduce the read-write number to disk, temporary file carries out two-stage merger, as shown in Figure 3.Level-one temporary file is that coding is single The temporary file that member generates;Second level temporary file is the file after once merging.
Further include parallel Data Rate Distribution in the present invention, is exactly distributed according to the code rate of input video to help parallel encoding Data Rate Distribution, it is specific to press SATD using a kind of(Sum of Absolute Transformed Difference, Ha Deman change The 4X4 prediction residual absolute value summation changed)Carry out the algorithm of Data Rate Distribution.
Since SATD uses half precision residual error data, if calculating SATD in an encoding process, need Complicated prediction process is completed first, can bring biggish performance cost.It can be given birth to after inverse quantization and inverse transformation in view of decoding At residual error, the calculating process of SATD is placed decoding stage by the present invention, thus need to only complete simple Ha Deman transformation and absolutely Calculating to value, and avoid complicated prediction process.The process of Data Rate Distribution is carried out as shown in fig. 7, the present invention according to SATD Input video is decoded;Then the SATD value of every frame is calculated, and uses it as the standard of complexity;Then for coding requirement Corresponding code rate is distributed for coding unit;Video frame is finally recompiled according to the code rate of distribution.
Using code rate allocation method proposed by the present invention, the calculating of SATD is completed in decoding, due to decoded image A large amount of memories are occupied, video re-encodes after cannot decoding completely, thus can not calculate SATD points of entire video before encoding Cloth.Start after circle queue is filled due to encoding, the present invention sets being averaged in circle queue section in the initial state Code rate is target bit rate, then accordingly distributes code rate according to the distribution of SATD.With the progress of coding, encoded coding unit Number can be more and more, and the distribution curve of SATD also can be increasingly more complete, what Data Rate Distribution algorithm only needed to guarantee to inscribe when coding Average bit rate is up to standard.
Detailed description of the invention
Fig. 1 is efficient parallel transcoding frame.
Fig. 2 is the GOP of closure.
Fig. 3 is video merging.
Fig. 4 is stripping and slicing and the parallel signal of decoding.
Fig. 5 is that stripping and slicing state is that symbol corresponds to table.
Fig. 6 is coding unit state transition diagram.
Fig. 7 is the parallel Data Rate Distribution signal of SATD.
Fig. 8 is stripping and slicing and decoded Parallel Implementation.
Fig. 9 is the dynamic dispatching process of decoding and coding.
Figure 10 is the process for encoding file mergences.
Figure 11 is the process of SATD distribution distribution.
Specific embodiment
In order to keep the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, specifically Bright preferred implementation of the invention.Before this it should be noted that term used in present specification and claims or It is the meaning in common meaning or dictionary that word, which is unable to limited interpretation, and should be based in order to illustrate its hair in the best way The principle that bright people can suitably define the concept of term is construed to meaning and concept of the technical idea of the present invention.With It, the structure indicated in embodiment and attached drawing documented by this specification is one of preferred embodiment, can not be complete Quan represents technical idea of the invention, is able to carry out each of replacement it will therefore be appreciated that may exist for the present invention Kind equivalent and variation.
FFmpeg is a powerful open source multi-media processing frame, and convertible, the more clock audio-video documents of editor are used It is very extensive on the way.Here how we is based on FFmpeg encoding and decoding frame and POSIX multi-thread programming model if introducing, realize high The parallel trans-coding system of effect.
1, parallel transcoding
In FFmpeg frame, we are by realizing clip_video, decode_closed_gop, scale_and_encode With tetra- power functions of concatenate, corresponding to the stripping and slicing of parallel transcoding, decoding, coding, merge four-stage.Further, We also need to realize the execution entrance after launch_transcoding function is waken up as thread, and each power function can lead to Setting trans_ctx parameter is crossed to be called.
To realize stripping and slicing and decoded parallel, the variable record stripping and slicing shape of clip_status is defined in trans_ctx State, the variable share CLIP_NOT_YET, tri- kinds of states of CLIPPING, CLIP_FIN.Simultaneously as the read-write of the variable is former Sub-operation, therefore it is protected using mutual exclusion lock.Thread enter the execution stream after launch_transcoding as shown in figure 8, If after thread obtains lock, the state for reading clip_status is CLIP_NOT_YET, then thread dispatching clip_video Function executes stripping and slicing, and calls decode_closed_gop to be decoded after the completion of stripping and slicing.The line of execution stripping and slicing is not needed Journey will directly decode.In the realization of clip_video, we use the av_seek_frame index functions I of FFmpeg offer Frame obtains pts after the I frame decoding(Presentation Time Stamp)It records, after the completion of stripping and slicing, clip_status is set to CLIP_FIN.
After thread enters decoding stage, traversal closure GOP queue is decoded the section, as shown in Figure 9.If acquisition is closed GOP success is closed, and there are enough free code units, then thread will traverse closure GOP, and calls avcodec_ The decoding of decode_video2 function.If obtaining closure GOP failure, if stripping and slicing at this time has been completed, illustrate not have New closure GOP is generated, then thread will enter coding stage, this is first entrance of the decoding scheduling to coding.If obtained Closure GOP success is taken, but without enough free code units, illustrates that decoding speed is too fast, thread enters coding stage, this It is second entrance of the decoding scheduling to coding.
After thread enters coding stage, traversal saturation coding unit, and avcodec_encode_video2 is called to be compiled Code.The case where if it is multichannel, needs to be embedded in one layer of circulation again to traverse multiplex coding context, to realize that a frame image is pressed The target of multichannel parameter coding.Certainly, if obtaining saturation coding unit failure, there are two kinds of situations, and one is decodings It is over, new saturation coding unit there will be no to generate, enter merging phase at this time;Another kind is to decode and do not complete, It is only sky in annular team, at this moment encodes thread and be rescheduled into decoding stage.
After coding unit generates temporary file, thread can't go to obtain next saturation coding unit at once, but sentence The temporary file after whether one section being reordered that breaks merges in advance.Therefore, as shown in Figure 10, a temporary file is generated in coding Afterwards, thread can enter merging phase, the label in corresponding reorder table is set as being completed, if the son that reorders at place Section is all set, then these temporary files are just merged into second level temporary file in advance.Finally, again that second level is temporarily literary Part is merged into target video.
2, the realization of SATD
The SATD of each frame is calculated in decoding stage, the summation of corresponding coding unit SATD is then counted, then analyzes it in video Proportion in total SATD, to realize Data Rate Distribution.Although the SATD summation of input video, thread can not be obtained in advance After only filling full circle queue, coding just will do it, so still suffering from sufficient distributed intelligence.In addition, with transcoding Progress, obtained SATD information can be more and more, can reasonably for coding unit distribute code rate.
The process of SATD distribution distribution is as shown in figure 11.In the decoding process of H264, the calculating process of SATD is will be pre- It surveys residual error and carries out Ha Deman transformation, then seek the sum of absolute value.FFmpeg call encoding and decoding library libavcodec, realization it is anti- The function that transformation generates macroblock residuals can be stored in residual error in the residual array of H264Context structural body.Therefore, real It only needs to carry out Ha Deman transformation to the residual array of the H264Context structural body when existing SATD.

Claims (5)

1. a kind of efficient parallel code-transferring method towards multi-core platform, which is characterized in that video code conversion includes decoding and encoding two A stage, energy level include two modules of decoding and coding parallel, and data level includes GOP grades and frame level parallel;GOP grades of parallel transcodings It needs in advance by video by closure GOP cutting, decoding thread obtains different closure GOP, and is decoded into original sequence;System In be equipped with one section of buffer area to store the image arranged by display order, coding thread is taken out continuous one section i.e. coding unit Absolute coding is carried out, and generates intermediate temporary file;Finally, temporary file is merged into target video.
2. the efficient parallel code-transferring method according to claim 1 towards multi-core platform, which is characterized in that video input Afterwards, thread is waken up and executes transcoding task;In transcoding process, the stripping and slicing of thread experience, coding, merges this four-stage at decoding, The thread of different phase is parallel in pipelined fashion;The result that the previous stage generates is supplied to latter stage use, and By special data structure managing;
After transcoding threads are waken up, dicing stage is initially entered;Video is cut using being closed GOP as unit in the thread of dicing stage Being cut into independently decoded section, other threads to immediately enter decoding stage;System is controlled using the label of a stripping and slicing state Making only one thread being capable of stripping and slicing;The block information for being closed GOP expression is put into closure GOP queue by stripping and slicing thread, and is decoded Thread obtains block information from the queue and is decoded, and the two executes parallel;
In decoding process, the original image that decoding generates is put into coding unit by thread;Coding unit is storage continuous one The data structure of section original image is encoded as a whole after being filled;System is single to coding using circle queue Member is managed collectively;It decodes thread and coding thread and dynamic dispatching is carried out according to the state of circle queue, to guarantee to compile solution Higher computing resource utilization rate is maintained during code;
After having encoded the original image frame in a coding unit, coding thread exports in the section at temporary file;If there is Continuous one section of temporary file generates, and coding thread is responsible in advance merging these temporary files;Which is recorded using reorder table A little temporary files are generated, and are encoded thread with help and are merged in advance;After the completion of all encoding tasks, then integrate;Mesh After marking file generated, transcoding task terminates, and thread is recycled by thread pool, into dormant state, waits transcoding task next time.
3. the efficient parallel code-transferring method according to claim 2 towards multi-core platform, which is characterized in that the quadravalence of transcoding The mode of Duan Caiyong assembly line is parallel, i.e., the parallel execution simultaneously of adjacent stage;
Firstly, dicing stage can execute parallel with decoding;In system, all threads execute transcoding task by the same entrance; However, in order to ensure only one thread carries out stripping and slicing to video, system maintenance one is cut due to dicing stage discomfort merging rows The label of bulk state, the label be divided into non-stripping and slicing, stripping and slicing carry out in, stripping and slicing three kinds of states are completed, use respectivelyc 0 ,c 1 ,c 2 Table Show;
The label is initialized as when transcoding task startc 0 , when reading if there is a thread, this is labeled asc 0 , then the line Journey is set toc 1 , and execute stripping and slicing task;When other threads read stripping and slicing label, which has been configured toc 1 Orc 2 , in It is to be directly entered decoding stage;Thread protects the reading or modification of stripping and slicing status indication with lock;When thread has determined certainly After oneself task, the block information for being closed GOP is put into closure GOP queue by scan video by stripping and slicing thread, and decodes thread Closure GOP information is obtained from the queue, the video section indicated it is decoded;That is stripping and slicing thread and decoding thread parallel It executes;After stripping and slicing thread executes completion, stripping and slicing state is set toc 2 , and enter decoding stage;
According to the state of annular coding unit queue, decoding stage and coding stage carry out dynamic dispatching;Since decoding process produces Raw raw image data committed memory is more, carries out unified storage to original image using annular coding queue;Coding unit Available free, saturation two states;Idle state represents the not stored any original image of coding unit;Saturation state represents the coding Unit is filled up by original image;The state of coding unit carries out switching at runtime during encoding and decoding, and decoding thread is filled out After a coding unit, it is set to saturation state;After the encoding tasks of coding unit terminate, coding unit is set to Idle state.
4. the efficient parallel code-transferring method according to claim 3 towards multi-core platform, which is characterized in that further include transcoding Each stage stage in it is parallel;
Dicing stage:If video does not have frame index, stripping and slicing needs progressive scan video and is divided into closure GOP;If depending on Frequency file includes frame index, then reads the frame index data of video, is time span phase video slicing according to the number of cutting Close several segments, and dicing position is sent to next stage --- decoding;
Decoding stage:Using closure GOP as cutter unit, there is the overlapping of I frame, previous closure in adjacent closure GOP The first frame of last frame and the latter the closure GOP of GOP is same frame;After closure GOP is decoded completely, last I is thrown away Frame is retained with guaranteeing that the I frame of overlapping only can decode GOP by the latter;The last one decoding the end GOP I frame not with it is other GOP overlapping is decoded, therefore the frame needs to retain;
Merging phase:Parallel encoding generates many temporary files, merges temporary file with the method for merger;In order to reduce pair The read-write number of disk, temporary file carry out two-stage merger;Level-one temporary file is the temporary file that coding unit generates;Second level Temporary file is the file after once merging.
5. the efficient parallel code-transferring method according to claim 4 towards multi-core platform, which is characterized in that further include parallel Data Rate Distribution, is exactly distributed the Data Rate Distribution to help parallel encoding according to the code rate of input video, carries out code using by SATD The algorithm of rate distribution, concrete operations are as follows:
Before encoding to data, the SATD value of the every frame of video is calculated first, it is right then using SATD as the standard of complexity Coding unit distributes corresponding code rate;Here the calculating process of SATD is placed into decoding stage, since coding must be in annular Queue just starts after being filled, and therefore, can set the average bit rate in circle queue section as object code in the initial state Then rate accordingly distributes code rate according to the distribution of SATD;With the progress of coding, encoded coding unit number is more and more, The distribution curve of SATD also can be increasingly more complete, and Data Rate Distribution algorithm only needs to guarantee that the average bit rate inscribed when coding is up to standard i.e. It can.
CN201810628187.8A 2018-06-19 2018-06-19 A kind of efficient parallel code-transferring method towards multi-core platform Pending CN108848384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810628187.8A CN108848384A (en) 2018-06-19 2018-06-19 A kind of efficient parallel code-transferring method towards multi-core platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810628187.8A CN108848384A (en) 2018-06-19 2018-06-19 A kind of efficient parallel code-transferring method towards multi-core platform

Publications (1)

Publication Number Publication Date
CN108848384A true CN108848384A (en) 2018-11-20

Family

ID=64202221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810628187.8A Pending CN108848384A (en) 2018-06-19 2018-06-19 A kind of efficient parallel code-transferring method towards multi-core platform

Country Status (1)

Country Link
CN (1) CN108848384A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110602122A (en) * 2019-09-20 2019-12-20 北京达佳互联信息技术有限公司 Video processing method and device, electronic equipment and storage medium
CN110996172A (en) * 2019-12-17 2020-04-10 杭州当虹科技股份有限公司 Method for quickly synthesizing 4K MXF file
CN111343503A (en) * 2020-03-31 2020-06-26 北京金山云网络技术有限公司 Video transcoding method and device, electronic equipment and storage medium
CN112637634A (en) * 2020-12-24 2021-04-09 北京睿芯高通量科技有限公司 High-concurrency video processing method and system for multi-process shared data
CN112822494A (en) * 2020-12-30 2021-05-18 稿定(厦门)科技有限公司 Double-buffer coding system and control method thereof
CN114245143A (en) * 2020-09-09 2022-03-25 阿里巴巴集团控股有限公司 Encoding method, device, system, electronic device and storage medium
CN114697675A (en) * 2020-12-25 2022-07-01 扬智科技股份有限公司 Decoding display system and memory access method thereof
CN115297328A (en) * 2022-10-10 2022-11-04 湖南马栏山视频先进技术研究院有限公司 Multi-node parallel video transcoding method facing distributed cluster

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101098483A (en) * 2007-07-19 2008-01-02 上海交通大学 Video cluster transcoding system using image group structure as parallel processing element
WO2013165088A1 (en) * 2012-05-02 2013-11-07 Samsung Electronics Co., Ltd. Distributed transcoding apparatus and method using multiple servers
CN104469370A (en) * 2013-09-17 2015-03-25 中国普天信息产业股份有限公司 Video transcode method and device
CN105451031A (en) * 2015-11-18 2016-03-30 腾讯科技(深圳)有限公司 Video transcoding method and system thereof
CN106254867A (en) * 2016-08-08 2016-12-21 暴风集团股份有限公司 Based on picture group, video file is carried out the method and system of transcoding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101098483A (en) * 2007-07-19 2008-01-02 上海交通大学 Video cluster transcoding system using image group structure as parallel processing element
WO2013165088A1 (en) * 2012-05-02 2013-11-07 Samsung Electronics Co., Ltd. Distributed transcoding apparatus and method using multiple servers
CN104469370A (en) * 2013-09-17 2015-03-25 中国普天信息产业股份有限公司 Video transcode method and device
CN105451031A (en) * 2015-11-18 2016-03-30 腾讯科技(深圳)有限公司 Video transcoding method and system thereof
CN106254867A (en) * 2016-08-08 2016-12-21 暴风集团股份有限公司 Based on picture group, video file is carried out the method and system of transcoding

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110602122A (en) * 2019-09-20 2019-12-20 北京达佳互联信息技术有限公司 Video processing method and device, electronic equipment and storage medium
CN110996172A (en) * 2019-12-17 2020-04-10 杭州当虹科技股份有限公司 Method for quickly synthesizing 4K MXF file
CN110996172B (en) * 2019-12-17 2022-01-11 杭州当虹科技股份有限公司 Method for quickly synthesizing 4K MXF file
CN111343503A (en) * 2020-03-31 2020-06-26 北京金山云网络技术有限公司 Video transcoding method and device, electronic equipment and storage medium
CN111343503B (en) * 2020-03-31 2022-03-04 北京金山云网络技术有限公司 Video transcoding method and device, electronic equipment and storage medium
CN114245143A (en) * 2020-09-09 2022-03-25 阿里巴巴集团控股有限公司 Encoding method, device, system, electronic device and storage medium
CN112637634A (en) * 2020-12-24 2021-04-09 北京睿芯高通量科技有限公司 High-concurrency video processing method and system for multi-process shared data
CN114697675A (en) * 2020-12-25 2022-07-01 扬智科技股份有限公司 Decoding display system and memory access method thereof
CN114697675B (en) * 2020-12-25 2024-04-05 扬智科技股份有限公司 Decoding display system and memory access method thereof
CN112822494A (en) * 2020-12-30 2021-05-18 稿定(厦门)科技有限公司 Double-buffer coding system and control method thereof
CN115297328A (en) * 2022-10-10 2022-11-04 湖南马栏山视频先进技术研究院有限公司 Multi-node parallel video transcoding method facing distributed cluster
CN115297328B (en) * 2022-10-10 2023-01-20 湖南马栏山视频先进技术研究院有限公司 Multi-node parallel video transcoding method facing distributed cluster

Similar Documents

Publication Publication Date Title
CN108848384A (en) A kind of efficient parallel code-transferring method towards multi-core platform
CN103621085B (en) Reduce method and the computing system of the delay in video decode
CN1278550C (en) Method and apparatus for regenerating image and image recording device
CN102150425B (en) System and method for decoding using parallel processing
US8170120B2 (en) Information processing apparatus and information processing method
CN101052127B (en) Information-processing apparatus, information-processing method
CN101895765B (en) Transcoder, recorder, and transcoding method
WO2017107442A1 (en) Video transcoding method and device
CN103297807A (en) Hadoop-platform-based method for improving video transcoding efficiency
CN102301710A (en) Multiple bit rate video encoding using variable bit rate and dynamic resolution for adaptive video streaming
CN104205834A (en) Method and apparatus for video encoding for each spatial sub-area, and method and apparatus for video decoding for each spatial sub-area
CN102447906A (en) Low-latency video decoding
CN102741830A (en) Systems and methods for a client-side remote presentation of a multimedia stream
CN104469370A (en) Video transcode method and device
CN102984465A (en) Program synthesis system and method applicable to networked intelligent digital media
Heikkinen et al. Distributed multimedia content analysis with MapReduce
CN100556140C (en) Moving picture re-encoding apparatus, moving picture editing apparatus and method thereof
CN107079159A (en) The method and apparatus of parallel video decoding based on multiple nucleus system
CN113271467A (en) Ultra-high-definition video layered coding and decoding method supporting efficient editing
CN107197296A (en) A kind of HEVC parallel encoding method and systems based on COStream
CN108886638A (en) Transcriber and reproducting method and file creating apparatus and document generating method
EP4354868A1 (en) Media data processing method and related device
US20240046562A1 (en) Information processing device and method
CN107105264B (en) Dynamic image prediction decoding method, dynamic image prediction decoding device
CN101094368A (en) Reproduction apparatus and reproduction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20181120

WD01 Invention patent application deemed withdrawn after publication