CN107172426B - Conversion method in the parallel frame per second of OpenCL based on double MIC - Google Patents

Conversion method in the parallel frame per second of OpenCL based on double MIC Download PDF

Info

Publication number
CN107172426B
CN107172426B CN201710490906.XA CN201710490906A CN107172426B CN 107172426 B CN107172426 B CN 107172426B CN 201710490906 A CN201710490906 A CN 201710490906A CN 107172426 B CN107172426 B CN 107172426B
Authority
CN
China
Prior art keywords
memory
sub thread
frame
thread
mic2
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710490906.XA
Other languages
Chinese (zh)
Other versions
CN107172426A (en
Inventor
朱虎明
王朵
焦李成
鹿乐
田小林
张小华
侯彪
关云辉
焦文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Electronic Science and Technology
Original Assignee
Xian University of Electronic Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Electronic Science and Technology filed Critical Xian University of Electronic Science and Technology
Priority to CN201710490906.XA priority Critical patent/CN107172426B/en
Publication of CN107172426A publication Critical patent/CN107172426A/en
Application granted granted Critical
Publication of CN107172426B publication Critical patent/CN107172426B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0135Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes
    • H04N7/014Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes involving the use of motion vectors

Abstract

The invention proposes conversion methods in a kind of parallel frame per second of the OpenCL based on double MIC, under the premise of guaranteeing picture quality, effectively shorten the runing time converted in frame per second, improve the operational efficiency converted in frame per second.Implementation steps are as follows: main thread initializes the MIC1 and MIC2 of OpenCL equipment;The video of reading is numbered in main thread;Main thread definition signal amount simultaneously initializes;Main thread opens up memory on host and creates sub thread 1 and sub thread 2;Sub thread 1 controls MIC1, executes motion estimation algorithm, while sub thread 2 controls MIC2, executes movement compensating algorithm, realize the upper conversion of video frame rate;Main thread closes sub thread 1 and sub thread 2.The present invention effectively improves the operational efficiency of algorithm, can be used for conversion art on video frame rate.

Description

Conversion method in the parallel frame per second of OpenCL based on double MIC
Technical field
The invention belongs to technical field of video processing, it is related to conversion method in a kind of parallel frame per second of OpenCL, and in particular to Conversion method in a kind of parallel frame per second of OpenCL based on double MIC is suitable for the fields such as conversion on video frame rate.
Background technique
In recent years, the new technology of video field continues to bring out, by increasing the resolution ratio of video and improving the frame per second of video Etc. technological means, brought people it is apparent, more impact force visual experience.Such as from original SD video, till now HD video in addition ultra high-definition 4K video.Nowadays 4K video source is also more and more, has progressed into the view of people In open country, it is also higher to imply that people require the clarity of video pictures.New work " the ratio of Li An in November, 2016 director The midfield war of Li Linen ", " 120 frames/4K/3D " (per second to play 120 frames, 4K resolution ratio, 3D effect) has been attempted for the first time Technology has been started the new rule of motion picture technique with the broadcast mode of 120 vertical frame dimension frame per second per second, has caused the extensive concern of industry.
Switch technology (FRUC) on video frame rate, as a kind of Video post-processing means, by being inserted in original video frame Low frame-rate video is converted into high frame-rate video by the mode for entering intermediate frame.Most early in the 1980s in the industry cycle there have been The technology of frame per second conversion, linear interpolation frame are employed to execute transfer algorithm in frame per second, average including frame duplication and frame, with After develop it is more and more mature.The mid-90 proposes switch technology on the video frame rate based on motion compensation, this technology pair The object of movement carries out motion estimation algorithm first, obtains the vector field close to real motion as far as possible.It is with block or pixel Point is unit, and the vector that different blocks obtains is likely to different, so the various pieces of moving object are contained, obtained arrow Amount is more accurate.Then according to obtained motion vector, the calculating of motion compensating module is carried out, i.e., according to original video Frame and motion vector obtained in the previous step are obtained by way of interpolation to interleave.
With the appearance of video image striding forward from high definition to ultra high-definition and high frame-rate video, so that handling video Image size and frame number have great growth.The increase of image scale, the raising of algorithm complexity, so that at original algorithm The time of reason greatly increases, and is unable to satisfy processing rapidly and even requires in real time.Therefore, how research accelerates to convert in frame per second and calculate Method has become current relatively one of urgent problems.
Intel has issued Intel Xeon Phi coprocessor in 2012, it is a based on integrated many-core framework MIC (Many Intergrated Cores) to strong fusion product.The coprocessor is integrated with 50 or more calculating cores, and And has the vector processing unit (VPU) of 512bit.The hardware of Intel MIC many-core framework had both remained the multistage in CPU Assembly line, while it being equipped with numerous calculating cores again, each calculating core can concurrently execute 4 threads again, this is ensured that MIC has while handling the advantage of multiple tasks.Burnt text in 2015 proposes parallel on MIC framework in Master's thesis Estimation and motion compensated in parallel method, this method utilizes the fork-join model of OpenMP, by each of estimation piece The process for solving motion vector is placed on the execution of MIC per thread, and each pixel of motion compensating module is placed on MIC per thread It executes.Although this method obtains certain acceleration effect, still, there is no researchs how to realize video frame rate on double MIC Upper conversion.
Summary of the invention
It is an object of the invention to be directed to the deficiency of above-mentioned prior art, a kind of OpenCL based on double MIC is proposed simultaneously Conversion method in row frame per second effectively shortens the runing time converted in frame per second under the premise of guaranteeing picture quality, improves frame per second The operational efficiency of upper conversion.
To achieve the above object, the technical solution that the present invention takes comprises the following steps that
(1) main thread initializes the MIC1 and MIC2 of OpenCL equipment, realizes control of the host side to MIC equipment System;
(2) video of reading is numbered in main thread: main thread reads in N frame video, and to regarding in motion estimation algorithm The picture number of frequency present frame is i, initializes i=1, while being j to the picture number of video present frame in movement compensating algorithm, Initialize j=1, wherein the value range of i is [1, N], and the value range of j is [1, N];
(3) it main thread definition signal amount and initializes: main thread definition signal amount 1 and semaphore 2, and by semaphore 1 Value is initialized as 1, and the value of semaphore 2 is initialized as 0;
(4) main thread opens up memory on host and creates sub thread: main thread opens up host memory cpu_ on host Mem1, host memory cpu_mem2 and host memory cpu_mem3, while creating sub thread 1 and sub thread 2;
(5) sub thread 1 controls MIC1, executes motion estimation algorithm:
(5a) sub thread 1 opens up memory mic1_mem1 and memory mic1_mem2 on MIC1;
The image data of i-th frame and i+1 frame is transferred to memory mic1_mem1 by (5b) sub thread 1;
(5c) MIC1 calculates the motion vector MVi of the i-th frame image data in motion estimation algorithm, and MVi is stored in memory In mic1_mem2;
MVi is passed to host memory cpu_mem1 from memory mic1_mem2 by (5d) sub thread 1;
(5e) sub thread 1 judges whether the value of semaphore 1 is greater than 0, if so, subtract 1 for the value of semaphore 1, while by host In MVi write-in host memory cpu_mem2 in memory cpu_mem1, the value of semaphore 2 is added 1, and execute step (5g), it is no Then, step (5f) is executed;
(5f) sub thread 1 waits sub thread 2 to modify the value of semaphore 1, until modification completion, and executes step (5e);
(5g) enables i=i+1, and sub thread 1 judges whether i≤N is true, if so, executing step (5b), otherwise, sub thread 1 is hung It rises;
(6) sub thread 2 controls MIC2, executes movement compensating algorithm, realizes the upper conversion of video frame rate:
(6a) sub thread 2 opens up memory mic2_mem1, memory mic2_mem2 and memory mic2_mem3 on MIC2;
The image data of+1 frame of jth frame and jth is passed to memory mic2_mem1 by (6b) sub thread 2;
(6c) sub thread 2 judges whether the value of semaphore 2 is greater than 0, if so, subtract 1 for the value of semaphore 2, while by host MVi in memory cpu_mem2 reads memory mic2_mem2, and executes step (6e), otherwise, executes step (6d);
(6d) sub thread 2 waits sub thread 1 to modify the value of semaphore 2, until modification completion, and executes step (6c);
(6e) MIC2 calculates the motion compensated interpolation to pixel each in interleave, and the interpolation result to interleave is stored In memory mic2_mem3;
Interpolation result is passed to host memory cpu_mem3 from memory mic2_mem3 by (6f) sub thread 2, and by host memory Interpolation result written document in cpu_mem3 adds 1 into hard disk, while by the value of semaphore 1;
(6g) enables j=j+1, and sub thread 2 judges whether j≤N is true, if so, executing step (6b), otherwise, sub thread 2 is hung It rises;
(7) main thread closes sub thread 1 and sub thread 2.
Compared with the prior art, the invention has the following advantages:
1, the present invention creates sub thread 1 and sub thread 2 in host side by Pthread, and sub thread 1 controls MIC1 calculating and works as The motion vector of prior image frame, while sub thread 2 controls the motion compensated interpolation that MIC2 calculates the previous frame image of present frame, keeps away The prior art is exempted from by individually calculating equipment serial computing motion vector and the time-consuming big defect of motion compensated interpolation, effectively Improve the operational efficiency converted in frame per second.
2, sub thread 1 and sub thread 2 require the host memory that motion vector is stored in access in the present invention, wherein sub thread 1 using semaphore control to this block host memory carry out write operation, sub thread 2 using semaphore control to this block host memory into Row read operation avoids two threads to the read/write conflict of this block memory, ensure that the correctness converted in frame per second.
Detailed description of the invention
Fig. 1 is implementation flow chart of the invention;
Fig. 2 is implementation flow chart of the present invention to the initialization of OpenCL equipment;
Fig. 3 is the implementation flow chart that sub thread 1 of the present invention controls that MIC1 executes motion estimation algorithm;
Fig. 4 is the implementation flow chart that sub thread 2 of the present invention controls that MIC2 executes movement compensating algorithm;
Fig. 5 is the single frames test video figure of the different resolution of emulation experiment input of the present invention;
Fig. 6 is the simulation experiment result figure that the present invention differentiates correctness.
Specific embodiment
In the following with reference to the drawings and specific embodiments, the invention will be further described.
Referring to Fig.1, conversion method in the parallel frame per second of OpenCL based on double MIC, includes the following steps:
Step 1) main thread initializes the MIC1 and MIC2 of OpenCL equipment, realizes host side to MIC equipment Control: after program starts execution, into main thread.To implementation flow chart reference Fig. 2 of OpenCL equipment initialization, main thread exists When obtaining the execution OpenCL equipment stage, the facility information of MIC1 is obtained by device [0], is obtained by device [1] Take the facility information of MIC2.Then command queue commandqueue1 is created to MIC1 equipment, order team is created to MIC2 equipment Commandqueue2 is arranged, in motion estimation algorithm, reading and writing data and the execution to MIC1 are controlled using commandqueue1 The operation of estimation kernel.In movement compensating algorithm, using commandqueue2 control to the reading and writing data of MIC2 and Execute the operation of motion compensation kernel.
The video of reading is numbered in step 2) main thread: main thread reads in 30 frame videos, due to executing estimation The sub thread 1 of algorithm and the sub thread 2 for executing movement compensating algorithm, are not to same frame image Parallel Processing, so to fortune The picture number of video present frame is i in dynamic algorithm for estimating, initializes i=1, while to video present frame in movement compensating algorithm Picture number be j, initialize j=1, wherein the value range of i is [1,30], and the value range of j is [1,30].
Step 3) main thread definition signal amount simultaneously initializes: two threads rush the access of same memory in order to prevent Prominent, the present invention controls the communication between two threads, main thread definition signal amount 1 and semaphore 2 using semaphore, by semaphore 1 value is initialized as 1, and the value of semaphore 2 is initialized as 0.
Step 4) main thread opens up memory on host and creates sub thread: main thread opens up host memory on host Cpu_mem1, host memory cpu_mem2 and host memory cpu_mem3 are created using Pthread function pthread_create Sub thread 1 and sub thread 2.
Step 5) sub thread 1 controls MIC1, executes motion estimation algorithm, implementation flow chart is referring to Fig. 3:
Step 5a) sub thread 1 using OpenCL function clCreateBuffer opened up on MIC1 memory mic1_mem1 and Memory mic1_mem2, mic1_mem1 are used to store the image data of input, and mic1_mem2 is used to store calculated movement arrow Measure result.
Step 5b) sub thread 1 using OpenCL function clEnqueueWriteBuffer by the figure of the i-th frame and i+1 frame As data are transferred to memory mic1_mem1.
Step 5c) MIC1 calculates the motion vector MVi of the i-th frame image data in motion estimation algorithm, and in MVi is stored in It deposits in mic1_mem2: the image of the i-th frame being divided into 240 × 135 macro blocks, calculates the set of candidate motion vectors of each macro block, According to SAD calculation formula, the sad value of each candidate motion vector in vector set is obtained, select the smallest Candidate Motion arrow of sad value The motion vector as the macro block is measured, the motion vector of i-th all macro blocks of frame image successively has been calculated.
Wherein X represents the width of the i-th frame current macro, and Y represents the height of the i-th frame current macro, xmnRepresent current block in the i-th frame Interior position is the pixel value of (m, n) point, ymnIt is the pixel value of (m, n) point for match block position in i+1 frame.
Step 5d) sub thread 1 using OpenCL function clEnqueueReadBuffer by MVi from memory mic1_mem2 pass Enter host memory cpu_mem1.
Step 5e) sub thread 1 judges whether the value of semaphore 1 is greater than 0, if so, showing that jth has had been calculated in sub thread 2 The motion compensation of frame can copy data to the memory block for storing input data in sub thread 2, subtract 1 for the value of semaphore 1, together When by host memory cpu_mem1 MVi write-in host memory cpu_mem2 in, the value of semaphore 2 is added 1, and execute step 5g), otherwise, step 5f is executed).
Step 5f) sub thread 1 wait sub thread 2 modify semaphore 1 value, until modification complete, and execute step 5e).
Step 5g) i=i+1 is enabled, sub thread 1 judges whether i≤30 are true, if so, calculating the movement arrow of next frame image Amount executes step 5b), otherwise, show that the motion vector of all frames of video has calculated completion, sub thread 1 is hung up.
Step 6) sub thread 2 controls MIC2, executes movement compensating algorithm, implementation flow chart is referring to Fig. 4:
Step 6a) sub thread 2 using OpenCL function clCreateBuffer opened up on MIC2 memory mic2_mem1, Memory mic2_mem2 and memory mic2_mem3.
Step 6b) sub thread 2 using OpenCL function clEnqueueWriteBuffer by the figure of+1 frame of jth frame and jth As data are passed to memory mic2_mem1.
Step 6c) sub thread 2 judges whether the value of semaphore 2 is greater than 0, if so, showing sub thread 1 by MVi from master Host memory cpu_mem2 is written in machine memory cpu_mem1, subtracts 1 for the value of semaphore 2, while will be in host memory cpu_mem2 MVi read memory mic2_mem2, and execute step 6e), otherwise, execute step 6d).
Step 6d) sub thread 2 wait sub thread 1 modify semaphore 2 value, until modification complete, and execute step 6c).
Step 6e) interpolation result to pixel each in interleave is calculated on MIC2, and the interpolation result to interleave is deposited It is placed in memory mic2_mem3.
Step 6f) sub thread 2 using OpenCL function clEnqueueReadBuffer by interpolation result from memory mic2_ Mem3 is passed to host memory cpu_mem3, and by the interpolation result written document in host memory cpu_mem3 into hard disk, simultaneously The value of semaphore 1 is added 1, shows that the motion compensated interpolation of jth frame calculates and completes.
Step 6g) j=j+1 is enabled, sub thread 2 judges whether j≤N is true, if so, calculating the motion compensation of next frame image Interpolation executes step (6b), otherwise, shows that video is all and has calculated completion to interleave, sub thread 2 is hung up.
Step 7) main thread closes sub thread 1 and sub thread 2.
Below in conjunction with emulation experiment, technical effect of the invention is further illustrated:
1) simulated conditions:
The single frames test video figure of the different resolution of emulation experiment input, referring to Fig. 5, Fig. 5 (a) is the list of 2K resolution ratio Frame test video figure ParkScene_1920 × 1080, Fig. 5 (b) are the single frames test video figure Sunset_3840 of 4K resolution ratio ×2160。
Emulation experiment environment uses Xian Electronics Science and Technology University's High Performance Computing Center cluster device, and test platform parameter is shown in Shown in table 1.
Table 1
2) emulation content and interpretation of result:
Conversion method in the parallel frame per second of OpenCL based on double MIC, table 2 are serial algorithm and parallel algorithm of the present invention The comparison of PSNR value, table 3 are motion estimation algorithm and movement compensating algorithm testing time, and table 4 is serially test time and the present invention Testing time.
Table 2
Video sequence Serial algorithm Parallel algorithm of the present invention
2K video 36.39 36.37
4K video 38.43 38.40
Table 3
Cycle tests Motion estimation algorithm (ms) Movement compensating algorithm (ms)
2K video 240.52 187.92
4K video 249.37 684.38
Table 4
Cycle tests Serial algorithm (ms) Parallel method (ms) of the present invention
2K video 428.44 261.30
4K video 933.75 689.61
Fig. 6 is the simulation experiment result figure for differentiating correctness, and Fig. 6 (a) is the single frames simulation result diagram of 2K resolution video, Fig. 6 (b) is the single frames simulation result diagram of 4K resolution video.It is correct for more accurate judgement parallel scheme of the invention Property, it is used to the quality of evaluation algorithms, the PSNR of simulation result of the present invention using the Y-PSNR (PSNR) of objective evaluation criteria As shown in table 2, the PSNR of transfer algorithm obtains in the parallel frame per second proposed in the present invention PSNR and serial algorithm are close for value, Available conclusion, the picture quality of transfer algorithm and the picture quality of serial algorithm are consistent in parallel frame per second, therefore are tested The correctness of transfer algorithm in this parallel frame per second is demonstrate,proved.
Table 3 is the serially test time of motion estimation algorithm and movement compensating algorithm in transfer algorithm in frame per second.
Table 4 is the serially test time and concurrent testing time of the invention of transfer algorithm in frame per second, wherein serially test Time is the summed result of motion estimation algorithm time and movement compensating algorithm time in table 3, and the testing time of the invention is about The greater of motion estimation algorithm testing time and movement compensating algorithm testing time between the two in table 3, along with data are copied Shellfish time and semaphore wait time.It is calculated from table 4, it can be seen that the testing time of the invention effectively accelerates to convert in frame per second The computational efficiency of method.

Claims (6)

1. conversion method in a kind of parallel frame per second of OpenCL based on double MIC, includes the following steps:
(1) main thread initializes the MIC1 and MIC2 of OpenCL equipment, realizes control of the host side to MIC equipment;
(2) video of reading is numbered in main thread: main thread reads in N frame video, and works as to video in motion estimation algorithm The picture number of previous frame is i, initializes i=1, while being j to the picture number of video present frame in movement compensating algorithm, initially Change j=1, wherein the value range of i is [1, N], and the value range of j is [1, N];
(3) it main thread definition signal amount and initializes: main thread definition signal amount 1 and semaphore 2, and will be at the beginning of the value of semaphore 1 Beginning turns to 1, and the value of semaphore 2 is initialized as 0;
(4) main thread opens up memory on host and creates sub thread: main thread opened up on host host memory cpu_mem1, Host memory cpu_mem2 and host memory cpu_mem3, while creating sub thread 1 and sub thread 2;
(5) sub thread 1 controls MIC1, executes motion estimation algorithm:
(5a) sub thread 1 opens up memory mic1_mem1 and memory mic1_mem2 on MIC1;
The image data of i-th frame and i+1 frame is transferred to memory mic1_mem1 by (5b) sub thread 1;
(5c) MIC1 calculates the motion vector MVi of the i-th frame image data in motion estimation algorithm, and MVi is stored in memory mic1_ In mem2;
MVi is passed to host memory cpu_mem1 from memory mic1_mem2 by (5d) sub thread 1;
(5e) sub thread 1 judges whether the value of semaphore 1 is greater than 0, if so, subtract 1 for the value of semaphore 1, while by host memory In MVi write-in host memory cpu_mem2 in cpu_mem1, the value of semaphore 2 is added 1, and execute step (5g) and otherwise hold Row step (5f);
(5f) sub thread 1 waits sub thread 2 to modify the value of semaphore 1, until modification completion, and executes step (5e);
(5g) enables i=i+1, and sub thread 1 judges whether i≤N is true, if so, executing step (5b), otherwise, sub thread 1 is hung up;
(6) sub thread 2 controls MIC2, executes movement compensating algorithm, realizes the upper conversion of video frame rate:
(6a) sub thread 2 opens up memory mic2_mem1, memory mic2_mem2 and memory mic2_mem3 on MIC2;
The image data of+1 frame of jth frame and jth is passed to memory mic2_mem1 by (6b) sub thread 2;
(6c) sub thread 2 judges whether the value of semaphore 2 is greater than 0, if so, subtract 1 for the value of semaphore 2, while by host memory MVi in cpu_mem2 reads memory mic2_mem2, and executes step (6e), otherwise, executes step (6d);
(6d) sub thread 2 waits sub thread 1 to modify the value of semaphore 2, until modification completion, and executes step (6c);
(6e) MIC2 calculates motion compensated interpolation to pixel each in interleave, and in the interpolation result to interleave is stored in It deposits in mic2_mem3;
Interpolation result is passed to host memory cpu_mem3 from memory mic2_mem3 by (6f) sub thread 2, and by host memory cpu_ Interpolation result written document in mem3 adds 1 into hard disk, while by the value of semaphore 1;
(6g) enables j=j+1, and sub thread 2 judges whether j≤N is true, if so, executing step (6b), otherwise, sub thread 2 is hung up;
(7) main thread closes sub thread 1 and sub thread 2.
2. conversion method in the parallel frame per second of the OpenCL according to claim 1 based on double MIC, which is characterized in that step (4) creation sub thread 1 and sub thread 2 described in, using Pthread function pthread_create.
3. conversion method in the parallel frame per second of the OpenCL according to claim 1 based on double MIC, which is characterized in that step Sub thread 1 described in (5a) opens up memory mic1_mem1 and memory mic1_mem2 on MIC1, with institute in step (6a) The sub thread 2 stated opens up memory mic2_mem1, memory mic2_mem2 and memory mic2_mem3 on MIC2, is all made of OpenCL function clCreateBuffer.
4. conversion method in the parallel frame per second of the OpenCL according to claim 1 based on double MIC, which is characterized in that step The image data of i-th frame and i+1 frame is transferred to memory mic1_mem1 by sub thread 1 described in (5b), with step (6b) The image data of+1 frame of jth frame and jth is passed to memory mic2_mem1 by the sub thread 2, is all made of OpenCL function clEnqueueWriteBuffer。
5. conversion method in the parallel frame per second of the OpenCL according to claim 1 based on double MIC, which is characterized in that step MIC1 described in (5c) calculates the motion vector MVi of the i-th frame image data in motion estimation algorithm, realizes that steps are as follows:
The image of i-th frame is divided into M × N by (5c1)2A macro block, wherein M >=1, N2≥1;
(5c2) calculates the set of candidate motion vectors of each macro block;
(5c3) obtains the sad value of each candidate motion vector in vector set according to SAD calculation formula;
Wherein X represents the width of the i-th frame current macro, and Y represents the height of the i-th frame current macro, xmnRepresent in the i-th frame position in current block It is set to the pixel value of (m, n) point, ymnIt is the pixel value of (m, n) point for match block position in i+1 frame;
(5c4) selects motion vector of the smallest candidate motion vector of sad value as the macro block;
The motion vector of i-th all macro blocks of frame image successively has been calculated in (5c5).
6. conversion method in the parallel frame per second of the OpenCL according to claim 1 based on double MIC, which is characterized in that step MVi is passed to host memory cpu_mem1 from memory mic1_mem2 by sub thread 1 described in (5d), with institute in step (6f) Interpolation result is passed to host memory cpu_mem3 from memory mic2_mem3 by the sub thread 2 stated, and is all made of OpenCL function cEnqueueReadBuffer。
CN201710490906.XA 2017-06-23 2017-06-23 Conversion method in the parallel frame per second of OpenCL based on double MIC Active CN107172426B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710490906.XA CN107172426B (en) 2017-06-23 2017-06-23 Conversion method in the parallel frame per second of OpenCL based on double MIC

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710490906.XA CN107172426B (en) 2017-06-23 2017-06-23 Conversion method in the parallel frame per second of OpenCL based on double MIC

Publications (2)

Publication Number Publication Date
CN107172426A CN107172426A (en) 2017-09-15
CN107172426B true CN107172426B (en) 2019-10-11

Family

ID=59819251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710490906.XA Active CN107172426B (en) 2017-06-23 2017-06-23 Conversion method in the parallel frame per second of OpenCL based on double MIC

Country Status (1)

Country Link
CN (1) CN107172426B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8169541B2 (en) * 2008-12-09 2012-05-01 Himax Media Solutions, Inc. Method of converting frame rate of video signal
CN102497550A (en) * 2011-12-05 2012-06-13 南京大学 Parallel acceleration method and device for motion compensation interpolation in H.264 encoding
CN102685438A (en) * 2012-05-08 2012-09-19 清华大学 Up-conversion method of video frame rate based on time-domain evolution
CN103220488A (en) * 2013-04-18 2013-07-24 北京大学 Up-conversion device and method of video frame rate
CN103402098A (en) * 2013-08-19 2013-11-20 武汉大学 Video frame interpolation method based on image interpolation
CN104219533A (en) * 2014-09-24 2014-12-17 苏州科达科技股份有限公司 Bidirectional motion estimating method and video frame rate up-converting method and system
CN106210767A (en) * 2016-08-11 2016-12-07 上海交通大学 A kind of video frame rate upconversion method and system of Intelligent lifting fluidity of motion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8169541B2 (en) * 2008-12-09 2012-05-01 Himax Media Solutions, Inc. Method of converting frame rate of video signal
CN102497550A (en) * 2011-12-05 2012-06-13 南京大学 Parallel acceleration method and device for motion compensation interpolation in H.264 encoding
CN102685438A (en) * 2012-05-08 2012-09-19 清华大学 Up-conversion method of video frame rate based on time-domain evolution
CN103220488A (en) * 2013-04-18 2013-07-24 北京大学 Up-conversion device and method of video frame rate
CN103402098A (en) * 2013-08-19 2013-11-20 武汉大学 Video frame interpolation method based on image interpolation
CN104219533A (en) * 2014-09-24 2014-12-17 苏州科达科技股份有限公司 Bidirectional motion estimating method and video frame rate up-converting method and system
CN106210767A (en) * 2016-08-11 2016-12-07 上海交通大学 A kind of video frame rate upconversion method and system of Intelligent lifting fluidity of motion

Also Published As

Publication number Publication date
CN107172426A (en) 2017-09-15

Similar Documents

Publication Publication Date Title
CN108230359B (en) Object detection method and apparatus, training method, electronic device, program, and medium
Chu et al. Temporally coherent gans for video super-resolution (tecogan)
CN106530258B (en) Iteratively faster MR image reconstruction method based on the full variational regularization of high-order
US7558428B2 (en) Accelerated video encoding using a graphics processing unit
CN109963048A (en) Noise-reduction method, denoising device and Dolby circuit system
CN101620730A (en) Computing higher resolution images from multiple lower resolution images
CN106062824B (en) edge detecting device and edge detection method
JP2012506647A (en) High resolution video acquisition apparatus and method
CN109670398A (en) Pig image analysis method and pig image analysis equipment
CN112055249B (en) Video frame interpolation method and device
Qin et al. Joint motion estimation and segmentation from undersampled cardiac MR image
CN106464865A (en) Block-based static region detection for video processing
CN109416836A (en) Information processing equipment, information processing method and information processing system
CN106296689B (en) Flaw detection method, system and device
CN105405152B (en) Adaptive scale method for tracking target based on structuring support vector machines
CN107172426B (en) Conversion method in the parallel frame per second of OpenCL based on double MIC
CN107622476B (en) Image Super-resolution processing method based on generative probabilistic model
Ouyang et al. Research on DENOISINg of cryo-em images based on deep learning
CN103310424B (en) A kind of image de-noising method based on structural similarity Yu total variation hybrid model
CN113284081B (en) Depth map super-resolution optimization method and device, processing equipment and storage medium
Adie et al. Parallel computing accelerated image inpainting using gpu cuda, theano, and tensorflow
CN112381845B (en) Rock core image generation method, model training method and device
CN104243887B (en) A kind of film mode detection method and device based on irregular sampling
WO2022043834A1 (en) Full skeletal 3d pose recovery from monocular camera
CN110505485A (en) Motion compensation process, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant