CN105872447A

CN105872447A - Video image processing device and method

Info

Publication number: CN105872447A
Application number: CN201610365216.7A
Authority: CN
Inventors: 华伟锋
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2016-05-26
Filing date: 2016-05-26
Publication date: 2016-08-17

Abstract

The invention discloses a video image processing device and method. The device comprises a collection module, a beautification module, a format conversion module and a processing module, wherein the collection module collects each video image frame in a previewing video of a current terminal, the beautification module performs beautification processing on the collected video image frames, the format conversion module is used for performing format conversion on the beautified previewing video and converting the previewing video to an opposite terminal video, the processing module displays the beautified previewing video on a display interface of the current terminal, and sends the opposite terminal video to an opposite call terminal. According to the scheme, the display effect of the video images can be improved, and the user experience can be improved; the previewing video is subjected to beautification processing one time, so that the processing efficiency is improved, and the real-time performance of the call video is ensured.

Description

A kind of video image processing device and method

Technical field

The present invention relates to terminal applies field, particularly relate to a kind of video image processing device and method.

Background technology

Along with the fast development of terminal applies technology, video calling has been increasingly becoming the daily call of user In a kind of universal call form.But, in current video call technology, display is at both call sides Video image in terminal all can seem that due to the factor such as light or angle comparison is dim, and visual experience is very poor, For facial image, it appears the colour of skin is dim, lackluster, has a strong impact on the image of user. It is to accept for the user that personal image is paid special attention to by this point for those, such as female user , therefore, need person skilled badly and propose a kind of effective solution, improve the aobvious of video image Show effect, improve the experience sense of user.

Summary of the invention

Present invention is primarily targeted at a kind of video image processing device of proposition and method, it is possible to improve and regard Frequently the display effect of image, improves the experience sense of user, and preview video is only done once U.S. face process, Improve treatment effeciency, it is ensured that the real-time of call video.

For achieving the above object, the present invention proposes a kind of video image processing device, and this device includes: Acquisition module, U.S. face module, format conversion module and processing module.

Acquisition module, each frame video image in the preview video gathering present terminal；Wherein, This preview video refers to: in call video, regarding for preview on currently displayed terminal demonstration interface Frequently.

U.S. face module, processes for each frame video image gathered carries out U.S. face.

Format conversion module, for carrying out format conversion, by preview to the preview video processed through U.S. face Video transformation is opposite end video, and wherein, this opposite end video refers to, shows in partner terminal for display Show the video on interface.

Processing module, the display interface of the currently displayed terminal of preview video for processing through U.S. face On, and opposite end video is sent to partner terminal.

Alternatively, this device also includes: face recognition module, judge module and determine module.

Face recognition module, for carrying out people according to the face recognition algorithms preset to each frame video image Face identification.

Judge module, judges current this frame video identified for the recognition result according to face recognition module Whether image exists facial image.

Determine module, in judging current this frame video image identified when judge module, there is face figure During picture, the U.S. face module of order carries out U.S. face to current this frame video image identified and processes；When judge module is sentenced When this frame video image identified before settled does not exists facial image, ignore this frame video of current identification Image.

Alternatively, U.S. face module carries out U.S. face and processes and include each frame video image gathered:

Default human face is identified respectively from the facial image identified.

Transfer default U.S. face and process packet；U.S. face processes and comprises in packet at one or more U.S. face Science and engineering has.

Use one or more U.S. face handling implement that human face carries out corresponding U.S. face to process.

Alternatively,

Human face includes: face, eyes and lip.

U.S. face handling implement includes: whitening instrument, thinning face instrument, remove black eye instrument and rich lip instrument.

U.S. face module uses one or more U.S. face handling implement that human face carries out corresponding U.S. face and processes Including:

Use whitening instrument that face is carried out whitening process.

Use thinning face instrument that face is carried out thinning face process.

Employing goes black eye instrument to go black eye to process eyes.

Use rich lip instrument that lip carries out rich lip process.

Alternatively, format conversion module carries out format conversion to the preview video processed through U.S. face and includes: The video image passing through the preview video that U.S. face processes is carried out YUV420 rotate counterclockwise.

Alternatively, opposite end video is sent to partner terminal and includes by processing module:

According to default video compression algorithm, opposite end video is compressed.

It is to there is the signal of telecommunication of analog signal form, by the signal of telecommunication by the opposite end video conversion through overcompression Analogue signal is converted to digital signal, and digital signal is carried out signal processing.

Digital signal through signal processing is sent to partner terminal.

For achieving the above object, the invention allows for a kind of method of video image processing, the method includes:

Gather each frame video image in the preview video of present terminal；Wherein, this preview video refers to: In call video, for the video of preview on currently displayed terminal demonstration interface.

The each frame video image gathered carries out U.S. face process.

The preview video processed through U.S. face is carried out format conversion, preview video is transformed to opposite end video, Wherein, this opposite end video refers to, for display video on partner terminal demonstration interface.

On the display interface of the currently displayed terminal of preview video that U.S. for process face is processed, and opposite end is regarded Take place frequently and give partner terminal.

Alternatively, the method also includes:

According to default face recognition algorithms, each frame video image is carried out recognition of face.

Judge whether current this frame video image identified exists facial image according to recognition result.

When judging current this frame video image identified exists facial image, to current this frame identified Video image carries out U.S. face and processes；When judging that current this frame video image identified does not exists facial image Time, ignore this frame video image of current identification.

Alternatively, each frame video image gathered carries out U.S. face process to include:

Default human face is identified respectively from the facial image identified.

Alternatively,

Human face includes: face, eyes and lip.

Use one or more U.S. face handling implement that human face carries out corresponding U.S. face process to include:

Use whitening instrument that face is carried out whitening process.

Use thinning face instrument that face is carried out thinning face process.

Employing goes black eye instrument to go black eye to process eyes.

Use rich lip instrument that lip carries out rich lip process.

Alternatively, the preview video processed through U.S. face is carried out format conversion to include: at U.S. face The video image of the preview video of reason carries out YUV420 and rotates counterclockwise.

Alternatively, opposite end video is sent to partner terminal include:

It is that there is the signal of telecommunication of analog signal form by the opposite end video conversion through overcompression, by this signal of telecommunication Analogue signal be converted to digital signal, and digital signal is carried out signal processing.

Digital signal through signal processing is sent to partner terminal.

The present invention proposes a kind of video image processing device and method, and this device includes: acquisition module, Each frame video image in the preview video gathering present terminal；Wherein, this preview video refers to: In call video, for the video of preview on currently displayed terminal demonstration interface.U.S. face module, uses Process in each frame video image gathered being carried out U.S. face.Format conversion module, for through U.S. face The preview video processed carries out format conversion, and preview video is transformed to opposite end video, wherein, this opposite end Video refers to, for display video on partner terminal demonstration interface.Processing module, being used for will On the display interface of the currently displayed terminal of preview video of U.S. face process, and opposite end video is sent To partner terminal.By the scheme of the embodiment of the present invention, it is possible to improve the display effect of video image, Improve the experience sense of user, and only preview video done once U.S. face process, improve treatment effeciency, Ensure that the real-time of call video.

Accompanying drawing explanation

Fig. 1 is the hardware architecture diagram realizing each one optional mobile terminal of embodiment of the present invention；

Fig. 2 is the wireless communication system schematic diagram of mobile terminal as shown in Figure 1；

Fig. 3 is the composition frame chart of the video image processing device of the embodiment of the present invention；

Fig. 4 is the method for video image processing flow chart of the embodiment of the present invention；

Fig. 5 is the method for video image processing schematic diagram of the embodiment of the present invention.

The realization of the object of the invention, functional characteristics and advantage will in conjunction with the embodiments, do referring to the drawings further Explanation.

Detailed description of the invention

Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not used to limit Determine the present invention.

Describe referring now to accompanying drawing and realize the present invention one optional mobile terminal of each embodiment.Rear In continuous description, use such as " module ", " parts " or the suffix of " unit " for representing element Only for the explanation of the beneficially present invention, itself do not has specific meaning.Therefore, " module " and " portion Part " can mixedly use.

Mobile terminal can be implemented in a variety of manners.Such as, the terminal described in the present invention can include (individual digital helps for such as mobile phone, smart phone, notebook computer, digit broadcasting receiver, PDA Reason), PAD (panel computer), PMP (portable media player), the mobile end of guider etc. The fixed terminal of end and such as numeral TV, desk computer etc..Hereinafter it is assumed that terminal is mobile whole End.However, it will be understood by those skilled in the art that, in addition to being used in particular for the element of mobile purpose, Structure according to the embodiment of the present invention can also apply to the terminal of fixed type.

Fig. 1 is the hardware configuration signal of the mobile terminal realizing each embodiment of the present invention.

Mobile terminal 100 can include wireless communication unit 110, A/V (audio/video) input block 120, User input unit 130, sensing unit 140, output unit 150, memorizer 160, interface unit 170, Controller 180 and power subsystem 190 etc..Fig. 1 shows the mobile terminal with various assembly, but should It is understood by, it is not required that implement all assemblies illustrated.Can alternatively implement more or less of group Part.Will be discussed in more detail below the element of mobile terminal.

Wireless communication unit 110 generally includes one or more assembly, and it allows mobile terminal 100 with wireless Radio communication between communication system or network.Such as, wireless communication unit can include broadcast reception Module 111, mobile communication module 112, wireless Internet module 113, short range communication module 114 and position letter At least one in breath module 115.

Broadcast reception module 111 via broadcast channel from external broadcasting management server receive broadcast singal and/ Or broadcast related information.Broadcast channel can include satellite channel and/or terrestrial channel.Broadcast control services Device can be to generate and send generation before broadcast singal and/or the server of broadcast related information or reception Broadcast singal and/or broadcast related information and send it to the server of terminal.Broadcast singal is permissible Including TV broadcast singal, radio signals, data broadcasting signal etc..And, broadcast singal can To farther include the broadcast singal combined with TV or radio signals.Broadcast related information can also There is provided via mobile communications network, and in this case, broadcast related information can be by mobile communication mould Block 112 receives.Broadcast singal can exist in a variety of manners, and such as, it can be wide with digital multimedia Broadcast the electronic program guides (EPG) of (DMB), the electronic service guidebooks of digital video broadcast-handheld (DVB-H) Etc. (ESG) form and exist.Broadcast reception module 111 can be by using various types of broadcast system System receives signal broadcast.Especially, broadcast reception module 111 can by use such as multimedia broadcasting- Ground (DMB-T), DMB-satellite (DMB-S), DVB-hand-held (DVB-H), Forward link media (MediaFLO^@) Radio Data System, received terrestrial digital broadcasting integrated service (ISDB-T) Etc. digit broadcasting system receive digital broadcasting.Broadcast reception module 111 may be constructed such that and is adapted to provide for The various broadcast systems of broadcast singal and above-mentioned digit broadcasting system.Receive via broadcast reception module 111 Broadcast singal and/or broadcast related information can be stored in memorizer 160 (or other type of storage is situated between Matter) in.

Mobile communication module 112 send radio signals to base station (such as, access point, node B etc.), In exterior terminal and server at least one and/or receive from it radio signal.Such radio Signal can include voice call signal, video calling signal or according to text and/or Multimedia Message The various types of data sent and/or receive.

Wireless Internet module 113 supports the Wi-Fi (Wireless Internet Access) of mobile terminal.This module can internal or Externally it is couple to terminal.Wi-Fi (Wireless Internet Access) technology involved by this module can include WLAN (nothing Line LAN) (Wi-Fi), Wibro (WiMAX), Wimax (worldwide interoperability for microwave access), HSDPA (at a high speed Downlink packets accesses) etc..

Short range communication module 114 is the module for supporting junction service.Some examples of short-range communication technology Including bluetooth^TM, RF identification (RFID), Infrared Data Association (IrDA), ultra broadband (UWB), purple honeybee^TM Etc..

Positional information module 115 is the module of positional information for checking or obtain mobile terminal.Position is believed The typical case of breath module is GPS (global positioning system).According to current technology, GPS module 115 calculates From the range information of three or more satellites and correct time information and for the Information application calculated Triangulation, thus according to longitude, latitude and highly accurately calculating three-dimensional current location information.When Before, use three satellites and by using other one for calculating the method for position and temporal information Satellite corrects the position and the error of temporal information calculated.Additionally, GPS module 115 can be by real time Ground Continuous plus current location information calculates velocity information.

A/V input block 120 is used for receiving audio or video signal.A/V input block 120 can include phase Machine 121 and mike 1220, camera 121 is caught by image in Video Capture pattern or image capture mode The view data of the static images or video that obtain device acquisition processes.Picture frame after process can show Show on display unit 151.Picture frame after camera 121 processes can be stored in memorizer 160 (or other Storage medium) in or be transmitted via wireless communication unit 110, can carry according to the structure of mobile terminal For two or more cameras 1210.Mike 122 can be known at telephone calling model, logging mode, voice Other pattern etc. operational mode receives sound (voice data) via mike, and can be by such sound Sound is processed as voice data.Audio frequency (voice) data after process can turn in the case of telephone calling model It is changed to be sent to the form output of mobile communication base station via mobile communication module 112.Mike 122 can Eliminate (or suppression) algorithm with the various types of noises of enforcement and in reception and send audio frequency letter to eliminate (or suppression) The noise produced during number or interference.

It is mobile to control that user input unit 130 can generate key input data according to the order of user's input The various operations of terminal.User input unit 130 allows user to input various types of information, and permissible Including keyboard, metal dome, touch pad (such as, detection due to touched and cause resistance, pressure, electricity The sensitive component of change held etc.), roller, rocking bar etc..Especially, when touch pad as a layer When being superimposed upon on display unit 151, touch screen can be formed.

Sensing unit 140 detects the current state of mobile terminal 100, (such as, mobile terminal 100 open or Closed mode), the position of mobile terminal 100, user is for the contact (that is, touch input) of mobile terminal 100 Presence or absence, the orientation of mobile terminal 100, the acceleration or deceleration of mobile terminal 100 move and direction etc., And generate the order or signal being used for controlling the operation of mobile terminal 100.Such as, when mobile terminal 100 When being embodied as sliding-type mobile phone, it is to engage on or off that sensing unit 140 can sense this sliding-type number Close.It addition, sensing unit 140 can detect whether power subsystem 190 provides electric power or interface unit 170 Whether couple with external device (ED).It is tactile that sensing unit 140 can include that proximity transducer 1410 will combine below Touch screen this is described.

Interface unit 170 is used as at least one external device (ED) and is connected connecing of can passing through with mobile terminal 100 Mouthful.Such as, external device (ED) can include wired or wireless head-band earphone port, external power source (or battery Charger) port, wired or wireless FPDP, memory card port, for connect there is identification module The port of device, audio frequency input/output (I/O) port, video i/o port, ear port etc..Identify mould Block can be that storage is for verifying that user uses the various information of mobile terminal 100 and can include user Identification module (UIM), client identification module (SIM), Universal Subscriber identification module (USIM) etc..It addition, The device (hereinafter referred to as " identifying device ") with identification module can be to take the form of smart card, therefore, knows Other device can be connected with mobile terminal 100 via port or other attachment means.Interface unit 170 is permissible For receiving from the input (such as, data message, electric power etc.) of external device (ED) and defeated by receive Enter to be transferred to the one or more elements in mobile terminal 100 or may be used in mobile terminal and outside Data are transmitted between device.

It addition, when mobile terminal 100 is connected with external base, interface unit 170 can serve as allowing to lead to Cross it provide the path of mobile terminal 100 by electric power from base or can serve as allowing to input from base Various command signals be transferred to the path of mobile terminal by it.Various command signals from base input Or electric power may serve as identifying whether mobile terminal is accurately fitted within the signal on base.Output is single Unit 150 be configured to vision, audio frequency and/or tactile manner provide output signal (such as, audio signal, Video signal, alarm signal, vibration signal etc.).Output unit 150 can include display unit 151, Dio Output Modules 152, alarm unit 153 etc..

Display unit 151 may be displayed on the information processed in mobile terminal 100.Such as, mobile terminal is worked as 100 when being in telephone calling model, display unit 151 can show and call or other communicate (such as, civilian This information receiving and transmitting, multimedia file download etc.) relevant user interface (UI) or graphic user interface (GUI).When mobile terminal 100 is in video calling pattern or image capture mode, display unit 151 Can show capture image and/or the image of reception, illustrate video or image and the UI of correlation function or GUI etc..

Meanwhile, when display unit 151 and touch pad the most superposed on one another to form touch screen time, aobvious Show that unit 151 can serve as input equipment and output device.Display unit 151 can include liquid crystal display (LCD), thin film transistor (TFT) LCD (TFT-LCD), Organic Light Emitting Diode (OLED) display, flexibility show Show at least one in device, three-dimensional (3D) display etc..Some in these display may be constructed such that Transparence is to allow user to watch from outside, and this is properly termed as transparent display, typical transparent display Can for example, TOLED (transparent organic light emitting diode) display etc..According to the specific enforcement wanted Mode, mobile terminal 100 can include two or more display units (or other display device), such as, Mobile terminal can include outernal display unit (not shown) and inner display unit (not shown).Touch screen can For detecting touch input pressure and touch input position and touch input area.

Dio Output Modules 152 can be in call signal at mobile terminal and receive pattern, call mode, note Time under the isotypes such as record pattern, speech recognition mode, broadcast reception mode, wireless communication unit 110 is connect Receive or in memorizer 160 storage voice data transducing audio signal and be output as sound.And And, the audio frequency that dio Output Modules 152 can provide relevant to the specific function of mobile terminal 100 execution is defeated Go out (such as, call signal receives sound, message sink sound etc.).Dio Output Modules 152 can wrap Include speaker, buzzer etc..

Alarm unit 153 can provide output to notify event to mobile terminal 100.Typically Event can include calling reception, message sink, key signals input, touch input etc..Except audio frequency Or outside video frequency output, alarm unit 153 can provide in a different manner and export sending out with notification event Raw.Such as, alarm unit 153 can with vibration form provide output, when receive calling, message or During some other entrance communication (incomingcommunication), alarm unit 153 can provide sense of touch defeated Go out (that is, vibration) to notify to user.By providing such sense of touch to export, even if in the shifting of user When mobile phone is in the pocket of user, user also is able to identify the generation of various event.Alarm unit The output of 153 generations that notification event can also be provided via display unit 151 or dio Output Modules 152.

Memorizer 160 can store the process performed by controller 180 and control the software program etc. of operation Deng, or can temporarily store oneself through output maybe will export data (such as, telephone directory, message, Still image, video etc.).And, memorizer 160 can store about when touch is applied to touch screen The vibration of the various modes of output and the data of audio signal.

Memorizer 160 can include the storage medium of at least one type, described storage medium include flash memory, Hard disk, multimedia card, card-type memorizer (such as, SD or DX memorizer etc.), random access storage device (RAM), static random-access memory (SRAM), read only memory (ROM), electrically erasable Read only memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, light Dish etc..And, mobile terminal 100 can be connected the storage function performing memorizer 160 with by network Network storage device cooperation.

Controller 180 generally controls the overall operation of mobile terminal.Such as, controller 180 performs and voice Control that call, data communication, video calling etc. are relevant and process.It addition, controller 180 can wrap Including the multi-media module 1810 for reproducing (or playback) multi-medium data, multi-media module 1810 can construct In controller 180, or it is so structured that separate with controller 180.Controller 180 can perform pattern Identifying processing, with the handwriting input performed on the touchscreen or picture are drawn input be identified as character or Image.

Power subsystem 190 receives external power or internal power under the control of controller 180 and provides behaviour Make the suitable electric power needed for each element and assembly.

Various embodiment described herein can be to use such as computer software, hardware or its any group The computer-readable medium closed is implemented.Implementing for hardware, embodiment described herein can pass through Use application-specific IC (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), processor, control Device, microcontroller, microprocessor, it is designed to perform in the electronic unit of function described herein extremely Few one is implemented, and in some cases, such embodiment can be implemented in controller 180.Right Implementing in software, the embodiment of such as process or function can perform at least one function or behaviour with permission The single software module made is implemented.Software code can be by writing with any suitable programming language Software application (or program) is implemented, and software code can be stored in memorizer 160 and by controlling Device 180 performs.

So far, oneself is through describing mobile terminal according to its function.Below, for the sake of brevity, will describe Various types of mobile terminals of such as folded form, board-type, oscillating-type, slide type mobile terminal etc. In slide type mobile terminal as example.Therefore, the present invention can be applied to any kind of mobile whole End, and it is not limited to slide type mobile terminal.

As shown in Figure 1 mobile terminal 100 may be constructed such that and utilizes via frame or packet transmission data The most wired and wireless communication system and satellite-based communication system operate.

The communication system being wherein operable to according to the mobile terminal of the present invention is described referring now to Fig. 2.

Such communication system can use different air interfaces and/or physical layer.Such as, by communication system The air interface that system uses includes such as frequency division multiple access (FDMA), time division multiple acess (TDMA), CDMA (CDMA) move lead to UMTS (UMTS) (especially, Long Term Evolution (LTE)), the whole world Communication system (GSM) etc..As non-limiting example, explained below relates to cdma communication system, but It is that such teaching is equally applicable to other type of system.

With reference to Fig. 2, cdma wireless communication system can include multiple mobile terminal 100, multiple base station (BS) 270, base station controller (BSC) 275 and mobile switching centre (MSC) 280.MSC280 is configured to Interface is formed with Public Switched Telephony Network (PSTN) 290.MSC280 is also structured to and can be via returning Journey circuit is couple to the BSC275 of base station 270 and forms interface.If the interface that back haul link can be known according to Ganji In any one construct, described interface includes such as E1/T1, ATM, IP, PPP, frame relay, HDSL, ADSL or xDSL.It will be appreciated that system as shown in Figure 2 can include multiple BSC2750.

Each BS270 can service one or more subregion (or region), by multidirectional antenna or sensing certain party To antenna cover each subregion radially away from BS270.Or, each subregion can be by for dividing Two or more antennas that collection receives cover.Each BS270 may be constructed such that support multiple frequencies distribution, And the distribution of each frequency has specific frequency spectrum (such as, 1.25MHz, 5MHz etc.).

Intersecting that subregion and frequency are distributed can be referred to as CDMA Channel.BS270 can also be referred to as base station Transceiver subsystem (BTS) or other equivalent terms.In this case, term " base station " can be used In broadly representing single BSC275 and at least one BS270.Base station can also be referred to as " cellular station ". Or, each subregion of specific BS270 can be referred to as multiple cellular station.

As shown in Figure 2, broadcast singal is sent in system the shifting operated by broadcsting transmitter (BT) 295 Dynamic terminal 100.Broadcast reception module 111 is arranged on mobile terminal 100 and sentences reception as shown in Figure 1 The broadcast singal sent by BT295.In fig. 2 it is shown that several global positioning systems (GPS) satellite 300. Satellite 300 helps to position at least one in multiple mobile terminals 100.

In fig. 2, depict multiple satellite 300, it is understood that be, it is possible to use any number of defend Star obtains useful location information.GPS module 115 is generally configured to and satellite 300 as shown in Figure 1 Coordinate the location information wanted with acquisition.Substitute GPS tracking technique or outside GPS tracking technique, can To use other technology of the position that can follow the tracks of mobile terminal.It addition, at least one gps satellite 300 can Optionally or additionally to process satellite dmb transmission.

As a typical operation of wireless communication system, BS270 receives from various mobile terminals 100 Reverse link signal.Mobile terminal 100 generally participates in call, information receiving and transmitting communicates with other type of.Special The each reverse link signal determining base station 270 reception is processed in specific BS270.The data obtained It is forwarded to the BSC275 being correlated with.BSC provides call resource distribution and the soft handover included between BS270 The mobile management function of the coordination of process.The data received also are routed to MSC280 by BSC275, its Extra route service for forming interface with PSTN290 is provided.Similarly, PSTN290 with MSC280 forms interface, MSC Yu BSC275 forms interface, and BSC275 correspondingly controls BS270 Forward link signals to be sent to mobile terminal 100.

Based on above-mentioned optional mobile terminal hardware configuration and communication system, propose the inventive method each Embodiment.

Along with the fast development of terminal applies technology, video calling has been increasingly becoming the daily call of user In a kind of universal call form.But, in current video call technology, display is at both call sides Video image in terminal all can seem that due to the factor such as light or angle comparison is dim, and visual experience is very poor, For facial image, it appears the colour of skin is dim, lackluster, has a strong impact on the image of user. It is to accept for the user that personal image is paid special attention to by this point for those, such as female user 's.Based on this problem, the present invention proposes a kind of effective solution, i.e. user both sides are entering During row video, carrying out U.S. face for video and graphic and process, this scheme can be effectively improved the aobvious of video image Show effect, improve the experience sense of user.It addition, in video call process, if simultaneously to preview number According to (i.e. preview video in embodiment of the present invention scheme) and video data (i.e. embodiment of the present invention scheme In opposite end video) carry out skin Caring process if, to graphic process unit GPU and central processor CPU Consuming all can be very big, and, if the air time is oversize, during to the process of preview data and video data Between can synchronize lengthen, this can further increase GPU and CPU consumption, even result in time serious video pass During defeated, Caton phenomenon is obvious.Therefore, in embodiment of the present invention scheme, when carrying out U.S. face and processing Preview data only carries out U.S. face process, then preview data is converted into video data according to certain format Form, substitute the video data that transmits of photographic head, this avoid simultaneously to preview data and video counts Processing according to carrying out U.S. face, decrease the workload of at least 50%, for once U.S. face processing procedure, processes speed Degree can have greatly improved, and have well lifting to later stage Consumer's Experience, and detailed protocol is as described below.

As it is shown on figure 3, first embodiment of the invention proposes a kind of video image processing device 01, this device Including: acquisition module 02, U.S. face module 03, format conversion module 04 and processing module 05.The present invention is real The video image processing device 01 executing example can apply in any terminal with video capability, and can To process any visual form including call video, for example, it is possible to be applied to volte video calling During Video processing.

Acquisition module 02, each frame video image in the preview video gathering present terminal；Wherein, This preview video refers to: in call video, regarding for preview on currently displayed terminal demonstration interface Frequently.

In embodiments of the present invention, in order to realize improving the purpose of the display effect of video image, it is proposed that A kind of processing scheme that video image is carried out U.S. face, i.e. photographic head gathers user's by camera function Video image, and the video image of collection is sent to the Computer Vision in embodiment of the present invention scheme Device 01, due in volte video calling, including on currently displayed terminal demonstration interface for preview Preview video and for display opposite end video on partner terminal demonstration interface, therefore, in order to Reaching good U.S. face effect and visual experience, simplest scheme is equal to preview video and opposite end video Carry out U.S. face to process, it is contemplated that GPU's and CPU consumes excessive problem, the embodiment of the present invention Scheme only carries out U.S. face and processes preview video, then the preview video after U.S. face process is carried out form conversion, Obtain opposite end video, only carry out U.S. face successively and process, decrease workload, reduce memory consumption.Base In this mentality of designing.Video image processing device 01 in the embodiment of the present invention receives what photographic head sent After video image, only preview video is carried out U.S. face and processes, and the preview video that will process through U.S. face The opposite end video being converted to, on the display interface of terminal, is sent to partner terminal by display.? Before this, the acquisition module 02 first passing through in embodiment of the present invention scheme is needed to gather the preview of present terminal Each frame video image in video.Owing to, for each video, being all by the video of a lot of frames Image composition, each frame video image is exactly a picture, if it is desired to all obtain in the whole video stage Obtain display effect well, need that each frame video image in the middle of video is carried out U.S. face and process, therefore, Need acquisition module 02 to keep consistent with the transmission of video speed of photographic head, each frame in video flowing is regarded Frequently image is acquired, and successively puts in order collection often according to the broadcasting of frame video image every in video One frame video image.

U.S. face module 03, processes for each frame video image gathered carries out U.S. face.

In embodiments of the present invention, each frame in preview video is collected by above-mentioned acquisition module 02 After video image, U.S. face module 03 just can carry out corresponding U.S. for each frame video image obtained Face processes.

Process, therefore, in U.S. further, since the embodiment of the present invention carries out U.S. face mainly for facial image Before face module 03 carries out U.S. face process to the video image gathered, need to first pass through default face and know Other algorithm carries out recognition of face to each frame video image parsed.Only for identifying facial image Video image just carry out U.S. face and process, the video image those not being identified to facial image just may be used Directly to ignore or to skip, do not carry out U.S. face and process.Specifically can be by the recognition of face in below scheme Module 06, judge module 07 and determine module 08 realize each frame video image recognition of face work.

Alternatively, this device also includes: face recognition module 06, judge module 07 and determine module 08.

Face recognition module 06, for carrying out each frame video image according to the face recognition algorithms preset Recognition of face.

In embodiments of the present invention, in order to avoid video image being carried out recognition of face each by video time The order of frame video image is mixed up, before each frame video image is carried out recognition of face, first according to regarding In Pin, the broadcasting of every frame video image successively puts in order and obtains each frame video image, and is carrying out people After face identification, still successively put in order to regarding after identifying according to playing of frame video image every in video Frequently image is kept in..

In embodiments of the present invention, this face recognition algorithms preset can be current existing any one Enforceable face recognition algorithms, does not limits for specific algorithm.Alternatively, this face recognition algorithms It can be principal component analysis PCA face recognition algorithms.

PCA face recognition algorithms is otherwise known as " eigenface technology ", and basic thought is: find facial image The basic element (eyes, buccal, lower jaw, lip etc.) of distribution, i.e. facial image sample set covariance square The characteristic vector (characteristic vector is referred to as eigenface) of battle array, characterizes facial image approx with this.By eyes, Buccal, lower jaw sample set covariance matrix characteristic vector be referred to as " the sub-face of feature "." the sub-face of feature " Generated subspace in corresponding image space, is referred to as in " sub-face space ".Calculate test image window to exist The projector distance in " sub-face space ", if video in window meets default threshold value comparison condition, then judges For face.

For the recognition result according to face recognition module, judge module 07, judges that current this frame identified regards Frequently whether image exists facial image.

In embodiments of the present invention, know from current this frame video image identified when face recognition module 06 When not going out face, it is judged that module 07 decides that in current this frame video image identified and there is facial image； When face recognition module 06 does not identifies face from current this frame video image identified, it is judged that module There is not facial image in 07 this frame video image deciding that current identification.

Determine module 08, in judging current this frame video image identified when judge module, there is face During image, the U.S. face module of order carries out U.S. face to current this frame video image identified and processes；Work as judge module When judging to there is not facial image in current this frame video image identified, this frame ignoring current identification regards Frequently image.

In embodiments of the present invention, after providing result of determination by above-mentioned judge module 07, just can root Process accordingly according to result of determination, i.e. judge, when judge module 07, this frame video figure of currently identifying When there is facial image in Xiang, determining that module 08 can activate U.S. face module 03, the U.S. face module 03 of order is to working as This frame video image of front identification carries out U.S. face and processes.This scheme makes U.S. face module 04 to locate always In duty, in order to save terminal resource, when need not U.S. face module U.S.'s face, U.S. face can be set Module 03 is in holding state or is in default low power consumpting state, this terminal resource of having clamoured further Consume.When there is not facial image during judge module 07 judges this frame video image currently identified, really Cover half block 08 ignores this frame video image of current identification, enters the handling process of next frame video image.

In embodiments of the present invention, by above-mentioned face recognition module 06, judge module 07 with determine mould After block 08 carries out the identification work of some row, just may determine that whether this currently processed frame video image needs Carry out U.S. face to process, for needing the video image carrying out U.S. face process specifically to use following methods real The U.S. face scheme of the existing embodiment of the present invention.

Alternatively, U.S. face module 03 uses one or more U.S. face handling implement video image to parsing Carry out U.S. face to process and include step S101-S103:

S101, from the facial image identified, identify default human face respectively.

In embodiments of the present invention, owing to face includes facial with face such as gill, eyebrow, eye, nose, mouths, The face process of so-called U.S. is aiming at face and beautifies and perfect process with what face were carried out, therefore, is passing through After face recognition algorithms identifies facial image, in addition it is also necessary to determine face further from facial image Organ.

In embodiments of the present invention, specifically need to identify that who face can according to each user not Arrange voluntarily with needs or different application scenarios, be not particularly limited at this.

Alternatively, human face includes: face, eyes and lip.Such as, have always black-eyed User can be set to the human face preset when identifying eyes, in order to carry out when U.S. face processes for The black eye of eye portion process, and the user that ratio of skin tone is more black always can be set to face to identify Time human face, in order to carry out U.S. face and process hour hands blee is carried out whitening process.

In embodiments of the present invention, from the facial image identified, default human face is identified respectively Method equally use above-mentioned PCA face recognition algorithms to complete, it is also possible to the completeest Become:

Gather the characteristic area of face part on the facial image identified, and resolve the spy in this feature region Levy area data；Basic feature by the characteristic area data that parse from the different human face preset Data compare；By the difference value with described characteristic area data less than or equal to the discrepancy threshold preset The human face corresponding to basic feature data as the human face identified.

S102, transfer U.S. face and process packet comprises one or more U.S. face handling implement.

In embodiments of the present invention, default one or more faces are identified as user by step S101 After organ, just can carry out corresponding U.S. face for the human face identified respectively and process.Here Concrete processing method needs the various U.S. face handling implement processing in packet by default U.S. face Become.

Alternatively, U.S. face handling implement includes: whitening instrument, thinning face instrument, remove black eye instrument and rich Lip instrument.

Process in packet in above-mentioned default U.S. face, multiple U.S. with different U.S. face function can be comprised Face handling implement, except above-mentioned whitening instrument, thinning face instrument, removes black eye instrument, rich lip instrument etc., Anti-acne instrument, wrinkle removing rasp tool, speckle dispelling instrument can also be included, change camber instrument, eyes amplification work The various U.S. face instrument such as tool, numerous to list herein.But use in the middle of process user, can be according to individual Which instrument people's demand or different application scenarios arrange voluntarily is active, and which instrument is in non- State of activation, to avoid user being regarded by the video image processing device of embodiment of the present invention scheme Frequently, during image procossing, default U.S. face processes the whole U.S. face handling implement comprised in packet and all enters work Make state, because for different users or for different application scenarios, it may be necessary to different U.S. face instrument, whole U.S.s face handling implement that U.S. face processes in packet all enters duty, carries out It may not be that user wants that various U.S. face process, and this state can be also user terminal band not The necessary wasting of resources.Embodiment of the present invention scheme can avoid the waste of terminal resource, and improves use The experience sense at family.

S103, use one or more U.S. face handling implement that human face carries out corresponding U.S. face to process.

In embodiments of the present invention, identify human face by step S101, and by step S102 After transferring corresponding U.S. face handling implement, the U.S. face handling implement transferred just can be used corresponding people Face carries out U.S. face and processes.

Alternatively, use the one or more U.S. face handling implement transferred that human face is carried out corresponding U.S. Face processes and includes:

1, use whitening instrument that face is carried out whitening process.

2, use thinning face instrument that face is carried out thinning face process.

3, employing goes black eye instrument to go black eye to process eyes.

4, use rich lip instrument that lip carries out rich lip process.

In other embodiments of the present invention, it is also possible to use other U.S. face handling implement to different organs Carry out different U.S. face to process, numerous to list herein.It should be noted that at above-mentioned various U.S. face Science and engineering tool can be realized by single functional software, it is also possible to by having multi-functional integrated software Realize.

Alternatively, above-mentioned whitening instrument, black eye instrument is gone can to realize by grinding soft and soggy part.

Mill skin, i.e. uses the figure layer in picture instrument PS (photoshop) software, masking-out, passage, work Tool, filter or other software eliminate the speckle of parts of skin, flaw to the personage in picture, variegated etc.. It is that character facial grinds skin with photoshop, it is possible to making character facial finer and smoother, smooth, profile is more Clearly.

Alternatively, the mill skin algorithm preset includes: single channel mill skin algorithm and based on guarantor limit wave filter three Passage mill skin algorithm.

In embodiments of the present invention, passage mill skin algorithm comprises the following steps S201-S206:

S201, open image, enter passage tuned plate.Replicate blue channel.

S202, to blue channel copy perform filter other high contrast retain.

S203, with the Eyedropper tool draw neighbouring color then cover parts to be protected with paintbrush.Including eye, Nose, eyebrow, mouth, the shadow detail of hair.

S204, image adjust calculate, generate Alpha1 passage.And carry out parameter setting at this passage.

S205, by predetermined registration operation (pinning Ctrl click Alpha1 passage) or preset instructions Load Selection, and select by predetermined registration operation (such as Shift+Ctrl+I) is counter.Return to layers palette and click on sharp Background layer alive.Then setting up a curve adjustment layer, adjust curve, the change of image is observed on limit.Now It is not eager to remove speckle completely, simply they is significantly weakened.Because before being repeated once further below Operation.

S206, impress visible by predetermined registration operation (by Shift+Ctrl+Alt+E Macintosh) or preset instructions Figure layer, comes again operation above to it.Operating parameter below is carried out with the observation controlled oneself.? The principle held is all to carry out the adjustment of trace.Reaching to keep image tone tone balance, despeckle effect is more Good purpose.Such as, if finding the mottle of some yellow of dark place.Including hair on the face.In workbox In take the Sponge tool, the mode option is for discoloring.If a less numerical value careful wiping mottle.Then use Paintbrush tool, chooses neighbouring color colouring (paintbrush color mode).

Format conversion module 04, for the preview video processed through U.S. face is carried out format conversion, will be pre- Video transformation of looking at is opposite end video, and wherein, this opposite end video refers to, for display in partner terminal Video on display interface.

In embodiments of the present invention, after preview video being carried out U.S. face by U.S. face module 03, format conversion Module 04 just can carry out format conversion on the basis of the preview video after the U.S. face obtained, because Preview video is the poorest with opposite end video, and the property of medicine obtains opposite end by preview video and regards Frequency must carry out angular transformation to preview data, it is thus achieved that has the opposite end video of applicable broadcasting angle, tool The mapping mode of body can be arranged voluntarily according to different application scenarios, and a change of the present invention is described below Change embodiment.

Alternatively, format conversion module 04 carries out format conversion to the preview video processed through U.S. face and includes: The video image passing through the preview video that U.S. face processes is carried out YUV420 rotate counterclockwise.

Y in YUV refers to " gray scale " or " lightness ", english expression be brightness Luminance, Luma, wherein luminance is expressed as Y, luma and is expressed as Y'.The calculation relation of Y Yu RGB For: Y=0.2126R+0.7152G+0.0722B, Y'=0.2126R'+0.7152G'+0.0722 B', symbol ' represent that employing Y compresses (Gamma compression).In YUV color is expressed, Color uses aberration Chrominance to express, and UV is two components of color aberration.U=B'-Y' (blue-luma), Cb, V=R'-Y'(red-luma are also illustrated that into), also illustrate that into Cr.Then Y ' UV also has and is expressed as Y ' CbCr.

General YUV420 picture format is actually Y'UV, and 420 refer to it on Y U V Sample rate.In the form of YUV420, first store the Y' value of each pixel, then and then storage Be every 2*2 square formation sampling U value once, finally store is that every 2*2 square formation is sampled V-value once.

In YUV420, the corresponding U of blockage of pixel corresponding a Y, a 2X2 And V.For all YUV420 images, their Y value arrangement is identical, because only that Y Image be exactly gray level image.Their UV of data form of YUV420sp with YUV420p is arranged in It is diverse in principle.420p it be first U has been deposited after, then deposit V, say, that UV They are continuous print.And 420sp it be that UV, UV the most alternately deposit.There is above theory, I just can calculate the size that a YUV420 deposits in internal memory accurately.Width*hight=Y (summation)；U=Y/4；V=Y/4.

In embodiments of the present invention, based on above-mentioned theory, the video image of preview video is carried out YUV420 After rotating to an angle counterclockwise, just can obtain the opposite end video of needs, here this certain angle Can determine according to different application scenarios.By This solution avoids the opposite end received from photographic head Video carries out the process that U.S. face processes, and significantly reduces the consumption of GPU and CPU.

Processing module 05, display circle of the currently displayed terminal of preview video for processing through U.S. face On face, and opposite end video is sent to partner terminal.

In embodiments of the present invention, by U.S. face module 03, each frame video image gathered carried out U.S. face After process, just obtain the preview video image with preferable display effect, just can pass through processing module On the display interface of the currently displayed terminal of video image of 05 preview video that U.S. for process face is processed, And the opposite end video that changed by preview video, the U.S. face of process process is sent to the end of partner End shows.

Alternatively, opposite end video is sent to partner terminal and includes step by processing module 05 S301-S303:

S301, according to default video compression algorithm, opposite end video is compressed.

In embodiments of the present invention, if the video image of acquisition to be sent to the electricity in strange land by the Internet Show on brain, it is necessary to video image is compressed, Normal squeezing mode as the most H.261, JPEG, MPEG etc., otherwise transmit required bandwidth and can become the biggest.Such as, when playing film when, The lower section of player has transmission speed 250kbps, 400kbps, a 1000kbps ... the quality of picture The highest, this speed is the biggest.And it is also this principle that photographic head carries out transmission of video, if will take the photograph As the resolution of head is transferred to 640 × 480, the picture captured often magnify little be about about 50kb, per second 30 Frame, then the speed needed for thecamera head video is 50 × 30/s=1500kbps=1.5Mbps.And in reality In the life of border, resolution when people are generally used for Internet video chat is 320 × 240 even lower, passes Defeated frame number is 24 frames per second.In other words, now video transmission rate will be less than 300kbps, and people are just More smooth transmission of video chat can be carried out.If using higher compression video mode, as MPEG-1 etc., transfer rate can be reduced to 200kbps less than.This is exactly general Video chat Time, the network transfer speeds needed for photographic head.

The compression of video is the core of Video processing, according to whether real-time can be divided into non real-time compression and Real Time Compression.And transmission of video (such as QQ video instant chat) requires that video compress is Real Time Compression.

Video compress is lossy compression method, it is, in general, that the compression ratio of video compress is the highest, it is possible to accomplish The highest compression ratio is because video image the redundancy in the biggest time and space.So-called It is relatively similar that temporal redundancy refers to their the pixel value ratio of same position of the adjacent image of two frames, has very Big dependency, especially rest image, even two two field pictures are identical, to moving image, pass through Certain computing (estimation), it should say that they also have the highest dependency；And what spatial coherence referred to Being same two field picture, two adjacent pixels also possess certain dependency.These dependencys are video pressures The original hypothesis of compression algorithm, in other words, if being unsatisfactory for the two condition (full white noise image, field Scape frequently switches image etc.), the effect of video compress is can be very poor.Remove the crucial calculation of temporal correlation Method is estimation, and it finds out the position that current image macroblock mates most in previous frame image, Hen Duoshi Waiting, we have only to this relative coordinate to record the most much of that, which offers a saving a large amount of code word, Improve compression ratio.In video compression algorithm, estimation is most critical, most crucial part forever. Removal spatial coherence is converted by discrete cosine transform and realizes, and the data in time domain are reflected It is mapped on frequency domain, then DCT coefficient is carried out quantification treatment, essentially all of lossy compression method, all can Having quantization, it is the most obvious that it improves compression ratio.

The original document of image is bigger, it is necessary to can quickly transmit through compression of images And play smoothly.And compression ratio weighs the parameter of image compression size just.In general, take the photograph As the compression ratio of head is mostly 5:1.If it is to say, before uncompressed the appearance of the image of 30 seconds Amount is 30MB, then after being compressed image according to the compression ratio of photographic head 5:1, it big Little reform into 6MB.

Alternatively, the video compression algorithm preset includes: motor rest image (or frame by frame) compress technique M-JPEG, dynamic image expert group Mpeg, H.264, Wavelet (wavelet compression), joint image special Family group JPEG 2000, digital audio/video encoding and decoding technique AVS.

S302, it is to there is the signal of telecommunication of analog signal form, by electricity by the opposite end video conversion through overcompression The analogue signal of signal is converted to digital signal, and digital signal is carried out signal processing.

In embodiments of the present invention, video image can be realized by any one feasible method current to arrive The later stage of the conversion of the signal of telecommunication, the conversion of analogue signal to digital signal, and digital signal processes work. The later stage of digital signal processes the mathematical algorithm computing being primarily referred to as by series of complex, the number to image Word signal parameter is optimized process, is mainly realized by digital signal processing chip DSP.

S303, the digital signal through signal processing is sent to partner terminal.

After video image being processed by above step, just can be by the video image after U.S. face Being sent to video display end, video display end here includes the display terminal of call video both sides, example As, the mobile phone of both sides, computer, Ipad etc..For the display terminal of local terminal, directly will not be compressed Video image be converted into digital signal, be sent to display interface device and carry out video and show；For For the display terminal of the other side, need first the digital signal of the video image after compression to be sent to the other side and show Show terminal, this video could be entered after the other side's display terminal receives this digital signal and decompresses Row display.

It should be noted that can be by any wired or wireless transmission means by the numeral of video image Signal is sent to the display terminal of the other side, such as, broadband, 3G, 4G etc..It is not particularly limited at this.

So far, it has been explained that whole basic features of the present invention program, it should be noted that above-mentioned in Appearance is only the specific embodiment of the present invention, it is impossible to as the final scheme of the present invention, in other embodiments, Can also use other embodiment, every with the same or analogous embodiment of embodiments of the invention, And the combination in any of the present invention program basic feature is all within protection scope of the present invention.

For achieving the above object, the invention allows for a kind of method of video image processing, such as Fig. 4, Fig. 5 Shown in, the method comprising the steps of S401-S404:

S401, gather present terminal preview video in each frame video image；Wherein, this preview regards Frequency refers to: in call video, for the video of preview on currently displayed terminal demonstration interface.

Alternatively, the method also includes:

S402, each frame video image gathered is carried out U.S. face process.

Default human face is identified respectively from the facial image identified.

Alternatively,

Human face includes: face, eyes and lip.

Use whitening instrument that face is carried out whitening process.

Use thinning face instrument that face is carried out thinning face process.

Employing goes black eye instrument to go black eye to process eyes.

Use rich lip instrument that lip carries out rich lip process.

The preview video that S403, face U.S. to process process carries out format conversion, and it is right preview video to be transformed to End video, wherein, this opposite end video refers to, for display regarding on partner terminal demonstration interface Frequently.

S404, by the display interface of the currently displayed terminal of preview video processed through U.S. face, and will Opposite end video is sent to partner terminal.

Alternatively, opposite end video is sent to partner terminal include:

Digital signal through signal processing is sent to partner terminal.

The present invention proposes a kind of video image processing device, and this device includes: acquisition module, is used for adopting Each frame video image in the preview video of collection present terminal；Wherein, this preview video refers to: logical In words video, for the video of preview on currently displayed terminal demonstration interface.U.S. face module, for right The each frame video image gathered carries out U.S. face and processes.Format conversion module, for processing through U.S. face Preview video carry out format conversion, preview video is transformed to opposite end video, wherein, this opposite end video Refer to, for display video on partner terminal demonstration interface.Processing module, for passing through On the display interface of the currently displayed terminal of preview video that U.S. face processes, and it is sent to lead to by opposite end video Words distant terminal.By the scheme of the embodiment of the present invention, it is possible to improve the display effect of video image, carry The experience sense of high user, and only preview video is done once U.S. face process, improve treatment effeciency, protect Demonstrate,prove the real-time of call video.

It should be noted that in this article, term " include ", " comprising " or its any other variant Be intended to comprising of nonexcludability so that include the process of a series of key element, method, article or Person's device not only includes those key elements, but also includes other key elements being not expressly set out, or also Including the key element intrinsic for this process, method, article or device.In the feelings not having more restriction Under condition, statement " including ... " key element limited, it is not excluded that include this key element process, Method, article or device there is also other identical element.

The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.

Through the above description of the embodiments, those skilled in the art is it can be understood that arrive above-mentioned Embodiment method can add the mode of required general hardware platform by software and realize, naturally it is also possible to logical Cross hardware, but a lot of in the case of the former is more preferably embodiment.Based on such understanding, the present invention's The part that prior art is contributed by technical scheme the most in other words can be with the form body of software product Revealing to come, this computer software product is stored in a storage medium (such as ROM/RAM, magnetic disc, light Dish) in, including some instructions with so that a station terminal equipment (can be mobile phone, computer, service Device, air-conditioner, or the network equipment etc.) perform the method described in each embodiment of the present invention.

These are only the preferred embodiments of the present invention, not thereby limit the scope of the claims of the present invention, every Utilize equivalent structure or equivalence flow process conversion that description of the invention and accompanying drawing content made, or directly or Connect and be used in other relevant technical fields, be the most in like manner included in the scope of patent protection of the present invention.

Claims

1. a video image processing device, it is characterised in that described device includes: acquisition module, U.S. Face module, format conversion module and processing module；

Described acquisition module, each frame video image in the preview video gathering present terminal；Its In, described preview video refers to: in call video, for pre-on currently displayed terminal demonstration interface The video look at；

Described U.S. face module, processes for the described each frame video image gathered carries out U.S. face；

Described format conversion module, for the preview video processed through described U.S. face is carried out format conversion, Described preview video is transformed to opposite end video, and wherein, described opposite end video refers to, for display logical Video on words distant terminal display interface；

Described processing module, being used for will be through the currently displayed terminal of preview video of described U.S. face process On display interface, and described opposite end video is sent to partner terminal.

2. video image processing device as claimed in claim 1, it is characterised in that described device also wraps Include: face recognition module, judge module and determine module；

Described face recognition module, is used for according to the face recognition algorithms preset described each frame video figure As carrying out recognition of face；

Described judge module, judges current identification for the recognition result according to described face recognition module Whether this frame video image exists facial image；

Described determine module, deposit in judging current this frame video image identified when described judge module When facial image, make described U.S. face module that current this frame video image identified is carried out U.S. face and process； When there is not facial image during described judge module judges current this frame video image identified, ignoring and working as This frame video image of front identification.

3. video image processing device as claimed in claim 2, it is characterised in that described U.S. face module The described each frame video image gathered carries out U.S. face process include:

Default human face is identified respectively from the described facial image identified；

Transfer default U.S. face and process packet；Described U.S. face processes in packet and comprises one or more U.S. Face handling implement；

The one or more U.S. face handling implement is used described human face to be carried out at corresponding U.S. face Reason.

4. video image processing device as claimed in claim 3, it is characterised in that

Described human face includes: face, eyes and lip；

Described U.S. face handling implement includes: whitening instrument, thinning face instrument, go black eye instrument and rich lip work Tool；

Described U.S. face module uses the one or more U.S. face handling implement that described human face is carried out phase The U.S. face answered processes and includes:

Use described whitening instrument that described face is carried out whitening process；

Use described thinning face instrument that described face is carried out thinning face process；

Black eye instrument is gone to go black eye to process described eyes described in employing；

Use described rich lip instrument that described lip carries out rich lip process.

5. video image processing device as claimed in claim 1, it is characterised in that described format conversion Module carries out format conversion to the preview video processed through described U.S. face and includes: to process through U.S. face The video image of preview video carries out YUV420 and rotates counterclockwise.

6. a method of video image processing, it is characterised in that described method includes:

Gather each frame video image in the preview video of present terminal；Wherein, described preview video is Refer to: in call video, for the video of preview on currently displayed terminal demonstration interface；

The described each frame video image gathered carries out U.S. face process；

The preview video processed through described U.S. face is carried out format conversion, described preview video is transformed to Opposite end video, wherein, described opposite end video refers to, for display at partner terminal demonstration interface Video；

By on the display interface of the currently displayed terminal of preview video processed through described U.S. face, and by institute State opposite end video and be sent to partner terminal.

7. method of video image processing as claimed in claim 6, it is characterised in that described method is also wrapped Include:

According to default face recognition algorithms, described each frame video image is carried out recognition of face；

Judge whether current this frame video image identified exists facial image according to recognition result；

8. method of video image processing as claimed in claim 7, it is characterised in that described to gathering Described each frame video image carries out U.S. face process and includes:

9. method of video image processing as claimed in claim 8, it is characterised in that

Described human face includes: face, eyes and lip；

The one or more U.S. face handling implement of described employing carries out corresponding U.S. face to described human face Process includes:

10. method of video image processing as claimed in claim 6, it is characterised in that described to process The preview video that described U.S. face processes carries out format conversion and includes: to the preview video processed through U.S. face Video image carries out YUV420 and rotates counterclockwise.