CN110049324A - Method for video coding, system, equipment and computer readable storage medium - Google Patents

Method for video coding, system, equipment and computer readable storage medium Download PDF

Info

Publication number
CN110049324A
CN110049324A CN201910297964.XA CN201910297964A CN110049324A CN 110049324 A CN110049324 A CN 110049324A CN 201910297964 A CN201910297964 A CN 201910297964A CN 110049324 A CN110049324 A CN 110049324A
Authority
CN
China
Prior art keywords
encoded
video
interest region
video frame
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910297964.XA
Other languages
Chinese (zh)
Other versions
CN110049324B (en
Inventor
齐燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
Original Assignee
OneConnect Smart Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Smart Technology Co Ltd filed Critical OneConnect Smart Technology Co Ltd
Priority to CN201910297964.XA priority Critical patent/CN110049324B/en
Publication of CN110049324A publication Critical patent/CN110049324A/en
Priority to PCT/CN2019/120899 priority patent/WO2020207030A1/en
Application granted granted Critical
Publication of CN110049324B publication Critical patent/CN110049324B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Abstract

The present invention provides a kind of method for video coding based on human face detection tech, system, equipment and computer readable storage medium, this method comprises: obtaining video frame to be encoded;Face datection is carried out to the video frame to be encoded based on preset rules, obtain Face datection result, and the interest region of the video frame to be encoded is determined according to the preset rules and the Face datection result, and using the region other than interest region in the video frame to be encoded as non-interest region;The interest region and the non-corresponding encoder bit rate in interest region are obtained, and the interest region and non-interest region are encoded respectively based on corresponding encoder bit rate.The present invention can take into account user experience while reducing transmission of video code rate.

Description

Method for video coding, system, equipment and computer readable storage medium
Technical field
The present invention relates to technical field of video coding more particularly to a kind of method for video coding, system, equipment and computers Readable storage medium storing program for executing.
Background technique
The development of video traffic is carrying out video pressure with low bit- rate inevitably by the restriction of limited bandwidth resource When contracting, the decline of video quality often will cause, and then reduce user experience, the reduction of user experience limits video traffic Development.Thus it is badly in need of a kind of method for video coding for taking into account user experience Yu transmission of video code rate.
Summary of the invention
The main purpose of the present invention is to provide a kind of method for video coding, it is intended to which solving existing method for video coding can not The technical issues of taking into account user experience and transmission of video code rate.
To achieve the above object, the present invention provides a kind of method for video coding, and the method for video coding includes following step It is rapid:
Obtain video frame to be encoded;
Face datection is carried out to the video frame to be encoded based on preset rules, obtains Face datection as a result, and according to institute It states preset rules and the Face datection result determines the interest region of the video frame to be encoded, and by the video to be encoded Region in frame other than interest region is as non-interest region;
The interest region and the non-corresponding encoder bit rate in interest region are obtained, and is based on corresponding volume Code code rate respectively encodes the interest region and non-interest region.
Optionally, described to determine the emerging of the video frame to be encoded according to the preset rules and the Face datection result The step of interesting region includes:
It is determined in the video frame to be encoded according to the Face datection result with the presence or absence of face;
If face is not present in the video frame to be encoded, preset central area is obtained, the central area is made For the interest region of the video frame to be encoded.
Optionally, include: before the step of acquisition video frame to be encoded
The video information for obtaining video to be encoded and the video to be encoded obtains described wait compile from the video information The video type of code video;
When the video to be encoded is video display class video, the face that high priest is obtained from the video information is special Sign;
After described the step of being determined in the video frame to be encoded according to the Face datection result with the presence or absence of face Include:
If there are faces in the video frame to be encoded, examined according to the facial characteristics of the high priest and the face Result is surveyed to judge to whether there is and the matched target face of high priest's facial characteristics in the video frame to be encoded;
If exist in the video frame to be encoded with the matched target face of high priest's facial characteristics, will be described Interest region of the target face corresponding region as the video frame to be encoded.
Optionally, the facial characteristics and the Face datection result according to the high priest judges described to be encoded Include: after the step of whether there is target face matched with high priest's facial characteristics in video frame
If in the video frame to be encoded there is no with the matched target face of high priest's facial characteristics, by institute State interest region of the face region as the video frame to be encoded in video frame to be encoded.
Optionally, the acquisition interest region and the non-corresponding encoder bit rate in interest region, and based on described The step of corresponding encoder bit rate respectively encodes the interest region and non-interest region include:
Macro block belonging to determining the interest region and non-interest region respectively;
The macro block distance of each macro block and the interest region belonging to the non-interest region is obtained, and is based on the macro block Distance determines each corresponding first code rate of macro block belonging to the non-interest region, wherein the macro block distance and first Code rate is negative correlativing relation;
Corresponding second code rate in the interest region is obtained, according to first code rate and second code rate respectively to institute It states non-interest region and interest region is encoded.
Optionally, described that each macro block belonging to the non-interest region corresponding the is determined based on macro block distance One code rate, wherein the step of macro block distance and the first code rate are negative correlativing relation include:
By the corresponding macro block distance of each macro block belonging to the non-interest region with it is preset at a distance from section be compared, really The corresponding macro block of each macro block is determined apart from locating apart from section;
The preset corresponding relationship apart from section and code rate is obtained, obtains the corresponding macro block of each macro block apart from locating distance The corresponding target bit rate in section, using the target bit rate as corresponding first code rate of each macro block.
Optionally, the method for video coding further include:
Receive user terminal transmission without spectators' prompt information, wherein no spectators' prompt information is by user terminal Detect no sight Shi Suofa on the screen of the subscriber terminal;
Reduce the encoder bit rate of current video frame to be encoded.
In addition, to achieve the above object, the present invention also provides a kind of video coding system, the video coding system packet It includes:
Video frame obtains module, for obtaining video frame to be encoded;
Interest determination module obtains face for carrying out Face datection to the video frame to be encoded based on preset rules Testing result, and determine according to the preset rules and the Face datection result interest region of the video frame to be encoded, And using the region other than interest region in the video frame to be encoded as non-interest region;
Execution module is encoded, for obtaining the interest region and the non-corresponding encoder bit rate in interest region, and base The interest region and non-interest region are encoded respectively in corresponding encoder bit rate.
In addition, to achieve the above object, the present invention also provides a kind of video encoder, the video encoder includes Processor, memory and it is stored in the video coding program that can be executed on the memory and by the processor, wherein institute When stating video coding program and being executed by the processor, realize such as the step of above-mentioned method for video coding.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium Video coding program is stored on storage medium, wherein realizing when the video coding program is executed by processor as above-mentioned The step of method for video coding.
The embodiment of the present invention treats encoded video frame based on preset rules and carries out face inspection by obtaining video frame to be encoded Survey, obtain Face datection as a result, and the interest region of video frame to be encoded is determined according to preset rules and Face datection result, and Using the region other than interest region in video frame to be encoded as non-interest region;Obtain interest region and non-interest region respectively Corresponding encoder bit rate, and the interest region and non-interest region are compiled respectively based on corresponding encoder bit rate Code, it may be assumed that the identification that encoded video frame carries out user-interested region is treated based on Face datection result and preset rules, and will identification Interest region and non-interest region out distinguishes coding, and then while reducing video size, it is ensured that user interest The video quality in region.
Detailed description of the invention
Fig. 1 is the video encoder structural schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is the flow diagram of one embodiment of method for video coding of the present invention;
Fig. 3 is the functional block diagram of one embodiment of video coding system of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
Referring to Figure 1, Fig. 1 is the hardware structural diagram of video encoder provided by the present invention.
Video encoder can be PC, be also possible to smart phone, tablet computer, portable computer, desktop computer Etc. equipment having a display function, optionally, video encoder can also be server apparatus, such as carry out with user terminal The far-end video server of video data interaction.
Video encoder may include: the components such as processor 101 and memory 201.In video encoder, place Reason device 101 is connect with memory 201, video coding program is stored on memory 201, processor 101 can call memory The video coding program stored in 201, and the step of realizing embodiment as each such as following method for video coding.
Memory 201, can be used for storing software program and various data, and memory 201 can mainly include storage program Area and storage data area, wherein storing program area can application program needed for storage program area, at least one function (such as Video coding program) etc.;Storage data area may include database etc..Processor 101 is the control centre of video encoder, Using the various pieces of various interfaces and the entire video encoder of connection, memory is stored in by running or executing Software program and/or module in 201, and the data being stored in memory 201 are called, execute each of video encoder Kind function and processing data, to carry out integral monitoring to video encoder.
It will be understood by those skilled in the art that video encoder structure shown in Fig. 1 is not constituted to Video coding The restriction of equipment may include perhaps combining certain components or different component cloth than illustrating more or fewer components It sets.
Based on above-mentioned hardware configuration, each embodiment of the method for the present invention is proposed.
The present invention provides a kind of method for video coding.
It is the flow diagram of method for video coding first embodiment of the present invention referring to Fig. 2, Fig. 2.
In the present embodiment, method for video coding the following steps are included:
Step S10 obtains video frame to be encoded;
Video encoder can obtain video to be encoded from the preset video database of Local or Remote server, In, video to be encoded can be the video acquired in real time, such as the video acquired in real time by terminal camera, in conference system The TV news acquired in real time are also possible to pre-stored video, such as video display video.Wherein, video frame is to constitute video Basic unit and be Video coding basic object, thus, in the present embodiment, actually execute encoding operation before, obtain Video frame to be encoded is as coded object.
Method for video coding of the present invention can be applied to multiple scenes, such as video conference scene or video display Entertainment Scene.It is regarding Under frequency conference scenario, video encoder passes through terminal camera and acquires meeting live video where each conference member, and will The corresponding meeting live video coding transmission of each conference member to other conference member terminals, at this point, meeting live video be to Encoded video.Under video display Entertainment Scene, video encoder is in the target video acquisition request for receiving user terminal transmission When, video to be encoded is determined according to the acquisition request, and Video coding to be encoded is transferred to user terminal.
One video is made of multiple video frames, and usual primary video coding can not be complete by all coding video frames of video At, need multiple Video coding could by video all coding is completed, therefore, when treating encoded video and being encoded, The video frame to be encoded of preset number needed for single coding need to repeatedly be obtained, executes corresponding video encoding operations, i.e. this hair Each step in bright embodiment.
Optionally, when the video frame (video frame i.e. to be encoded) for obtaining video to be encoded for the first time or before, depending on Frequency encoding setting information, and therefrom obtain coding rule, coding rule may include interest region establish rules really then, interest region with The code rate in non-interest region determines rule etc., after obtaining coding rule, can be treated according to coding rule encoded video frame into Row encoding operation.Optionally, monitoring can be updated to coding rule in real time, when detecting that coding rule changes, obtained Newest coding rule is obtained, and the remaining uncoded video frame of encoded video is treated according to newest coding rule and is encoded.
Step S20 carries out Face datection to the video frame to be encoded based on preset rules, obtain Face datection as a result, And the interest region of the video frame to be encoded is determined according to the preset rules and the Face datection result, and will it is described to Region in encoded video frame other than interest region is as non-interest region;
Either other videos such as TV news or video display video, face region are all that user's attention is concentrated Region, to take into account quality and compression efficiency, the embodiment of the present invention is according to face association attributes (such as area of video frame to be encoded Attribute, (pixel/coordinate) position attribution etc.) distinguish coding.In view of in different video frames to be encoded with the presence or absence of face with And the faces association attributes such as face location is uncertain, needs to treat encoded video frame and carries out Face datection to determine face correlation Attribute, to determine the specific distribution of encoder bit rate in subsequent coding steps based on face association attributes.
Preset rules herein, i.e. interest region are established rules then really, and when carrying out Face datection, specific detection content needs root According to preset rules determine, preset rules can when obtaining video frame to be encoded for the first time or before obtain, can also carry out people It is obtained before face detection.
Preset rules can be for using face region in video frame to be encoded as interest region;Or: it will be to Area is greater than the face region of preset value as interest region in encoded video frame;It can be with are as follows: by video frame to be encoded The face region and neighboring area that middle area is greater than preset value are as interest region;On the basis of above-mentioned preset rules, When preset rules be may also include that in video frame to be encoded without face, by predeterminable area (such as center in video frame to be encoded Domain) it is used as interest region.Aforementioned preset rules are only several optional examples that interest region determines rule, can be also based on for other The interest region of face determines rule.In addition, multiple preset rules can also be arranged in video encoder simultaneously, can be compiled by video Decoding apparatus user independently switches interest region and determines rule.
The specific detection content in aforementioned Face datection, and then determining and specific detection content pair are determined according to preset rules The Face datection result answered.According to the difference of preset rules, it may be determined that corresponding different specific detection content and Face datection As a result, including but not limited to following example: being using face region in video frame to be encoded as region of interest in preset rules When domain, specific detection content only detects whether that there are faces, and the face location detected, corresponding Face datection result For there are face is not present in face and face location or video frame to be encoded in video frame to be encoded;It is in preset rules As when interest region, specific detection content is detection for face region using area in video frame to be encoded greater than preset value With the presence or absence of face and face area, corresponding Face datection result is that there are faces and existing in video frame to be encoded Face location and area, which are greater than in the face or video frame to be encoded of preset value, is not present face;In preset rules further include: If without face in video frame to be encoded, when regarding predeterminable area (such as central area) in video frame to be encoded as interest region, Specific detection content further includes the position of predeterminable area.
Based on above description it is found that after determining preset rules and Face datection result, that is, it can determine video frame to be encoded Interest region.Interest region can be indicated with pixel form, and will be except interest region respective pixel in video frame to be encoded Pixel as non-interest region.
Step S30 obtains the interest region and the non-corresponding encoder bit rate in interest region, and based on described respective Corresponding encoder bit rate respectively encodes the interest region and non-interest region.
Default interest region and the non-corresponding encoder bit rate in interest region, interest region correspond to encoder bit rate higher than non- The corresponding encoder bit rate in interest region.After determining interest region and non-interest region, corresponding encoder bit rate is directly obtained.Its In, unified code rate coding can be used in non-interest region, can also be further according to image complexity or at a distance from interest region Using different code rates.
Optionally, to any one video, carried out above-mentioned interest region recognition, to interest region and non-interest region Using different code rates encode and etc. after, the different zones of each video frame can be corresponded to code rate and stored, subsequent When carrying out Video coding to same video again, the encoder bit rate that can directly inquire acquisition each region of the video is distributed, and according to Encoder bit rate distribution directly encodes the video.
The present embodiment treats encoded video frame based on preset rules and carries out Face datection by obtaining video frame to be encoded, Obtain Face datection as a result, and the interest region of video frame to be encoded is determined according to preset rules and Face datection result, and will Region in video frame to be encoded other than interest region is as non-interest region;It obtains interest region and non-interest region is respectively right The encoder bit rate answered, and the interest region and non-interest region are encoded respectively based on corresponding encoder bit rate, That is: encoded video frame is treated based on Face datection result and preset rules and carries out the identification of user-interested region, and will identify that Interest region and non-interest region distinguish coding, and then while reducing video size, it is ensured that user interest area The video quality in domain.
Further, the second embodiment of method for video coding of the present invention is proposed based on the above embodiment.
In method for video coding second embodiment of the present invention, include: before step S10
Step S01 obtains the video information of video to be encoded and the video to be encoded, obtains from the video information The video type of the video to be encoded;
The video encoder that method for video coding of the present invention corresponds to video coding program is configured, can be applied to a variety of differences Video coding scene, it is typical such as video display class video, TV news.Video to be encoded can be the video acquired in real time, such as The TV news of real-time Transmission in digital conference system, or video in the database is prestored, such as video website service Video display class video in device.
Video type is contained in the video information of video to be encoded, also may include high priest's information comprising main Character face's feature.Wherein, when video type is video display class video, go out because each films and television programs are fixed with one or more High priest of mirror, including leading role, supporting role and the actor playing a supporting role etc., these are all the interested regions of user, thus in video information Including aforementioned high priest.
Step S02 obtains high priest's from the video information when the video to be encoded is video display class video Facial characteristics;
The facial characteristics that high priest can be directly acquired from video information, does not have the face of high priest in video information When portion's feature, it can be analyzed by treating the video frame of preset number in encoded video to determine high priest (as by going out Analytical judgment foundation of the field rate/time for competiton as high priest), if for example, all occurring in the video frame of preset number Someone face, then using the people as one of high priest;After determining high priest, treat in encoded video high priest into Row facial feature extraction, and high priest's facial characteristics is stored in video information, when realizing Video coding, directly from video The facial characteristics of high priest is obtained in information.
The interest of the video frame to be encoded is determined in step S20 according to the preset rules and the Face datection result The step of region includes:
Step S21 is determined in the video frame to be encoded according to the Face datection result with the presence or absence of face;
It in the present embodiment, can comprising there are face or face being not present in video frame to be encoded in Face datection result Directly determined according to Face datection result.
Step S22 obtains preset central area, in described if face is not present in the video frame to be encoded Interest region of the heart district domain as the video frame to be encoded.
Determine that there is no faces in video frame to be encoded according to Face datection result, because when there is no face, user's Sight focus generally heart position in video, so using preset central area as the interest region of video frame to be encoded.
Preset central area can be fixed central area, and central area refers on video frame geometric meaning to be encoded Central area, can be rectangular area or circle (including ellipse) region in video frame center to be encoded, specifically can root It is anticipated that central area area and video frame to be encoded position of the good central area of areal calculation in video frame to be encoded (location of pixels/coordinate position).
After described the step of being determined in the video frame to be encoded according to the Face datection result with the presence or absence of face Include:
Step S23, if there are faces in the video frame to be encoded, according to the facial characteristics of the high priest and institute Face datection result is stated to judge to whether there is and the matched target of high priest's facial characteristics in the video frame to be encoded Face;
If continuing to judge to whether there is and high priest face in video frame to be encoded there are face in video frame to be encoded The target face of portion's characteristic matching.In the present embodiment, the specific detection content of Face datection further include: detecting face When, acquisition face characteristic is continued to test, then further includes the face characteristic detected in corresponding Face datection result.Can pass through by The face characteristic detected is compared with high priest's facial characteristics and is matched, and is judged whether there is and high priest's facial characteristics Matched target face.
Step S24, if in the video frame to be encoded exist with the matched target face of high priest's facial characteristics, Then using target face corresponding region as the interest region of the video frame to be encoded.
Target face corresponding region, it may include target face region can also include that target face corresponds to personage simultaneously Region.
Wherein, target face region, can directly where obtaining target face in Face datection result position (as Plain position or coordinate position) it is used as target face region.Target face corresponds to personage region, refers to the associated body of face Body portion pixel region can treat target face neighboring area in encoded video frame and carry out human body contour outline identification, will know others The region of body contoured is as the associated body part pixel region of face.
High priest refers to including the interested personage of the users such as leading role, supporting role and the actor playing a supporting role, with high priest's facial characteristics Matched target face, the i.e. face of high priest are detecting video frame to be encoded so that high priest is hero and heroine as an example Middle when there is the face of hero and heroine, the face of hero and heroine is the interest region of video frame to be encoded, if at this point, to be encoded The face of other non-hero and heroine is had also appeared in video frame, the face of other non-hero and heroine is non-interest region.
Optionally, after step S23 further include:
If in the video frame to be encoded there is no with the matched target face of high priest's facial characteristics, by institute State interest region of the face region as the video frame to be encoded in video frame to be encoded.
If the face region that directly will test is as video to be encoded without target face in video frame to be encoded The interest region of frame.
Above-mentioned example is connect, by taking high priest is hero and heroine as an example, if without target face in video frame to be encoded, i.e., wait compile Face without hero and heroine in code video frame, but have the face of other non-hero and heroine (such as passerby), then using the face of passerby as wait compile The interest region of code video frame.
The present embodiment is video display class in video to be encoded by obtaining the video type of video to be encoded from video information When video, the facial characteristics of high priest is obtained from video information, there are when face in video frame to be encoded, according to main The facial characteristics and Face datection result of personage judges to whether there is in video frame to be encoded to be matched with high priest's facial characteristics Target face;If in video frame to be encoded exist with the matched target face of high priest's facial characteristics, by target face Interest region of the corresponding region as the video frame to be encoded, in view of in video display class video, the attention one of spectators (user) As concentrate on high priest, by the identification to high priest in video display class video, and using high priest corresponding region as Interest region is encoded interest region with high code rate so as to subsequent, by the region other than interest region with compared with Low Bit-rate Coding, The place that user pays attention to is encoded with high code rate, good video effect can be provided for user, while by user's attention The place of dispersion is compared with Low Bit-rate Coding, then can reduce transmission of video code rate.
Further, in method for video coding 3rd embodiment of the present invention, step S30 includes:
Step S31, determine the interest region and non-interest region respectively belonging to macro block;
Video encoding operations in method for video coding of the present invention are as unit of macro block, and macro block is encoded one by one, by it It is organized into continuous video code flow, wherein macro block is made of a luminance pixel block and additional two chroma pixel blocks.
The affiliated one or more macro blocks in interest region and non-interest region, are determining interest region and non-interest region Afterwards, one belonging to being determined interest region and non-interest region respectively according to the location of pixels in interest region and non-interest region A or multiple macro blocks.
Step S32, obtains the macro block distance of each macro block and the interest region belonging to the non-interest region, and is based on Macro block distance determines each corresponding first code rate of macro block belonging to the non-interest region, wherein the macro block away from From with the first code rate be negative correlativing relation;
The remoter region with human eye focal length center, the easier ignorance of human eye can be to non-emerging based on this visual characteristics of human eyes Interesting region is encoded using different code rates.The macro block distance of macro block and interest region, macro block distance are got over where calculating non-interest region Small, code rate is bigger, i.e., reduces code rate with the increase with interest region distance, so that user is difficult to discover a video frame Quality difference is realized under the premise of user is noninductive, and encoded video stream is reduced, and reduces bandwidth requirement.
Here macro block distance can refer to the macroblock number being separated by with the affiliated macro block in the boundary in interest region, macro block distance With the first code rate negative correlation, it may be assumed that macro block corresponding with the adjacent macro block of the affiliated macro block in interest zone boundary apart from smaller, Corresponding first code rate of the adjacent macro block is bigger;The most macro block of the macroblock number being separated by with the affiliated macro block in interest zone boundary, Corresponding first code rate is minimum.Here the first code rate not refers in particular to a certain numerical value, but refers to belonging to all non-interest regions The corresponding encoder bit rate of one or more macro blocks.
Optionally, following formula computing macro block distances and the negative correlativing relation between the first code rate can be passed through:
Y=-kx+b, k are positive number, and y is the first code rate, and x is macro block distance.
Optionally, determine that each macro block belonging to the non-interest region is each based on the macro block distance described in step S32 Self-corresponding first code rate, wherein the step of macro block distance and the first code rate are negative correlativing relation include:
By the corresponding macro block distance of each macro block belonging to the non-interest region with it is preset at a distance from section be compared, really The corresponding macro block of each macro block is determined apart from locating apart from section;The preset corresponding relationship apart from section and code rate is obtained, is obtained The corresponding macro block of each macro block apart from locating apart from the corresponding target bit rate in section, it is corresponding using the target bit rate as each macro block The first code rate.
The corresponding relationship between macro block distance and the first code rate can be preset, and corresponding relationship between the two is stored, When need to determine the first code rate of certain macro block, the corresponding first macro block distance of the macro block is directly acquired, and obtains macro block distance and The corresponding relationship of one code rate determines size of first macro block apart from corresponding first code rate according to the corresponding relationship.
Step S33 obtains corresponding second code rate in the interest region, according to first code rate and second code rate The non-interest region and interest region are encoded respectively.
In the present embodiment, by the encoder bit rate in interest region, i.e. the second code rate, prestores in the database, determining interest Behind region, corresponding second code rate in interest region can be obtained directly from database.Each macro block pair belonging to non-interest region The first code rate answered encodes each macro block belonging to non-interest region, is encoded with the second code rate to interest region.
In the present embodiment, the corresponding relationship of macro block distance and the first code rate, the i.e. corresponding relationship apart from section and code rate, place Macro block distance between a certain distance regions corresponds to same code rate.
The present embodiment may be implemented in user it is noninductive under the premise of, reduce encoded video stream, reduce bandwidth requirement.
Optionally, in method for video coding fourth embodiment of the present invention, the method for video coding further include: receive User terminal send without spectators' prompt information, wherein no spectators' prompt information detects no sight by user terminal Shi Suofa on the screen of the subscriber terminal;Reduce the encoder bit rate of current video frame to be encoded.
Setting for video frame rate to be encoded, can also be true according to testing result of the user terminal to User Status It is fixed.Specifically can be by whether there is sight to stop on the screen of the subscriber terminal in user terminal camera detection preset period of time, if inspection It measures without sight in preset period of time, then sends without spectators' prompt information to video encoder, video encoder is receiving After spectators' prompt information, the encoder bit rate of current video frame to be encoded is reduced, has sight to rest on user detecting again When on terminal screen, transmission has spectators' prompt information to video encoder, and video encoder is receiving spectators' prompt When information, the encoder bit rate of current video frame to be encoded is restored to normal level.
Because the different zones for treating encoded video frame distinguish coding, reducing current video frame to be encoded Encoder bit rate when, same code rate is uniformly reduced to all areas of current video frame to be encoded, or will current video to be encoded The encoder bit rate of all areas of frame is reduced to same code rate value.
Optionally, no spectators' prompt information can also determine the program being currently running by user terminal, if detecting User executes operation, or detection in the operation for carrying out other programs, as temporary withdrawal current video interface goes to other pages When being minimized to video window, it can be transmitted without spectators' prompt information to video encoder.
The present embodiment by receive user terminal transmission without spectators' prompt information, wherein the no spectators prompt letter Breath detects no sight Shi Suofa on the screen of the subscriber terminal by user terminal;Reduce the coding code of current video frame to be encoded Whether rate can detect user by user terminal and actually spend attention on video, and according to the testing result pair of user terminal The encoder bit rate of the current video frame to be encoded of video is adjusted, and can reduce transmission bandwidth, saves transfer resource.
In addition, the present invention also provides a kind of video coding systems corresponding with each step of above-mentioned method for video coding.
It is the functional block diagram of video coding system first embodiment of the present invention referring to Fig. 3, Fig. 3.
In the present embodiment, video coding system of the present invention includes:
Video frame obtains module 10, for obtaining video frame to be encoded;
Interest determination module 20 obtains people for carrying out Face datection to the video frame to be encoded based on preset rules Face testing result, and determine according to the preset rules and the Face datection result region of interest of the video frame to be encoded Domain, and using the region other than interest region in the video frame to be encoded as non-interest region;
Execution module 30 is encoded, for obtaining the interest region and the non-corresponding encoder bit rate in interest region, and The interest region and non-interest region are encoded respectively based on corresponding encoder bit rate.
Further, interest determination module 20 are also used to determine the video to be encoded according to the Face datection result It whether there is face in frame;If face is not present in the video frame to be encoded, preset central area is obtained, in described Interest region of the heart district domain as the video frame to be encoded.
Further, video coding system of the present invention further include:
Acquiring video information module, for obtaining the video information of video to be encoded and the video to be encoded, from described The video type of the video to be encoded is obtained in video information;When the video to be encoded is video display class video, from described The facial characteristics of high priest is obtained in video information;
Interest determination module 20, if being also used in the video frame to be encoded, there are faces, according to the high priest Facial characteristics and the Face datection result judges to whether there is in the video frame to be encoded and high priest face The target face of characteristic matching;If existing and the matched target person of high priest's facial characteristics in the video frame to be encoded Face, then using target face corresponding region as the interest region of the video frame to be encoded.
Further, interest determination module 20, if being also used to be not present and the main people in the video frame to be encoded The target face of object plane portion characteristic matching, then using face region in the video frame to be encoded as the video to be encoded The interest region of frame.
Further, execution module 30 is encoded, is also used to determine that the interest region and non-interest region are respectively affiliated Macro block;Obtain the macro block distance of each macro block and the interest region belonging to the non-interest region, and based on the macro block away from Each corresponding first code rate of macro block belonging to the non-interest region described from determination, wherein the macro block distance with first yard Rate is negative correlativing relation;Corresponding second code rate in the interest region is obtained, according to first code rate and second code rate The non-interest region and interest region are encoded respectively.
Further, execution module 30 is encoded, is also used to the corresponding macro block of each macro block belonging to the non-interest region Distance with it is preset at a distance from section be compared, determine the corresponding macro block of each macro block apart from locating apart from section;It obtains preset The corresponding relationship apart from section and code rate, obtain the corresponding macro block of each macro block apart from locating apart from the corresponding object code in section Rate, using the target bit rate as corresponding first code rate of each macro block.
Further, video coding system of the present invention further include:
Code rate adjust module, for receive user terminal transmission without spectators' prompt information, wherein the no spectators mention Show that information detects no sight Shi Suofa on the screen of the subscriber terminal by user terminal;Reduce the volume of current video frame to be encoded Code code rate.
The present invention also proposes a kind of computer readable storage medium, is stored thereon with computer program.The computer can Reading storage medium can be the memory 201 in the video encoder of Fig. 1, be also possible to such as ROM (Read-Only Memory, read-only memory)/RAM (Random Access Memory, random access memory), magnetic disk, in CD at least One kind, the computer readable storage medium include that some instructions are used so that the equipment with processor (can be hand Video encoder etc. in machine, computer, server, the network equipment or the embodiment of the present invention) execute each implementation of the present invention Method described in example.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the server-side that include a series of elements not only include those elements, It but also including other elements that are not explicitly listed, or further include for this process, method, article or server-side institute Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wrapping Include in process, method, article or the server-side of the element that there is also other identical elements.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of method for video coding, which is characterized in that the method for video coding the following steps are included:
Obtain video frame to be encoded;
Face datection is carried out to the video frame to be encoded based on preset rules, obtains Face datection as a result, and according to described pre- If rule and the Face datection result determine the interest region of the video frame to be encoded, and will be in the video frame to be encoded Region other than interest region is as non-interest region;
The interest region and the non-corresponding encoder bit rate in interest region are obtained, and is based on corresponding coding code Rate respectively encodes the interest region and non-interest region.
2. method for video coding as described in claim 1, which is characterized in that described according to the preset rules and the face Testing result determines that the step of interest region of the video frame to be encoded includes:
It is determined in the video frame to be encoded according to the Face datection result with the presence or absence of face;
If face is not present in the video frame to be encoded, preset central area is obtained, using the central area as institute State the interest region of video frame to be encoded.
3. method for video coding as claimed in claim 2, which is characterized in that before the step of acquisition video frame to be encoded Include:
The video information for obtaining video to be encoded and the video to be encoded obtains the view to be encoded from the video information The video type of frequency;
When the video to be encoded is video display class video, the facial characteristics of high priest is obtained from the video information;
Include: after described the step of being determined in the video frame to be encoded according to the Face datection result with the presence or absence of face
If there are faces in the video frame to be encoded, according to the facial characteristics of the high priest and the Face datection knot Fruit judges to whether there is and the matched target face of high priest's facial characteristics in the video frame to be encoded;
If in the video frame to be encoded exist with the matched target face of high priest's facial characteristics, by the target Interest region of the face corresponding region as the video frame to be encoded.
4. method for video coding as claimed in claim 3, which is characterized in that the facial characteristics according to the high priest Judge in the video frame to be encoded with the Face datection result with the presence or absence of matched with high priest's facial characteristics Include: after the step of target face
If in the video frame to be encoded there is no with the matched target face of high priest's facial characteristics, will it is described to Interest region of the face region as the video frame to be encoded in encoded video frame.
5. method for video coding as described in claim 1, which is characterized in that the acquisition interest region and non-region of interest The corresponding encoder bit rate in domain, and based on corresponding encoder bit rate respectively to the interest region and non-region of interest The step of domain is encoded include:
Macro block belonging to determining the interest region and non-interest region respectively;
The macro block distance of each macro block and the interest region belonging to the non-interest region is obtained, and is based on the macro block distance Determine each corresponding first code rate of macro block belonging to the non-interest region, wherein the macro block distance and the first code rate For negative correlativing relation;
Corresponding second code rate in the interest region is obtained, according to first code rate and second code rate respectively to described non- Interest region and interest region are encoded.
6. method for video coding as claimed in claim 5, which is characterized in that described described non-based on the macro block distance determination Each corresponding first code rate of macro block belonging to interest region, wherein the macro block distance is negatively correlated close with the first code rate The step of being include:
By the corresponding macro block distance of each macro block belonging to the non-interest region with it is preset at a distance from section be compared, determine each The corresponding macro block of macro block is apart from locating apart from section;
The preset corresponding relationship apart from section and code rate is obtained, obtains the corresponding macro block of each macro block apart from locating apart from section Corresponding target bit rate, using the target bit rate as corresponding first code rate of each macro block.
7. method for video coding as described in claim 1, which is characterized in that the method for video coding further include:
Receive user terminal transmission without spectators' prompt information, wherein no spectators' prompt information is detected by user terminal To no sight Shi Suofa on the screen of the subscriber terminal;
Reduce the encoder bit rate of current video frame to be encoded.
8. a kind of video coding system, which is characterized in that the video coding system includes:
Video frame obtains module, for obtaining video frame to be encoded;
Interest determination module obtains Face datection for carrying out Face datection to the video frame to be encoded based on preset rules As a result, and the interest region of the video frame to be encoded is determined according to the preset rules and the Face datection result, and will Region in the video frame to be encoded other than interest region is as non-interest region;
Execution module is encoded, for obtaining the interest region and the non-corresponding encoder bit rate in interest region, and is based on institute Corresponding encoder bit rate is stated respectively to encode the interest region and non-interest region.
9. a kind of video encoder, which is characterized in that the video encoder includes processor, memory and storage On the memory and the video coding program that can be executed by the processor, wherein the video coding program is by the place When managing device and executing, the step of realizing method for video coding as described in any one of claims 1 to 7.
10. a kind of computer readable storage medium, which is characterized in that be stored with video volume on the computer readable storage medium Coded program, wherein realizing the view as described in any one of claims 1 to 7 when the video coding program is executed by processor The step of frequency coding method.
CN201910297964.XA 2019-04-12 2019-04-12 Video encoding method, system, device, and computer-readable storage medium Active CN110049324B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910297964.XA CN110049324B (en) 2019-04-12 2019-04-12 Video encoding method, system, device, and computer-readable storage medium
PCT/CN2019/120899 WO2020207030A1 (en) 2019-04-12 2019-11-26 Video encoding method, system and device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910297964.XA CN110049324B (en) 2019-04-12 2019-04-12 Video encoding method, system, device, and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN110049324A true CN110049324A (en) 2019-07-23
CN110049324B CN110049324B (en) 2022-10-14

Family

ID=67276985

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910297964.XA Active CN110049324B (en) 2019-04-12 2019-04-12 Video encoding method, system, device, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN110049324B (en)
WO (1) WO2020207030A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110620924A (en) * 2019-09-23 2019-12-27 广州虎牙科技有限公司 Method and device for processing coded data, computer equipment and storage medium
CN110769252A (en) * 2019-11-01 2020-02-07 西安交通大学 Method for improving coding quality by AI face detection
CN111050190A (en) * 2019-12-31 2020-04-21 广州酷狗计算机科技有限公司 Encoding method, device and equipment of live video stream and storage medium
WO2020207030A1 (en) * 2019-04-12 2020-10-15 深圳壹账通智能科技有限公司 Video encoding method, system and device, and computer-readable storage medium
CN111885332A (en) * 2020-07-31 2020-11-03 歌尔科技有限公司 Video storage method and device, camera and readable storage medium
CN112183227A (en) * 2020-09-08 2021-01-05 瑞芯微电子股份有限公司 Intelligent pan-face region coding method and equipment
CN112733650A (en) * 2020-12-29 2021-04-30 深圳云天励飞技术股份有限公司 Target face detection method and device, terminal equipment and storage medium
CN112995713A (en) * 2021-03-02 2021-06-18 广州酷狗计算机科技有限公司 Video processing method, video processing device, computer equipment and storage medium
CN113011210A (en) * 2019-12-19 2021-06-22 北京百度网讯科技有限公司 Video processing method and device
CN114286136A (en) * 2021-12-28 2022-04-05 咪咕文化科技有限公司 Video playing and encoding method, device, equipment and computer readable storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114531615B (en) * 2020-11-03 2023-10-27 腾讯科技(深圳)有限公司 Video data processing method, device, computer equipment and storage medium
CN113068034B (en) * 2021-03-25 2022-12-30 Oppo广东移动通信有限公司 Video encoding method and device, encoder, equipment and storage medium
CN114885167A (en) * 2022-04-29 2022-08-09 上海哔哩哔哩科技有限公司 Video coding method and device
CN116800976B (en) * 2023-07-17 2024-03-12 武汉星巡智能科技有限公司 Audio and video compression and restoration method, device and equipment for infant with sleep

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102547293A (en) * 2012-02-16 2012-07-04 西南交通大学 Method for coding session video by combining time domain dependence of face region and global rate distortion optimization
CN103974071A (en) * 2013-01-29 2014-08-06 富士通株式会社 Video coding method and equipment on basis of regions of interest
WO2016202285A1 (en) * 2015-06-19 2016-12-22 美国掌赢信息科技有限公司 Real-time video transmission method and electronic apparatus
CN106550240A (en) * 2016-12-09 2017-03-29 武汉斗鱼网络科技有限公司 A kind of bandwidth conservation method and system
CN106658011A (en) * 2016-12-09 2017-05-10 深圳市云宙多媒体技术有限公司 Panoramic video coding and decoding methods and devices

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104427337B (en) * 2013-08-21 2018-03-27 杭州海康威视数字技术股份有限公司 Interested area video coding method and its device based on target detection
TWI616102B (en) * 2016-06-24 2018-02-21 和碩聯合科技股份有限公司 Video image generation system and video image generating method thereof
CN110049324B (en) * 2019-04-12 2022-10-14 深圳壹账通智能科技有限公司 Video encoding method, system, device, and computer-readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102547293A (en) * 2012-02-16 2012-07-04 西南交通大学 Method for coding session video by combining time domain dependence of face region and global rate distortion optimization
CN103974071A (en) * 2013-01-29 2014-08-06 富士通株式会社 Video coding method and equipment on basis of regions of interest
WO2016202285A1 (en) * 2015-06-19 2016-12-22 美国掌赢信息科技有限公司 Real-time video transmission method and electronic apparatus
CN106550240A (en) * 2016-12-09 2017-03-29 武汉斗鱼网络科技有限公司 A kind of bandwidth conservation method and system
CN106658011A (en) * 2016-12-09 2017-05-10 深圳市云宙多媒体技术有限公司 Panoramic video coding and decoding methods and devices

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020207030A1 (en) * 2019-04-12 2020-10-15 深圳壹账通智能科技有限公司 Video encoding method, system and device, and computer-readable storage medium
CN110620924B (en) * 2019-09-23 2022-05-20 广州虎牙科技有限公司 Method and device for processing coded data, computer equipment and storage medium
CN110620924A (en) * 2019-09-23 2019-12-27 广州虎牙科技有限公司 Method and device for processing coded data, computer equipment and storage medium
CN110769252A (en) * 2019-11-01 2020-02-07 西安交通大学 Method for improving coding quality by AI face detection
CN113011210B (en) * 2019-12-19 2022-09-16 北京百度网讯科技有限公司 Video processing method and device
CN113011210A (en) * 2019-12-19 2021-06-22 北京百度网讯科技有限公司 Video processing method and device
US11659181B2 (en) 2019-12-19 2023-05-23 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for determining region of interest
CN111050190B (en) * 2019-12-31 2022-02-18 广州酷狗计算机科技有限公司 Encoding method, device and equipment of live video stream and storage medium
CN111050190A (en) * 2019-12-31 2020-04-21 广州酷狗计算机科技有限公司 Encoding method, device and equipment of live video stream and storage medium
CN111885332A (en) * 2020-07-31 2020-11-03 歌尔科技有限公司 Video storage method and device, camera and readable storage medium
CN112183227A (en) * 2020-09-08 2021-01-05 瑞芯微电子股份有限公司 Intelligent pan-face region coding method and equipment
CN112183227B (en) * 2020-09-08 2023-12-22 瑞芯微电子股份有限公司 Intelligent face region coding method and device
CN112733650A (en) * 2020-12-29 2021-04-30 深圳云天励飞技术股份有限公司 Target face detection method and device, terminal equipment and storage medium
CN112733650B (en) * 2020-12-29 2024-05-07 深圳云天励飞技术股份有限公司 Target face detection method and device, terminal equipment and storage medium
CN112995713A (en) * 2021-03-02 2021-06-18 广州酷狗计算机科技有限公司 Video processing method, video processing device, computer equipment and storage medium
CN114286136A (en) * 2021-12-28 2022-04-05 咪咕文化科技有限公司 Video playing and encoding method, device, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN110049324B (en) 2022-10-14
WO2020207030A1 (en) 2020-10-15

Similar Documents

Publication Publication Date Title
CN110049324A (en) Method for video coding, system, equipment and computer readable storage medium
Guan et al. Pano: Optimizing 360 video streaming with a better understanding of quality perception
CN102572217B (en) Visual-attention-based multimedia processing method and device
US5825917A (en) Region-based image processing method, image processing apparatus and image communication apparatus
KR100669837B1 (en) Extraction of foreground information for stereoscopic video coding
Shen et al. Just noticeable distortion profile inference: A patch-level structural visibility learning approach
TWI505695B (en) Video encoder and related management and coding methods, video decoder and related video decoding method
CN108933935A (en) Detection method, device, storage medium and the computer equipment of video communication system
CN102572502B (en) Selecting method of keyframe for video quality evaluation
CN112087625A (en) Image processing method, image processing apparatus, server, and storage medium
CN113747160B (en) Video coding configuration method, device, equipment and computer readable storage medium
CN115396705A (en) Screen projection operation verification method, platform and system
CN110740316A (en) Data coding method and device
Yuan et al. Object shape approximation and contour adaptive depth image coding for virtual view synthesis
CN110545430A (en) video transmission method and device
CN110460855B (en) Image processing method and system
KR20210145512A (en) Attendance check and concentration analysis system using online education-based facial recognition, and method thereof
CN114827617B (en) Video coding and decoding method and system based on perception model
CN116980604A (en) Video encoding method, video decoding method and related equipment
CN111031325A (en) Data processing method and system
Tran et al. Deeplight: Robust & unobtrusive real-time screen-camera communication for real-world displays
CN116827921A (en) Audio and video processing method, device and equipment for streaming media
KR20030062043A (en) Face detection and tracking of video communication system
Jang et al. Mobile video communication based on augmented reality
CN112511834A (en) Encoding method, apparatus and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant