WO2020207030A1 - 视频编码方法、系统、设备及计算机可读存储介质 - Google Patents

视频编码方法、系统、设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2020207030A1
WO2020207030A1 PCT/CN2019/120899 CN2019120899W WO2020207030A1 WO 2020207030 A1 WO2020207030 A1 WO 2020207030A1 CN 2019120899 W CN2019120899 W CN 2019120899W WO 2020207030 A1 WO2020207030 A1 WO 2020207030A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoded
video
interest
area
video frame
Prior art date
Application number
PCT/CN2019/120899
Other languages
English (en)
French (fr)
Inventor
齐燕
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2020207030A1 publication Critical patent/WO2020207030A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • This application relates to the field of video coding technology, and in particular to a video coding method, system, device and computer-readable storage medium.
  • the main purpose of this application is to provide a video encoding method, system, device, and computer-readable storage medium, aiming to solve the technical problem that existing video encoding methods cannot balance user experience and video transmission bit rate.
  • the present application provides a video encoding method.
  • the video encoding method includes the following steps:
  • the step of determining the region of interest of the video frame to be encoded according to the preset rule and the face detection result includes:
  • a preset central area is acquired, and the central area is used as the region of interest of the video frame to be coded.
  • the method includes:
  • the video to be encoded is a film and television video, obtain the facial features of the main character from the video information;
  • the step of determining whether there is a human face in the video frame to be encoded according to the human face detection result includes:
  • the region corresponding to the target human face is taken as the region of interest of the video frame to be encoded.
  • the step of judging whether there is a target face matching the facial features of the main person in the to-be-encoded video frame according to the facial features of the main person and the face detection result includes:
  • the area where the human face is located in the video frame to be encoded is used as the region of interest of the video frame to be encoded.
  • the step of obtaining the coding rate corresponding to each of the interest area and the non-interest area, and respectively encoding the interest area and the non-interest area based on the respective corresponding coding rate includes:
  • the second code rate corresponding to the region of interest is acquired, and the non-interest region and the region of interest are respectively coded according to the first code rate and the second code rate.
  • the determining a first code rate corresponding to each macro block to which the non-interest area belongs based on the macro block distance, wherein the step of the macro block distance and the first code rate in a negative correlation relationship includes :
  • the corresponding relationship between the preset distance interval and the code rate is obtained, the target code rate corresponding to the distance interval of the macro block distance corresponding to each macro block is obtained, and the target code rate is taken as the first code rate corresponding to each macro block.
  • the video encoding method further includes:
  • no-audience prompt information sent by the user terminal, where the no-audience prompt information is sent when the user terminal detects that no line of sight is on the user terminal screen;
  • this application also provides a video encoding system, the video encoding system including:
  • Video frame acquisition module for acquiring video frames to be encoded
  • the interest determination module is configured to perform face detection on the video frame to be encoded based on preset rules to obtain a face detection result, and determine the value of the video frame to be encoded according to the preset rule and the face detection result An area of interest, and use an area other than the area of interest in the to-be-encoded video frame as a non-interest area;
  • the encoding execution module is configured to obtain the respective coding code rates of the interest area and the non-interest area, and respectively encode the interest area and the non-interest area based on the respective corresponding coding code rates.
  • the present application also provides a video encoding device, the video encoding device including a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein When the computer-readable instructions are executed by the processor, the steps of the video encoding method described above are implemented.
  • the present application also provides a computer-readable storage medium having computer-readable instructions stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the implementation is as described above The steps of the video encoding method.
  • the embodiment of the application obtains the video frame to be encoded, performs face detection on the video frame to be encoded based on preset rules, obtains the face detection result, and determines the region of interest of the video frame to be encoded according to the preset rule and the face detection result, and Take the area outside the interest area in the video frame to be encoded as the non-interest area; obtain the respective coding rate of the interest area and the non-interest area, and respectively encode the interest area and the non-interest area based on the respective coding rate , That is: based on the face detection results and preset rules, the user's interest area is recognized for the encoded video frame, and the identified interest area and non-interest area are distinguished and encoded, thereby reducing the video size while ensuring the user's interest area Video quality.
  • FIG. 1 is a schematic structural diagram of a video encoding device in a hardware operating environment involved in a solution of an embodiment of the present application;
  • FIG. 2 is a schematic flowchart of an embodiment of a video encoding method according to this application.
  • FIG. 3 is a schematic diagram of functional modules of an embodiment of a video encoding system according to this application.
  • FIG. 1 is a schematic diagram of the hardware structure of the video encoding device provided by this application.
  • the video encoding device can be a PC, or a device with a display function such as a smart phone, a tablet computer, a portable computer, a desktop computer, etc.
  • the video encoding device can also be a server device, such as a video data interaction with a user terminal Remote video server.
  • the video encoding device may include components such as a processor 101 and a memory 201.
  • the processor 101 is connected to the memory 201, and computer-readable instructions are stored on the memory 201.
  • the processor 101 can call the computer-readable instructions stored in the memory 201, and implement the following video encoding method embodiments. step.
  • the memory 201 can be used to store software programs and various data.
  • the memory 201 can mainly include a storage program area and a storage data area.
  • the storage program area can store an operating system and at least one application program required for a function (for example, for video Encoded computer-readable instructions), etc.; the storage data area may include a database, etc.
  • the processor 101 is the control center of the video encoding device. It uses various interfaces and lines to connect the various parts of the entire video encoding device, runs or executes software programs and/or modules stored in the memory 201, and calls the memory 201
  • the data inside performs various functions of the video encoding device and processes the data, thereby monitoring the video encoding device as a whole.
  • FIG. 1 does not constitute a limitation on the video encoding device, and may include more or less components than shown in the figure, or a combination of certain components, or different components Layout.
  • This application provides a video encoding method.
  • FIG. 2 is a schematic flowchart of a first embodiment of a video encoding method according to this application.
  • the video encoding method includes the following steps:
  • Step S10 obtaining a video frame to be encoded
  • the video encoding device can obtain the video to be encoded from the video database preset on the local or remote server, where the video to be encoded can be a video collected in real time, such as a video collected in real time through a terminal camera, such as a meeting video collected in real time in a conference system , It can also be a pre-stored video, such as a movie video.
  • a video frame is a basic unit constituting a video and a basic object of video encoding. Therefore, in this embodiment, before actually performing an encoding operation, a video frame to be encoded is acquired as an encoding object.
  • the video encoding method of this application can be applied to multiple scenarios, such as video conference scenarios or film and television entertainment scenarios.
  • the video encoding device collects the on-site video of each conference member through the terminal camera, and transmits the on-site video encoding corresponding to each conference member to other conference member terminals. At this time, the on-site video of the conference is the video to be encoded .
  • the video encoding device receives the target video acquisition request sent by the user terminal, it determines the video to be encoded according to the acquisition request, and transmits the encoding of the video to be encoded to the user terminal.
  • a video is composed of multiple video frames.
  • one video encoding cannot complete all the video frames of the video. It requires multiple video encodings to complete all encoding of a video. Therefore, when encoding the video to be encoded, it needs to be obtained multiple times.
  • the corresponding video encoding operation is performed, that is, each step in the embodiment of the present application.
  • the video encoding setting information is obtained, and the encoding rule is obtained therefrom.
  • the encoding rule may include the determination rule of the region of interest, the region of interest, and The code rate determination rules of the non-interest area, etc.
  • the video frame to be coded can be coded according to the coding rules.
  • the encoding rule can be updated and monitored in real time, and when a change in the encoding rule is detected, the latest encoding rule is obtained, and the remaining unencoded video frames of the video to be encoded are encoded according to the latest encoding rule.
  • Step S20 Perform face detection on the video frame to be encoded based on a preset rule to obtain a face detection result, and determine the region of interest of the video frame to be encoded according to the preset rule and the face detection result. And use an area other than the interest area in the to-be-encoded video frame as a non-interest area;
  • the area where the face is located is the area where the user’s attention is concentrated.
  • the embodiment of the application according to the face-related attributes (such as area attributes , (Pixels/coordinates) location attributes, etc.) for distinguishing coding.
  • face-related attributes such as area attributes , (Pixels/coordinates) location attributes, etc.
  • the preset rule here is the determination rule of the region of interest.
  • the specific detection content needs to be determined according to the preset rule.
  • the preset rule can be obtained when or before the first video frame to be encoded is obtained, or Acquired before face detection.
  • the preset rule can be the area of the face in the video frame to be encoded as the area of interest; it can also be: the area of the face in the video frame to be encoded with an area greater than the preset value is taken as the area of interest; it can also be: the area to be encoded The area of the face in the video frame whose area is larger than the preset value and the surrounding area are regarded as the area of interest; on the basis of the above preset rule, the preset rule may also include: when there is no face in the video frame to be encoded, The preset area (such as the central area) in the video frame is used as the area of interest.
  • the foregoing preset rules are only a few optional examples of the interest area determination rules, and other interest area determination rules based on human faces may also be used.
  • multiple preset rules can be set in the video encoding device at the same time, and the user of the video encoding device can switch the rules for determining the region of interest autonomously.
  • the specific detection content in the aforementioned face detection is determined according to the preset rule, and then the face detection result corresponding to the specific detection content is determined.
  • different specific detection content and face detection results can be determined, including but not limited to the following example: when the preset rule is to use the area of the face in the video frame to be encoded as the region of interest, the specific detection The content is only to detect whether there is a face and the position of the detected face.
  • the corresponding face detection result is that there is a face and face position in the video frame to be encoded, or there is no face in the video frame to be encoded;
  • the rule is that when the area of the face in the video frame to be encoded with an area greater than the preset value is used as the region of interest, the specific detection content is to detect whether there is a face and the area of the face, and the corresponding face detection result is in the video frame to be encoded
  • the preset rules also include: if there is no face in the video frame to be encoded, the When the preset area (such as the central area) in the encoded video frame is used as the area of interest, the specific detection content also includes the location of the preset area.
  • the interest area of the video frame to be encoded can be determined.
  • the region of interest can be expressed in the form of pixels, and pixels other than the pixels corresponding to the region of interest in the to-be-encoded video frame are regarded as the non-interest region.
  • Step S30 Obtain the respective coding code rates of the interest area and the non-interest area, and respectively encode the interest area and the non-interest area based on the respective corresponding coding code rates.
  • the coding code rate corresponding to the interest area and the non-interest area is preset, and the coding code rate corresponding to the interest area is higher than the coding code rate corresponding to the non-interest area. After determining the interest area and the non-interest area, the corresponding coding rate is directly obtained.
  • the non-interest area can be coded with a uniform code rate, or different code rates can be further adopted according to the image complexity or the distance from the interest area.
  • the corresponding bit rates of the different regions of each video frame can be stored, in When the same video is subsequently encoded again, the coding rate distribution of each region of the video can be directly queried and the video is directly encoded according to the coding rate distribution.
  • This embodiment obtains the video frame to be encoded, performs face detection on the video frame to be encoded based on preset rules, obtains the face detection result, and determines the region of interest of the video frame to be encoded according to the preset rules and the face detection result, and
  • the area outside the interest area in the video frame to be encoded is regarded as the non-interest area; the coding rate corresponding to each of the interest area and the non-interest area is obtained, and the interest area and the non-interest area are respectively coded based on the respective corresponding coding code rates, That is: based on the face detection results and preset rules, the user's interest area is recognized for the encoded video frame, and the identified interest area and non-interest area are distinguished and coded, so as to reduce the video size while ensuring the user's interest area Video quality.
  • the method includes:
  • Step S01 Obtain a video to be encoded and video information of the video to be encoded, and obtain the video type of the video to be encoded from the video information;
  • the video coding device configured with computer-readable instructions corresponding to the video coding method of this application can be applied to a variety of different video coding scenarios, such as film and television videos and conference videos.
  • the video to be encoded may be a video collected in real time, such as a conference video transmitted in real time in a digital conference system, or a video pre-stored in a database, such as a movie video in a video website server.
  • the video information of the video to be encoded includes the video type, and may also include main character information, which includes the main character's facial features.
  • main character information which includes the main character's facial features.
  • Step S02 when the video to be encoded is a film and television video, obtain the facial features of the main character from the video information;
  • the facial features of the main character can be directly obtained from the video information.
  • the main character can be determined by analyzing the preset number of video frames in the encoded video (such as by appearance rate/ appearance Time is used as the basis for analysis and judgment of the main character). For example, if a person’s face appears in a preset number of video frames, the person is regarded as one of the main characters; after the main character is determined, the main character in the encoded video is treated Characters extract facial features and store the facial features of the main characters in the video information. When implementing video encoding, the facial features of the main characters are directly obtained from the video information.
  • step S20 the step of determining the region of interest of the video frame to be encoded according to the preset rule and the face detection result includes:
  • Step S21 Determine whether there is a face in the video frame to be encoded according to the face detection result
  • the face detection result includes the presence or absence of a face in the video frame to be encoded, which can be directly determined according to the face detection result.
  • Step S22 If there is no human face in the video frame to be encoded, obtain a preset central area, and use the central area as the region of interest of the video frame to be encoded.
  • the face detection result it is determined that there is no human face in the video frame to be encoded. Because when there is no human face, the user's line of sight is generally at the center of the video, so the preset central area is taken as the interest area of the video frame to be encoded.
  • the preset central area can be a fixed central area.
  • the central area refers to the central area in the geometric sense of the video frame to be encoded. It can be a rectangular area or a circular (including elliptical) area at the center of the video frame to be encoded.
  • the expected area of the central area and the area of the video frame to be encoded calculate the position (pixel position/coordinate position) of the central area on the video frame to be encoded.
  • the step of determining whether there is a human face in the video frame to be encoded according to the human face detection result includes:
  • Step S23 If there is a human face in the video frame to be encoded, determine whether there is a facial feature matching the facial feature of the main person in the video frame to be encoded according to the facial feature of the main person and the face detection result.
  • Target face If there is a human face in the video frame to be encoded, determine whether there is a facial feature matching the facial feature of the main person in the video frame to be encoded according to the facial feature of the main person and the face detection result.
  • the specific detection content of the face detection further includes: when a face is detected, the detection is continued to obtain the face feature, and the corresponding face detection result also includes the detected face feature. It is possible to compare and match the detected facial features with the facial features of the main person to determine whether there is a target face that matches the facial features of the main person.
  • Step S24 If there is a target face matching the facial features of the main person in the video frame to be encoded, the region corresponding to the target face is taken as the region of interest of the video frame to be encoded.
  • the area corresponding to the target face may include the area where the target face is located, and may also include the area where the person corresponding to the target face is located.
  • the area where the target face is located can be directly obtained from the face detection result (pixel position or coordinate position) as the area where the target face is located.
  • the target face corresponds to the area where the person is located, which refers to the pixel area of the body part associated with the face.
  • the surrounding area of the target face in the encoded video frame can be recognized by the body contour, and the area defined by the recognized body contour is regarded as the body part pixel area associated with the face area.
  • the main character refers to the characters that the user is interested in, including the protagonist, the supporting character, and the dragon sleeve.
  • the target face that matches the main character’s facial features, that is, the main character’s face takes the main character as the male and female protagonist as an example.
  • step S23 the method further includes:
  • the area where the human face is located in the video frame to be encoded is used as the region of interest of the video frame to be encoded.
  • the area where the detected face is located is directly used as the region of interest of the video frame to be encoded.
  • the hero and heroine as an example, if there is no target face in the video frame to be encoded, that is, there is no face of the hero or heroine in the video frame to be encoded, but there are faces of other non- heroes and heroines (such as passersby), then The face of a passerby is used as the interest area of the video frame to be encoded.
  • the video type of the video to be encoded is obtained from the video information.
  • the video to be encoded is a film and television video
  • the facial features of the main character are obtained from the video information.
  • the main facial features of the person and the face detection result determine whether there is a target face that matches the facial features of the main person in the video frame to be encoded; if there is a target face that matches the facial feature of the main person in the video frame to be encoded, the target person The face corresponding area is used as the interest area of the video frame to be encoded.
  • the audience (user) generally focuses on the main characters.
  • the main characters are corresponding
  • the area is used as an area of interest, so that the area of interest is subsequently encoded at a higher bit rate, and the area outside the area of interest is encoded at a lower bit rate, that is, the place that the user pays attention to is encoded at a higher bit rate, which can provide users with good video Effect, while encoding the distracted places with a lower bit rate, the video transmission bit rate can be reduced.
  • step S30 includes:
  • Step S31 Determine the macroblock to which the interest area and the non-interest area belong respectively;
  • the video encoding operation in the video encoding method of this application takes macroblocks as the unit, and encodes the macroblocks one by one, and organizes them into a continuous video code stream.
  • the macroblock is composed of one luminance pixel block and two additional chrominance pixel blocks. composition.
  • Both the interest area and the non-interest area belong to one or more macroblocks. After the interest area and the non-interest area are determined, the pixel positions of the interest area and the non-interest area can be used to determine one or more of the interest area and the non-interest area. Multiple macro blocks.
  • Step S32 Obtain the macroblock distance between each macroblock to which the non-interest area belongs and the interest area, and determine the first code rate corresponding to each macroblock to which the non-interest area belongs based on the macroblock distance. Wherein, the macro block distance and the first code rate have a negative correlation;
  • the non-interesting area can be encoded with different code rates. Calculate the distance between the macro block where the non-interest area is located and the macro block of the interest area. The smaller the macro block distance, the greater the bit rate, that is, the bit rate decreases as the distance from the interest area increases, making it difficult for users to perceive a video frame.
  • the quality difference can reduce the coded video stream and reduce the bandwidth requirement under the premise that the user is insensitive.
  • the macroblock distance here can refer to the number of macroblocks separated from the macroblock to which the boundary of the region of interest belongs.
  • the macroblock distance has a negative correlation with the first code rate, that is, corresponds to the macroblock adjacent to the macroblock to which the boundary of the region of interest belongs.
  • the first code rate here does not specifically refer to a certain value, but refers to the code rate corresponding to one or more macroblocks to which all non-interest regions belong.
  • the negative correlation between the macro block distance and the first code rate can be calculated by the following formula:
  • y -kx+b
  • k is a positive number
  • y is the first bit rate
  • x is the macroblock distance
  • step S32 the first code rate corresponding to each macro block to which the non-interest area belongs is determined based on the macro block distance, wherein the macro block distance and the first code rate have a negative correlation
  • the steps include:
  • the macro block distance corresponding to each macro block to which the non-interest area belongs is compared with the preset distance interval to determine the distance interval in which the macro block distance corresponding to each macro block is located; and the preset distance interval and code rate are obtained.
  • the target code rate corresponding to the distance interval of the macro block distance corresponding to each macro block is obtained, and the target code rate is taken as the first code rate corresponding to each macro block.
  • the corresponding relationship between the macro block distance and the first bit rate can be preset, and the corresponding relationship between the two can be stored.
  • the first bit rate corresponding to the macro block can be directly obtained. Macro block distance, and obtain the corresponding relationship between the macro block distance and the first code rate, and determine the size of the first code rate corresponding to the first macro block distance according to the corresponding relationship.
  • Step S33 Obtain a second code rate corresponding to the region of interest, and encode the non-interest region and the region of interest respectively according to the first code rate and the second code rate.
  • the coding rate of the region of interest is pre-stored in the database. After the region of interest is determined, the second bit rate corresponding to the region of interest can be directly obtained from the database.
  • the first code rate corresponding to each macro block to which the non-interest area belongs is used to encode each macro block to which the non-interest area belongs, and the second code rate is used to code the interest area.
  • the correspondence relationship between the macroblock distance and the first code rate that is, the correspondence relationship between the distance interval and the code rate, the distance of the macroblock in a certain distance interval corresponds to the same code rate.
  • This embodiment can reduce the coded video stream and reduce the bandwidth requirement on the premise that the user is insensitive.
  • the video encoding method further includes: receiving no viewer prompt information sent by the user terminal, wherein the no viewer prompt information is detected by the user terminal without line of sight Sent on the screen of the user terminal; reduce the encoding bit rate of the current video frame to be encoded.
  • the setting of the frame rate of the video to be encoded can also be determined according to the detection result of the user status by the user terminal.
  • the user terminal camera can detect whether there is a line of sight on the screen of the user terminal within a preset time period. If no line of sight is detected within the preset time period, it will send a no-viewer prompt message to the video encoding device, and the video encoding device will receive no viewers After prompting the message, reduce the coding rate of the current video frame to be encoded, and when it is detected that the line of sight is staying on the user terminal screen again, the audience prompt message is sent to the video encoding device, and the video encoding device receives the audience prompt message , To restore the encoding rate of the current video frame to be encoded to the normal level.
  • the currently running program can also be determined through the user terminal. If it is detected that the user is performing other program operations, such as temporarily exiting the current video interface to go to other pages to perform operations, or detecting that the video window is the smallest When changing, you can send a no-viewer prompt message to the video encoding device.
  • the no-audience prompt information is sent when the user terminal detects that there is no line of sight on the user terminal screen; reducing the encoding bit rate of the current video frame to be encoded,
  • the user terminal can detect whether the user is actually paying attention to the video, and adjust the encoding rate of the current video frame to be encoded according to the detection result of the user terminal, which can reduce transmission bandwidth and save transmission resources.
  • this application also provides a video encoding system corresponding to the steps of the above-mentioned video encoding method.
  • FIG. 3 is a schematic diagram of the functional modules of the first embodiment of the video encoding system of this application.
  • the video coding system of this application includes:
  • the video frame acquisition module 10 is used to acquire a video frame to be encoded
  • the interest determination module 20 is configured to perform face detection on the video frame to be encoded based on preset rules to obtain a face detection result, and determine the video frame to be encoded according to the preset rule and the face detection result
  • the region of interest in the video frame to be encoded, and the region outside the region of interest in the to-be-encoded video frame is regarded as a non-interest region;
  • the encoding execution module 30 is configured to obtain the respective coding rate of the interest region and the non-interest region, and respectively encode the interest region and the non-interest region based on the respective corresponding coding rate.
  • the interest determination module 20 is further configured to determine whether there is a face in the video frame to be encoded according to the face detection result; if there is no face in the video frame to be encoded, obtain a preset center Area, the central area is taken as the interest area of the video frame to be encoded.
  • the video coding system of this application also includes:
  • the video information acquiring module is used to acquire the video to be encoded and the video information of the video to be encoded, and to acquire the video type of the video to be encoded from the video information; when the video to be encoded is a film and television video, from Obtain the facial features of the main characters from the video information;
  • the interest determination module 20 is further configured to, if there is a human face in the video frame to be encoded, determine whether there is a human face in the video frame to be encoded according to the facial features of the main person and the face detection result.
  • the interest determination module 20 is further configured to, if there is no target face matching the facial features of the main person in the video frame to be encoded, use the area where the face in the video frame to be encoded is located as the The region of interest of the video frame to be encoded.
  • the encoding execution module 30 is also used to determine the macroblocks to which the interest area and the non-interest area belong; to obtain the distance between the macroblocks to which the non-interest area belongs and the macroblock of the interest area, and based on the The macro block distance determines the first code rate corresponding to each macro block to which the non-interest area belongs, where the macro block distance and the first code rate are in a negative correlation; the second code corresponding to the interest area is obtained Encoding the non-interest area and the interest area respectively according to the first code rate and the second code rate.
  • the encoding execution module 30 is further configured to compare the macro block distance corresponding to each macro block to which the non-interest area belongs with a preset distance interval, and determine the distance of the macro block distance corresponding to each macro block Interval; obtain the correspondence between the preset distance interval and the code rate, obtain the target code rate corresponding to the distance interval of the macro block distance corresponding to each macro block, and use the target code rate as the first code corresponding to each macro block rate.
  • the video coding system of this application also includes:
  • the bit rate adjustment module is used to receive the no-viewer prompt information sent by the user terminal, where the no-viewer prompt information is sent when the user terminal detects that there is no line of sight on the user terminal screen; reduce the encoding of the current video frame to be encoded Bit rate.
  • the computer-readable storage medium may be a non-volatile readable storage medium on which a computer program is stored.
  • the computer-readable storage medium may be the memory 201 in the video encoding device of FIG. 1, or may be a ROM (Read-Only Memory, read-only memory)/RAM (Random Access Memory, at least one of random access memory), magnetic disks, and optical disks.
  • the computer-readable storage medium includes several instructions to enable a device with a processor (which can be a mobile phone, a computer, a server, a network device or The video encoding device in the embodiments of the present application, etc.) execute the methods described in the various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供一种基于人脸检测技术的视频编码方法、系统、设备及计算机可读存储介质,该方法包括:获取待编码视频帧;基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述预设规则和所述人脸检测结果确定所述待编码视频帧的兴趣区域,并将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。

Description

视频编码方法、系统、设备及计算机可读存储介质
本申请要求于2019年04月12日提交中国专利局、申请号为201910297964.X、发明名称为“视频编码方法、系统、设备及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及视频编码技术领域,尤其涉及一种视频编码方法、系统、设备及计算机可读存储介质。
背景技术
视频业务的发展无可避免地受到有限带宽资源的制约,而在用低码率进行视频压缩时,往往会造成视频质量的下降,进而降低用户体验,用户体验的降低限制了视频业务的发展。因而急需一种兼顾用户体验与视频传输码率的视频编码方法。
发明内容
本申请的主要目的在于提供一种视频编码方法、系统、设备及计算机可读存储介质,旨在解决现有视频编码方法无法兼顾用户体验与视频传输码率的技术问题。
为实现上述目的,本申请提供一种视频编码方法,所述视频编码方法包括以下步骤:
获取待编码视频帧;
基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述预设规则和所述人脸检测结果确定所述待编码视频帧的兴趣区域,并将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
可选地,所述根据所述预设规则和所述人脸检测结果确定所述待编码视频帧的兴趣区域的步骤包括:
根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸;
若所述待编码视频帧中不存在人脸,则获取预设的中心区域,将所述中心区域作为所述待编码视频帧的兴趣区域。
可选地,所述获取待编码视频帧的步骤之前包括:
获取待编码视频及所述待编码视频的视频信息,从所述视频信息中获得所述待编码视频的视频类型;
在所述待编码视频为影视类视频时,从所述视频信息中获得主要人物的面部特征;
所述根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸的步骤之后包括:
若所述待编码视频帧中存在人脸,则根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸;
若所述待编码视频帧中存在与所述主要人物面部特征匹配的目标人脸,则将所述目标人脸对应区域作为所述待编码视频帧的兴趣区域。
可选地,所述根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸的步骤之后包括:
若所述待编码视频帧中不存在与所述主要人物面部特征匹配的目标人脸,则将所述待编码视频帧中人脸所在区域作为所述待编码视频帧的兴趣区域。
可选地,所述获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码的步骤包括:
确定所述兴趣区域和非兴趣区域各自所属的宏块;
获取所述非兴趣区域所属的各宏块与所述兴趣区域的宏块距离,并基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系;
获取所述兴趣区域对应的第二码率,根据所述第一码率和所述第二码率分别对所述非兴趣区域和兴趣区域进行编码。
可选地,所述基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系的步骤包括:
将所述非兴趣区域所属的各宏块对应的宏块距离与预置的距离区间进行比对,确定各宏块对应的宏块距离所处的距离区间;
获取预置的距离区间与码率的对应关系,获得各宏块对应的宏块距离所处的距离区间对应的目标码率,将所述目标码率作为各宏块对应的第一码率。
可选地,所述视频编码方法还包括:
接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;
降低当前待编码视频帧的编码码率。
此外,为实现上述目的,本申请还提供一种视频编码系统,所述视频编码系统包括:
视频帧获取模块,用于获取待编码视频帧;
兴趣确定模块,用于基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述预设规则和所述人脸检测结果确定所述待编码视频帧的兴趣区域,并将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
编码执行模块,用于获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
此外,为实现上述目的,本申请还提供一种视频编码设备,所述视频编码设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现如上述的视频编码方法的步骤。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现如上述的视频编码方法的步骤。
本申请实施例通过获取待编码视频帧,基于预设规则对待编码视频帧进行人脸检测,获得人脸检测结果,并根据预设规则和人脸检测结果确定待编码视频帧的兴趣区域,并将待编码视频帧中兴趣区域以外的区域作为非兴趣区域;获取兴趣区域和非兴趣区域各自对应的编码码率,并基于各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码,即:基于人脸检测结果和预设规则对待编码视频帧进行用户兴趣区域的识别,并将识别出的兴趣区域以及非兴趣区域进行区分编码,进而在降低视频大小的同时,确保用户兴趣区域的视频质量。
附图说明
图1是本申请实施例方案涉及的硬件运行环境的视频编码设备结构示意图;
图2为本申请视频编码方法一实施例的流程示意图;
图3为本申请视频编码系统一实施例的功能模块示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
请参见图1,图1为本申请所提供的视频编码设备的硬件结构示意图。
视频编码设备可以是PC,也可以是智能手机、平板电脑、便携计算机、台式计算机等具有显示功能的设备,可选地,视频编码设备还可以是服务器设备,例如与用户终端进行视频数据交互的远端视频服务器。
视频编码设备可以包括:处理器101以及存储器201等部件。在视频编码设备中,处理器101与存储器201连接,存储器201上存储有计算机可读指令,处理器101可以调用存储器201中存储的计算机可读指令,并实现如下述视频编码方法各实施例的步骤。
存储器201,可用于存储软件程序以及各种数据,存储器201可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如用于进行视频编码的计算机可读指令)等;存储数据区可包括数据库等。处理器101,是视频编码设备的控制中心,利用各种接口和线路连接整个视频编码设备的各个部分,通过运行或执行存储在存储器201内的软件程序和/或模块,以及调用存储在存储器201内的数据,执行视频编码设备的各种功能和处理数据,从而对视频编码设备进行整体监控。
本领域技术人员可以理解,图1中示出的视频编码设备结构并不构成对视频编码设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
基于上述硬件结构,提出本申请方法各个实施例。
本申请提供一种视频编码方法。
参照图2,图2为本申请视频编码方法第一实施例的流程示意图。
本实施例中,视频编码方法包括以下步骤:
步骤S10,获取待编码视频帧;
视频编码设备可以从本地或远程服务器预置的视频数据库中获取待编码视频,其中,待编码视频可为实时采集的视频,例如通过终端摄像头实时采集的视频,如会议系统中实时采集的会议视频,也可以是预先存储的视频,例如影视视频。其中,视频帧是构成视频的基本单元且为视频编码的基本对象,因而,本实施例中,在实际执行编码操作之前,获取待编码视频帧作为编码对象。
本申请视频编码方法可应用于多个场景,如视频会议场景或影视娱乐场景。在视频会议场景下,视频编码设备通过终端摄像头采集各会议成员所在的会议现场视频,并将各会议成员对应的会议现场视频编码传输到其他会议成员终端,此时,会议现场视频为待编码视频。在影视娱乐场景下,视频编码设备在接收到用户终端发送的目标视频获取请求时,根据该获取请求确定待编码视频,并将待编码视频编码传输到用户终端。
一个视频由多个视频帧组成,通常一次视频编码无法将视频的所有视频帧编码完成,需要多次视频编码才能将一个视频全部编码完成,因此,在对待编码视频进行编码时,需多次获得单次编码所需的预设数目的待编码视频帧,执行对应的视频编码操作,即本申请实施例中的各步骤。
可选地,在首次获得待编码视频的视频帧(即待编码视频帧)之时或之前,获得视频编码设置信息,并从中获得编码规则,编码规则可包括兴趣区域的确定规则、兴趣区域与非兴趣区域的码率确定规则等,在获得编码规则后,即可依据编码规则对待编码视频帧进行编码操作。可选地,可实时对编码规则进行更新监控,在检测到编码规则发生改变时,获得最新的编码规则,并根据最新编码规则对待编码视频剩余的未编码视频帧进行编码。
步骤S20,基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述预设规则和所述人脸检测结果确定所述待编码视频帧的兴趣区域,并将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
不管是会议视频,还是影视视频等其他视频,人脸所在区域都是用户注意力集中的区域,为兼顾质量和压缩效率,本申请实施例根据待编码视频帧的人脸相关属性(如面积属性、(像素/坐标)位置属性等)进行区分编码。鉴于不同待编码视频帧中是否存在人脸以及人脸位置等人脸相关属性的不确定,需要对待编码视频帧进行人脸检测以确定人脸相关属性,以便后续编码步骤中基于人脸相关属性确定编码码率的具体分配。
此处的预设规则,即兴趣区域的确定规则,在进行人脸检测时,具体检测内容需根据预设规则确定,预设规则可在首次获得待编码视频帧之时或之前获取,也可以在进行人脸检测之前获取。
预设规则可以为将待编码视频帧中人脸所在区域作为兴趣区域;也可以为:将待编码视频帧中面积大于预设值的人脸所在区域作为兴趣区域;还可以为:将待编码视频帧中面积大于预设值的人脸所在区域及周边区域作为兴趣区域;在上述预设规则的基础上,预设规则还可包括:在待编码视频帧中无人脸时,将待编码视频帧中预设区域(如中心区域)作为兴趣区域。前述预设规则仅为兴趣区域确定规则的几个可选示例,还可为其他基于人脸的兴趣区域确定规则。此外,视频编码设备中还可同时设置多个预设规则,可由视频编码设备用户自主切换兴趣区域确定规则。
根据预设规则确定前述人脸检测中的具体检测内容,进而确定与具体检测内容对应的人脸检测结果。根据预设规则的不同,可确定对应不同的具体检测内容以及人脸检测结果,包括但不限于如下示例:在预设规则为将待编码视频帧中人脸所在区域作为兴趣区域时,具体检测内容仅为检测是否存在人脸,以及检测到的人脸位置,对应的人脸检测结果为待编码视频帧中存在人脸以及人脸位置,或待编码视频帧中不存在人脸;在预设规则为将待编码视频帧中面积大于预设值的人脸所在区域作为兴趣区域时,具体检测内容为检测是否存在人脸以及人脸面积,对应的人脸检测结果为待编码视频帧中存在人脸以及存在的人脸位置以及面积大于预设值的人脸,或待编码视频帧中不存在人脸;在预设规则还包括:若待编码视频帧中无人脸,则将待编码视频帧中预设区域(如中心区域)作为兴趣区域时,具体检测内容还包括预设区域的位置。
基于上述说明可知,在确定预设规则和人脸检测结果后,即可确定待编码视频帧的兴趣区域。可将兴趣区域以像素形式表示,并将待编码视频帧中兴趣区域对应像素之外的像素作为非兴趣区域。
步骤S30,获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
预设兴趣区域和非兴趣区域各自对应的编码码率,兴趣区域对应编码码率高于非兴趣区域对应的编码码率。在确定兴趣区域和非兴趣区域后,直接获得对应的编码码率。其中,非兴趣区域可采用统一的码率编码,也可进一步根据图像复杂度或与兴趣区域的距离采用不同的码率。
可选地,对任意一个视频,在进行了上述兴趣区域识别、对兴趣区域与非兴趣区域采用不同码率进行编码等步骤后,可将每个视频帧的不同区域对应码率进行存储,在后续再次对同一视频进行视频编码时,可直接查询获得该视频各区域的编码码率分布,并根据该编码码率分布直接对该视频进行编码。
本实施例通过获取待编码视频帧,基于预设规则对待编码视频帧进行人脸检测,获得人脸检测结果,并根据预设规则和人脸检测结果确定待编码视频帧的兴趣区域,并将待编码视频帧中兴趣区域以外的区域作为非兴趣区域;获取兴趣区域和非兴趣区域各自对应的编码码率,并基于各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码,即:基于人脸检测结果和预设规则对待编码视频帧进行用户兴趣区域的识别,并将识别出的兴趣区域以及非兴趣区域进行区分编码,进而在降低视频大小的同时,确保用户兴趣区域的视频质量。
进一步地,基于上述实施例提出本申请视频编码方法的第二实施例。
在本申请视频编码方法第二实施例中,步骤S10之前包括:
步骤S01,获取待编码视频及所述待编码视频的视频信息,从所述视频信息中获得所述待编码视频的视频类型;
配置本申请视频编码方法对应计算机可读指令的视频编码设备,可应用于多种不同的视频编码场景,典型的如影视类视频、会议视频。待编码视频,可以为实时采集的视频,如数字会议系统中实时传输的会议视频,也可以为预存在数据库中的视频,如视频网站服务器中的影视类视频。
待编码视频的视频信息中包含了视频类型,还可包含主要人物信息,其包括主要人物面部特征。其中,在视频类型为影视类视频时,因每个影视作品都固定有一个或多个出镜的主要人物,包括主角、配角以及龙套等,这些都是用户感兴趣的区域,因而视频信息中包括前述主要人物。
步骤S02,在所述待编码视频为影视类视频时,从所述视频信息中获得主要人物的面部特征;
可从视频信息中直接获取主要人物的面部特征,在视频信息中没有主要人物的面部特征时,可通过对待编码视频中预设数目的视频帧进行分析以确定主要人物(如通过出场率/出场时间作为主要人物的分析判断依据),例如,若是在预设数目的视频帧中都出现了某人的人脸,则将该人作为主要人物之一;确定主要人物后,对待编码视频中主要人物进行面部特征提取,并将主要人物面部特征存储在视频信息中,实现视频编码时,直接从视频信息中获取主要人物的面部特征。
步骤S20中根据所述预设规则和所述人脸检测结果确定所述待编码视频帧的兴趣区域的步骤包括:
步骤S21,根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸;
在本实施例中,人脸检测结果中包含待编码视频帧中存在人脸或不存在人脸,可直接根据人脸检测结果确定。
步骤S22,若所述待编码视频帧中不存在人脸,则获取预设的中心区域,将所述中心区域作为所述待编码视频帧的兴趣区域。
根据人脸检测结果确定待编码视频帧中不存在人脸,因为在没有人脸时,用户的视线焦点一般在视频中心位置,所以将预设的中心区域作为待编码视频帧的兴趣区域。
预设的中心区域可以为固定的中心区域,中心区域指待编码视频帧几何意义上的中心区域,可以为处于待编码视频帧中心的矩形区域或圆形(包括椭圆形)区域,具体可根据预期的中心区域面积和待编码视频帧的面积计算好中心区域在待编码视频帧上的位置(像素位置/坐标位置)。
所述根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸的步骤之后包括:
步骤S23,若所述待编码视频帧中存在人脸,则根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸;
若待编码视频帧中存在人脸,则继续判断待编码视频帧中是否存在与主要人物面部特征匹配的目标人脸。在本实施例中,人脸检测的具体检测内容还包括:在检测到人脸时,继续检测获得人脸特征,则对应的人脸检测结果中还包括检测到的人脸特征。可通过将检测到的人脸特征与主要人物面部特征进行对比匹配,判断是否存在与主要人物面部特征匹配的目标人脸。
步骤S24,若所述待编码视频帧中存在与所述主要人物面部特征匹配的目标人脸,则将所述目标人脸对应区域作为所述待编码视频帧的兴趣区域。
目标人脸对应区域,可包括目标人脸所在区域,还可同时包括目标人脸对应人物所在区域。
其中,目标人脸所在区域,可直接从人脸检测结果中获得目标人脸所在的位置(像素位置或坐标位置)作为目标人脸所在区域。目标人脸对应人物所在区域,指人脸关联的身体部分像素区域,可对待编码视频帧中目标人脸周边区域进行人体轮廓识别,将识别的人体轮廓界定的区域作为人脸关联的身体部分像素区域。
主要人物指包括主角、配角以及龙套等用户感兴趣的人物,与主要人物面部特征匹配的目标人脸,即主要人物的人脸,以主要人物为男女主角为例,在检测到待编码视频帧中出现男女主角的人脸时,男女主角的人脸为待编码视频帧的兴趣区域,此时,若是待编码视频帧中还出现了其他非男女主角的脸,该其他非男女主角的脸为非兴趣区域。
可选地,步骤S23之后还包括:
若所述待编码视频帧中不存在与所述主要人物面部特征匹配的目标人脸,则将所述待编码视频帧中人脸所在区域作为所述待编码视频帧的兴趣区域。
若待编码视频帧中无目标人脸,则直接将检测到的人脸所在区域作为待编码视频帧的兴趣区域。
接上述示例,以主要人物为男女主角为例,若待编码视频帧中无目标人脸,即待编码视频帧中无男女主角的脸,但有其他非男女主角(如路人)的脸,则将路人的脸作为待编码视频帧的兴趣区域。
本实施例通过从视频信息中获得待编码视频的视频类型,在待编码视频为影视类视频时,从视频信息中获得主要人物的面部特征,在待编码视频帧中存在人脸时,根据主要人物的面部特征和人脸检测结果判断待编码视频帧中是否存在与主要人物面部特征匹配的目标人脸;若待编码视频帧中存在与主要人物面部特征匹配的目标人脸,则将目标人脸对应区域作为所述待编码视频帧的兴趣区域,鉴于影视类视频中,观众(用户)的注意力一般集中在主要人物上,通过对影视类视频中主要人物的识别,并将主要人物对应区域作为兴趣区域,以便后续对兴趣区域以较高码率编码,将兴趣区域以外的区域以较低码率编码,即对用户注意的地方以较高码率编码,可为用户提供良好的视频效果,同时将用户注意力分散的地方以较低码率编码,则可降低视频传输码率。
进一步地,在本申请视频编码方法第三实施例中,步骤S30包括:
步骤S31,确定所述兴趣区域和非兴趣区域各自所属的宏块;
本申请视频编码方法中的视频编码操作以宏块为单位,逐个宏块进行编码,将其组织成连续的视频码流,其中,宏块由一个亮度像素块和附加的两个色度像素块组成。
兴趣区域与非兴趣区域均所属一个或多个宏块,在确定好兴趣区域和非兴趣区域后,即可根据兴趣区域和非兴趣区域的像素位置确定兴趣区域和非兴趣区域各自所属的一个或多个宏块。
步骤S32,获取所述非兴趣区域所属的各宏块与所述兴趣区域的宏块距离,并基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系;
与人眼焦距中心越远的区域,人眼越容易忽视,基于这一人眼视觉特征可对非兴趣区域采用不同码率编码。计算非兴趣区域所在宏块与兴趣区域的宏块距离,宏块距离越小,码率越大,即随着与兴趣区域距离的增大而降低码率,使得用户很难察觉一个视频帧的质量差异,实现在用户无感的前提下,降低编码视频流,降低带宽要求。
这里的宏块距离,可以指与兴趣区域的边界所属宏块相隔的宏块数目,宏块距离与第一码率呈负相关关系,即:与兴趣区域边界所属宏块相邻的宏块对应的宏块距离越小,该相邻的宏块对应的第一码率越大;与兴趣区域边界所属宏块相隔的宏块数最多的宏块,对应的第一码率最小。这里的第一码率并非特指某一数值,而是指代所有非兴趣区域所属的一个或多个宏块对应的编码码率。
可选地,可通过下述公式计算宏块距离与第一码率间的负相关关系:
y=-kx+b,k为正数,y为第一码率,x为宏块距离。
可选地,步骤S32中所述基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系的步骤包括:
将所述非兴趣区域所属的各宏块对应的宏块距离与预置的距离区间进行比对,确定各宏块对应的宏块距离所处的距离区间;获取预置的距离区间与码率的对应关系,获得各宏块对应的宏块距离所处的距离区间对应的目标码率,将所述目标码率作为各宏块对应的第一码率。
可预设宏块距离与第一码率之间的对应关系,并将二者之间的对应关系存储,在需确定某宏块的第一码率时,直接获取该宏块对应的第一宏块距离,并获取宏块距离与第一码率的对应关系,根据该对应关系确定第一宏块距离对应的第一码率的大小。
步骤S33,获取所述兴趣区域对应的第二码率,根据所述第一码率和所述第二码率分别对所述非兴趣区域和兴趣区域进行编码。
本实施例中,将兴趣区域的编码码率,即第二码率,预存在数据库中,在确定兴趣区域后,可直接从数据库中获取兴趣区域对应的第二码率。用非兴趣区域所属的各宏块对应的第一码率编码非兴趣区域所属的各宏块,用第二码率对兴趣区域进行编码。
本实施例中,宏块距离与第一码率的对应关系,即距离区间与码率的对应关系,处于某一距离区间的的宏块距离对应着同一码率。
本实施例可实现在用户无感的前提下,降低编码视频流,降低带宽要求。
可选地,在本申请视频编码方法第四实施例中,所述视频编码方法还包括:接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;降低当前待编码视频帧的编码码率。
对于待编码视频帧码率的设置,还可以根据用户终端对用户状态的检测结果确定。具体可通过用户终端摄像头检测预设时段内是否有视线停留在用户终端屏幕上,若检测到预设时段内无视线,则发送无观众提示信息到视频编码设备,视频编码设备在接收到无观众提示信息后,降低当前待编码视频帧的编码码率,在检测到重新有视线停留在用户终端屏幕上时,发送有观众提示信息到视频编码设备,视频编码设备在接收到有观众提示信息时,将当前待编码视频帧的编码码率恢复至正常水平。
因为对待编码视频帧的不同区域进行区分编码,所以,在降低当前待编码视频帧的编码码率时,对当前待编码视频帧的所有区域统一降低相同码率,或将当前待编码视频帧的所有区域的编码码率降低到同一码率值。
可选地,无观众提示信息还可通过用户终端确定当前正在运行的程序,若检测到用户在进行其他程序的操作,如暂时退出当前视频界面去到其他页面执行操作,或者检测到视频窗口最小化时,可发送无观众提示信息给视频编码设备。
本实施例通过接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;降低当前待编码视频帧的编码码率,可由用户终端检测用户实际是否花费注意力在视频上,并根据用户终端的检测结果对视频的当前待编码视频帧的编码码率进行调整,可以降低传输带宽,节约传输资源。
此外,本申请还提供一种与上述视频编码方法各步骤对应的视频编码系统。
参照图3,图3为本申请视频编码系统第一实施例的功能模块示意图。
在本实施例中,本申请视频编码系统包括:
视频帧获取模块10,用于获取待编码视频帧;
兴趣确定模块20,用于基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述预设规则和所述人脸检测结果确定所述待编码视频帧的兴趣区域,并将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
编码执行模块30,用于获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
进一步地,兴趣确定模块20,还用于根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸;若所述待编码视频帧中不存在人脸,则获取预设的中心区域,将所述中心区域作为所述待编码视频帧的兴趣区域。
进一步地,本申请视频编码系统还包括:
视频信息获取模块,用于获取待编码视频及所述待编码视频的视频信息,从所述视频信息中获得所述待编码视频的视频类型;在所述待编码视频为影视类视频时,从所述视频信息中获得主要人物的面部特征;
兴趣确定模块20,还用于若所述待编码视频帧中存在人脸,则根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸;若所述待编码视频帧中存在与所述主要人物面部特征匹配的目标人脸,则将所述目标人脸对应区域作为所述待编码视频帧的兴趣区域。
进一步地,兴趣确定模块20,还用于若所述待编码视频帧中不存在与所述主要人物面部特征匹配的目标人脸,则将所述待编码视频帧中人脸所在区域作为所述待编码视频帧的兴趣区域。
进一步地,编码执行模块30,还用于确定所述兴趣区域和非兴趣区域各自所属的宏块;获取所述非兴趣区域所属的各宏块与所述兴趣区域的宏块距离,并基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系;获取所述兴趣区域对应的第二码率,根据所述第一码率和所述第二码率分别对所述非兴趣区域和兴趣区域进行编码。
进一步地,编码执行模块30,还用于将所述非兴趣区域所属的各宏块对应的宏块距离与预置的距离区间进行比对,确定各宏块对应的宏块距离所处的距离区间;获取预置的距离区间与码率的对应关系,获得各宏块对应的宏块距离所处的距离区间对应的目标码率,将所述目标码率作为各宏块对应的第一码率。
进一步地,本申请视频编码系统还包括:
码率调整模块,用于接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;降低当前待编码视频帧的编码码率。
本申请还提出一种计算机可读存储介质,计算机可读存储介质可以为非易失性可读存储介质,其上存储有计算机程序。所述计算机可读存储介质可以是图1的视频编码设备中的存储器201,也可以是如ROM(Read-Only Memory,只读存储器)/RAM(Random Access Memory,随机存取存储器)、磁碟、光盘中的至少一种,所述计算机可读存储介质包括若干指令用以使得一台具有处理器的设备(可以是手机,计算机,服务器,网络设备或本申请实施例中的视频编码设备等)执行本申请各个实施例所述的方法。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者服务端不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者服务端所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者服务端中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种视频编码方法,其中,所述视频编码方法包括以下步骤:
    获取待编码视频及所述待编码视频的视频信息,从所述视频信息中获得所述待编码视频的视频类型;
    在所述待编码视频为影视类视频时,从所述视频信息中获得主要人物的面部特征;
    获取待编码视频帧;
    基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸;
    若所述待编码视频帧中不存在人脸,则获取预设的中心区域,将所述中心区域作为所述待编码视频帧的兴趣区域;
    若所述待编码视频帧中存在人脸,则根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸;
    若所述待编码视频帧中存在与所述主要人物面部特征匹配的目标人脸,则将所述目标人脸对应区域作为所述待编码视频帧的兴趣区域;
    将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
    获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
  2. 如权利要求1所述的视频编码方法,其中,所述根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸的步骤之后包括:
    若所述待编码视频帧中不存在与所述主要人物面部特征匹配的目标人脸,则将所述待编码视频帧中人脸所在区域作为所述待编码视频帧的兴趣区域。
  3. 如权利要求1所述的视频编码方法,其中,所述获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码的步骤包括:
    确定所述兴趣区域和非兴趣区域各自所属的宏块;
    获取所述非兴趣区域所属的各宏块与所述兴趣区域的宏块距离,并基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系;
    获取所述兴趣区域对应的第二码率,根据所述第一码率和所述第二码率分别对所述非兴趣区域和兴趣区域进行编码。
  4. 如权利要求3所述的视频编码方法,其中,所述基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系的步骤包括:
    将所述非兴趣区域所属的各宏块对应的宏块距离与预置的距离区间进行比对,确定各宏块对应的宏块距离所处的距离区间;
    获取预置的距离区间与码率的对应关系,获得各宏块对应的宏块距离所处的距离区间对应的目标码率,将所述目标码率作为各宏块对应的第一码率。
  5. 如权利要求1所述的视频编码方法,其中,所述视频编码方法还包括:
    接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;
    降低当前待编码视频帧的编码码率。
  6. 一种视频编码系统,其中,所述视频编码系统包括:
    视频信息获取模块,用于获取待编码视频及所述待编码视频的视频信息,从所述视频信息中获得所述待编码视频的视频类型;在所述待编码视频为影视类视频时,从所述视频信息中获得主要人物的面部特征;
    视频帧获取模块,用于获取待编码视频帧;
    兴趣确定模块,用于基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸;若所述待编码视频帧中不存在人脸,则获取预设的中心区域,将所述中心区域作为所述待编码视频帧的兴趣区域;若所述待编码视频帧中存在人脸,则根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸;若所述待编码视频帧中存在与所述主要人物面部特征匹配的目标人脸,则将所述目标人脸对应区域作为所述待编码视频帧的兴趣区域;将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
    编码执行模块,用于获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
  7. 如权利要求6所述的视频编码系统,其中,所述兴趣确定模块,还用于若所述待编码视频帧中不存在与所述主要人物面部特征匹配的目标人脸,则将所述待编码视频帧中人脸所在区域作为所述待编码视频帧的兴趣区域。
  8. 如权利要求6所述的视频编码系统,其中,所述编码执行模块,还用于确定所述兴趣区域和非兴趣区域各自所属的宏块;获取所述非兴趣区域所属的各宏块与所述兴趣区域的宏块距离,并基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系;获取所述兴趣区域对应的第二码率,根据所述第一码率和所述第二码率分别对所述非兴趣区域和兴趣区域进行编码。
  9. 如权利要求8所述的视频编码系统,其中,所述编码执行模块,还用于将所述非兴趣区域所属的各宏块对应的宏块距离与预置的距离区间进行比对,确定各宏块对应的宏块距离所处的距离区间;获取预置的距离区间与码率的对应关系,获得各宏块对应的宏块距离所处的距离区间对应的目标码率,将所述目标码率作为各宏块对应的第一码率。
  10. 如权利要求6所述的视频编码系统,其中,所述视频编码系统,还包括:
    码率调整模块,用于接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;降低当前待编码视频帧的编码码率。
  11. 一种视频编码设备,其中,所述视频编码设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现如下步骤:
    获取待编码视频及所述待编码视频的视频信息,从所述视频信息中获得所述待编码视频的视频类型;
    在所述待编码视频为影视类视频时,从所述视频信息中获得主要人物的面部特征;
    获取待编码视频帧;
    基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸;
    若所述待编码视频帧中不存在人脸,则获取预设的中心区域,将所述中心区域作为所述待编码视频帧的兴趣区域;
    若所述待编码视频帧中存在人脸,则根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸;
    若所述待编码视频帧中存在与所述主要人物面部特征匹配的目标人脸,则将所述目标人脸对应区域作为所述待编码视频帧的兴趣区域;
    将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
    获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
  12. 如权利要求11所述的视频编码设备,其中,所述根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸的步骤之后包括:
    若所述待编码视频帧中不存在与所述主要人物面部特征匹配的目标人脸,则将所述待编码视频帧中人脸所在区域作为所述待编码视频帧的兴趣区域。
  13. 如权利要求11所述的视频编码设备,其中,所述获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码的步骤包括:
    确定所述兴趣区域和非兴趣区域各自所属的宏块;
    获取所述非兴趣区域所属的各宏块与所述兴趣区域的宏块距离,并基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系;
    获取所述兴趣区域对应的第二码率,根据所述第一码率和所述第二码率分别对所述非兴趣区域和兴趣区域进行编码。
  14. 如权利要求13所述的视频编码设备,其中,所述基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系的步骤包括:
    将所述非兴趣区域所属的各宏块对应的宏块距离与预置的距离区间进行比对,确定各宏块对应的宏块距离所处的距离区间;
    获取预置的距离区间与码率的对应关系,获得各宏块对应的宏块距离所处的距离区间对应的目标码率,将所述目标码率作为各宏块对应的第一码率。
  15. 如权利要求11所述的视频编码设备,其中,所述计算机可读指令被所述处理器执行时,还实现如下步骤:
    接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;
    降低当前待编码视频帧的编码码率。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现如下步骤:
    获取待编码视频及所述待编码视频的视频信息,从所述视频信息中获得所述待编码视频的视频类型;
    在所述待编码视频为影视类视频时,从所述视频信息中获得主要人物的面部特征;
    获取待编码视频帧;
    基于预设规则对所述待编码视频帧进行人脸检测,获得人脸检测结果,并根据所述人脸检测结果确定所述待编码视频帧中是否存在人脸;
    若所述待编码视频帧中不存在人脸,则获取预设的中心区域,将所述中心区域作为所述待编码视频帧的兴趣区域;
    若所述待编码视频帧中存在人脸,则根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸;
    若所述待编码视频帧中存在与所述主要人物面部特征匹配的目标人脸,则将所述目标人脸对应区域作为所述待编码视频帧的兴趣区域;
    将所述待编码视频帧中兴趣区域以外的区域作为非兴趣区域;
    获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述根据所述主要人物的面部特征和所述人脸检测结果判断所述待编码视频帧中是否存在与所述主要人物面部特征匹配的目标人脸的步骤之后包括:
    若所述待编码视频帧中不存在与所述主要人物面部特征匹配的目标人脸,则将所述待编码视频帧中人脸所在区域作为所述待编码视频帧的兴趣区域。
  18. 如权利要求16所述的计算机可读存储介质,其中,所述获取所述兴趣区域和非兴趣区域各自对应的编码码率,并基于所述各自对应的编码码率分别对所述兴趣区域和非兴趣区域进行编码的步骤包括:
    确定所述兴趣区域和非兴趣区域各自所属的宏块;
    获取所述非兴趣区域所属的各宏块与所述兴趣区域的宏块距离,并基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系;
    获取所述兴趣区域对应的第二码率,根据所述第一码率和所述第二码率分别对所述非兴趣区域和兴趣区域进行编码。
  19. 如权利要求18所述的计算机可读存储介质,其中,所述基于所述宏块距离确定所述非兴趣区域所属的各宏块各自对应的第一码率,其中,所述宏块距离与第一码率为负相关关系的步骤包括:
    将所述非兴趣区域所属的各宏块对应的宏块距离与预置的距离区间进行比对,确定各宏块对应的宏块距离所处的距离区间;
    获取预置的距离区间与码率的对应关系,获得各宏块对应的宏块距离所处的距离区间对应的目标码率,将所述目标码率作为各宏块对应的第一码率。
  20. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被处理器执行时,还实现如下步骤:
    接收到用户终端发送的无观众提示信息,其中,所述无观众提示信息由用户终端检测到没有视线在用户终端屏幕上时所发;
    降低当前待编码视频帧的编码码率。
PCT/CN2019/120899 2019-04-12 2019-11-26 视频编码方法、系统、设备及计算机可读存储介质 WO2020207030A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910297964.X 2019-04-12
CN201910297964.XA CN110049324B (zh) 2019-04-12 2019-04-12 视频编码方法、系统、设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2020207030A1 true WO2020207030A1 (zh) 2020-10-15

Family

ID=67276985

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120899 WO2020207030A1 (zh) 2019-04-12 2019-11-26 视频编码方法、系统、设备及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN110049324B (zh)
WO (1) WO2020207030A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113068034A (zh) * 2021-03-25 2021-07-02 Oppo广东移动通信有限公司 视频编码方法及装置、编码器、设备、存储介质
CN114531615A (zh) * 2020-11-03 2022-05-24 腾讯科技(深圳)有限公司 视频数据处理方法、装置、计算机设备和存储介质
CN116800976A (zh) * 2023-07-17 2023-09-22 武汉星巡智能科技有限公司 伴睡婴幼儿时音视频压缩和还原方法、装置及设备
WO2023207205A1 (zh) * 2022-04-29 2023-11-02 上海哔哩哔哩科技有限公司 视频编码方法及装置

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110049324B (zh) * 2019-04-12 2022-10-14 深圳壹账通智能科技有限公司 视频编码方法、系统、设备及计算机可读存储介质
CN110620924B (zh) * 2019-09-23 2022-05-20 广州虎牙科技有限公司 编码数据的处理方法、装置、计算机设备及存储介质
CN110769252A (zh) * 2019-11-01 2020-02-07 西安交通大学 一种利用ai人脸检测提升编码质量的方法
CN113011210B (zh) 2019-12-19 2022-09-16 北京百度网讯科技有限公司 视频处理方法和装置
CN111050190B (zh) * 2019-12-31 2022-02-18 广州酷狗计算机科技有限公司 直播视频流的编码方法、装置、设备及存储介质
CN111885332A (zh) * 2020-07-31 2020-11-03 歌尔科技有限公司 一种视频存储方法、装置、摄像头及可读存储介质
CN112183227B (zh) * 2020-09-08 2023-12-22 瑞芯微电子股份有限公司 一种智能泛人脸区域的编码方法和设备
CN112733650B (zh) * 2020-12-29 2024-05-07 深圳云天励飞技术股份有限公司 目标人脸检测方法、装置、终端设备及存储介质
CN112995713A (zh) * 2021-03-02 2021-06-18 广州酷狗计算机科技有限公司 视频处理方法、装置、计算机设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103974071A (zh) * 2013-01-29 2014-08-06 富士通株式会社 基于感兴趣区域的视频编码方法和设备
CN104427337A (zh) * 2013-08-21 2015-03-18 杭州海康威视数字技术股份有限公司 基于目标检测的感兴趣区域视频编码方法及其装置
CN106550240A (zh) * 2016-12-09 2017-03-29 武汉斗鱼网络科技有限公司 一种带宽节省方法和系统
US20170374319A1 (en) * 2016-06-24 2017-12-28 Pegatron Corporation Video image generation system and video image generating method thereof
CN110049324A (zh) * 2019-04-12 2019-07-23 深圳壹账通智能科技有限公司 视频编码方法、系统、设备及计算机可读存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102547293B (zh) * 2012-02-16 2015-01-28 西南交通大学 人脸区域时域依赖性与全局率失真优化相结合的会话视频编码方法
WO2016202285A1 (zh) * 2015-06-19 2016-12-22 美国掌赢信息科技有限公司 一种即时视频的传输方法和电子设备
CN106658011A (zh) * 2016-12-09 2017-05-10 深圳市云宙多媒体技术有限公司 全景视频的编解码方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103974071A (zh) * 2013-01-29 2014-08-06 富士通株式会社 基于感兴趣区域的视频编码方法和设备
CN104427337A (zh) * 2013-08-21 2015-03-18 杭州海康威视数字技术股份有限公司 基于目标检测的感兴趣区域视频编码方法及其装置
US20170374319A1 (en) * 2016-06-24 2017-12-28 Pegatron Corporation Video image generation system and video image generating method thereof
CN106550240A (zh) * 2016-12-09 2017-03-29 武汉斗鱼网络科技有限公司 一种带宽节省方法和系统
CN110049324A (zh) * 2019-04-12 2019-07-23 深圳壹账通智能科技有限公司 视频编码方法、系统、设备及计算机可读存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114531615A (zh) * 2020-11-03 2022-05-24 腾讯科技(深圳)有限公司 视频数据处理方法、装置、计算机设备和存储介质
CN114531615B (zh) * 2020-11-03 2023-10-27 腾讯科技(深圳)有限公司 视频数据处理方法、装置、计算机设备和存储介质
CN113068034A (zh) * 2021-03-25 2021-07-02 Oppo广东移动通信有限公司 视频编码方法及装置、编码器、设备、存储介质
WO2023207205A1 (zh) * 2022-04-29 2023-11-02 上海哔哩哔哩科技有限公司 视频编码方法及装置
CN116800976A (zh) * 2023-07-17 2023-09-22 武汉星巡智能科技有限公司 伴睡婴幼儿时音视频压缩和还原方法、装置及设备
CN116800976B (zh) * 2023-07-17 2024-03-12 武汉星巡智能科技有限公司 伴睡婴幼儿时音视频压缩和还原方法、装置及设备

Also Published As

Publication number Publication date
CN110049324B (zh) 2022-10-14
CN110049324A (zh) 2019-07-23

Similar Documents

Publication Publication Date Title
WO2020207030A1 (zh) 视频编码方法、系统、设备及计算机可读存储介质
WO2017206456A1 (zh) 一种视频通话中视频图像展示方法及装置
WO2018128472A1 (en) Virtual reality experience sharing
WO2011062339A1 (en) Method for user authentication, and video communication apparatus and display apparatus thereof
EP3523960A1 (en) Device and method of displaying images
WO2016171363A1 (ko) 서버, 사용자 단말 장치 및 그 제어 방법
EP3304942A1 (en) Method and apparatus for sharing application
WO2017107611A1 (zh) 一种智能家居控制方法以及装置、系统
WO2016200018A1 (en) Method and apparatus for sharing application
WO2017080402A1 (zh) 同屏监控智能设备状态的方法、投影设备及用户终端
WO2019203528A1 (en) Electronic apparatus and method for controlling thereof
WO2015182893A1 (en) Apparatus and method for providing information
WO2018225949A1 (en) Method and apparatus for determining a motion vector
WO2023171981A1 (ko) 감시카메라 관리 장치
WO2019233190A1 (zh) 基于显示终端的文本转语音方法、显示终端及存储介质
WO2020207038A1 (zh) 基于人脸识别的人数统计方法、装置、设备及存储介质
WO2019114587A1 (zh) 虚拟现实终端的信息处理方法、装置及可读存储介质
WO2014189289A1 (ko) 연관 서비스 제공 방법 및 장치
WO2019160275A1 (en) Electronic device and method for generating summary image of electronic device
WO2014073939A1 (en) Method and apparatus for capturing and displaying an image
WO2018101533A1 (ko) 영상 처리 장치 및 방법
WO2015037894A1 (ko) 비디오 메모리의 모니터링을 이용한 영상 처리 장치
WO2020241973A1 (en) Display apparatus and control method thereof
WO2020134003A1 (zh) 智能电视的输入方法、智能电视、移动终端及存储介质
WO2017101456A1 (zh) 一种拍摄图像的点对焦方法、系统及移动终端

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19924057

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19924057

Country of ref document: EP

Kind code of ref document: A1