CN112651350A - Video processing method and device - Google Patents

Video processing method and device Download PDF

Info

Publication number
CN112651350A
CN112651350A CN202011602544.7A CN202011602544A CN112651350A CN 112651350 A CN112651350 A CN 112651350A CN 202011602544 A CN202011602544 A CN 202011602544A CN 112651350 A CN112651350 A CN 112651350A
Authority
CN
China
Prior art keywords
image
component
frequency component
low
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011602544.7A
Other languages
Chinese (zh)
Inventor
张传金
刘治国
万海峰
陶维俊
马金星
姚莉莉
邵磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ANHUI CREARO TECHNOLOGY CO LTD
Original Assignee
ANHUI CREARO TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANHUI CREARO TECHNOLOGY CO LTD filed Critical ANHUI CREARO TECHNOLOGY CO LTD
Priority to CN202011602544.7A priority Critical patent/CN112651350A/en
Publication of CN112651350A publication Critical patent/CN112651350A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a video processing method, which is used for a video conference system, and is used for detecting the existence of a real face image, processing the face image and outputting the processed face image; and displaying a video conference picture comprising a human face image according to the image signal. The method comprises the steps of separating three different color channels of a face RGB image to obtain an R component image, a G component image and a B component image, carrying out high-frequency and low-frequency optimization, and obtaining and displaying an optimized face. According to the invention, the low-frequency component which is mainly the background part is suppressed, and the high-frequency component image part is enhanced based on the illumination parameter, so that a more accurate pre-processed image is obtained, and the method is convenient for processing the high-accuracy image display and processing under the dark environment with uneven illumination.

Description

Video processing method and device
Technical Field
The present invention relates to the field of video image processing technologies, and in particular, to a video processing method and apparatus.
Background
A video conference system comprises a software video conference system and a hardware video conference system, which refer to individuals or groups in two or more different places, and distribute various data such as static and dynamic images, voice, characters, pictures and the like of people to computers of various users through various existing telecommunication transmission media, so that the users dispersed geographically can be copolymerized at one place, and the information can be exchanged through various modes such as graphics, sound and the like, and the understanding ability of the two parties to the content is improved. At present, the video conference gradually develops towards the direction of multi-network cooperation, high-definition and development.
The video conference is the most advanced communication technology at present, and can realize high-efficiency and high-definition remote conference and office only by means of the internet, has unique advantages in the aspects of continuously improving the communication efficiency of users, reducing the travel cost of enterprises, improving the management effect and the like, has partially replaced business trips, and becomes the latest mode of remote office. In recent years, the application range of video conferences is rapidly expanded, and the video conferences are seen everywhere from the fields of governments, public security, army, courts, science and technology, energy, medical treatment, education and the like, and cover the aspects of social life. The latest data of the chronological information shows that the market scale of the domestic video conference in 2009 is about 43.6 billion yuan.
High-definition display of a face image in a video conference is a technical problem, the face image is unclear due to different illumination and different shading degrees of different conference scenes, and a traditional face image processing process mainly comprises light compensation, gray level transformation, histogram equalization, normalization, geometric correction, filtering, sharpening and the like of the face image. The technical problem of face image recognition under the scene with weak illumination intensity is always solved, and the color clarity of the face and the surrounding environment is ensured as much as possible, so that subsequent video display, processing and video analysis are facilitated.
Disclosure of Invention
In view of the above, the present invention provides a video processing method and apparatus, which are used in a video conference system to process technical problems of uneven illumination and difficulty in displaying and processing high-definition face images in a dark environment by frequency band optimization.
The technical scheme of the invention is as follows:
a video processing method for a video conference system including a video conference device and a display device, the method comprising:
detecting sound generated by a sound source of the conference space and outputting a positioning signal;
judging whether a real face image exists in the sub-image block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image;
and displaying a video conference picture comprising a human face image according to the image signal.
Correspondingly, the processing the face image and outputting the processed image signal includes:
separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image;
and respectively carrying out optimization processing on the low-frequency component image and the high-frequency component image.
Correspondingly, the performing optimization processing on the low-frequency component image and the high-frequency component image respectively includes:
respectively executing low-frequency suppression on the R low-frequency component image, the G low-frequency component image and the B low-frequency component image to obtain an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R low-frequency component image, the G low-frequency component image and the B low-frequency component image;
enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
correspondingly, the performing optimization processing on the low-frequency component image and the high-frequency component image respectively includes:
acquiring incident light parameters, and respectively enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
the incident light parameter r0 is set to a constant K/average gray value of the image;
and obtaining corresponding R high-frequency component enhanced images, G high-frequency component enhanced images and B high-frequency component enhanced images based on the product of the incident light parameter R0 and the R high-frequency component images, the G high-frequency component images and the B high-frequency component images.
Correspondingly, the generating step of synthesizing the processed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image, and the B component low-frequency suppression image includes:
generating an R component optimized image based on the R component low-frequency suppression image and the R high-frequency component enhanced image;
generating a G component optimized image based on the G component low-frequency suppression image and the G high-frequency component enhanced image;
generating a B component optimized image based on the B component low-frequency suppression image and the B high-frequency component enhanced image;
and synthesizing and generating a processed face image based on the R component optimized image, the G component optimized image and the B component optimized image.
In addition, the present invention also provides a video processing apparatus for a video conference system, where the video conference system includes a video conference device and a display device, the apparatus includes:
the detection module is used for detecting the sound generated by the sound source of the conference space and outputting a positioning signal;
the processing module is used for judging whether a real face image exists in the subimage block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image;
and the display module displays a video conference picture comprising a human face image according to the image signal.
Correspondingly, the processing the face image and outputting the processed face image includes:
separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image;
and respectively carrying out optimization processing on the low-frequency component image and the high-frequency component image.
Correspondingly, the performing optimization processing on the low-frequency component image and the high-frequency component image respectively includes:
respectively executing low-frequency suppression on the R low-frequency component image, the G low-frequency component image and the B low-frequency component image to obtain an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R low-frequency component image, the G low-frequency component image and the B low-frequency component image;
enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
correspondingly, the performing optimization processing on the low-frequency component image and the high-frequency component image respectively includes:
acquiring incident light parameters, and respectively enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
the incident light parameter r0 is set to a constant K/average gray value of the image;
and obtaining corresponding R high-frequency component enhanced images, G high-frequency component enhanced images and B high-frequency component enhanced images based on the product of the incident light parameter R0 and the R high-frequency component images, the G high-frequency component images and the B high-frequency component images.
Correspondingly, the generating step of synthesizing the processed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image, and the B component low-frequency suppression image includes:
generating an R component optimized image based on the R component low-frequency suppression image and the R high-frequency component enhanced image;
generating a G component optimized image based on the G component low-frequency suppression image and the G high-frequency component enhanced image;
generating a B component optimized image based on the B component low-frequency suppression image and the B high-frequency component enhanced image;
and synthesizing and generating a processed face image based on the R component optimized image, the G component optimized image and the B component optimized image.
In the scheme of the embodiment of the invention, the video processing method is used for a video conference system, the video conference system comprises video conference equipment and display equipment, and the method comprises the following steps: detecting sound generated by a sound source of the conference space and outputting a positioning signal; judging whether a real face image exists in the sub-image block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image; and displaying a video conference picture comprising a human face image according to the image signal. The method comprises the steps of separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image; a low-frequency optimization step, namely performing low-frequency suppression on the R component image, the G component image and the B component image to obtain an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R component, the G component and the B component; a high-frequency optimization step of enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image; and a generating step of synthesizing the processed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image and the B component low-frequency suppression image. According to the invention, the low-frequency component which is mainly the background part is inhibited, and the high-frequency component image part is enhanced based on the illumination parameter, so that a more accurate pre-processed image is obtained, and the technical problems that the illumination is uneven and the high-accuracy image display and processing are difficult to perform in a dark environment are solved.
Drawings
FIG. 1 is a flowchart of a method according to a first embodiment of the present invention;
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
The invention implements a video processing method, is used for the video conference system, the said video conference system includes video conference apparatus and display device, the said method includes:
detecting sound generated by a sound source of the conference space and outputting a positioning signal;
judging whether a real face image exists in the sub-image block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image;
specifically, when a video conference is performed, the processor acquires a conference image of the conference space via the image detection device. When the sound source detection device detects the sound source of the conference space, the sound source detection device outputs a positioning signal. In the embodiment, the sound source detecting device determines whether the sound intensity of the sound source exceeds a threshold (Thresholds) of the sound intensity and the sound duration, for example, to determine whether to output the localization signal to the processor. The processor receives the positioning signal to judge whether a real human face image exists in the sub-image block of the conference image corresponding to the sound source according to the positioning signal so as to output an image signal. The display device displays a close-up conference picture including a real face image and a speaker's voice by dialing in accordance with the image signal. Therefore, the video conference method of the embodiment can enable the video conference device to actively track the sound of the conference member, and can operate the display device to synchronously display the face image of the conference member emitting the sound, so as to provide a good video conference effect.
And displaying a video conference picture comprising a human face image according to the image signal.
Correspondingly, the processing the face image and outputting the processed image signal includes:
separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image;
and respectively carrying out optimization processing on the low-frequency component image and the high-frequency component image.
Specifically, it is well known in the prior art how to acquire an R component image, a G component image, and a B component image of an image, and the embodiment is not described more.
Specifically, the R low-frequency component image, the R high-frequency component image, the G low-frequency component image, the G high-frequency component image, the B low-frequency component image, and the B high-frequency component image corresponding to the R component image, the G component image, and the B component image are acquired, and DCT transform, wavelet transform, or fourier transform may be performed on the R component image, the G component image, and the B component image, respectively. In the following, fourier transform is taken as an example, and from the physical effect, the fourier transform is to convert an image from a spatial domain to a frequency domain, the physical meaning of the fourier transform is to transform a grayscale distribution function of the image into a frequency distribution function of the image, and the inverse fourier transform is to transform a frequency distribution function of the image into a grayscale distribution function. Low-frequency component (low-frequency signal): the area representing the slow change of brightness or gray value in the image, i.e. the large flat area in the image, describes the main part of the image, which is a comprehensive measure of the intensity of the whole image. High-frequency component (high-frequency signal): corresponding to the parts of the image where the change is severe, namely the edges (contours) or noise and the detailed parts of the image. Mainly for the measurement of image edges and contours, while the human eye is more sensitive to high frequency components. The reason why the noise corresponds to the high frequency component is that the image noise is high frequency in most cases.
A low-frequency optimization step, in which low-frequency suppression is respectively performed on the R low-frequency component image, the G low-frequency component image and the B low-frequency component image, and an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R low-frequency component image, the G low-frequency component image and the B low-frequency component image are obtained;
specifically, the R low-frequency component image, the G low-frequency component image, and the B low-frequency component image are respectively input to a high-pass filter, so that low frequency is further suppressed, a high-frequency portion is enhanced, and edges or lines of the face image are clearer.
A high-frequency optimization step of enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
preferably, the high-frequency optimizing step of enhancing the R high-frequency component image, the G high-frequency component image, and the B high-frequency component image includes:
acquiring incident light parameters, and respectively enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
the incident light parameter r0 is set to a constant K/average gray value of the image;
here, the average gray value of the image should be the average gray corresponding to the current R, G, and B high-frequency component images; where a constant K is used to control the overall brightness of the enhanced image.
Preferably, the enhancing the R high-frequency component image, the G high-frequency component image, and the B high-frequency component image includes:
and obtaining corresponding R high-frequency component enhanced images, G high-frequency component enhanced images and B high-frequency component enhanced images based on the product of the incident light parameter R0 and the R high-frequency component images, the G high-frequency component images and the B high-frequency component images.
And a generating step of synthesizing the preprocessed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image and the B component low-frequency suppression image.
Preferably, the generating step of synthesizing the preprocessed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image, and the B component low-frequency suppression image includes:
generating an R component optimized image based on the R component low-frequency suppression image and the R high-frequency component enhanced image;
generating a G component optimized image based on the G component low-frequency suppression image and the G high-frequency component enhanced image;
generating a B component optimized image based on the B component low-frequency suppression image and the B high-frequency component enhanced image;
and synthesizing and generating a preprocessed face image based on the R component optimized image, the G component optimized image and the B component optimized image.
Example two
In addition, the present invention also provides a video processing apparatus for a video conference system, where the video conference system includes a video conference device and a display device, the apparatus includes:
the detection module is used for detecting the sound generated by the sound source of the conference space and outputting a positioning signal;
the processing module is used for judging whether a real face image exists in the subimage block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image;
and the display module displays a video conference picture comprising a human face image according to the image signal.
Correspondingly, the processing the face image and outputting the processed face image includes:
separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image;
and respectively carrying out optimization processing on the low-frequency component image and the high-frequency component image.
Correspondingly, the performing optimization processing on the low-frequency component image and the high-frequency component image respectively includes:
respectively executing low-frequency suppression on the R low-frequency component image, the G low-frequency component image and the B low-frequency component image to obtain an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R low-frequency component image, the G low-frequency component image and the B low-frequency component image;
enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
correspondingly, the performing optimization processing on the low-frequency component image and the high-frequency component image respectively includes:
acquiring incident light parameters, and respectively enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
the incident light parameter r0 is set to a constant K/average gray value of the image;
and obtaining corresponding R high-frequency component enhanced images, G high-frequency component enhanced images and B high-frequency component enhanced images based on the product of the incident light parameter R0 and the R high-frequency component images, the G high-frequency component images and the B high-frequency component images.
Correspondingly, the generating step of synthesizing the processed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image, and the B component low-frequency suppression image includes:
generating an R component optimized image based on the R component low-frequency suppression image and the R high-frequency component enhanced image;
generating a G component optimized image based on the G component low-frequency suppression image and the G high-frequency component enhanced image;
generating a B component optimized image based on the B component low-frequency suppression image and the B high-frequency component enhanced image;
and synthesizing and generating a processed face image based on the R component optimized image, the G component optimized image and the B component optimized image.
In the scheme of the embodiment of the invention, the video processing method is used for a video conference system, the video conference system comprises video conference equipment and display equipment, and the method comprises the following steps: detecting sound generated by a sound source of the conference space and outputting a positioning signal; judging whether a real face image exists in the sub-image block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image; and displaying a video conference picture comprising a human face image according to the image signal. The method comprises the steps of separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image; a low-frequency optimization step, namely performing low-frequency suppression on the R component image, the G component image and the B component image to obtain an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R component, the G component and the B component; a high-frequency optimization step of enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image; and a generating step of synthesizing the processed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image and the B component low-frequency suppression image. According to the invention, the low-frequency component which is mainly the background part is inhibited, and the high-frequency component image part is enhanced based on the illumination parameter, so that a more accurate pre-processed image is obtained, and the technical problems that the illumination is uneven and the high-accuracy image display and processing are difficult to perform in a dark environment are solved.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules may all be implemented in software invoked by a processing element. Or may be implemented entirely in hardware. And part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the determining module 310 may be a processing element separately set up, or may be implemented by being integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and the function of the determining module 310 may be called and executed by a processing element of the apparatus. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call program code. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).
The bus 130 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus 130 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
In addition, the embodiment of the invention also provides a readable storage medium, wherein the readable storage medium stores computer execution instructions, and when a processor executes the computer execution instructions, the media data processing method based on remote interaction and cloud computing is realized.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements, and offset processing may occur to those skilled in the art, though not expressly stated herein. Such modifications, improvements, and offset processing are suggested in this specification and still fall within the spirit and scope of the exemplary embodiments of this specification.
Also, the description uses specific words to describe embodiments of the description. Such as "one possible implementation," "one possible example," and/or "exemplary" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "one possible implementation," "one possible example," and/or "exemplary" in various places throughout this specification are not necessarily referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or contexts, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or on a large data platform. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which the elements and lists are processed, the use of alphanumeric characters, or other designations in this specification is not intended to limit the order in which the processes and methods of this specification are performed, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented through interactive services, they may also be implemented through software-only solutions, such as installing the described system on an existing large data platform or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims (10)

1. A video processing method for use in a video conferencing system including a video conferencing device and a display device, the method comprising:
detecting sound generated by a sound source of the conference space and outputting a positioning signal;
judging whether a real face image exists in the sub-image block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image;
and displaying a video conference picture comprising a human face image according to the image signal.
2. The video processing method according to claim 1, wherein the processing the face image and outputting the processed image signal comprises:
separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image;
and respectively carrying out optimization processing on the low-frequency component image and the high-frequency component image.
3. The video processing method according to claim 2, wherein the performing optimization processing on the low-frequency component image and the high-frequency component image respectively comprises:
respectively executing low-frequency suppression on the R low-frequency component image, the G low-frequency component image and the B low-frequency component image to obtain an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R low-frequency component image, the G low-frequency component image and the B low-frequency component image;
and enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image.
4. The video processing method according to claim 3, wherein the performing optimization processing on the low-frequency component image and the high-frequency component image respectively comprises:
acquiring incident light parameters, and respectively enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
the incident light parameter r0Set to a constant K/average gray value of the image;
based on the incident light parameter r0And respectively multiplying the R high-frequency component image, the G high-frequency component image and the B high-frequency component image to obtain a corresponding R high-frequency component enhanced image, a G high-frequency component enhanced image and a B high-frequency component enhanced image.
5. The image preprocessing method for face recognition according to claim 4, wherein the generating step of synthesizing the processed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image, and the B component low-frequency suppression image comprises:
generating an R component optimized image based on the R component low-frequency suppression image and the R high-frequency component enhanced image;
generating a G component optimized image based on the G component low-frequency suppression image and the G high-frequency component enhanced image;
generating a B component optimized image based on the B component low-frequency suppression image and the B high-frequency component enhanced image;
and synthesizing and generating a processed face image based on the R component optimized image, the G component optimized image and the B component optimized image.
6. A video processing apparatus for use in a video conference system including a video conference device and a display device, the apparatus comprising:
the detection module is used for detecting the sound generated by the sound source of the conference space and outputting a positioning signal;
the processing module is used for judging whether a real face image exists in the subimage block of the conference image corresponding to the sound source according to the positioning signal so as to process the face image and output the processed face image;
and the display module displays a video conference picture comprising a human face image according to the image signal.
7. The video processing apparatus according to claim 5, wherein said processing the face image and outputting the processed face image comprises:
separating three different color channels of a human face RGB image to obtain an R component image, a G component image and a B component image, and obtaining an R low-frequency component image, an R high-frequency component image, a G low-frequency component image, a G high-frequency component image, a B low-frequency component image and a B high-frequency component image which respectively correspond to the R component image, the G component image and the B component image;
and respectively carrying out optimization processing on the low-frequency component image and the high-frequency component image.
8. The video processing apparatus according to claim 6, wherein the performing optimization processing on each of the low-frequency component image and the high-frequency component image comprises:
respectively executing low-frequency suppression on the R low-frequency component image, the G low-frequency component image and the B low-frequency component image to obtain an R component low-frequency suppression image, a G component low-frequency suppression image and a B component low-frequency suppression image which respectively correspond to the R low-frequency component image, the G low-frequency component image and the B low-frequency component image;
and enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image.
9. The video processing apparatus according to claim 7, wherein the performing optimization processing on each of the low-frequency component image and the high-frequency component image includes:
acquiring incident light parameters, and respectively enhancing the R high-frequency component image, the G high-frequency component image and the B high-frequency component image;
the incident light parameter r0Set to a constant K/average gray value of the image;
based on the incident light parameter r0And respectively multiplying the R high-frequency component image, the G high-frequency component image and the B high-frequency component image to obtain a corresponding R high-frequency component enhanced image, a G high-frequency component enhanced image and a B high-frequency component enhanced image.
10. The image preprocessing method for face recognition according to claim 8, wherein the generating step of synthesizing the processed face image based on the R component low-frequency suppression image, the G component low-frequency suppression image, and the B component low-frequency suppression image comprises:
generating an R component optimized image based on the R component low-frequency suppression image and the R high-frequency component enhanced image;
generating a G component optimized image based on the G component low-frequency suppression image and the G high-frequency component enhanced image;
generating a B component optimized image based on the B component low-frequency suppression image and the B high-frequency component enhanced image;
and synthesizing and generating a processed face image based on the R component optimized image, the G component optimized image and the B component optimized image.
CN202011602544.7A 2020-12-29 2020-12-29 Video processing method and device Pending CN112651350A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011602544.7A CN112651350A (en) 2020-12-29 2020-12-29 Video processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011602544.7A CN112651350A (en) 2020-12-29 2020-12-29 Video processing method and device

Publications (1)

Publication Number Publication Date
CN112651350A true CN112651350A (en) 2021-04-13

Family

ID=75364091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011602544.7A Pending CN112651350A (en) 2020-12-29 2020-12-29 Video processing method and device

Country Status (1)

Country Link
CN (1) CN112651350A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713717A (en) * 2004-06-25 2005-12-28 北京中星微电子有限公司 Digital sound control orienting method for camera site of camera
CN1917623A (en) * 2005-08-17 2007-02-21 索尼株式会社 Camera controller and teleconferencing system
CN101080000A (en) * 2007-07-17 2007-11-28 华为技术有限公司 Method, system, server and terminal for displaying speaker in video conference
CN102368816A (en) * 2011-12-01 2012-03-07 中科芯集成电路股份有限公司 Intelligent front end system of video conference
CN102750682A (en) * 2012-07-17 2012-10-24 中国矿业大学(北京) Image preprocessing method for processing nonuniform illumination of miner face image and coal surface
CN102903081A (en) * 2012-09-07 2013-01-30 西安电子科技大学 Low-light image enhancement method based on red green blue (RGB) color model
CN109118444A (en) * 2018-07-26 2019-01-01 东南大学 A kind of regularization facial image complex illumination minimizing technology based on character separation
CN111263106A (en) * 2020-02-25 2020-06-09 厦门亿联网络技术股份有限公司 Picture tracking method and device for video conference

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713717A (en) * 2004-06-25 2005-12-28 北京中星微电子有限公司 Digital sound control orienting method for camera site of camera
CN1917623A (en) * 2005-08-17 2007-02-21 索尼株式会社 Camera controller and teleconferencing system
CN101080000A (en) * 2007-07-17 2007-11-28 华为技术有限公司 Method, system, server and terminal for displaying speaker in video conference
CN102368816A (en) * 2011-12-01 2012-03-07 中科芯集成电路股份有限公司 Intelligent front end system of video conference
CN102750682A (en) * 2012-07-17 2012-10-24 中国矿业大学(北京) Image preprocessing method for processing nonuniform illumination of miner face image and coal surface
CN102903081A (en) * 2012-09-07 2013-01-30 西安电子科技大学 Low-light image enhancement method based on red green blue (RGB) color model
CN109118444A (en) * 2018-07-26 2019-01-01 东南大学 A kind of regularization facial image complex illumination minimizing technology based on character separation
CN111263106A (en) * 2020-02-25 2020-06-09 厦门亿联网络技术股份有限公司 Picture tracking method and device for video conference

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
上官伟等: "《综合运用灰度变换方法改善超声医学图像质量》" *
张志龙等: "《一种保持图像细节的直方图均衡新算法》" *
杨桐等: "《基于深度学习与自适应对比度增强的臂丛神经超声图像优化》" *

Similar Documents

Publication Publication Date Title
JP6615917B2 (en) Real-time video enhancement method, terminal, and non-transitory computer-readable storage medium
KR101621614B1 (en) Method and apparatus for enhancing digital image, and apparatus for image processing using the same
WO2014169579A1 (en) Color enhancement method and device
Jung et al. Optimized perceptual tone mapping for contrast enhancement of images
WO2022088976A1 (en) Image processing method and device
CN111583103B (en) Face image processing method and device, electronic equipment and computer storage medium
CN111899197A (en) Image brightening and denoising method and device, mobile terminal and storage medium
US20240046537A1 (en) Image processing method and apparatus, device and readable storage medium
CN111738950B (en) Image processing method and device
CN117218039A (en) Image processing method, device, computer equipment and storage medium
CN113538304A (en) Training method and device of image enhancement model, and image enhancement method and device
CN112651350A (en) Video processing method and device
CN112215237B (en) Image processing method and device, electronic equipment and computer readable storage medium
Wang et al. Nighttime image dehazing using color cast removal and dual path multi-scale fusion strategy
CN111784726A (en) Image matting method and device
CN110555799A (en) Method and apparatus for processing video
CN117474827A (en) Image definition detection method and device
Tao et al. An effective and robust underwater image enhancement method based on color correction and artificial multi-exposure fusion
CN112801997B (en) Image enhancement quality evaluation method, device, electronic equipment and storage medium
CN112613458A (en) Image preprocessing method and device for face recognition
CN114266803A (en) Image processing method, image processing device, electronic equipment and storage medium
CN106651815B (en) Method and device for processing Bayer format video image
Zhang et al. An adaptive tone mapping algorithm for high dynamic range images
CN111756954B (en) Image processing method, image processing device, electronic equipment and computer readable medium
CN113379631B (en) Image defogging method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination