CN112541402A - Data processing method and device and electronic equipment - Google Patents

Data processing method and device and electronic equipment Download PDF

Info

Publication number
CN112541402A
CN112541402A CN202011312594.1A CN202011312594A CN112541402A CN 112541402 A CN112541402 A CN 112541402A CN 202011312594 A CN202011312594 A CN 202011312594A CN 112541402 A CN112541402 A CN 112541402A
Authority
CN
China
Prior art keywords
video stream
stream data
portrait
portrait image
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011312594.1A
Other languages
Chinese (zh)
Inventor
牛红霞
路呈璋
张爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN202011312594.1A priority Critical patent/CN112541402A/en
Publication of CN112541402A publication Critical patent/CN112541402A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the invention provides a data processing method, a data processing device and electronic equipment, wherein the method comprises the following steps: acquiring first video stream data of a video conference acquired by the recording equipment; carrying out portrait recognition on the first video stream data to obtain a portrait image; and displaying the portrait image when the first video stream data is displayed. According to the embodiment of the invention, when the first video stream data is displayed on the recording equipment, the portrait image is displayed at the same time, so that personalized content is provided, the content of the video conference is enriched, and the communication experience of the video conference is improved.

Description

Data processing method and device and electronic equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, and an electronic device.
Background
In recent years, recording apparatuses have been developed rapidly and have entered the public domain as products in professional fields. Recording equipment is generally required for recording by journalists, students, teachers and other groups. In addition, recording of various television programs, movies, music, etc. requires the use of recording equipment.
In many scenes, besides the recording device can be used for recording, the recording device can also be used for carrying out video conference through the video conference software of a third party. However, in the current video conference process, only the video stream data collected by the camera is displayed, so that the content displayed on the recording device lacks personalized content, resulting in poor communication experience of the video conference.
Disclosure of Invention
The embodiment of the invention provides a data processing method, which is used for identifying and displaying a portrait image from video stream data during a video conference, so that personalized content is provided, the content of the video conference is enriched, and the communication experience of the video conference is improved.
Correspondingly, the embodiment of the invention also provides a data processing device and electronic equipment, which are used for ensuring the realization and application of the method.
In order to solve the above problem, an embodiment of the present invention discloses a data processing method, which specifically includes:
acquiring first video stream data of a video conference acquired by the recording equipment;
carrying out portrait recognition on the first video stream data to obtain a portrait image;
and displaying the portrait image when the first video stream data is displayed.
Optionally, the performing portrait recognition on the first video stream data to obtain a portrait image further includes:
receiving a starting instruction of portrait recognition;
and starting portrait recognition according to the starting instruction.
Optionally, the performing portrait recognition on the first video stream data to obtain a portrait image further includes:
identifying lip feature data from the portrait image;
judging whether the lip feature data meet preset conditions or not;
when a preset condition is met, determining the portrait image as a target portrait image;
and adding a visual identifier on the target portrait image.
Optionally, the identifying lip feature data from the portrait image includes:
extracting identity data from the first video stream data;
when the identity data changes, lip feature data are identified from the portrait image.
Optionally, after the performing portrait recognition on the first video stream data to obtain a portrait image, the method further includes:
transmitting the portrait image and the first video stream data into an electronic device;
and displaying the portrait image when the first video stream data is displayed on the electronic equipment.
Optionally, the displaying the portrait image while displaying the first video stream data includes:
before the electronic equipment does not enter a video conference, displaying the portrait image when displaying the first video stream data.
Optionally, the displaying the portrait image while displaying the first video stream data further includes:
after the electronic equipment enters a video conference, acquiring second video stream data sent by the electronic equipment;
combining the first video stream data and the portrait images into a window video;
and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.
Optionally, the identity data is a voiceprint.
The embodiment of the invention also discloses a data processing method which is applied to the electronic equipment and comprises the following steps:
receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;
and displaying the portrait image when the first video stream data is displayed.
The embodiment of the invention also discloses a data processing device, which is applied to the recording equipment, and the device comprises:
the video stream data acquisition module is used for acquiring first video stream data of the video conference acquired by the recording equipment;
the portrait recognition module is used for carrying out portrait recognition on the first video stream data to obtain a portrait image;
and the video stream data display module is used for displaying the portrait image when the first video stream data is displayed.
Optionally, the method further comprises: the face recognition starting module is used for receiving a face recognition starting instruction; and starting portrait recognition according to the starting instruction.
Optionally, the method further comprises: the visual identification adding module is used for identifying lip feature data from the portrait image; judging whether the lip feature data meet preset conditions or not; when the preset condition is met, determining the portrait image as a target portrait image; and adding a visual identifier on the target portrait image.
Optionally, the visual identifier adding module is configured to extract identity data from the first video stream data; when the identity data changes, lip feature data are identified from the portrait image.
Optionally, the apparatus further comprises: the video stream data transmission module is used for transmitting the portrait image and the first video stream data to the electronic equipment; and displaying the portrait image when the first video stream data is displayed on the electronic equipment.
Optionally, the video stream data display module is configured to display the portrait image when the first video stream data is displayed before the electronic device enters the video conference.
Optionally, the video stream data display module is configured to obtain second video stream data sent by the electronic device after the electronic device enters a video conference; combining the first video stream data and the portrait images into a window video; and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.
Optionally, the identity data is a voiceprint.
The embodiment of the invention also discloses a data processing device, which is applied to electronic equipment, and the device comprises:
the video stream data receiving module is used for receiving first video stream data and a portrait image of the video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;
and the video stream data display module is used for displaying the portrait image when the first video stream data is displayed.
The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the electronic equipment, the electronic equipment can execute the data processing method according to any one of the embodiments of the invention.
The embodiment of the invention also discloses a sound recording device, which comprises a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs are configured to be executed by one or more processors and comprise instructions for: acquiring first video stream data of a video conference acquired by the recording equipment; carrying out portrait recognition on the first video stream data to obtain a portrait image; and displaying the portrait image when the first video stream data is displayed.
Optionally, the performing portrait recognition on the first video stream data to obtain a portrait image further includes:
receiving a portrait recognition starting instruction;
and starting portrait recognition according to the starting instruction.
Optionally, the performing portrait recognition on the first video stream data to obtain a portrait image further includes:
identifying lip feature data from the portrait image;
judging whether the lip feature data meet preset conditions or not;
when the preset condition is met, determining the portrait image as a target portrait image;
and adding a visual identifier on the target portrait image.
Optionally, the identifying lip feature data from the portrait image includes:
extracting identity data from the first video stream data;
when the identity data changes, lip feature data are identified from the portrait image.
Optionally, after the performing portrait recognition on the first video stream data to obtain a portrait image, the method further includes:
transmitting the portrait image and the first video stream data into an electronic device;
and displaying the portrait image when the first video stream data is displayed on the electronic equipment.
Optionally, the displaying the portrait image while displaying the first video stream data includes:
before the electronic equipment does not enter a video conference, displaying the portrait image when displaying the first video stream data.
Optionally, the displaying the portrait image while displaying the first video stream data further includes:
after the electronic equipment enters a video conference, acquiring second video stream data sent by the electronic equipment;
combining the first video stream data and the portrait images into a window video;
and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.
Optionally, the identity data is a voiceprint.
An embodiment of the present invention also discloses an electronic device, including a memory, and one or more programs, where the one or more programs are stored in the memory, and configured to be executed by one or more processors, and the one or more programs include instructions for: receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data; and displaying the portrait image when the first video stream data is displayed.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, in the video conference process, the portrait identification is carried out on the first video streaming data of the video conference collected by the recording equipment to obtain the portrait image, and then the portrait image can be displayed simultaneously when the first video streaming data is displayed on the recording equipment, so that the personalized content is provided, the content of the video conference is enriched, and the communication experience of the video conference is improved.
Furthermore, in the embodiment of the invention, lip language recognition can be respectively carried out on the obtained portrait images, when the lip characteristic data meet the preset conditions, the portrait images are determined as the target portrait images and visual marks are added to distinguish the target portrait images from other portrait images, so that the participants who normally talk at present can be quickly concerned, and the content of the video conference can be better absorbed.
Drawings
FIG. 1 is a flow chart of the steps of one data processing method embodiment of the present invention;
FIG. 2 is a schematic diagram of video conferencing software on a recording device of the present invention;
FIG. 3 is a flow chart of the processing steps of a visual marker of the present invention;
FIG. 4a is a schematic illustration of a display of the present invention after activation of a portrait recognition function;
FIG. 4b is a schematic illustration of a display of the present invention with the portrait recognition function turned off;
FIG. 5a is a schematic view of a display of an electronic device of the present invention before entering a video conference;
FIG. 5b is a schematic view of a display of an electronic device of the present invention after entering a video conference;
FIG. 6 is a flow chart of the steps of yet another alternative embodiment of a data processing method of the present invention;
FIG. 7 is a block diagram of an embodiment of a data processing apparatus of the present invention;
FIG. 8 is a block diagram of an alternate embodiment of a data processing apparatus of the present invention;
FIG. 9 illustrates a block diagram of an electronic device for data processing in accordance with an exemplary embodiment.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The embodiment of the invention provides a data processing method, which is applied to recording equipment, wherein the recording equipment can be equipment with a recording function, such as a recording pen, translation equipment such as a translation pen, a translator and the like; the embodiments of the present invention are not limited in this regard.
Wherein, the recording equipment is provided with an image acquisition module. Therefore, when the recording equipment carries out a video conference, the portrait image can be obtained by identifying the portrait of the first video streaming data through the first video streaming data collected by the image collecting module, and the portrait image can be displayed simultaneously when the first video streaming data is displayed on the recording equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:
and 102, acquiring first video stream data of the video conference, which is acquired by the recording equipment.
In the embodiment of the invention, when a user needs to carry out a video conference, the video conference function of the recording equipment can be started so as to carry out the video conference based on the recording equipment. Specifically, the recording device may be installed with third-party video conference software, and the user may start the video conference software on the recording device, so that a video conference may be performed based on the video conference software.
Referring to fig. 2, one or more pieces of video conference software, such as video conference software 1, video conference software 2, and video conference software 3 of fig. 2, may be installed on the recording apparatus, and a user may select one of the video conference software to initiate or enter a video conference in preparation for starting the video conference.
In a specific example, the image acquisition module may be a camera, and after video conference software is started on the recording device, the recording device may call the camera in real time to shoot in the process of a video conference, so as to obtain first video stream data.
And step 104, performing portrait recognition on the first video stream data to obtain a portrait image.
In a specific implementation, a plurality of participants usually participate in the video conference during the video conference, so that the image recognition can be performed on the first video stream data, and the image images corresponding to the participants are obtained. For example, assuming that the participants include a participant a, a participant B, and a participant C, after performing the portrait recognition on the first video stream data, portrait images corresponding to the participant a, the participant B, and the participant C may be obtained.
And 106, displaying the portrait image when the first video stream data is displayed.
In the embodiment of the invention, after the recording device collects the first video stream data and identifies the portrait image from the first video stream data, the first video stream data and the portrait image can be displayed on the recording device, so that the participants of the parties participating in the video conference can be determined on the recording device. Specifically, the first video stream data may be displayed on the full screen on the display of the sound recording apparatus, and then the portrait image may be displayed at a certain position, such as a middle-lower position, on the display.
It should be noted that, in the embodiment of the present invention, the face recognition may be performed in the whole video conference process, or the face recognition may be performed only under a specific condition. For example, the person image recognition may be performed on the first video stream data only within 1 minute after the video conference starts, and then the person image recognition does not need to be performed after 1 minute, thereby reducing unnecessary system overhead, or the person image recognition may be stopped after the person image of all participants is recognized. Of course, the above specified conditions are only examples, and in practical implementation, the conditions may be adjusted according to practical situations, and embodiments of the present invention need not be limited to these conditions.
As a specific example of the present invention, since the picture corresponding to the first video stream data is constantly changing, the captured portrait image is also constantly changing, and therefore the portrait image displayed on the sound recording device may also be constantly changing. Considering that different pictures are already dynamically displayed when the first video data stream is displayed, if the portrait images also follow the continuous change, the participants may not be centralized, so in the embodiment of the present invention, one of the portrait images of each participant can be selected and displayed, and will not be modified during the video conference. The selection of the portrait images may be, for example, selecting high-definition portrait images according to a preset algorithm, or may be selected by participants through recording equipment by themselves, which is not limited in the embodiment of the present invention.
According to the data processing method, in the video conference process, the portrait identification is carried out on the first video streaming data of the video conference collected by the recording equipment to obtain the portrait image, and then the portrait image can be displayed simultaneously when the first video streaming data is displayed on the recording equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.
In an optional embodiment of the present invention, the step 102, performing portrait recognition on the first video stream data to obtain a portrait image, further includes:
receiving a portrait recognition starting instruction;
and starting portrait recognition according to the starting instruction.
In the embodiment of the invention, the recording equipment is provided with a portrait recognition function, and after receiving a portrait recognition starting instruction, the portrait recognition can be started according to the starting instruction, so that after the recording equipment collects first video stream data, the portrait recognition can be carried out on the first video stream data, and further a portrait image can be obtained.
In the above optional embodiment, when receiving the face recognition start instruction, the corresponding face recognition function is started, so that the user can selectively start or close the face recognition function according to actual requirements, thereby improving the user experience and reducing unnecessary system overhead.
In an optional embodiment of the present invention, referring to fig. 2, the performing portrait recognition on the first video stream data to obtain a portrait image, where the method further includes:
step 302, identifying lip feature data from the portrait image;
step 304, judging whether the lip feature data meet preset conditions;
step 306, when a preset condition is met, determining the portrait image as a target portrait image;
and 308, adding a visual identifier on the target portrait image.
In the video conference process, participants usually speak to make their opinions, and when speaking, the participants need to make sounds by opening and closing lips, so in the embodiment of the invention, feature extraction is further performed on the portrait image to identify lip feature data, and then the participant who is speaking currently in the video conference is confirmed.
In the embodiment of the invention, the portrait image is input into the pre-trained portrait recognition model, and the portrait recognition model can output corresponding lip feature data. Specifically, the preset condition may be that lip feature data is matched with preset lip feature data, where the preset lip feature data is lip feature data generated in advance to characterize that a participant is speaking, and therefore, when the lip feature data is matched with the preset lip feature data, it may be determined that the participant corresponding to the portrait image is speaking, and the portrait image may be determined as the target portrait image.
The visual identification can highlight, frame and magnify other special effects.
As a specific example, assuming that the portrait image includes portrait image a, portrait image B, and portrait image C, when it is determined that lip feature data of portrait image a satisfies a preset condition, portrait image a is taken as a target portrait image, and then highlight special effects may be added to portrait image a.
In the above optional embodiment, the lip feature data of the portrait image is recognized to determine the target portrait image of the participant who normally talks at present, then the visual identifier is added to the target portrait image, and the portrait image including the visual identifier is displayed on the recording device, so that the participant who normally talks at present can be quickly noticed, and the content of the video conference can be better absorbed.
In an optional embodiment of the present invention, the step 302 of identifying lip feature data from the portrait image includes:
extracting identity data from the first video stream data;
when the identity data changes, lip feature data are identified from the portrait image.
The identity data is used for uniquely identifying the corresponding participant. Specifically, the identity data may be a voiceprint, and since the voiceprint has specificity and stability, different speaking participants in the video conference can be identified through the voiceprint.
In the embodiment of the invention, the identity data is extracted from the first video stream data in real time, and if the identity data changes, which indicates that the participant who is speaking at present changes, the lip feature data can be further identified from the portrait image.
As a specific example, a voiceprint is extracted from the first video stream data in real time, and if a participant currently speaking in the video conference changes from participant a to participant B, it may be detected that the voiceprint changes, that is, the voiceprint of participant a is switched to the voiceprint of participant B, and lip feature data may be further identified from the portrait image.
In the above optional embodiment, only when the identity data is detected to be changed, that is, the participant who is speaking is detected to be changed, the lip feature data may be further identified from the portrait image to identify the portrait image corresponding to the changed participant to add the visual identifier, so that the lip feature data does not need to be identified in the whole video conference process, thereby reducing unnecessary system overhead.
In an optional embodiment of the present invention, after performing portrait recognition on the first video stream data to obtain a portrait image in step 102, the method further includes:
transmitting the portrait image and the first video stream data into an electronic device;
and displaying the portrait image when the first video stream data is displayed on the electronic equipment.
The electronic device may be a device that performs a video conference with a recording device, such as other recording devices, a translation device, a smart phone, a tablet device, or a computer, and the like.
In the embodiment of the present invention, after obtaining the portrait image from the first video stream data, the recording device may package the first video stream data and the portrait image together and send the first video stream data and the portrait image to other electronic devices participating in the video conference, so that the electronic devices display the portrait image while displaying the first video stream data after receiving the portrait image. Specifically, the first video stream data may be displayed full screen on a display of the electronic device, and then the portrait image may be displayed at a certain position, such as a middle-lower position, on the display.
In the data processing method, in the video conference process, the portrait identification is carried out on the first video stream data of the video conference collected by the recording equipment to obtain the portrait image, and then the portrait image and the first video stream data are packed and transmitted to the electronic equipment together, so that the portrait image is displayed when the first video stream data is displayed on the electronic equipment, the content of the video conference is enriched, and the communication experience of the video conference is improved.
In an optional embodiment of the present invention, the displaying the portrait image while displaying the first video stream data includes: before the electronic equipment does not enter a video conference, displaying the portrait image when displaying the first video stream data.
Specifically, before the electronic device does not enter the video conference, if the human image recognition function is started at this time, the human image may be recognized from the first video stream data collected by the recording device, and then the first video stream data and the human image may be displayed on the recording device.
In the above-described alternative embodiment, the display effect may be previewed on the sound recording apparatus in advance before transmitting the first video stream data and the portrait image to the electronic apparatus.
Referring to fig. 4a, after the human image recognition function is started, first video stream data and a human image are displayed on the sound recording device, the human image is displayed at a middle-lower position and comprises a human image a, a human image B and a human image C, wherein if the human image comprises an image corresponding to a speaking participant as the human image B, the human image B can be enlarged to prompt that the participant corresponding to the human image B is speaking. After the face recognition function is turned off, the recording device will not perform face recognition, and the face image will not be displayed on the recording device, which may specifically refer to fig. 4 b.
It should be noted that, when the portrait image is displayed on the recording device in the embodiment of the present invention, the portrait image may be displayed in a horizontal screen manner or a vertical screen manner, which is not limited in this respect. Optionally, in the embodiment of the present invention, the position of the portrait images on the display screen may be adjusted according to whether the recording device is a horizontal screen or a vertical screen, for example, when the recording device is a vertical screen, the portrait images are horizontally and sequentially placed, when the recording device is a horizontal screen, the portrait images are vertically and sequentially placed, or whether the recording device is a vertical screen or a horizontal screen, the portrait images are horizontally and sequentially placed, which is also not limited by the embodiment of the present invention.
In an optional embodiment of the present invention, the displaying the portrait image while displaying the first video stream data further includes:
after the electronic equipment enters a video conference, acquiring second video stream data sent by the electronic equipment;
combining the first video stream data and the portrait images into a window video;
and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.
The electronic device may be a device that performs a video conference with a recording device, such as other recording devices, a translation device, a smart phone, a tablet device, or a computer, and the like, which is not limited in this embodiment of the present invention.
Specifically, after the electronic device enters the video conference, second video stream data sent by the electronic device can be received, the portrait image can be displayed simultaneously when the second video stream data is displayed on the recording device, and the first video stream data and the portrait image of the recording device can be combined into a small video window, so that the second video stream data, the portrait image and the window video can be displayed simultaneously on the recording device.
Referring to fig. 5a, after the portrait recognition function is started, after second video stream data of the electronic device is received, the screen is switched to show the second video stream data, and a portrait image including a portrait image a, a portrait image B, and a portrait image C is displayed. In addition, the first video stream data and the portrait image of the recording apparatus may be continuously displayed on the recording apparatus after being reduced, as shown in the upper right corner of fig. 5 a. After the face recognition function is turned off, the recording device will not perform face recognition any more, and then the face image will not be displayed on the recording device any more, and meanwhile the window video will not include the face image any more, which may be specifically referred to fig. 5 b.
In the above-described alternative embodiment, when the second video stream data sent by the electronic device is displayed on the recording device, the display effect of the portrait image can still be viewed, and the display effect can also be viewed in the video window.
Referring to fig. 6, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:
step 602, receiving first video stream data and a portrait image of a video conference sent by the sound recording device; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data.
And step 604, displaying the portrait image when the first video stream data is displayed.
In the data processing method, in the video conference process, the electronic equipment can receive the first video stream data of the video conference collected by the recording equipment, and identify and obtain the portrait image based on the second video stream data, and then display the portrait image when the first video stream data is displayed on the electronic equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.
Of course, if the participant of the electronic device does not want to display the portrait image, the participant may also send a prompt message to the recording device to prompt the recording device to turn off the portrait recognition function, and further not transmit the portrait image while continuing to transmit the first video stream data.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 7, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, and is applied to a recording device, and may specifically include the following modules:
a video stream data obtaining module 702, configured to obtain first video stream data of a video conference collected by the sound recording device;
a portrait identification module 704, configured to perform portrait identification on the first video stream data to obtain a portrait image;
a video stream data display module 706, configured to display the portrait image when the first video stream data is displayed.
In an optional embodiment of the invention, the apparatus further comprises: the face recognition starting module is used for receiving a face recognition starting instruction; and starting portrait recognition according to the starting instruction.
In an optional embodiment of the invention, the apparatus further comprises: the visual identification adding module is used for identifying lip feature data from the portrait image; judging whether the lip feature data meet preset conditions or not; when the preset condition is met, determining the portrait image as a target portrait image; and adding a visual identifier on the target portrait image to distinguish the target portrait image from other portrait images.
In an optional embodiment of the present invention, the visual identifier adding module is configured to extract identity data from the first video stream data; when the identity data changes, lip feature data are identified from the portrait image.
In an optional embodiment of the invention, the apparatus further comprises: the video stream data transmission module is used for transmitting the portrait image and the first video stream data to the electronic equipment; and displaying the portrait image when the first video stream data is displayed on the electronic equipment.
In an optional embodiment of the present invention, the video stream data display module 706 is configured to display the portrait image when the first video stream data is displayed before the electronic device enters the video conference.
In an optional embodiment of the present invention, the video stream data display module 706 is configured to obtain second video stream data sent by the electronic device after the electronic device enters a video conference; combining the first video stream data and the portrait images into a window video; and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.
In an optional embodiment of the invention, the identity data is a voiceprint.
In the embodiment of the invention, in the video conference process, the portrait identification is carried out on the first video streaming data of the video conference collected by the recording equipment to obtain the portrait image, and then the portrait image can be displayed simultaneously when the first video streaming data is displayed on the recording equipment, so that the personalized content is provided, the content of the video conference is enriched, and the communication experience of the video conference is improved.
Referring to fig. 8, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, and is applied to an electronic device, and specifically includes the following modules:
a video stream data receiving module 802, configured to receive first video stream data and a portrait image of a video conference sent by the sound recording device; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;
a video stream data display module 804, configured to display the portrait image when the first video stream data is displayed.
In the embodiment of the invention, in the video conference process, the electronic equipment can receive the first video stream data of the video conference collected by the recording equipment, identify and obtain the portrait image based on the second video stream data, and simultaneously display the portrait image when the first video stream data is displayed on the electronic equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.
Furthermore, in the embodiment of the invention, lip language recognition can be respectively carried out on the obtained portrait images, when the lip characteristic data meet the preset conditions, the portrait images are determined as the target portrait images and visual marks are added to distinguish the target portrait images from other portrait images, so that the participants who normally talk at present can be quickly concerned, and the content of the video conference can be better absorbed.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Fig. 9 is a block diagram illustrating an architecture of an electronic device 900 for data processing in accordance with an example embodiment. For example, the electronic device 900 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 9, electronic device 900 may include one or more of the following components: a processing component 902, a memory 904, a power component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, and a communication component 916.
The processing component 902 generally controls overall operation of the electronic device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 902 may include one or more processors 1020 to execute instructions to perform all or a portion of the steps of the methods described above. Further, processing component 902 can include one or more modules that facilitate interaction between processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.
The memory 904 is configured to store various types of data to support operation at the device 900. Examples of such data include instructions for any application or method operating on the electronic device 900, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 904 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power component 906 provides power to the various components of the electronic device 900. Power components 906 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device 900.
The multimedia components 908 include a screen that provides an output interface between the electronic device 900 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 908 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 900 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 910 is configured to output and/or input audio signals. For example, the audio component 910 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 900 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 904 or transmitted via the communication component 916. In some embodiments, audio component 910 also includes a speaker for outputting audio signals.
I/O interface 912 provides an interface between processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor component 914 includes one or more sensors for providing status evaluations of various aspects of the electronic device 900. For example, sensor assembly 914 may detect an open/closed state of device 900, the relative positioning of components, such as a display and keypad of electronic device 900, sensor assembly 914 may also detect a change in the position of electronic device 900 or a component of electronic device 900, the presence or absence of user contact with electronic device 900, orientation or acceleration/deceleration of electronic device 900, and a change in the temperature of electronic device 900. The sensor assembly 914 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 916 is configured to facilitate wired or wireless communication between the electronic device 900 and other devices. The electronic device 900 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication part 914 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 914 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 904 comprising instructions, executable by the processor 920 of the electronic device 900 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform a data processing method, the method comprising: receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data; and displaying the portrait image when the first video stream data is displayed.
In an alternative embodiment of the present invention, the electronic device 1100 may be a recording device, and the recording device may be a recording pen, a translating pen, a translator, or the like. A non-transitory computer readable storage medium having instructions therein which, when executed by a processor of an audio recording device, enable the audio recording device to perform a data processing method, the method comprising: acquiring first video stream data of a video conference acquired by the recording equipment; carrying out portrait recognition on the first video stream data to obtain a portrait image; and displaying the portrait image when the first video stream data is displayed.
Optionally, the performing portrait recognition on the first video stream data to obtain a portrait image further includes:
receiving a portrait recognition starting instruction;
and starting portrait recognition according to the starting instruction.
Optionally, the performing portrait recognition on the first video stream data to obtain a portrait image further includes:
identifying lip feature data from the portrait image;
judging whether the lip feature data meet preset conditions or not;
when the preset condition is met, determining the portrait image as a target portrait image;
and adding a visual identifier on the target portrait image.
Optionally, the identifying lip feature data from the portrait image includes:
extracting identity data from the first video stream data;
when the identity data changes, lip feature data are identified from the portrait image.
Optionally, after the performing portrait recognition on the first video stream data to obtain a portrait image, the method further includes:
transmitting the portrait image and the first video stream data into an electronic device;
and displaying the portrait image when the first video stream data is displayed on the electronic equipment.
Optionally, the displaying the portrait image while displaying the first video stream data includes:
before the electronic equipment does not enter a video conference, displaying the portrait image when displaying the first video stream data.
Optionally, the displaying the portrait image while displaying the first video stream data further includes:
after the electronic equipment enters a video conference, acquiring second video stream data sent by the electronic equipment;
combining the first video stream data and the portrait images into a window video;
and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.
Optionally, the identity data is a voiceprint.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The data processing method, the data processing apparatus and the electronic device provided by the present invention are described in detail above, and specific examples are applied herein to illustrate the principles and embodiments of the present invention, and the descriptions of the above embodiments are only used to help understand the method and the core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A data processing method is applied to a sound recording device, and the method comprises the following steps:
acquiring first video stream data of a video conference acquired by the recording equipment;
carrying out portrait recognition on the first video stream data to obtain a portrait image;
and displaying the portrait image when the first video stream data is displayed.
2. The method of claim 1, wherein the performing portrait recognition on the first video stream data to obtain a portrait image, further comprises:
receiving a portrait recognition starting instruction;
and starting portrait recognition according to the starting instruction.
3. The method of claim 1, wherein the performing portrait recognition on the first video stream data to obtain a portrait image, further comprises:
identifying lip feature data from the portrait image;
judging whether the lip feature data meet preset conditions or not;
when the preset condition is met, determining the portrait image as a target portrait image;
and adding a visual identifier on the target portrait image.
4. The method of claim 3, wherein the identifying lip feature data from the portrait image comprises:
extracting identity data from the first video stream data;
when the identity data changes, lip feature data are identified from the portrait image.
5. A data processing method is applied to an electronic device, and the method comprises the following steps:
receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;
and displaying the portrait image when the first video stream data is displayed.
6. A data processing apparatus, for use in a sound recording device, the apparatus comprising:
the video stream data acquisition module is used for acquiring first video stream data of the video conference acquired by the recording equipment;
the portrait recognition module is used for carrying out portrait recognition on the first video stream data to obtain a portrait image;
and the video stream data display module is used for displaying the portrait image when the first video stream data is displayed.
7. A data processing apparatus, for use in an electronic device, the apparatus comprising:
the video stream data receiving module is used for receiving first video stream data and a portrait image of the video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;
and the video stream data display module is used for displaying the portrait image when the first video stream data is displayed.
8. An audio recording apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
acquiring first video stream data of a video conference acquired by the recording equipment;
carrying out portrait recognition on the first video stream data to obtain a portrait image;
and displaying the portrait image when the first video stream data is displayed.
9. An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors the one or more programs including instructions for:
receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;
and displaying the portrait image when the first video stream data is displayed.
10. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data processing method according to any of method claims 1-5.
CN202011312594.1A 2020-11-20 2020-11-20 Data processing method and device and electronic equipment Pending CN112541402A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011312594.1A CN112541402A (en) 2020-11-20 2020-11-20 Data processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011312594.1A CN112541402A (en) 2020-11-20 2020-11-20 Data processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN112541402A true CN112541402A (en) 2021-03-23

Family

ID=75014508

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011312594.1A Pending CN112541402A (en) 2020-11-20 2020-11-20 Data processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN112541402A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113473061A (en) * 2021-06-10 2021-10-01 荣耀终端有限公司 Video call method and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754285A (en) * 2013-12-26 2015-07-01 广达电脑股份有限公司 Video conference system
CN105005430A (en) * 2015-07-17 2015-10-28 深圳市金立通信设备有限公司 Window display method and terminal
CN105893948A (en) * 2016-03-29 2016-08-24 乐视控股(北京)有限公司 Method and apparatus for face identification in video conference
CN105915798A (en) * 2016-06-02 2016-08-31 北京小米移动软件有限公司 Camera control method in video conference and control device thereof
CN108702483A (en) * 2016-02-19 2018-10-23 微软技术许可有限责任公司 Communication event
CN108932519A (en) * 2017-05-23 2018-12-04 中兴通讯股份有限公司 A kind of meeting-place data processing, display methods and device and intelligent glasses
CN110505399A (en) * 2019-08-13 2019-11-26 聚好看科技股份有限公司 Control method, device and the acquisition terminal of Image Acquisition
CN111651632A (en) * 2020-04-23 2020-09-11 深圳英飞拓智能技术有限公司 Method and device for outputting voice and video of speaker in video conference
CN111770299A (en) * 2020-04-20 2020-10-13 厦门亿联网络技术股份有限公司 Method and system for real-time face abstract service of intelligent video conference terminal

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754285A (en) * 2013-12-26 2015-07-01 广达电脑股份有限公司 Video conference system
CN105005430A (en) * 2015-07-17 2015-10-28 深圳市金立通信设备有限公司 Window display method and terminal
CN108702483A (en) * 2016-02-19 2018-10-23 微软技术许可有限责任公司 Communication event
CN105893948A (en) * 2016-03-29 2016-08-24 乐视控股(北京)有限公司 Method and apparatus for face identification in video conference
CN105915798A (en) * 2016-06-02 2016-08-31 北京小米移动软件有限公司 Camera control method in video conference and control device thereof
CN108932519A (en) * 2017-05-23 2018-12-04 中兴通讯股份有限公司 A kind of meeting-place data processing, display methods and device and intelligent glasses
CN110505399A (en) * 2019-08-13 2019-11-26 聚好看科技股份有限公司 Control method, device and the acquisition terminal of Image Acquisition
CN111770299A (en) * 2020-04-20 2020-10-13 厦门亿联网络技术股份有限公司 Method and system for real-time face abstract service of intelligent video conference terminal
CN111651632A (en) * 2020-04-23 2020-09-11 深圳英飞拓智能技术有限公司 Method and device for outputting voice and video of speaker in video conference

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王忆勤: "中医面诊与计算机辅助诊断", 30 November 2010, 上海科学技术出版社, pages: 79 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113473061A (en) * 2021-06-10 2021-10-01 荣耀终端有限公司 Video call method and electronic equipment

Similar Documents

Publication Publication Date Title
CN106791893B (en) Video live broadcasting method and device
CN107105314B (en) Video playing method and device
US20170304735A1 (en) Method and Apparatus for Performing Live Broadcast on Game
WO2017181551A1 (en) Video processing method and device
US20170034430A1 (en) Video recording method and device
CN106210757A (en) Live broadcasting method, live broadcast device and live broadcast system
US10230891B2 (en) Method, device and medium of photography prompts
CN107743244B (en) Video live broadcasting method and device
CN110677734B (en) Video synthesis method and device, electronic equipment and storage medium
CN112114765A (en) Screen projection method and device and storage medium
CN106775403B (en) Method and device for acquiring stuck information
KR20170005783A (en) Method and apparatus for displaying conversation interface
CN106254939B (en) Information prompting method and device
CN112532931A (en) Video processing method and device and electronic equipment
CN107885016B (en) Holographic projection method and device
CN106791563B (en) Information transmission method, local terminal equipment, opposite terminal equipment and system
CN107247794B (en) Topic guiding method in live broadcast, live broadcast device and terminal equipment
CN107105311B (en) Live broadcasting method and device
CN112463274B (en) Interface adjustment method and device and electronic equipment
CN111355973B (en) Data playing method and device, electronic equipment and storage medium
EP3767624A1 (en) Method and apparatus for obtaining audio-visual information
CN112087653A (en) Data processing method and device and electronic equipment
CN112541402A (en) Data processing method and device and electronic equipment
CN112685599A (en) Video recommendation method and device
CN116758896A (en) Conference audio language adjustment method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination