CN112541402A

CN112541402A - Data processing method and device and electronic equipment

Info

Publication number: CN112541402A
Application number: CN202011312594.1A
Authority: CN
Inventors: 牛红霞; 路呈璋; 张爽
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2020-11-20
Filing date: 2020-11-20
Publication date: 2021-03-23

Abstract

The embodiment of the invention provides a data processing method, a data processing device and electronic equipment, wherein the method comprises the following steps: acquiring first video stream data of a video conference acquired by the recording equipment; carrying out portrait recognition on the first video stream data to obtain a portrait image; and displaying the portrait image when the first video stream data is displayed. According to the embodiment of the invention, when the first video stream data is displayed on the recording equipment, the portrait image is displayed at the same time, so that personalized content is provided, the content of the video conference is enriched, and the communication experience of the video conference is improved.

Description

Data processing method and device and electronic equipment

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, and an electronic device.

Background

In recent years, recording apparatuses have been developed rapidly and have entered the public domain as products in professional fields. Recording equipment is generally required for recording by journalists, students, teachers and other groups. In addition, recording of various television programs, movies, music, etc. requires the use of recording equipment.

In many scenes, besides the recording device can be used for recording, the recording device can also be used for carrying out video conference through the video conference software of a third party. However, in the current video conference process, only the video stream data collected by the camera is displayed, so that the content displayed on the recording device lacks personalized content, resulting in poor communication experience of the video conference.

Disclosure of Invention

The embodiment of the invention provides a data processing method, which is used for identifying and displaying a portrait image from video stream data during a video conference, so that personalized content is provided, the content of the video conference is enriched, and the communication experience of the video conference is improved.

Correspondingly, the embodiment of the invention also provides a data processing device and electronic equipment, which are used for ensuring the realization and application of the method.

In order to solve the above problem, an embodiment of the present invention discloses a data processing method, which specifically includes:

acquiring first video stream data of a video conference acquired by the recording equipment;

carrying out portrait recognition on the first video stream data to obtain a portrait image;

and displaying the portrait image when the first video stream data is displayed.

Optionally, the performing portrait recognition on the first video stream data to obtain a portrait image further includes:

receiving a starting instruction of portrait recognition;

and starting portrait recognition according to the starting instruction.

identifying lip feature data from the portrait image;

judging whether the lip feature data meet preset conditions or not;

when a preset condition is met, determining the portrait image as a target portrait image;

and adding a visual identifier on the target portrait image.

Optionally, the identifying lip feature data from the portrait image includes:

extracting identity data from the first video stream data;

when the identity data changes, lip feature data are identified from the portrait image.

Optionally, after the performing portrait recognition on the first video stream data to obtain a portrait image, the method further includes:

transmitting the portrait image and the first video stream data into an electronic device;

and displaying the portrait image when the first video stream data is displayed on the electronic equipment.

Optionally, the displaying the portrait image while displaying the first video stream data includes:

before the electronic equipment does not enter a video conference, displaying the portrait image when displaying the first video stream data.

Optionally, the displaying the portrait image while displaying the first video stream data further includes:

after the electronic equipment enters a video conference, acquiring second video stream data sent by the electronic equipment;

combining the first video stream data and the portrait images into a window video;

and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.

Optionally, the identity data is a voiceprint.

The embodiment of the invention also discloses a data processing method which is applied to the electronic equipment and comprises the following steps:

receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;

The embodiment of the invention also discloses a data processing device, which is applied to the recording equipment, and the device comprises:

the video stream data acquisition module is used for acquiring first video stream data of the video conference acquired by the recording equipment;

the portrait recognition module is used for carrying out portrait recognition on the first video stream data to obtain a portrait image;

and the video stream data display module is used for displaying the portrait image when the first video stream data is displayed.

Optionally, the method further comprises: the face recognition starting module is used for receiving a face recognition starting instruction; and starting portrait recognition according to the starting instruction.

Optionally, the method further comprises: the visual identification adding module is used for identifying lip feature data from the portrait image; judging whether the lip feature data meet preset conditions or not; when the preset condition is met, determining the portrait image as a target portrait image; and adding a visual identifier on the target portrait image.

Optionally, the visual identifier adding module is configured to extract identity data from the first video stream data; when the identity data changes, lip feature data are identified from the portrait image.

Optionally, the apparatus further comprises: the video stream data transmission module is used for transmitting the portrait image and the first video stream data to the electronic equipment; and displaying the portrait image when the first video stream data is displayed on the electronic equipment.

Optionally, the video stream data display module is configured to display the portrait image when the first video stream data is displayed before the electronic device enters the video conference.

Optionally, the video stream data display module is configured to obtain second video stream data sent by the electronic device after the electronic device enters a video conference; combining the first video stream data and the portrait images into a window video; and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.

Optionally, the identity data is a voiceprint.

The embodiment of the invention also discloses a data processing device, which is applied to electronic equipment, and the device comprises:

the video stream data receiving module is used for receiving first video stream data and a portrait image of the video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;

The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the electronic equipment, the electronic equipment can execute the data processing method according to any one of the embodiments of the invention.

The embodiment of the invention also discloses a sound recording device, which comprises a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs are configured to be executed by one or more processors and comprise instructions for: acquiring first video stream data of a video conference acquired by the recording equipment; carrying out portrait recognition on the first video stream data to obtain a portrait image; and displaying the portrait image when the first video stream data is displayed.

receiving a portrait recognition starting instruction;

and starting portrait recognition according to the starting instruction.

identifying lip feature data from the portrait image;

judging whether the lip feature data meet preset conditions or not;

when the preset condition is met, determining the portrait image as a target portrait image;

and adding a visual identifier on the target portrait image.

Optionally, the identifying lip feature data from the portrait image includes:

extracting identity data from the first video stream data;

Optionally, the identity data is a voiceprint.

An embodiment of the present invention also discloses an electronic device, including a memory, and one or more programs, where the one or more programs are stored in the memory, and configured to be executed by one or more processors, and the one or more programs include instructions for: receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data; and displaying the portrait image when the first video stream data is displayed.

The embodiment of the invention has the following advantages:

in the embodiment of the invention, in the video conference process, the portrait identification is carried out on the first video streaming data of the video conference collected by the recording equipment to obtain the portrait image, and then the portrait image can be displayed simultaneously when the first video streaming data is displayed on the recording equipment, so that the personalized content is provided, the content of the video conference is enriched, and the communication experience of the video conference is improved.

Furthermore, in the embodiment of the invention, lip language recognition can be respectively carried out on the obtained portrait images, when the lip characteristic data meet the preset conditions, the portrait images are determined as the target portrait images and visual marks are added to distinguish the target portrait images from other portrait images, so that the participants who normally talk at present can be quickly concerned, and the content of the video conference can be better absorbed.

Drawings

FIG. 1 is a flow chart of the steps of one data processing method embodiment of the present invention;

FIG. 2 is a schematic diagram of video conferencing software on a recording device of the present invention;

FIG. 3 is a flow chart of the processing steps of a visual marker of the present invention;

FIG. 4a is a schematic illustration of a display of the present invention after activation of a portrait recognition function;

FIG. 4b is a schematic illustration of a display of the present invention with the portrait recognition function turned off;

FIG. 5a is a schematic view of a display of an electronic device of the present invention before entering a video conference;

FIG. 5b is a schematic view of a display of an electronic device of the present invention after entering a video conference;

FIG. 6 is a flow chart of the steps of yet another alternative embodiment of a data processing method of the present invention;

FIG. 7 is a block diagram of an embodiment of a data processing apparatus of the present invention;

FIG. 8 is a block diagram of an alternate embodiment of a data processing apparatus of the present invention;

FIG. 9 illustrates a block diagram of an electronic device for data processing in accordance with an exemplary embodiment.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

The embodiment of the invention provides a data processing method, which is applied to recording equipment, wherein the recording equipment can be equipment with a recording function, such as a recording pen, translation equipment such as a translation pen, a translator and the like; the embodiments of the present invention are not limited in this regard.

Wherein, the recording equipment is provided with an image acquisition module. Therefore, when the recording equipment carries out a video conference, the portrait image can be obtained by identifying the portrait of the first video streaming data through the first video streaming data collected by the image collecting module, and the portrait image can be displayed simultaneously when the first video streaming data is displayed on the recording equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.

Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:

and 102, acquiring first video stream data of the video conference, which is acquired by the recording equipment.

In the embodiment of the invention, when a user needs to carry out a video conference, the video conference function of the recording equipment can be started so as to carry out the video conference based on the recording equipment. Specifically, the recording device may be installed with third-party video conference software, and the user may start the video conference software on the recording device, so that a video conference may be performed based on the video conference software.

Referring to fig. 2, one or more pieces of video conference software, such as video conference software 1, video conference software 2, and video conference software 3 of fig. 2, may be installed on the recording apparatus, and a user may select one of the video conference software to initiate or enter a video conference in preparation for starting the video conference.

In a specific example, the image acquisition module may be a camera, and after video conference software is started on the recording device, the recording device may call the camera in real time to shoot in the process of a video conference, so as to obtain first video stream data.

And step 104, performing portrait recognition on the first video stream data to obtain a portrait image.

In a specific implementation, a plurality of participants usually participate in the video conference during the video conference, so that the image recognition can be performed on the first video stream data, and the image images corresponding to the participants are obtained. For example, assuming that the participants include a participant a, a participant B, and a participant C, after performing the portrait recognition on the first video stream data, portrait images corresponding to the participant a, the participant B, and the participant C may be obtained.

And 106, displaying the portrait image when the first video stream data is displayed.

In the embodiment of the invention, after the recording device collects the first video stream data and identifies the portrait image from the first video stream data, the first video stream data and the portrait image can be displayed on the recording device, so that the participants of the parties participating in the video conference can be determined on the recording device. Specifically, the first video stream data may be displayed on the full screen on the display of the sound recording apparatus, and then the portrait image may be displayed at a certain position, such as a middle-lower position, on the display.

It should be noted that, in the embodiment of the present invention, the face recognition may be performed in the whole video conference process, or the face recognition may be performed only under a specific condition. For example, the person image recognition may be performed on the first video stream data only within 1 minute after the video conference starts, and then the person image recognition does not need to be performed after 1 minute, thereby reducing unnecessary system overhead, or the person image recognition may be stopped after the person image of all participants is recognized. Of course, the above specified conditions are only examples, and in practical implementation, the conditions may be adjusted according to practical situations, and embodiments of the present invention need not be limited to these conditions.

As a specific example of the present invention, since the picture corresponding to the first video stream data is constantly changing, the captured portrait image is also constantly changing, and therefore the portrait image displayed on the sound recording device may also be constantly changing. Considering that different pictures are already dynamically displayed when the first video data stream is displayed, if the portrait images also follow the continuous change, the participants may not be centralized, so in the embodiment of the present invention, one of the portrait images of each participant can be selected and displayed, and will not be modified during the video conference. The selection of the portrait images may be, for example, selecting high-definition portrait images according to a preset algorithm, or may be selected by participants through recording equipment by themselves, which is not limited in the embodiment of the present invention.

According to the data processing method, in the video conference process, the portrait identification is carried out on the first video streaming data of the video conference collected by the recording equipment to obtain the portrait image, and then the portrait image can be displayed simultaneously when the first video streaming data is displayed on the recording equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.

In an optional embodiment of the present invention, the step 102, performing portrait recognition on the first video stream data to obtain a portrait image, further includes:

receiving a portrait recognition starting instruction;

and starting portrait recognition according to the starting instruction.

In the embodiment of the invention, the recording equipment is provided with a portrait recognition function, and after receiving a portrait recognition starting instruction, the portrait recognition can be started according to the starting instruction, so that after the recording equipment collects first video stream data, the portrait recognition can be carried out on the first video stream data, and further a portrait image can be obtained.

In the above optional embodiment, when receiving the face recognition start instruction, the corresponding face recognition function is started, so that the user can selectively start or close the face recognition function according to actual requirements, thereby improving the user experience and reducing unnecessary system overhead.

In an optional embodiment of the present invention, referring to fig. 2, the performing portrait recognition on the first video stream data to obtain a portrait image, where the method further includes:

step 302, identifying lip feature data from the portrait image;

step 304, judging whether the lip feature data meet preset conditions;

step 306, when a preset condition is met, determining the portrait image as a target portrait image;

and 308, adding a visual identifier on the target portrait image.

In the video conference process, participants usually speak to make their opinions, and when speaking, the participants need to make sounds by opening and closing lips, so in the embodiment of the invention, feature extraction is further performed on the portrait image to identify lip feature data, and then the participant who is speaking currently in the video conference is confirmed.

In the embodiment of the invention, the portrait image is input into the pre-trained portrait recognition model, and the portrait recognition model can output corresponding lip feature data. Specifically, the preset condition may be that lip feature data is matched with preset lip feature data, where the preset lip feature data is lip feature data generated in advance to characterize that a participant is speaking, and therefore, when the lip feature data is matched with the preset lip feature data, it may be determined that the participant corresponding to the portrait image is speaking, and the portrait image may be determined as the target portrait image.

The visual identification can highlight, frame and magnify other special effects.

As a specific example, assuming that the portrait image includes portrait image a, portrait image B, and portrait image C, when it is determined that lip feature data of portrait image a satisfies a preset condition, portrait image a is taken as a target portrait image, and then highlight special effects may be added to portrait image a.

In the above optional embodiment, the lip feature data of the portrait image is recognized to determine the target portrait image of the participant who normally talks at present, then the visual identifier is added to the target portrait image, and the portrait image including the visual identifier is displayed on the recording device, so that the participant who normally talks at present can be quickly noticed, and the content of the video conference can be better absorbed.

In an optional embodiment of the present invention, the step 302 of identifying lip feature data from the portrait image includes:

extracting identity data from the first video stream data;

The identity data is used for uniquely identifying the corresponding participant. Specifically, the identity data may be a voiceprint, and since the voiceprint has specificity and stability, different speaking participants in the video conference can be identified through the voiceprint.

In the embodiment of the invention, the identity data is extracted from the first video stream data in real time, and if the identity data changes, which indicates that the participant who is speaking at present changes, the lip feature data can be further identified from the portrait image.

As a specific example, a voiceprint is extracted from the first video stream data in real time, and if a participant currently speaking in the video conference changes from participant a to participant B, it may be detected that the voiceprint changes, that is, the voiceprint of participant a is switched to the voiceprint of participant B, and lip feature data may be further identified from the portrait image.

In the above optional embodiment, only when the identity data is detected to be changed, that is, the participant who is speaking is detected to be changed, the lip feature data may be further identified from the portrait image to identify the portrait image corresponding to the changed participant to add the visual identifier, so that the lip feature data does not need to be identified in the whole video conference process, thereby reducing unnecessary system overhead.

In an optional embodiment of the present invention, after performing portrait recognition on the first video stream data to obtain a portrait image in step 102, the method further includes:

The electronic device may be a device that performs a video conference with a recording device, such as other recording devices, a translation device, a smart phone, a tablet device, or a computer, and the like.

In the embodiment of the present invention, after obtaining the portrait image from the first video stream data, the recording device may package the first video stream data and the portrait image together and send the first video stream data and the portrait image to other electronic devices participating in the video conference, so that the electronic devices display the portrait image while displaying the first video stream data after receiving the portrait image. Specifically, the first video stream data may be displayed full screen on a display of the electronic device, and then the portrait image may be displayed at a certain position, such as a middle-lower position, on the display.

In the data processing method, in the video conference process, the portrait identification is carried out on the first video stream data of the video conference collected by the recording equipment to obtain the portrait image, and then the portrait image and the first video stream data are packed and transmitted to the electronic equipment together, so that the portrait image is displayed when the first video stream data is displayed on the electronic equipment, the content of the video conference is enriched, and the communication experience of the video conference is improved.

In an optional embodiment of the present invention, the displaying the portrait image while displaying the first video stream data includes: before the electronic equipment does not enter a video conference, displaying the portrait image when displaying the first video stream data.

Specifically, before the electronic device does not enter the video conference, if the human image recognition function is started at this time, the human image may be recognized from the first video stream data collected by the recording device, and then the first video stream data and the human image may be displayed on the recording device.

In the above-described alternative embodiment, the display effect may be previewed on the sound recording apparatus in advance before transmitting the first video stream data and the portrait image to the electronic apparatus.

Referring to fig. 4a, after the human image recognition function is started, first video stream data and a human image are displayed on the sound recording device, the human image is displayed at a middle-lower position and comprises a human image a, a human image B and a human image C, wherein if the human image comprises an image corresponding to a speaking participant as the human image B, the human image B can be enlarged to prompt that the participant corresponding to the human image B is speaking. After the face recognition function is turned off, the recording device will not perform face recognition, and the face image will not be displayed on the recording device, which may specifically refer to fig. 4 b.

It should be noted that, when the portrait image is displayed on the recording device in the embodiment of the present invention, the portrait image may be displayed in a horizontal screen manner or a vertical screen manner, which is not limited in this respect. Optionally, in the embodiment of the present invention, the position of the portrait images on the display screen may be adjusted according to whether the recording device is a horizontal screen or a vertical screen, for example, when the recording device is a vertical screen, the portrait images are horizontally and sequentially placed, when the recording device is a horizontal screen, the portrait images are vertically and sequentially placed, or whether the recording device is a vertical screen or a horizontal screen, the portrait images are horizontally and sequentially placed, which is also not limited by the embodiment of the present invention.

In an optional embodiment of the present invention, the displaying the portrait image while displaying the first video stream data further includes:

The electronic device may be a device that performs a video conference with a recording device, such as other recording devices, a translation device, a smart phone, a tablet device, or a computer, and the like, which is not limited in this embodiment of the present invention.

Specifically, after the electronic device enters the video conference, second video stream data sent by the electronic device can be received, the portrait image can be displayed simultaneously when the second video stream data is displayed on the recording device, and the first video stream data and the portrait image of the recording device can be combined into a small video window, so that the second video stream data, the portrait image and the window video can be displayed simultaneously on the recording device.

Referring to fig. 5a, after the portrait recognition function is started, after second video stream data of the electronic device is received, the screen is switched to show the second video stream data, and a portrait image including a portrait image a, a portrait image B, and a portrait image C is displayed. In addition, the first video stream data and the portrait image of the recording apparatus may be continuously displayed on the recording apparatus after being reduced, as shown in the upper right corner of fig. 5 a. After the face recognition function is turned off, the recording device will not perform face recognition any more, and then the face image will not be displayed on the recording device any more, and meanwhile the window video will not include the face image any more, which may be specifically referred to fig. 5 b.

In the above-described alternative embodiment, when the second video stream data sent by the electronic device is displayed on the recording device, the display effect of the portrait image can still be viewed, and the display effect can also be viewed in the video window.

Referring to fig. 6, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:

step 602, receiving first video stream data and a portrait image of a video conference sent by the sound recording device; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data.

And step 604, displaying the portrait image when the first video stream data is displayed.

In the data processing method, in the video conference process, the electronic equipment can receive the first video stream data of the video conference collected by the recording equipment, and identify and obtain the portrait image based on the second video stream data, and then display the portrait image when the first video stream data is displayed on the electronic equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.

Of course, if the participant of the electronic device does not want to display the portrait image, the participant may also send a prompt message to the recording device to prompt the recording device to turn off the portrait recognition function, and further not transmit the portrait image while continuing to transmit the first video stream data.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

Referring to fig. 7, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, and is applied to a recording device, and may specifically include the following modules:

a video stream data obtaining module 702, configured to obtain first video stream data of a video conference collected by the sound recording device;

a portrait identification module 704, configured to perform portrait identification on the first video stream data to obtain a portrait image;

a video stream data display module 706, configured to display the portrait image when the first video stream data is displayed.

In an optional embodiment of the invention, the apparatus further comprises: the face recognition starting module is used for receiving a face recognition starting instruction; and starting portrait recognition according to the starting instruction.

In an optional embodiment of the invention, the apparatus further comprises: the visual identification adding module is used for identifying lip feature data from the portrait image; judging whether the lip feature data meet preset conditions or not; when the preset condition is met, determining the portrait image as a target portrait image; and adding a visual identifier on the target portrait image to distinguish the target portrait image from other portrait images.

In an optional embodiment of the present invention, the visual identifier adding module is configured to extract identity data from the first video stream data; when the identity data changes, lip feature data are identified from the portrait image.

In an optional embodiment of the invention, the apparatus further comprises: the video stream data transmission module is used for transmitting the portrait image and the first video stream data to the electronic equipment; and displaying the portrait image when the first video stream data is displayed on the electronic equipment.

In an optional embodiment of the present invention, the video stream data display module 706 is configured to display the portrait image when the first video stream data is displayed before the electronic device enters the video conference.

In an optional embodiment of the present invention, the video stream data display module 706 is configured to obtain second video stream data sent by the electronic device after the electronic device enters a video conference; combining the first video stream data and the portrait images into a window video; and displaying the portrait image and the window video when the second video stream data is displayed on the recording equipment.

In an optional embodiment of the invention, the identity data is a voiceprint.

Referring to fig. 8, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, and is applied to an electronic device, and specifically includes the following modules:

a video stream data receiving module 802, configured to receive first video stream data and a portrait image of a video conference sent by the sound recording device; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data;

a video stream data display module 804, configured to display the portrait image when the first video stream data is displayed.

In the embodiment of the invention, in the video conference process, the electronic equipment can receive the first video stream data of the video conference collected by the recording equipment, identify and obtain the portrait image based on the second video stream data, and simultaneously display the portrait image when the first video stream data is displayed on the electronic equipment, so that the content of the video conference is enriched, and the communication experience of the video conference is improved.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

Fig. 9 is a block diagram illustrating an architecture of an electronic device 900 for data processing in accordance with an example embodiment. For example, the electronic device 900 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 9, electronic device 900 may include one or more of the following components: a processing component 902, a memory 904, a power component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, and a communication component 916.

The processing component 902 generally controls overall operation of the electronic device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 902 may include one or more processors 1020 to execute instructions to perform all or a portion of the steps of the methods described above. Further, processing component 902 can include one or more modules that facilitate interaction between processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.

The memory 904 is configured to store various types of data to support operation at the device 900. Examples of such data include instructions for any application or method operating on the electronic device 900, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 904 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power component 906 provides power to the various components of the electronic device 900. Power components 906 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device 900.

The multimedia components 908 include a screen that provides an output interface between the electronic device 900 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 908 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 900 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 910 is configured to output and/or input audio signals. For example, the audio component 910 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 900 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 904 or transmitted via the communication component 916. In some embodiments, audio component 910 also includes a speaker for outputting audio signals.

I/O interface 912 provides an interface between processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 914 includes one or more sensors for providing status evaluations of various aspects of the electronic device 900. For example, sensor assembly 914 may detect an open/closed state of device 900, the relative positioning of components, such as a display and keypad of electronic device 900, sensor assembly 914 may also detect a change in the position of electronic device 900 or a component of electronic device 900, the presence or absence of user contact with electronic device 900, orientation or acceleration/deceleration of electronic device 900, and a change in the temperature of electronic device 900. The sensor assembly 914 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 916 is configured to facilitate wired or wireless communication between the electronic device 900 and other devices. The electronic device 900 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication part 914 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 914 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 904 comprising instructions, executable by the processor 920 of the electronic device 900 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

A non-transitory computer readable storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform a data processing method, the method comprising: receiving first video stream data and a portrait image of a video conference sent by the recording equipment; the portrait image is obtained by the recording equipment through portrait recognition according to the first video stream data; and displaying the portrait image when the first video stream data is displayed.

In an alternative embodiment of the present invention, the electronic device 1100 may be a recording device, and the recording device may be a recording pen, a translating pen, a translator, or the like. A non-transitory computer readable storage medium having instructions therein which, when executed by a processor of an audio recording device, enable the audio recording device to perform a data processing method, the method comprising: acquiring first video stream data of a video conference acquired by the recording equipment; carrying out portrait recognition on the first video stream data to obtain a portrait image; and displaying the portrait image when the first video stream data is displayed.

receiving a portrait recognition starting instruction;

and starting portrait recognition according to the starting instruction.

identifying lip feature data from the portrait image;

judging whether the lip feature data meet preset conditions or not;

and adding a visual identifier on the target portrait image.

Optionally, the identifying lip feature data from the portrait image includes:

extracting identity data from the first video stream data;

Optionally, the identity data is a voiceprint.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The data processing method, the data processing apparatus and the electronic device provided by the present invention are described in detail above, and specific examples are applied herein to illustrate the principles and embodiments of the present invention, and the descriptions of the above embodiments are only used to help understand the method and the core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A data processing method is applied to a sound recording device, and the method comprises the following steps:

2. The method of claim 1, wherein the performing portrait recognition on the first video stream data to obtain a portrait image, further comprises:

receiving a portrait recognition starting instruction;

and starting portrait recognition according to the starting instruction.

3. The method of claim 1, wherein the performing portrait recognition on the first video stream data to obtain a portrait image, further comprises:

identifying lip feature data from the portrait image;

judging whether the lip feature data meet preset conditions or not;

and adding a visual identifier on the target portrait image.

4. The method of claim 3, wherein the identifying lip feature data from the portrait image comprises:

extracting identity data from the first video stream data;

5. A data processing method is applied to an electronic device, and the method comprises the following steps:

6. A data processing apparatus, for use in a sound recording device, the apparatus comprising:

7. A data processing apparatus, for use in an electronic device, the apparatus comprising:

8. An audio recording apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:

9. An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors the one or more programs including instructions for:

10. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data processing method according to any of method claims 1-5.