CN111862705A

CN111862705A - Method, device, medium and electronic equipment for prompting live broadcast teaching target

Info

Publication number: CN111862705A
Application number: CN202010586514.5A
Authority: CN
Inventors: 王珂晟; 黄劲; 黄钢; 许巧龄; 郝缘
Original assignee: Beijing Anbo Shengying Education Technology Co ltd
Current assignee: Beijing Anbo Shengying Education Technology Co ltd
Priority date: 2020-06-24
Filing date: 2020-06-24
Publication date: 2020-10-30

Abstract

The first audio stream is received before the first identity information is determined, so that network congestion can be reduced, delay and pause phenomena in the interactive process are avoided, and the teaching interactive process is guaranteed to be smoothly carried out. The first identity information is prompted in the first audio and video stream, and the generated second audio and video stream is transmitted to a teacher terminal which registers the live broadcast teaching. The teaching teacher can visually see images of students participating in interactive communication and identity information of the students among thousands of students attending lessons, so that the distance between two teaching parties is shortened, and the interestingness of teaching is improved. The number of the first audio and video streams transmitted to the teacher terminal is limited in a mode determined by the teacher terminal, so that the ordered progress of teaching interaction is guaranteed. Meanwhile, the network data volume is reduced, delay and blockage caused by network congestion are avoided, and the continuity of teaching is guaranteed.

Description

Method, device, medium and electronic equipment for prompting live broadcast teaching target

Technical Field

The disclosure relates to the technical field of computers, in particular to a method, a device, a medium and electronic equipment for prompting a live broadcast teaching target.

Background

Remote education is an education form which adopts various media ways to carry out system teaching and communication, and is education for transmitting courses to one or more student terminals outside a campus. Modern distance education refers to education in which a course is delivered to a remote student through audio, video, and computer technologies including real-time and non-real-time.

Live broadcast teaching is a form of remote education, and remote teaching is carried out in a real-time video form. The advantages are that: the teaching range is wide, is not limited by sites and the number of personnel, and can carry out direct teaching interaction and on-site question answering. Live teaching is a remote education mode closest to on-site teaching. However, in live broadcast teaching, students can see teachers, but teachers cannot see students, so that a gap and distance are created in teaching.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

The present disclosure is directed to a method, an apparatus, a medium, and an electronic device for prompting a live broadcast teaching target, which can solve at least one of the above-mentioned technical problems. The specific scheme is as follows:

according to a specific implementation manner of the present disclosure, in a first aspect, the present disclosure provides a method for prompting a live broadcast teaching target, including:

receiving a first audio stream sent by a first student terminal registered for live teaching;

analyzing the first audio stream and extracting a first voiceprint feature;

acquiring first identity information corresponding to the first voiceprint feature from an object voiceprint library based on the first voiceprint feature;

receiving a first audio and video stream sent by the first student terminal in real time, prompting the first identity information in the first audio and video stream to generate a second audio and video stream, and transmitting the second audio and video stream to a teacher terminal registered for the live teaching; wherein the first audio/video stream includes student images associated with the first identity information.

According to a second aspect of the present disclosure, there is provided a device for prompting a live teaching target, including:

the receiving unit is used for receiving a first audio stream sent by a first student terminal registered for live teaching;

An extraction unit configured to analyze the first audio stream and extract a first voiceprint feature;

a first identity information acquiring unit, configured to acquire, from an object voiceprint library, first identity information corresponding to the first voiceprint feature based on the first voiceprint feature;

the prompting unit is used for receiving a first audio and video stream sent by the first student terminal in real time, prompting the first identity information in the first audio and video stream to generate a second audio and video stream, and transmitting the second audio and video stream to a teacher terminal registered for live teaching; wherein the first audio/video stream includes student images associated with the first identity information.

According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of prompting a live instructional target as defined in any one of the first aspects.

According to a fourth aspect thereof, the present disclosure provides an electronic device, comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method of prompting a live instructional target as defined in any one of the first aspects.

Compared with the prior art, the scheme of the embodiment of the disclosure at least has the following beneficial effects:

the disclosure provides a method, a device, a medium and electronic equipment for prompting a live teaching target. The first audio stream is received before the first identity information is determined, so that network congestion can be reduced, delay and pause phenomena in the interactive process are avoided, and the teaching interactive process is guaranteed to be smoothly carried out. The first identity information is prompted in the first audio and video stream, and the generated second audio and video stream is transmitted to a teacher terminal which registers the live broadcast teaching. The teaching teacher can visually see images of students participating in interactive communication and identity information of the students among thousands of students attending lessons, so that the distance between two teaching parties is shortened, and the interestingness of teaching is improved. The number of the first audio and video streams transmitted to the teacher terminal is limited in a mode determined by the teacher terminal, so that the ordered progress of teaching interaction is guaranteed. Meanwhile, the network data volume is reduced, delay and blockage caused by network congestion are avoided, and the continuity of teaching is guaranteed.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale. In the drawings:

FIG. 1 illustrates a flow diagram of a method of prompting a live instructional target in accordance with an embodiment of the present disclosure;

FIG. 2 illustrates a block diagram of elements of an apparatus for prompting a live instructional target, according to an embodiment of the present disclosure;

fig. 3 shows an electronic device connection structure schematic according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Alternative embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.

The first embodiment provided by the present disclosure, namely, the embodiment of the method for prompting the live broadcast teaching target.

The embodiments of the present disclosure are described in detail below with reference to fig. 1.

Step S101, receiving a first audio stream sent by a first student terminal registered for live teaching.

Remote education is an education form which adopts various media ways to carry out system teaching and communication, and is education for transmitting courses to one or more student terminals outside a campus.

Live broadcast teaching is a form of remote education, and remote teaching is carried out in a real-time video form.

Before live teaching, a virtual classroom needs to be registered in the internet. In the virtual classroom, a specific teaching teacher is arranged, and the teaching teacher carries out live teaching through a teacher terminal. The students register the courses of the virtual classroom through the student terminals to acquire the qualification of participating in live teaching, and the number of the students on one side of the student terminals can be one or more. Since the virtual classroom exists in the internet, there is no upper limit to the number of students participating in live teaching in principle. Therefore, the live broadcast teaching can serve high-quality teaching resources to a wide student group, and the teaching quality is guaranteed.

The first audio stream is student speech that is sent after being gathered by the student terminal in the live teaching interaction. The student terminal is allowed to send only the audio stream before the student terminal is allowed to transmit the audio and video stream. Compared with audio and video streams with the same duration, the audio stream has smaller byte number and occupies less network flow, so that network congestion can be reduced, delay and pause phenomena in the interaction process are avoided, and the teaching interaction process is ensured to be smoothly carried out.

In this step, the method may include simultaneously receiving voices, which are uttered by different students, collected by a plurality of first student terminals.

Step S102, analyzing the first audio stream and extracting a first voiceprint feature.

Audio fingerprinting technology refers to extracting unique voiceprint features from a piece of Audio through a specific algorithm to identify a huge number of Audio stream samples. This step is to extract a first voiceprint feature from the first audio stream.

Step S103, acquiring first identity information corresponding to the first voiceprint feature from an object voiceprint library based on the first voiceprint feature.

The identification process of the audio fingerprint technology is not influenced by the storage format, the coding mode, the code rate and the compression technology of the audio. And comparing the content to be identified with the established object voiceprint library by extracting the first voiceprint feature in the first audio stream to finish the identification process.

Since each person has his own unique voiceprint characteristics, his voiceprint characteristics can represent the unique identity of the person. Therefore, in the embodiment of the present disclosure, a one-to-one mapping relationship is established between the first voiceprint feature and the first identity information in the object voiceprint library, and as long as the voiceprint feature matched with the first voiceprint feature is obtained in the object voiceprint library, the first identity information corresponding to the matched voiceprint feature is also obtained.

The first identity information is unique information representing the student. Such as the student's school number, or the student's name, class, and school zone.

However, in order to improve the recognition rate of the audio fingerprint technology and increase the fault tolerance rate, the step of obtaining the first identity information corresponding to the first voiceprint feature from the object voiceprint library based on the first voiceprint feature includes the following steps:

and S103-11, performing similarity matching on the first voiceprint features and prestored voiceprint features in the object voiceprint library to obtain a similarity matching result.

And S103-12, judging whether the similarity matching result meets a preset matching threshold value.

And S103-13, if yes, acquiring first identity information corresponding to the matched pre-stored voiceprint characteristics.

The embodiment of the present disclosure provides the above fault tolerance step, and provides a method for similarity matching in an audio fingerprint technology, that is, matching a first voiceprint feature with a pre-stored voiceprint feature in an object voiceprint library, and allowing a difference between the first voiceprint feature and the pre-stored voiceprint feature, if the difference is within a range of a pre-set matching threshold, it can be determined that the first voiceprint feature and the pre-stored voiceprint feature are successfully matched. The fault tolerance steps described above improve the success rate of identification.

And step S104, receiving a first audio and video stream sent by the first student terminal in real time, marking the first identity information in the first audio and video stream to prompt the first identity information, generating a second audio and video stream and transmitting the second audio and video stream to a teacher terminal registered for live teaching.

After the student who sends the first audio stream is identified, a request for collecting and uploading audio and video streams is sent to a first student terminal which collects the first audio stream, and the first student terminal uploads the collected first audio and video streams. Wherein the first audio/video stream includes student images associated with the first identity information.

The embodiment of the disclosure prompts first identity information in the first audio and video stream, and transmits the generated second audio and video stream to the teacher terminal registered with the live broadcast teaching. The teaching teacher can visually see images of students participating in interactive communication and identity information of the students among thousands of students attending lessons, so that the distance between two teaching parties is shortened, and the interestingness of teaching is improved.

When different first audio streams collected by a plurality of first student terminals are received at the same time, the embodiment of the disclosure can generate a plurality of second audio and video streams to be transmitted to the teacher terminal. The teacher terminal can synchronously display a plurality of student terminal images participating in live broadcast teaching interaction.

When a plurality of students participate in live teaching at a student terminal, in order to enable a teacher to more directly determine students participating in interaction from a plurality of students, further, the method for prompting the first identity information in the first audio/video stream to generate a second audio/video stream comprises the following steps:

and step S104-1, extracting a first frame image of each frame from the first audio and video stream.

I.e. the video in the first audio-video stream is decomposed into a plurality of frames of images.

And step S104-2, acquiring corresponding first facial feature information from a facial feature information base based on the first identity information.

And step S104-3, performing face recognition on the first frame image of each frame according to the first face feature information, and acquiring a first face region of each frame.

And step S104-4, prompting the first identity information near the first face area of the first frame image of each frame, and generating a second frame image.

Prompting the first identity information may be displaying the first identity information in proximity to the first face region.

In order to specify a specific interactive object more specifically, the prompting the first identity information near a first face region of each frame of the first frame image and generating a second frame image further includes:

And step S104-4-1, prompting the first face area in each frame of first frame image, prompting the first identity information near the first face area, and generating a second frame image.

And prompting the first face area, and drawing a square frame in the first face area to prompt the artificial student who participates in interaction at present.

And step S104-5, combining the second frame image based on a preset video format to generate a second audio/video stream.

In the steps, the image of the interactive object in the first audio/video stream is found in a face recognition mode, so that a teacher can easily know the information of the multiple interactive objects and clearly see the face of the teacher.

In order to prevent that the unordered speech of student appears in the interactive process of live teaching, cause teacher's terminal information confusion, this disclosed embodiment has carried out further optimization.

Before the real-time receiving of the first audio and video stream sent by the first student terminal, the method further comprises the following steps:

and step S103-21, transmitting the first identity information to the teacher terminal.

And S103-22, acquiring the first identity information determined by the teacher terminal.

In the embodiment of the disclosure, the number of the first audio/video streams transmitted to the teacher terminal is limited in a manner determined by the teacher terminal, so that the ordered progress of teaching interaction is ensured. Meanwhile, the network data volume is reduced, delay and blockage caused by network congestion are avoided, and the continuity of teaching is guaranteed.

In order to enable other students participating in live teaching to see not only teachers but also students participating in interaction, the method further comprises the following steps:

and step S105, transmitting the second audio and video stream to a student terminal registered for live broadcasting teaching.

In order to facilitate the teacher to freely view the teaching site of the student side, the method further comprises the following steps:

and S108-1, acquiring all second identity information registered for the live teaching.

And step S108-2, transmitting the second identity information to the teacher terminal.

And step S108-3, receiving third identity information determined by the teacher terminal from the second identity information.

And S108-4, receiving a corresponding third audio-video stream in real time based on the third identity information, and transmitting the third audio-video stream to a teacher terminal and/or a student terminal which are registered for live teaching.

The steps ensure that the textbook terminal can watch the teaching site of any student end at any time.

According to the embodiment of the disclosure, the first audio stream is received before the first identity information is determined, so that network congestion can be reduced, delay and pause phenomena in an interaction process are avoided, and a teaching interaction process is ensured to be smoothly carried out. The first identity information is prompted in the first audio and video stream, and the generated second audio and video stream is transmitted to a teacher terminal which registers the live broadcast teaching. The teaching teacher can visually see images of students participating in interactive communication and identity information of the students among thousands of students attending lessons, so that the distance between two teaching parties is shortened, and the interestingness of teaching is improved. The number of the first audio and video streams transmitted to the teacher terminal is limited in a mode determined by the teacher terminal, so that the ordered progress of teaching interaction is guaranteed. Meanwhile, the network data volume is reduced, delay and blockage caused by network congestion are avoided, and the continuity of teaching is guaranteed.

Corresponding to the first embodiment provided by the disclosure, the disclosure also provides a second embodiment, namely a device for prompting the live broadcast teaching target. Since the second embodiment is basically similar to the first embodiment, the description is simple, and the relevant portions should be referred to the corresponding description of the first embodiment. The device embodiments described below are merely illustrative.

Fig. 2 shows an embodiment of an apparatus for prompting a live teaching target provided by the present disclosure.

As shown in fig. 2, the present disclosure provides a device for prompting a live teaching target, including:

the receiving unit 201 is configured to receive a first audio stream sent by a first student terminal registered for live teaching;

an extracting unit 202, configured to analyze the first audio stream and extract a first voiceprint feature;

an obtaining first identity information unit 203, configured to obtain first identity information corresponding to the first voiceprint feature from an object voiceprint library based on the first voiceprint feature;

the prompting unit 204 is configured to receive a first audio/video stream sent by the first student terminal in real time, prompt the first identity information in the first audio/video stream to generate a second audio/video stream, and transmit the second audio/video stream to a teacher terminal registered for live teaching; wherein the first audio/video stream includes student images associated with the first identity information.

Optionally, the unit for obtaining first identity information 203 includes:

a similarity matching result obtaining subunit, configured to perform similarity matching with pre-stored voiceprint features in the object voiceprint library based on the first voiceprint feature, and obtain a similarity matching result;

the judging subunit is used for judging whether the similarity matching result meets a preset matching threshold value;

and the first identity information acquiring subunit is used for acquiring the first identity information corresponding to the matched pre-stored voiceprint characteristics if the output result of the judging subunit is yes.

Optionally, the apparatus further includes: determining a first identity information element;

in the determining the first identity information element, the following are included:

a transmitting first identity information subunit operable to transmit the first identity information to the teacher terminal;

and the first identity information determining unit subunit is used for acquiring the first identity information determined by the teacher terminal.

Optionally, the prompting unit 204 includes:

the first frame image extracting subunit is used for extracting a first frame image of each frame from the first audio and video stream;

the acquiring first facial feature information subunit is used for acquiring corresponding first facial feature information from a facial feature information base on the basis of the first identity information;

The acquiring first face region subunit is used for carrying out face recognition on a first frame image of each frame according to the first face feature information to acquire a first face region of each frame;

a first identity information prompting subunit, configured to prompt the first identity information near a first face area of each first frame image, and generate a second frame image;

and the second audio and video stream generating subunit is used for combining the second frame image based on a preset video format to generate a second audio and video stream.

Optionally, in the unit for prompting first identity information, the method further includes:

and the prompting first face area subunit is used for prompting the first face area in each frame of first frame image, prompting the first identity information near the first face area and generating a second frame image.

Optionally, the apparatus further includes:

and the second audio and video stream transmission unit is used for transmitting the second audio and video stream to the student terminal registered with the live broadcast teaching.

Optionally, the apparatus further includes: a switching unit;

in the switching unit, comprising:

the second identity information acquisition subunit is used for acquiring all second identity information registered for the live broadcast teaching;

A transmitting second identity information subunit, configured to transmit the second identity information to the teacher terminal;

a third identity information receiving subunit, configured to receive third identity information determined by the teacher terminal from the second identity information;

and the third audio and video stream transmitting subunit is used for receiving a corresponding third audio and video stream in real time based on the third identity information and transmitting the third audio and video stream to a teacher terminal and/or a student terminal which are registered for live teaching.

The embodiment of the present disclosure provides a third embodiment, that is, an electronic device, where the electronic device is used to prompt a live broadcast teaching target, and the electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the one processor to cause the at least one processor to perform the method of cueing a live instructional target as described in the first embodiment.

The fourth embodiment provides a computer storage medium for prompting a live teaching target, wherein the computer storage medium stores computer-executable instructions which can execute the method for prompting the live teaching target in the first embodiment.

Referring now to FIG. 3, shown is a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 3, the electronic device may include a processing device (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage device 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the electronic apparatus are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.

Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device to communicate wirelessly or by wire with other devices to exchange data. While fig. 3 illustrates an electronic device having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 309, or installed from the storage means 308, or installed from the ROM 302. The computer program, when executed by the processing device 301, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (Hyper text transfer protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a Local Area Network (LAN), a Wide Area Network (WAN), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the C language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (socs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. A method for prompting a live teaching target is characterized by comprising the following steps:

analyzing the first audio stream and extracting a first voiceprint feature;

2. The method according to claim 1, wherein the obtaining first identity information corresponding to the first voiceprint feature from a subject voiceprint library based on the first voiceprint feature comprises:

Performing similarity matching on the basis of the first voiceprint features and prestored voiceprint features in the object voiceprint library to obtain a similarity matching result;

judging whether the similarity matching result meets a preset matching threshold value or not;

if yes, first identity information corresponding to the matched pre-stored voiceprint features is obtained.

3. The method according to claim 1, before said receiving in real time the first audio/video stream sent by the first student terminal, further comprising:

transmitting the first identity information to the teacher terminal;

and acquiring the first identity information determined by the teacher terminal.

4. The method of claim 1, wherein prompting the first identity information in the first audio-video stream to generate a second audio-video stream comprises:

extracting a first frame image of each frame from the first audio and video stream;

acquiring corresponding first facial feature information from a facial feature information base based on the first identity information;

performing face recognition on the first frame image of each frame according to the first face feature information to acquire a first face area of each frame;

prompting the first identity information near a first face area of each frame of first frame image, and generating a second frame image;

And combining the second frame image based on a preset video format to generate a second audio/video stream.

5. The method of claim 4, wherein said prompting for said first identity information in the vicinity of a first facial region of a first frame image per frame and generating a second frame image further comprises:

the first face region is presented in each first frame image, and the first identity information is presented in the vicinity of the first face region, and a second frame image is generated.

6. The method of claim 1, further comprising:

and transmitting the second audio and video stream to a student terminal registered with the live broadcast teaching.

7. The method of claim 1, further comprising:

acquiring all second identity information registered for the live broadcast teaching;

transmitting the second identity information to the teacher terminal;

receiving third identity information determined by the teacher terminal from the second identity information;

and receiving a corresponding third audio-video stream in real time based on the third identity information, and transmitting the third audio-video stream to a teacher terminal and/or a student terminal which are registered for live teaching.

8. An apparatus for prompting a live teaching target, comprising:

9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1 to 7.

10. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method of any one of claims 1 to 7.