WO2021218379A1 - 一种多媒体互动方法、装置、设备及存储介质 - Google Patents
一种多媒体互动方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2021218379A1 WO2021218379A1 PCT/CN2021/079166 CN2021079166W WO2021218379A1 WO 2021218379 A1 WO2021218379 A1 WO 2021218379A1 CN 2021079166 W CN2021079166 W CN 2021079166W WO 2021218379 A1 WO2021218379 A1 WO 2021218379A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- control instruction
- multimedia
- image
- multimedia interactive
- frame
- Prior art date
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 128
- 238000000034 method Methods 0.000 title claims abstract description 74
- 238000000605 extraction Methods 0.000 claims abstract description 25
- 230000002452 interceptive effect Effects 0.000 claims description 156
- 238000012545 processing Methods 0.000 claims description 64
- 230000015572 biosynthetic process Effects 0.000 claims description 16
- 238000003786 synthesis reaction Methods 0.000 claims description 16
- 238000007781 pre-processing Methods 0.000 claims description 15
- 230000001960 triggered effect Effects 0.000 claims description 14
- 230000009467 reduction Effects 0.000 claims description 13
- 230000005540 biological transmission Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 description 27
- 230000008569 process Effects 0.000 description 24
- 230000006870 function Effects 0.000 description 21
- 230000000694 effects Effects 0.000 description 13
- 238000013473 artificial intelligence Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 239000000284 extract Substances 0.000 description 5
- 238000009434 installation Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000010365 information processing Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000012092 media component Substances 0.000 description 1
- 230000006386 memory function Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
- G06Q50/2057—Career enhancement or continuing education service
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/637—Control signals issued by the client directed to the server or network components
- H04N21/6371—Control signals issued by the client directed to the server or network components directed to network
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/06—Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
- G09B5/065—Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/414—Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4312—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
- H04N21/4334—Recording operations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/435—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
- H04N21/4355—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream involving reformatting operations of additional data, e.g. HTML pages on a television screen
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4398—Processing of audio elementary streams involving reformatting operations of audio signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44016—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/443—OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
- H04N21/4438—Window management, e.g. event handling following interaction with the user interface
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/647—Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
- H04N21/64746—Control signals issued by the network directed to the server or the client
- H04N21/64761—Control signals issued by the network directed to the server or the client directed to the server
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/66—Remote control of cameras or camera parts, e.g. by remote control devices
- H04N23/661—Transmitting camera control signals through networks, e.g. control via the Internet
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/765—Interface circuits between an apparatus for recording and another apparatus
- H04N5/77—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
Definitions
- the present disclosure relates to the field of artificial intelligence technology, and in particular to a multimedia interaction method, device, equipment, and storage medium.
- online teaching platforms have become more and more popular, and online teaching usually requires interaction between teachers and students, or students and students, so as to make the teaching effect and experience better.
- the present disclosure provides a multimedia interaction method, device, equipment and storage medium.
- the first aspect of the embodiments of the present disclosure provides a multimedia interaction method, including: calling a multimedia interaction component of a teaching platform; using the multimedia interaction component to obtain a control instruction; using the multimedia interaction component to extract information based on the control instruction ; The extracted information is displayed or played through the multimedia interactive component.
- a multimedia interaction method including: calling a multimedia interaction component of a teaching platform; using the multimedia interaction component to obtain a control instruction; using the multimedia interaction component to extract information based on the control instruction ; The extracted information is displayed or played through the multimedia interactive component.
- control instruction includes at least one of an image shooting control instruction, a video recording control instruction, and an audio recording control instruction.
- images, video or audio information can be obtained conveniently and quickly without additional equipment when interacting on the teaching platform, making the effect of online teaching better.
- the use of the multimedia interactive component to extract information based on the control instruction includes: when the control instruction is an image shooting control instruction, triggering a camera device according to the image shooting control instruction to acquire a frame of image; And/or when the control instruction is a video recording control instruction, trigger the camera device according to the video recording control instruction to obtain multiple frames of images; and/or when the control instruction is an audio recording control instruction, According to the audio recording control instruction, the recording device is triggered to perform audio recording.
- the multimedia interactive components can be directly used for video recording, image shooting, and audio recording when interacting on the teaching platform. It does not require additional equipment to improve the convenience of information acquisition and make the effect of online teaching better.
- the method further includes: acquiring the number of frames per second of the multi-frame image, The multi-frame image is selectively played according to the corresponding relationship between the number of frames and the time through the number of frames transmitted per second. In this way, when the teaching platform is interacting, it can be played according to the number of image transmission frames, making the video presentation smoother and the interactive effect more flexible.
- the step of displaying or playing the extracted information through the multimedia interactive component includes: The multi-frame images are sequentially played in order of acquisition time; or the multi-frame images are synthesized to form a video file, and the video file is played. This makes the playback of multi-frame images smooth and saves the processing time and process of saving images into video files. Or the multiple frames of images are combined into a video file, and the video file is played, so that the video can be played normally.
- the step of using the multimedia interaction component to extract information based on the control instruction includes: using the multimedia interaction component to extract the information from pre-stored preset information based on the control instruction.
- the local preset voice or preset image is called to imitate the recording or image through the multimedia interaction component, and there is no need to upload multimedia files such as student videos and photos to the server for processing. Therefore, the teaching interaction does not need to rely on large Bandwidth, while real-time performance is better.
- the use of the multimedia interactive component to extract information based on the control instruction further includes: setting the timing time of a timer according to the control instruction; when the timing period of the timer is reached, controlling the station The multimedia interactive component extracts information.
- the interactive effect is more flexible when interacting on the teaching platform.
- the multimedia interactive component after displaying or playing the extracted information through the multimedia interactive component, it further includes: controlling any of the operations of position movement, window zooming, and window hiding of the window that is displayed or played. In this way, the interactive effect is more flexible when interacting on the teaching platform.
- said displaying or playing the extracted information through the multimedia interaction component further includes: obtaining a preset code through the multimedia interaction component, and preprocessing the extracted information according to the preset code;
- the preprocessed information is displayed or played through the multimedia interactive component.
- the user can obtain the required information through the multimedia interactive component and preprocess it according to the preset code to obtain the processing result, which is more in line with the requirements of teaching and helps to improve the efficiency of programming learning, and it can be done through local equipment.
- To complete the information processing there is no need to transmit the information to the server for processing, reduce the dependence on bandwidth, reduce network interaction, and increase the information processing rate.
- the pre-processing includes: performing image processing operations on the acquired one frame of image or the multiple frames of images, and/or performing at least one of speech noise reduction, speech to text, and speech synthesis on the acquired audio An operation.
- the method further includes: downloading all the information that is being played.
- One frame of image is acquired from the multiple frames of images or the one frame of images.
- a second aspect of the embodiments of the present disclosure provides a multimedia interactive device, including: a calling module configured to call a multimedia interactive component of a teaching platform; an input module configured to use the multimedia interactive component to obtain a control instruction; an information extraction module, It is configured to use the multimedia interactive component to extract information based on the control instruction; the output module is configured to display or play the extracted information through the multimedia interactive component.
- a third aspect of the embodiments of the present disclosure provides a multimedia interactive device, including a memory and a processor, wherein the memory stores program instructions, and the processor retrieves the program instructions from the memory to execute any one of the foregoing The multimedia interactive method described in item.
- a fourth aspect of the embodiments of the present disclosure provides a computer-readable storage medium that stores a program file, and the program file can be executed to implement the multimedia interaction method described in any one of the foregoing.
- a fifth aspect of the embodiments of the present disclosure provides a computer program product, including computer-readable code, and when the computer-readable code runs in an electronic device, a processor in the electronic device executes The method described in one aspect.
- the present disclosure obtains control instructions through the multimedia interaction components by calling the multimedia interaction components of the teaching platform, then extracts information based on the control instructions, and displays or plays the extracted information through the multimedia interaction components.
- multimedia interaction is carried out based on multimedia interaction components, making multimedia interaction more flexible and convenient, and teaching effects are better.
- FIG. 1 is a schematic flowchart of an embodiment of the multimedia interaction method of the present disclosure
- FIG. 2 is a schematic diagram of the connection structure between the front end of the web page and the local machine in the multimedia interactive method of the present disclosure
- FIG. 3 is a schematic flowchart of another embodiment of the multimedia interaction method of the present disclosure.
- FIG. 4A is a schematic flowchart of another embodiment of the multimedia interaction method of the present disclosure.
- 4B is a schematic diagram of the overall framework of the multimedia interaction method of the present disclosure.
- FIG. 5 is a schematic diagram of the structure of the multimedia interactive device of the present disclosure.
- Figure 6 is a schematic diagram of the structure of the multimedia interactive device of the present disclosure.
- Fig. 7 is a schematic structural diagram of a computer-readable storage medium of the present disclosure.
- Multimedia interaction on the teaching platform mainly refers to human-computer interaction through photographs, audio recordings, video recordings, and screenshots during the teaching process.
- this makes multimedia interaction more difficult. Therefore, the embodiments of the present disclosure provide a multimedia interaction method.
- the teaching platform uses the multimedia interaction component to obtain the control instruction input by the user, it directly extracts information based on the control instruction through the multimedia interaction component.
- the code is preprocessed, the processed information is displayed or played. Therefore, the information can be obtained and processed on the local machine, and multimedia interaction can be carried out without the aid of external equipment.
- FIG. 1 is a schematic flowchart of an embodiment of the disclosed multimedia interaction method.
- Step S11 Call the multimedia interactive component of the teaching platform.
- the teaching platform is a network teaching system logged in through a native browser, such as a programming teaching platform and an artificial intelligence teaching platform.
- the multimedia interaction component may be a preset component in the teaching platform that performs processing operations such as acquiring multimedia information.
- it may be a component in the teaching platform that calls the local camera to acquire images and transmits the images to the local computer for operation.
- the multimedia interactive component can be called according to the user's operation on the teaching platform.
- the server running on this machine is the preset software (equivalent to the local engine) that has been researched and developed. It can be downloaded and downloaded from the teaching platform through the local device (local machine). The server running on this machine can be used for implementation. The function of the multimedia interactive component.
- the multimedia interactive component needs to be activated, for example, the user clicks the button of the multimedia interactive component on the teaching platform or the user enters the user code to call the multimedia interactive component, etc.
- the teaching platform calls the multimedia interactive component
- the running server connects.
- the web front end and the server running on the machine are connected through a communication interface.
- the web front end 201 and the server running on the machine 202 are connected through a socket input and output port 203 (socket IO), as shown in the figure. 2 shown.
- the front end of the webpage may be a browser
- the browser may be a general-purpose browser on a computer, such as 360 browser, Baidu browser, Google browser, QQ browser, Sogou browser, etc., the browser It may also be other types of browsers, which are not limited here; in another embodiment, the front end of the webpage may also be application (APP) software, such as a third-party application of a smart device.
- APP application
- the front end of the webpage can be a programming teaching interface of a browser or application software.
- Step S12 Obtain a control instruction by using the multimedia interactive component.
- control instruction is a control instruction input by the user, such as a code instruction input by the user, or a control instruction remotely input through other devices.
- control instruction may be a voice control instruction, or a manual trigger type control instruction, such as a control instruction triggered by a button.
- control instruction may also be an automatically triggered control instruction. For example, a timer is set in the multimedia interactive component, and the control instruction is generated after the timer time is reached; or the user enters a certain need to perform In multimedia interactive experimental courses, the teaching platform automatically generates control instructions corresponding to the experimental courses.
- the teaching platform uses the multimedia interactive component to obtain the control instruction.
- the control instruction includes at least one of image shooting control instructions, video recording control instructions, and audio recording control instructions.
- the multimedia interactive component can be used to take pictures, record audio, record videos, and take screenshots, etc. operate.
- Step S13 Use the multimedia interactive component to extract information based on the control instruction.
- the teaching platform uses the multimedia interactive component to extract information according to the control instruction.
- the control instruction is an image shooting control instruction
- the camera device is triggered according to the image shooting control instruction to acquire a frame of image.
- the multimedia interactive component is triggered to control the camera device of the machine to take pictures, and the camera device can be the camera of the machine that logs in to the teaching platform, or it can be an external camera.
- the timing time of the timer may also be set, and when the timing period of the timer is reached, the multimedia interaction component is controlled to extract information.
- the camera will be automatically activated to take pictures after the 5th second is reached; after the picture is taken to the next 5 seconds, the camera will be activated again
- the camera device takes pictures.
- the photographing process can end when the next control instruction is received, or the number of photographs can be set at the same time when the timing time is set, and the photographing will automatically stop when the number of photographs is reached.
- the teaching platform uses the multimedia interactive component to trigger the camera device according to the video recording control instruction, for example, a built-in or external camera that logs in to the teaching platform to obtain multiple frames of images. Multiple frames of images are connected to form a video.
- the timing time of the timer may also be set, and when the timing period of the timer is reached, the multimedia interaction component is controlled to extract information. For example, if the timer is set to 5 seconds, starting from the moment when the control instruction is received, the camera will be automatically activated for video recording after the 5th second is reached. After the video recording is completed to the next 5 seconds, Enable the camera device again for video recording.
- Video recording can end when the next control instruction is received, or the number of photos can be set at the same time when the timing is set. When the number of shots is reached, the shooting will stop automatically.
- the audio is shielded and audio recording is not performed, that is, the recorded video does not include sound information.
- audio recording may also be performed at the same time, that is, the recorded video also includes sound information.
- the teaching platform uses a multimedia interactive component to trigger a recording device, such as a recorder, a microphone, etc., to perform audio recording according to the audio recording control instruction.
- a recording device such as a recorder, a microphone, etc.
- the timing time of the timer can also be set, and when the timing period of the timer is reached, the multimedia interaction component is controlled to extract information. For example, if the timer is set to 5 seconds, it will start at the time when the control instruction is received. After the 5th second is reached, the recording device will be automatically activated for audio recording. After the audio recording is completed, the next 5 seconds will be reached. To enable the recording device again for audio recording.
- Audio recording can end when the next control instruction is received, or you can set the number of recordings at the same time when setting the timing time, and automatically stop audio recording when the number of recordings is reached. It should be noted that, in one embodiment, during the audio recording process, the camera device can also be turned on for video recording, and the obtained audio includes image information.
- step S14 Display or play the extracted information through the multimedia interactive component.
- the extracted information is a frame of image information, that is, photographing information, after acquiring a frame of image, it is displayed through the multimedia interactive component.
- the multi-frame image When displaying multi-frame images, it can be displayed according to the set time frequency, or it can be played and displayed according to the transmission rate of the image frame.
- the number of transmission frames per second of the multi-frame image is acquired, and the multi-frame image is selectively played according to the corresponding relationship between the number of frames and the time through the number of transmission frames per second. For example, when acquiring a multi-frame image, a total of 1000 frames of images are acquired. In the process of acquiring images, the number of frames transmitted per second is 200. If you want to play the 3rd second video, you can directly start playing from the 401th frame. It is understandable that if you need to play the 401st frame, you can also directly drag the video frame number to the 3rd second.
- the timing time of the timer can also be set, and at a certain timing time, a certain frame of images of the multiple frames of images are acquired for display, and iterated in sequence. For example, if the timing time is 1 second, in response to the time reaching one second, a frame of image is played, which can slow down the playback speed of the video, so that it can be seen clearly during the interaction and deepen the memory.
- a screenshot of the video being played can also be taken. For example, one frame of image can be obtained from the multi-frame image or one frame of image being played, that is, the screenshot operation can be performed. After the screenshot has acquired the current frame of image, it can be displayed through the display window of the multimedia interactive component at the same time, or image processing operations such as target recognition can be performed on the screenshot image.
- the extracted information is audio information, it is played through the display window of the multimedia interactive component.
- the recorded audio when playing through the multimedia interactive component, can also be converted to text processing, that is, the audio corresponding to the text can be displayed at the same time as the audio is played, so that the meaning of the audio can be understood.
- text processing that is, the audio corresponding to the text can be displayed at the same time as the audio is played, so that the meaning of the audio can be understood.
- the multimedia interaction component can be used to extract the information from the pre-stored preset information based on the control instruction .
- the teaching interaction can be performed according to the preset voice imitating recording; in another embodiment, the teaching interaction can be performed according to the preset image imitating the image acquisition process.
- the teaching platform can implement voice playback by calling pre-recorded voice content.
- the user pre-records preset voice content and preprocesses the voice content to imitate the audio recording and processing process in actual teaching to achieve the purpose of teaching.
- the method of calling the local preset voice imitating recording or calling the preset image imitating image through the multimedia group component does not require the process of uploading multimedia files such as student videos and photos to the server for processing. Therefore, this teaching interaction does not need to rely on large bandwidth, and the real-time performance is better.
- the teaching platform controls the pop-up question window to display the questions that need to be answered.
- the user inputs the audio recording control instruction through the front end of the webpage, that is, the question window, and the teaching platform calls the multimedia interactive component according to the audio recording control instruction to directly record the voice, and recognizes the recorded voice to check whether it is correct.
- the teaching platform pops up the window "Who is the author of Shiji?", the user enters the answer "Sima Qian" through the multimedia interactive component, and the teaching platform uses the multimedia interactive component to preprocess the user's answer and then identify and verify it to check whether the answer is correct , In order to achieve the purpose of enriching the interactive teaching methods.
- the teaching platform will pop up and simultaneously display multiple selectable answers based on the preset voice content. For example, the teaching platform pops up the window "Who is the author of Shiji?" and calls multiple pre-recorded answers in the multimedia interactive component.
- the teaching platform pops up multiple answer windows, such as "Sima Qian, Luo Guanzhong, Shi Nai'an", etc., users Click the selected answer directly. For example, the user clicks the voice answer "Sima Qian”.
- the teaching platform recognizes and verifies the user's answer to check whether the answer is correct, so as to achieve the purpose of enriching the interactive teaching methods.
- the teaching platform can pop up the voice recording window and call the preset voice content. At the same time, it can also simulate the voice recording process, and then call the preset voice content processing.
- the voice or text information (such as voice noise reduction or voice-to-text, etc.) is played and/or displayed, so as to simulate the voice recording and processing process without networking, and achieve teaching effects.
- a control instruction is obtained through a multimedia interaction component
- the multimedia interaction component is used to extract information based on the control instruction
- the extracted information is displayed or played through the multimedia interaction component. Therefore, the multimedia interaction can be completed by using the components of the teaching platform running locally, and there is no need to call external special equipment for information extraction and then upload the extracted information to the machine, which can simplify the operation, improve the interest of multimedia interaction, and make the teaching effect better.
- the server running on the computer is loaded on the computer, and the server running on the computer is connected with the browser, so that the multimedia interactive component of the teaching platform can be called to realize the multimedia interactive function.
- the teaching platform in the embodiments of the present disclosure may involve computer vision scenarios, such as face recognition, image recognition, object tracking and other algorithms.
- the server running on the machine can call the camera and microphone of the machine for multimedia interaction during operation. Users can take pictures and record independently to get the video, photos or audio they want. Therefore, there is no need to call an external device for information extraction and upload the extracted information to the machine, which simplifies the operation, improves the interest of multimedia interaction, and makes the teaching effect better.
- multimedia interaction components in the embodiments of the present disclosure can realize local voice or image processing, without uploading multimedia files such as voice or realizing videos, images to the server for processing, and do not need to rely on large bandwidth, and at the same time, the real-time performance is better. .
- FIG. 3 is a schematic flow chart of another embodiment of the multimedia interactive method of the present disclosure.
- Steps S31, S32, and S33 are the same as steps S11, S12, and S13 in the first embodiment. Please refer to Figure 1 and related text descriptions will not be repeated here.
- the difference between this embodiment and the embodiment corresponding to FIG. 1 is that this embodiment further includes after step S33:
- Step S34 Obtain a preset code through the multimedia interaction component, and preprocess the extracted information according to the preset code.
- the teaching platform obtains the preset code through the multimedia interactive component, and then preprocesses the extracted information according to the preset code.
- the preset code is the code written into the multimedia interactive component.
- the teaching platform uses the preset code written in the multimedia interactive component to perform Pretreatment.
- the preprocessing includes performing image processing on one or more frames of images obtained, or performing any one or more of speech noise reduction, speech to text, and speech synthesis on the obtained audio.
- target recognition or target tracking can be performed on the target in one frame of image or multiple frames of image according to the preset code.
- the preset code may be a preset model integrated in a multimedia interactive component, and the preset model may be a model that integrates a neural network algorithm capable of target recognition or target tracking, and of course, it may also be integrated by other Models of algorithms that can perform target recognition or target tracking.
- the audio information can be processed according to the preset code, such as speech noise reduction, speech to text, and speech synthesis.
- the multimedia interactive component uses a preset code to perform noise reduction processing on the audio file.
- the multimedia interaction component may perform speech synthesis processing on the audio file through a preset code; in another embodiment, the multimedia interaction The component can also convert the acquired audio information to text processing and then display it, and can also perform conversion to text processing during the audio playback process; of course, it can also convert to text processing and display before performing voice playback, which is not limited here. .
- the bit rate and channel number of the extracted images, videos, and audios need to meet the requirements of the multimedia interactive component.
- the supported bit rate and the number of channels can be set according to the user code or preset processing algorithm to reduce the chance of recognition errors when preprocessing the extracted information.
- the preset code may also be a code input by the user.
- the user performs an information processing operation (such as a target detection operation) as needed.
- the user code is input in the teaching platform, and the code may indicate the need to perform
- the multimedia interactive component obtains the user code, it can parse it to find out which kind of preprocessing needs to be performed on the information, and then call the corresponding algorithm module for information preprocessing.
- the preset code may also be a code instruction obtained at the same time as the control instruction is obtained. When the multimedia interactive component obtains the control instruction, it can parse and obtain the control instruction and the preset code at the same time.
- Step S35 Display or play the preprocessed information through the multimedia interactive component.
- the acquired audio, video, and image can also be saved in a designated folder.
- the recorded video is saved in a designated folder, and during playback, the video in the folder is automatically opened for playback.
- the acquired audio, video, and image can be displayed or played directly, and there is no need to save.
- the multi-frame image is played and displayed through the multimedia interactive component.
- the multi-frame images that make up the video are acquired through the multimedia interactive component, in particular, the acquired multi-frame images are preprocessed, that is, after image processing (for example, target detection, target recognition, etc.), the processing The subsequent multi-frame images form an image set.
- the images in the image collection do not include time information, but each frame of image acquisition itself has a time point, that is, each frame of image has its corresponding acquisition time.
- the display makes the image after multi-frame processing play smoothly, and saves the processing time and process of saving the image into a video file at the same time.
- the obtained multiple frames of images may also be combined into a video file before being played and displayed.
- a control instruction is acquired through a multimedia interaction component running locally; after the multimedia interaction component is used to extract information based on the control instruction, the acquired information is further processed through the multimedia interaction component using a preset code.
- Preprocessing for example, perform image processing operations such as target recognition and target tracking on one or more frames of images obtained, or perform any one or more operations of speech noise reduction, speech to text, and speech synthesis on the obtained audio; After preprocessing, it is displayed or played through multimedia interactive components. Therefore, there is no need to call an external device for information extraction and upload the extracted information to the machine, which can simplify the operation, improve the interest of multimedia interaction, and make the teaching effect better.
- FIG. 4A is a schematic flowchart of another embodiment of the multimedia interaction method of the present disclosure, in which step S41, step S42, step S43, step S44, step S45 and step S31, step S32, and step described in FIG. 3 S33, step S34, and step S35 are the same, except that the embodiment of the present disclosure further includes:
- Step S46 Control the window to be displayed or played to perform any operation of position movement, window zooming, and window hiding.
- the displayed or played window can be dragged to move the position of the window, or the displayed or played window can be zoomed to change the size of the window, or the displayed or played
- the window is hidden to make multimedia interaction more flexible. For example, if the displayed or played window blocks the current display interface for teaching, live broadcast, chat, etc., the blocked interface can be displayed by moving the position of the window or hiding or zooming the window. For another example, if the displayed or played window is too small and the video or image playback interface is not clear, you can zoom in and out the window.
- a series of multimedia interactive interfaces based on the PyQt platform are implemented, which mainly involve multimedia interactive interfaces such as taking photos, videos, recordings, screenshots, playing audio, and playing videos.
- the multimedia interaction components are directly packaged into the installation package. After the machine logs in to the teaching platform, after downloading and installing the components, you can directly perform multimedia interactions such as photographing, video recording, and recording without other dependencies such as decoders. It is simple and clear, easy to operate, and Conducive to the convenient realization of multimedia interaction in online education.
- the teaching platform is designed and developed to have a local engine, which is installed in the machine, and started as a service carrier for running user-written codes, and communicates with the browser front end through socket input and output ports.
- the teaching platform is also designed and encapsulated with a multimedia interactive code library, involving computer vision scenarios such as face recognition, image recognition, object tracking and other algorithm teaching, which can be called by the local engine.
- a multimedia interactive code library involving computer vision scenarios such as face recognition, image recognition, object tracking and other algorithm teaching, which can be called by the local engine.
- students can support the call of encapsulated algorithm modules. When the code is running, they can interact with pop-up windows. Students can take pictures and record independently to obtain the video and photo resources they want. If they need to select objects, they can also call the screenshot function. Drag the mouse to take a screenshot, then call the algorithm of the course for processing, and finally call the play video or display picture interface to display the final result of the algorithm.
- the user inputs an audio recording control instruction, and the multimedia interactive component performs voice recording based on the control instruction.
- the second code can also be used to perform speech synthesis, speech analysis, and speech-to-text on the audio Processing and other operations.
- the playback window can also be displayed.
- the playback window can include playback speed, playback progress bar, etc., and the user can also control the playback window to move, zoom, and hide.
- the audio may or may not include video images.
- the audio processing module (pyAudio) is used for audio acquisition, and then the ripple animation (wave) component is used to convert the acquired audio into the standard file format used for recording, which supports the setting of bit rate, number of channels and other parameters. It can support artificial intelligence algorithm requirements more flexibly; during audio playback, the new media (Qmedia) component of the standard audio playback format is used for audio playback, and the main functions of playback, pause, drag and drop and time display are realized on the main interface.
- the ripple animation (wave) component is used to convert the acquired audio into the standard file format used for recording, which supports the setting of bit rate, number of channels and other parameters. It can support artificial intelligence algorithm requirements more flexibly; during audio playback, the new media (Qmedia) component of the standard audio playback format is used for audio playback, and the main functions of playback, pause, drag and drop and time display are realized on the main interface.
- the user inputs an image shooting control instruction
- the multimedia interactive component performs image shooting based on the control instruction to obtain a frame of image.
- the second code can also be used to perform operations such as target recognition and target tracking on the image.
- the user can also control the image display window to move, zoom, hide and other operations.
- a cross-platform computer vision library such as OpenCV can be triggered to acquire the camera image, and then displayed on a multimedia interactive interface such as the main interface of the PyQt platform.
- the current frame that triggered the picture can be saved.
- the user inputs a video recording control instruction, and the multimedia interactive component performs video recording based on the control instruction, and obtains multiple frames of images.
- the second code can also be used to perform target recognition on the video, Target tracking, speech synthesis, speech analysis, and speech-to-text processing operations.
- the captured video is displayed and played, you can also take a screenshot on the displayed video, and then obtain a frame of image, and then you can process the intercepted image.
- the user can also control the video playback window to move, zoom, hide and other operations.
- the video may or may not contain audio information.
- the cross-platform computer vision library is regularly triggered to obtain the camera image, and then displayed on the main interface of the PyQt platform.
- each frame can be written to a local video file through the cross-platform computer vision library. Taking a photo saves the current frame of the starting photo.
- the video is playing, open the video file based on the cross-platform computer vision library, and the timer will trigger the acquisition of the image according to the frame per second (FPS) information of the video, and display it on the main interface, and support drag, pause, etc. Function.
- the multimedia interactive component obtains an image collection after processing the video through the algorithm.
- the multimedia interactive component also supports the image collection as an input parameter, which can be played without additional saving as a video file, that is, it is triggered by a timer to obtain One frame of the image collection is displayed and iterated in sequence, and the effect is similar to that of a video player, pause, and progress bar dragging, and realizes more flexible multimedia display.
- the multimedia interaction method of the embodiment shown in FIG. 1 to the embodiment shown in FIG. 3 described in the present disclosure uses multimedia interaction components to extract information (such as video recording, photographing, audio recording, etc.), and use local resources for multimedia interaction. No need to purchase additional teaching and research equipment. In addition, it can also preprocess the extracted information through multimedia interactive components, without uploading to the cloud, and without additional bandwidth, saving computing resources and network resources.
- the present disclosure is based on this machine to realize video recording, audio recording, image shooting, screenshots, etc. It does not require third-party decoders and programs. It only needs to install web pages on this machine so that it can run the first code through the server running on this machine. Call multimedia interactive components, so the requirements for the host are low and the operating environment is green.
- the multimedia interaction component is capable of multiple multimedia interactions, which is more in line with teaching needs. Compared with the way of multimedia interaction with external equipment, it reduces additional conversion steps.
- Multimedia interaction mainly refers to scenes such as taking photos, recordings, videos, and screenshots.
- some artificial intelligence algorithms in computer vision if you can support students to take pictures for image processing, such as face recognition, object recognition, use of video and screenshot interaction, and object tracking will greatly enhance the interest of teaching.
- recordings can be used for demonstration education of speech recognition, combined with speech to text, experience language control, and use the playback function to experience speech generation.
- Artificial intelligence teaching must involve some teaching content of computer vision and natural speech processing. In order to increase interactivity and interest, it is an indispensable functional requirement to support students or teachers in recording, photographing, and video recording operations for algorithm display. Some of the existing methods are to provide complete embedded equipment for video and audio recording, etc., requiring additional purchase of teaching and research equipment. At the same time, after collecting videos and images with local equipment, upload them to the cloud for algorithm processing, and then return the results. In addition, some do not provide a local green operating environment and need to install additional dependencies for normal operation.
- This example implements a series of multimedia interactive interfaces based on the PyQt platform, which mainly involve multimedia interactive interfaces such as taking photos, videos, recordings, screenshots, playing audio, and playing videos, and realizes the use of local resources for interaction without the need to purchase additional teaching and research equipment.
- the multimedia interactive interface is directly packaged into an installation package. After installing it on the device, it does not require other decoders and other dependencies. It can directly perform multimedia interactions such as taking pictures, videos, and recordings. It is simple and easy to operate. It is online education and multimedia interaction. Excellent realization.
- this example is also deeply customized in conjunction with education scenarios.
- the multimedia interactive interface provided in this example is implemented based on the PyQt platform.
- the web front end 410 (equivalent to the web front end) and the local engine 420 are connected through a communication interface, and the web front end 410 initiates scheduling, that is, according to the user’s
- the operation on the front end of the webpage triggers the local engine 420 to run on the machine; the local engine 420 calls the multimedia application programming interface (API) to pop up a display window 430, which is the multimedia interactive interface for playing or displaying
- the extracted information for example, the captured image is displayed.
- the user can also control the display window to move, zoom, hide, and other operations.
- the multimedia interaction component can perform a variety of multimedia interactions, which is more in line with teaching needs. Compared with the way of multimedia interaction with external equipment, it reduces additional conversion steps.
- the web front end 410 may be a programming teaching interface of a browser or application software.
- the local engine 420 is a server running locally, and is preset software developed through research and development.
- the local engine 420 can run on the local device after logging in to the teaching platform through the local device, that is, the local computer.
- the display window 430 may also be a question window, which displays the question that needs to be answered, for example, the user inputs an audio recording control instruction through the question window.
- the local engine 420 directly uses the encapsulated computer programming language function (python) function (equivalent to a multimedia interactive component) to record the voice according to the audio recording control instruction, and recognizes the recorded voice to check whether it is correct.
- the following takes video recording and playback of teaching scenes as an example to illustrate the basic realization of video interaction, that is to say, the extracted information only contains images and does not contain audio information.
- Video recording and photographing stage Based on the timer, the cross-platform computer vision library is regularly triggered to obtain the camera image, and then displayed on the main interface of the multimedia interactive interface. For the video recording function, each frame is written to a local video file through the cross-platform computer vision library. For the camera function, save the current frame that triggered the camera.
- Video playback stage Open the video file based on the cross-platform computer vision library, and the timer triggers timing according to the information of the video frame per second to obtain the image, which is displayed on the main interface. At the same time, it supports drag and drop, pause and other functions.
- the cross-platform computer vision library to achieve video recording, playback and screenshots, not based on third-party decoders and programs, making the operating environment controllable and green. That is to say, the green installation can be used, the installation is convenient, and the requirements for the configuration of the running host are low.
- an image collection is obtained after the video is processed by the algorithm.
- the interface also supports the image collection as an input parameter, which can be played without additional saving as a video file.
- the main principle is to use a timer to trigger a frame to obtain a frame of the image collection for display, and iterate in sequence, and achieve the effect similar to the pause and progress bar drag in a video player, to achieve more flexible multimedia display. In this way, the in-depth customization is made in combination with the teaching scene, which is more in line with the needs of teaching and reduces additional conversion steps.
- the audio processing module is used for audio acquisition, and then the ripple animation component is used to convert the audio file into a standard file format, which supports the setting of parameters such as bit rate and channel number, and supports the needs of artificial intelligence algorithms more flexibly.
- Play stage new media components are used for audio playback, and the main interface realizes the main functions of play, pause, drag and drop, and time display.
- the voice-to-text function is supported after recording.
- the main function of this function is realized based on the open application programming interface of the cloud platform. And when the user does not have access to the external network, the interaction from recording to voice recognition can be imitated according to the preset voice content to achieve the purpose of education. In this way, computer vision-related algorithms directly perform calculation processing on the machine without uploading to the cloud, without additional bandwidth, and saving computing resources.
- the multimedia interaction of the teaching platform is realized. Students can take photos, videos, and audio recordings to learn related algorithms, instead of only having multimedia content preset by the teaching platform, it is more interesting and flexible. Solve the problems of inconsistent student computer configuration and inconsistent student computer environment.
- the multimedia interactive function is realized without the need to install additional decoders or other dependent installations. There is no need to upload multimedia files such as student videos and photos to the server for processing, and there is no need to rely on large bandwidth, and the real-time performance is better.
- the embodiments of the present disclosure adopt recording equipment and video recording equipment that call the equipment of the local machine, without additional dedicated hardware equipment, and use the equipment of the local machine for interaction to achieve the purpose of teaching.
- the embodiments of the present disclosure do not require users to upload multimedia files obtained by taking photos, videos, and recordings to the network, and everything is performed locally for teaching demonstrations.
- the video playback supports two forms of video and image collection. Because in the artificial intelligence teaching scene, the original video is analyzed for each frame, and after the processing is completed, it is a set of image collections.
- the video playback interface supports The image collection is the input parameter, which can be played directly, and supports pause, progress drag and drop, etc., and the interactive mode is more flexible.
- the embodiments of the present disclosure can be applied to computer vision scenarios, such as face recognition, image recognition, object tracking and other algorithm teaching.
- students can call the module of the packaged development window program (equivalent to the multimedia interaction component), and interact with pop-up windows when the code is running. Students take pictures and record independently to obtain the video and photo resources they want. ; If you need to select an object, you can also call the screenshot function, drag the mouse to take a screenshot, and then call the algorithm of the course for processing, and finally call the play video or display picture interface to display the final result of the algorithm.
- the embodiments of the present disclosure can also be applied to natural voice processing scenarios, such as voice commands, voice synthesis and other scenarios.
- FIG. 5 is a schematic diagram of the structure of the multimedia interactive device of the present disclosure. Including: a calling module 51, an input module 52, an information extraction module 53, and an output module 54.
- the calling module 51 is configured to call the multimedia interactive components of the teaching platform.
- the teaching platform is a network teaching system logged in through a native browser, such as a programming teaching platform and an artificial intelligence teaching platform.
- the multimedia interaction component may be a preset component in the teaching platform that performs processing operations such as acquiring multimedia information.
- it may be a component in the teaching platform that calls the local camera to acquire images and transmits the images to the local computer for operation.
- the multimedia interactive component can be called according to the user's operation on the teaching platform.
- the server running on this machine is the preset software (equivalent to the local engine) that has been researched and developed. It can be downloaded from the teaching platform through the local device (local machine) and run on the machine. The server running on this machine can be used to achieve The function of the multimedia interactive component.
- the front end of the webpage may be a browser
- the browser may be a general-purpose browser on a computer, such as 360 browser, Baidu browser, Google browser, QQ browser, Sogou browser, etc., the browser It may also be other types of browsers, which are not limited here; in another embodiment, the front end of the webpage may also be application software, such as a third-party application of a smart device.
- the front end of the webpage can be a programming teaching interface of a browser or application software.
- the calling module 51 is configured to call multimedia interactive components, and the front end of the web page is connected with the server running on the local machine.
- the front end of the web page and the server running on the local machine are connected through a communication interface, for example, the front end of the web page and the local machine are connected.
- the running server is connected through the socket input and output ports.
- the calling module 51 is also configured to call the multimedia interactive component to preprocess the extracted information according to the preset code. , Including performing image processing on one or more frames of images acquired, or performing any one or more of the operations of speech noise reduction, speech-to-text, and speech synthesis on the acquired audio. For example, if the acquired information is one frame of image or multiple frames of images, target recognition or target tracking can be performed on the target in one frame of image or multiple frames of image according to the preset code.
- the preset code may be a preset model integrated in a multimedia interactive component, which may be a model that integrates a neural network algorithm capable of target recognition or target tracking, and of course, it may also be integrated with other targets capable of performing target recognition. Models of algorithms for recognition or target tracking.
- the audio information can be processed according to the preset code, such as speech noise reduction, speech to text, and speech synthesis.
- the multimedia interaction component uses a preset code to perform noise reduction processing on the audio file; in another embodiment, if the audio file is acquired within a fixed time The audio information is multiple pieces of audio.
- the multimedia interaction component may perform speech synthesis processing on the audio file through a preset code; in another embodiment, the multimedia interaction component may also convert the acquired audio information into text processing. Display, you can also perform text processing during audio playback; of course, you can also perform text processing and display first and then perform voice playback, which is not limited here.
- the input module 52 is configured to obtain a control instruction by using the multimedia interactive component.
- the input module 51 is configured to obtain at least one of an image shooting control instruction, a video recording control instruction, and an audio recording control instruction.
- the control instruction can control the multimedia interactive component to perform operations such as photographing, recording, recording, and screenshot.
- the input module 52 is also configured to obtain the preset code through the multimedia interactive component. After the information is extracted, the teaching platform obtains the preset code through the multimedia interactive component, and preprocesses the extracted information according to the preset code.
- the preset code is the code written into the multimedia interactive component. After the multimedia interactive component extracts information according to the control instruction, the teaching platform uses the preset code written in the multimedia interactive component to perform Pretreatment.
- the information extraction module 53 is configured to use the multimedia interaction component to extract information based on the control instruction.
- the information extraction module 53 is configured to, when the control instruction is an image shooting control instruction, trigger the camera device to obtain a frame of image according to the image shooting control instruction; and/or, when the control instruction is a video recording control instruction Next, trigger a camera device to acquire multiple frames of images according to the video recording control instruction; and/or, in a case where the control instruction is an audio recording control instruction, trigger the recording device to perform audio recording according to the audio recording control instruction .
- the teaching platform uses the multimedia interactive component to extract information according to the control instruction.
- the camera device is triggered according to the image shooting control instruction to acquire a frame of image.
- the control instruction is a video recording control instruction
- the teaching platform uses the multimedia interactive component to trigger the camera device according to the video recording control instruction, for example, a built-in or external camera that logs in to the teaching platform to acquire multiple frames of images. Multiple frames of images are connected to form a video.
- the control instruction is an audio recording control instruction
- the teaching platform uses the multimedia interactive component to trigger a recording device such as a recorder, a microphone, etc. to perform audio recording according to the audio recording control instruction.
- the information extraction module 53 is also configured to obtain one frame of image from the multi-frame image or the one frame of image being played. When playing or displaying a video or image, the multimedia interactive component can obtain a frame of image from the video or image being played or displayed to complete the screen capture operation.
- the information extraction module 53 is further configured to set the timing time of the timer according to the control instruction; when the timing period of the timer is reached, control the multimedia interaction component to extract information. When the timing period of the timer is reached, the multimedia interaction component is controlled to extract information. For example, if the timer is set to 5 seconds, starting from the moment when the control instruction is received, after 5 seconds, the information extraction will start automatically, such as taking pictures, video recording, and recording. After the information extraction is completed, it will reach the next one. After 5 seconds, the information extraction starts again, and it can end when the next control instruction is received, or when the timing time is set, the number of information extraction times can be set at the same time, and the information extraction will automatically stop when the number of information extraction times is reached
- the output module 54 is configured to display or play the extracted information through the multimedia interactive component.
- the output module 54 displays multiple frames of images, it can display according to the set time and frequency, and can also perform playback and display according to the transmission rate of the image frames.
- the teaching platform after the teaching platform obtains the multi-frame images that make up the video through the multimedia interactive component, it also obtains the number of frames per second of the multi-frame images, and passes the multi-frame images through the frames per second.
- the number is selectively played according to the corresponding relationship between the number of frames and the time. For example, when acquiring a multi-frame image, a total of 1000 frames of images are acquired. During the image acquisition process, the number of frames transmitted per second is 200. If you want to play the 3rd second video, you can directly start playing from the 401th frame; Understandably, if you want to play the 401st frame, you can directly drag the video frame to the 3rd second.
- the output module 54 is further configured to display or play the preprocessed information through the multimedia interactive component.
- the preprocessing includes: performing image processing operations on the acquired one frame of image or the multiple frames of images, and/or performing at least one of speech noise reduction, speech to text, and speech synthesis on the acquired audio. For example, if the acquired information is one frame of image or multiple frames of images, image processing operations such as target recognition or target tracking can be performed on the target in one frame of image or multiple frames of image according to the preset code. If the extracted information is a multi-frame image, that is, video shooting information, after the multi-frame image is acquired, it is played and displayed through the multimedia interactive component.
- more Frame images form an image set.
- the image set itself does not include time information, but each frame of image itself has a time point, that is, each frame of image has its corresponding acquisition time. Therefore, in order to play the video smoothly, When recording images, record the time of acquiring each frame of image, and directly display the acquired multiple frames of images according to the acquisition time of each frame of image during playback. Makes multi-frame images play smoothly, while saving the processing time and process of saving images into video files.
- the acquired multiple frames of images may also be combined into a video file before being played and displayed.
- the preset code may be a preset model, which may be a model that integrates a neural network algorithm that can perform target recognition or target tracking, and of course, it can also be a model that integrates other algorithms that can perform target recognition or target tracking.
- the acquired information is audio information
- the audio information may be processed according to a preset code such as speech noise reduction, speech to text, and speech synthesis.
- the multimedia interactive component uses a preset code to reduce the noise of the audio file; in another embodiment, if the audio information is acquired within a fixed time
- the multimedia interaction component can perform speech synthesis processing on the audio file through a preset code; in another embodiment, the multimedia interaction component can also convert the acquired audio information into text processing for display. You can also perform text processing during audio playback; of course, you can also perform text processing and display first and then perform voice playback, which is not limited here.
- the output module 54 is also configured to control any operation of position movement, window zooming, and window hiding of the window to be displayed or played, so as to make multimedia interaction more flexible.
- the multimedia interactive device uses multimedia interactive components to extract information (such as video recording, photographing, audio recording, etc.), and uses local resources for multimedia interaction, without the need to purchase additional teaching and research equipment.
- it can also preprocess the extracted information through multimedia interactive components, without uploading to the cloud, and without additional bandwidth, saving computing resources and network resources. It is based on this machine to achieve video recording, audio recording, image capture, screenshots, etc. It does not require third-party decoders and programs. It only needs to install web pages on this machine, so that it can be called by running the first code on the server running on this machine.
- Multimedia interactive components so its requirements for the host are low and the operating environment is green. Its multimedia interaction components are capable of multiple multimedia interactions, which are more in line with teaching needs. Compared with the way of multimedia interaction with external equipment, it reduces additional conversion steps.
- the embodiment of the present disclosure also proposes a computer-readable storage medium in which at least one instruction or at least one program is stored, and the above is realized when the at least one instruction or at least one program is loaded and executed by a processor method.
- the computer-readable storage medium may be a non-volatile computer-readable storage medium.
- FIG. 6 is a schematic diagram of the structure of the multimedia interactive device of the present disclosure.
- the multimedia interactive device includes a memory 62 and a processor 61 connected to each other.
- the memory 62 is configured to store program instructions for implementing any one of the above-mentioned multimedia interaction methods.
- the processor 61 is configured to execute program instructions stored in the memory 62.
- the processor 61 may also be referred to as a central processing unit (Central Processing Unit, CPU).
- the processor 61 may be an integrated circuit chip with signal processing capability.
- the processor 61 may also be a general-purpose processor, a digital signal processor (Digital Signal Processing, DSP), an application specific integrated circuit (ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA), or other Programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory 62 can be a memory stick, a flash memory (Trans-Flash, TF) card, etc., and can store all the information in the multimedia interactive device, including the input original data, computer programs, intermediate running results, and final running results are all stored in the memory . It stores and retrieves information according to the location specified by the controller. With memory, multimedia interactive devices can only have memory function to ensure normal operation.
- the storage of multimedia interactive devices can be divided into main storage (memory) and auxiliary storage (external storage) according to usage, and there are also classification methods for external storage and internal storage. External storage is usually magnetic media or optical discs, etc., which can store information for a long time.
- Memory refers to the storage components on the motherboard, used to store the currently executing data and programs, but only used to temporarily store the programs and data, the data will be lost if the power is turned off or power off.
- the disclosed method and device may be implemented in other ways.
- the device implementation described above is only illustrative, for example, the division of modules or units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of this embodiment.
- the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present disclosure essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium.
- Including several instructions to make a computer device which can be a personal computer, a system server, or a network device, etc.
- a processor execute all or part of the steps of the methods of the various embodiments of the present disclosure.
- FIG. 7 is a schematic structural diagram of a computer-readable storage medium of the present disclosure.
- the storage medium of the present disclosure stores a program file 71 that can implement all the above-mentioned multimedia interaction methods.
- the program file 71 can be stored in the above-mentioned storage medium in the form of a software product, and includes a number of instructions to enable a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the various embodiments of the present disclosure.
- the aforementioned storage devices include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks and other media that can store program codes. , Or terminal devices such as computers, servers, mobile phones, and tablets.
- the multimedia interaction component of the teaching platform is invoked; the multimedia interaction component is used to obtain the control instruction; the multimedia interaction component is used to extract information based on the control instruction; and the extracted information is passed through the multimedia interaction
- the components are displayed or played, so that multimedia interaction can be realized without the aid of external equipment, thereby making the effect of network teaching better.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Economics (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Software Systems (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- User Interface Of Digital Computer (AREA)
- Electrically Operated Instructional Devices (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
Claims (25)
- 一种多媒体互动方法,包括:调用教学平台的多媒体交互组件;利用所述多媒体交互组件获取到控制指令;利用所述多媒体交互组件基于所述控制指令进行信息提取;将提取的信息通过所述多媒体交互组件进行显示或播放。
- 根据权利要求1所述的多媒体互动方法,其中,所述控制指令包括图像拍摄控制指令、视频录制控制指令、音频录制控制指令中的至少一种。
- 根据权利要求2所述的多媒体互动方法,其中,所述利用所述多媒体交互组件基于所述控制指令进行信息提取,包括:在所述控制指令为所述图像拍摄控制指令的情况下,根据所述图像拍摄控制指令触发摄像装置以获取一帧图像;和/或在所述控制指令为所述视频录制控制指令的情况下,根据所述视频录制控制指令触发摄像装置以获取多帧图像;和/或在所述控制指令为所述音频录制控制指令的情况下,根据所述音频录制控制指令触发录音装置以进行音频录制。
- 根据权利要求3所述的多媒体互动方法,其中,所述在所述控制指令为所述视频录制控制指令的情况下,根据所述视频录制控制指令触发摄像装置以获取多帧图像之后还包括:获取所述多帧图像的每秒传输帧数;将所述多帧图像通过所述每秒传输帧数按照帧数与时间的对应关系,进行选择性播放。
- 根据权利要求3所述的多媒体互动方法,其中,所述在所述控制指令为所述视频录制控制指令或所述图像拍摄控制指令的情况下,所述将提取的信息通过所述多媒体交互组件进行显示或播放,包括:根据所述多帧图像的获取时间先后顺序依次播放所述多帧图像;或者将所述多帧图像进行合成,以形成视频文件,并播放所述视频文件。
- 根据权利要求1至5任一项所述的多媒体互动方法,其中,所述利用所述多媒体交互组件基于所述控制指令进行信息提取,包括:利用所述多媒体交互组件基于所述控制指令从预先存储的预置信息中提取所述信息。
- 根据权利要求1至5任一项所述的多媒体互动方法,其中,所述利用所述多媒体交互组件基于所述控制指令进行信息提取,包括:根据所述控制指令,设置定时器的定时时间;在到达所述定时器的定时时间段的情况下,控制所述多媒体交互组件进行信息提取。
- 根据权利要求1至7任一项所述的多媒体互动方法,其中,所述将提取的信息通过所述多媒体交互组件进行显示或播放之后还包括:控制进行显示或播放的窗口进行位置移动、窗口缩放、窗口隐藏中任意一项操作。
- 根据权利要求1至8任一项所述的多媒体互动方法,其中,所述将提取的信息通过所述多媒体交互组件进行显示或播放,包括:通过所述多媒体交互组件获取预设代码,并根据所述预设代码对提取的所述信息进 行预处理;将预处理后的信息通过所述多媒体交互组件进行显示或播放。
- 根据权利要求9所述的多媒体互动方法,其中,所述预处理包括:对获取的一帧图像或多帧图像进行图像处理操作;和/或对获取的音频进行语音降噪、语音转文字、语音合成中至少一种操作。
- 根据权利要求3至10任一项所述的多媒体互动方法,其中,在所述控制指令为所述视频录制控制指令或所述图像拍摄控制指令的情况下,所述将提取的信息通过所述多媒体交互组件进行显示或播放之后还包括:从正在播放的所述多帧图像或所述一帧图像中获取一帧图像。
- 一种多媒体互动装置,其中,包括:调用模块,配置为调用教学平台的多媒体交互组件;输入模块,配置为利用所述多媒体交互组件获取到控制指令;信息提取模块,配置为利用所述多媒体交互组件基于所述控制指令进行信息提取;输出模块,配置为将提取的信息通过所述多媒体交互组件进行显示或播放。
- 根据权利要求12所述的装置,其中,所述控制指令包括图像拍摄控制指令、视频录制控制指令、音频录制控制指令中的至少一种。
- 根据权利要求13所述的装置,其中,所述信息提取模块还用于在所述控制指令为所述图像拍摄控制指令的情况下,根据所述图像拍摄控制指令触发摄像装置以获取一帧图像;和/或,在所述控制指令为所述视频录制控制指令的情况下,根据所述视频录制控制指令触发摄像装置以获取多帧图像;和/或,在所述控制指令为所述音频录制控制指令的情况下,根据所述音频录制控制指令触发录音装置以进行音频录制。
- 根据权利要求14所述的装置,其中,所述信息提取模块还用于获取所述多帧图像的每秒传输帧数,将所述多帧图像通过所述每秒传输帧数按照帧数与时间的对应关系,进行选择性播放。
- 根据权利要求14所述的装置,其中,在所述控制指令为所述视频录制控制指令或所述图像拍摄控制指令的情况下,所述信息提取模块还用于根据所述多帧图像的获取时间先后顺序依次播放所述多帧图像;或者,将所述多帧图像进行合成,以形成视频文件,并播放所述视频文件。
- 根据权利要求12至16任一项所述的装置,其中,所述信息提取模块还用于利用所述多媒体交互组件基于所述控制指令从预先存储的预置信息中提取所述信息。
- 根据权利要求12至17任一项所述的装置,其中,所述信息提取模块还用于根据所述控制指令,设置定时器的定时时间;在到达所述定时器的定时时间段的情况下,控制所述多媒体交互组件进行信息提取。
- 根据权利要求12至18任一项所述的装置,其中,所述输出模块还用于控制进行显示或播放的窗口进行位置移动、窗口缩放、窗口隐藏中任意一项操作。
- 根据权利要求12至19任一项所述的装置,其中,所述输入模块还用于通过所述多媒体交互组件获取预设代码,所述调用模块还用于调用多媒体交互组件根据所述预设代码对提取的所述信息进行预处理;所述输出模块还用于将预处理后的信息通过所述多媒体交互组件进行显示或播放。
- 根据权利要求20所述的装置,其中,所述预处理包括:对获取的一帧图像或多帧图像进行图像处理操作;和/或,对获取的音频进行语音降噪、语音转文字、语音合成中至少一种操作。
- 根据权利要求14至21任一项所述的装置,其中,在所述控制指令为所述视频 录制控制指令或所述图像拍摄控制指令的情况下,所述输出模块还用于从正在播放的所述多帧图像或所述一帧图像中获取一帧图像。
- 一种多媒体互动设备,包括:存储器和处理器,其中,所述存储器存储有程序指令,所述处理器从所述存储器调取所述程序指令以执行如权利要求1至11任一项所述的多媒体互动方法。
- 一种计算机可读存储介质,存储有程序文件,所述程序文件能够被执行以实现如权利要求1至11任一项所述的多媒体互动方法。
- 一种计算机程序产品,包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备中的处理器执行如权利要求1至11任一项所述的方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG11202111719SA SG11202111719SA (en) | 2020-04-28 | 2021-03-04 | Multimedia interaction method, device, and equipment, and storage medium |
JP2021562332A JP2022533911A (ja) | 2020-04-28 | 2021-03-04 | マルチメディアインタラクティブ方法、装置、機器及び記憶媒体 |
KR1020217034309A KR20210143857A (ko) | 2020-04-28 | 2021-03-04 | 멀티미디어 상호 작용 방법, 장치, 기기 및 저장 매체 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010352029.1 | 2020-04-28 | ||
CN202010352029.1A CN111586490A (zh) | 2020-04-28 | 2020-04-28 | 一种多媒体互动方法、装置、设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021218379A1 true WO2021218379A1 (zh) | 2021-11-04 |
Family
ID=72111748
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/079166 WO2021218379A1 (zh) | 2020-04-28 | 2021-03-04 | 一种多媒体互动方法、装置、设备及存储介质 |
Country Status (6)
Country | Link |
---|---|
JP (1) | JP2022533911A (zh) |
KR (1) | KR20210143857A (zh) |
CN (1) | CN111586490A (zh) |
SG (1) | SG11202111719SA (zh) |
TW (1) | TW202141446A (zh) |
WO (1) | WO2021218379A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114245227A (zh) * | 2021-12-24 | 2022-03-25 | 阿维塔科技(重庆)有限公司 | 一种车辆环境控制方法、装置及计算机可读存储介质 |
CN114615431A (zh) * | 2022-03-11 | 2022-06-10 | 联想(北京)有限公司 | 多媒体数据处理方法、装置、终端及存储介质 |
CN117789723A (zh) * | 2023-11-29 | 2024-03-29 | 广州炸子鸡网络科技有限公司 | 一种基于人工智能的视频内容处理方法及系统 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111586490A (zh) * | 2020-04-28 | 2020-08-25 | 上海商汤临港智能科技有限公司 | 一种多媒体互动方法、装置、设备及存储介质 |
CN114268801A (zh) * | 2021-12-21 | 2022-04-01 | 北京达佳互联信息技术有限公司 | 媒体信息处理方法、媒体信息呈现方法和装置 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130344469A1 (en) * | 2012-06-25 | 2013-12-26 | Texas Instruments Incorporated | Open Paradigm for Interactive Networked Educational Systems |
CN109783256A (zh) * | 2019-01-10 | 2019-05-21 | 上海商汤智能科技有限公司 | 人工智能教学系统及方法、电子设备、存储介质 |
CN110032364A (zh) * | 2019-04-11 | 2019-07-19 | 上海商汤智能科技有限公司 | 数据处理方法、装置、电子设备和计算机存储介质 |
CN110134386A (zh) * | 2019-04-04 | 2019-08-16 | 成都娄外科技有限公司 | 一种程序编辑方法和装置 |
CN110362299A (zh) * | 2019-06-14 | 2019-10-22 | 杭州古德微机器人有限公司 | 一种基于blockly和树莓派的在线图形化编程系统及其使用方法 |
CN110533969A (zh) * | 2019-08-05 | 2019-12-03 | 深圳市编玩边学教育科技有限公司 | 一种编程教学端及系统 |
CN111586490A (zh) * | 2020-04-28 | 2020-08-25 | 上海商汤临港智能科技有限公司 | 一种多媒体互动方法、装置、设备及存储介质 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7234113B1 (en) * | 1999-06-29 | 2007-06-19 | Intel Corporation | Portable user interface for presentation of information associated with audio/video data |
JP6180011B2 (ja) * | 2012-11-07 | 2017-08-16 | 国立大学法人 筑波大学 | 動作評価支援装置、動作評価支援システム、動作評価支援方法及びプログラム |
US20140281994A1 (en) * | 2013-03-15 | 2014-09-18 | Xiaomi Inc. | Interactive method, terminal device and system for communicating multimedia information |
CN103197836B (zh) * | 2013-03-15 | 2016-07-06 | 小米科技有限责任公司 | 一种多媒体信息的交互方法、装置及系统 |
CN103927908B (zh) * | 2014-05-03 | 2016-11-23 | 广东真迪科教设备有限公司 | 一种教学多媒体系统及其控制方法 |
JP2019012965A (ja) * | 2017-06-30 | 2019-01-24 | 富士通株式会社 | 映像制御方法、映像制御装置、及び映像制御プログラム |
JP6960598B2 (ja) * | 2017-07-13 | 2021-11-05 | パナソニックIpマネジメント株式会社 | 撮像装置 |
CN110568984A (zh) * | 2019-08-22 | 2019-12-13 | 北京大米科技有限公司 | 在线教学方法、装置、存储介质及电子设备 |
-
2020
- 2020-04-28 CN CN202010352029.1A patent/CN111586490A/zh active Pending
-
2021
- 2021-03-04 SG SG11202111719SA patent/SG11202111719SA/en unknown
- 2021-03-04 KR KR1020217034309A patent/KR20210143857A/ko active Search and Examination
- 2021-03-04 JP JP2021562332A patent/JP2022533911A/ja active Pending
- 2021-03-04 WO PCT/CN2021/079166 patent/WO2021218379A1/zh active Application Filing
- 2021-04-06 TW TW110112437A patent/TW202141446A/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130344469A1 (en) * | 2012-06-25 | 2013-12-26 | Texas Instruments Incorporated | Open Paradigm for Interactive Networked Educational Systems |
CN109783256A (zh) * | 2019-01-10 | 2019-05-21 | 上海商汤智能科技有限公司 | 人工智能教学系统及方法、电子设备、存储介质 |
CN110134386A (zh) * | 2019-04-04 | 2019-08-16 | 成都娄外科技有限公司 | 一种程序编辑方法和装置 |
CN110032364A (zh) * | 2019-04-11 | 2019-07-19 | 上海商汤智能科技有限公司 | 数据处理方法、装置、电子设备和计算机存储介质 |
CN110362299A (zh) * | 2019-06-14 | 2019-10-22 | 杭州古德微机器人有限公司 | 一种基于blockly和树莓派的在线图形化编程系统及其使用方法 |
CN110533969A (zh) * | 2019-08-05 | 2019-12-03 | 深圳市编玩边学教育科技有限公司 | 一种编程教学端及系统 |
CN111586490A (zh) * | 2020-04-28 | 2020-08-25 | 上海商汤临港智能科技有限公司 | 一种多媒体互动方法、装置、设备及存储介质 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114245227A (zh) * | 2021-12-24 | 2022-03-25 | 阿维塔科技(重庆)有限公司 | 一种车辆环境控制方法、装置及计算机可读存储介质 |
CN114245227B (zh) * | 2021-12-24 | 2023-12-15 | 阿维塔科技(重庆)有限公司 | 一种车辆环境控制方法、装置及计算机可读存储介质 |
CN114615431A (zh) * | 2022-03-11 | 2022-06-10 | 联想(北京)有限公司 | 多媒体数据处理方法、装置、终端及存储介质 |
CN114615431B (zh) * | 2022-03-11 | 2023-09-19 | 联想(北京)有限公司 | 多媒体数据处理方法、装置、终端及存储介质 |
CN117789723A (zh) * | 2023-11-29 | 2024-03-29 | 广州炸子鸡网络科技有限公司 | 一种基于人工智能的视频内容处理方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
JP2022533911A (ja) | 2022-07-27 |
TW202141446A (zh) | 2021-11-01 |
SG11202111719SA (en) | 2021-12-30 |
CN111586490A (zh) | 2020-08-25 |
KR20210143857A (ko) | 2021-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021218379A1 (zh) | 一种多媒体互动方法、装置、设备及存储介质 | |
JP2021192222A (ja) | 動画インタラクティブ方法と装置、電子デバイス、コンピュータ可読記憶媒体、及び、コンピュータプログラム | |
CN110085244B (zh) | 直播互动方法、装置、电子设备及可读存储介质 | |
US11653072B2 (en) | Method and system for generating interactive media content | |
CN110166842B (zh) | 一种视频文件操作方法、装置和存储介质 | |
EP3239857B1 (en) | A method and system for dynamically generating multimedia content file | |
CN111787986B (zh) | 基于面部表情的语音效果 | |
CN109508090B (zh) | 一种具备可交互性的增强现实展板系统 | |
CN112188267B (zh) | 视频播放方法、装置和设备及计算机存储介质 | |
CN109361954B (zh) | 视频资源的录制方法、装置、存储介质及电子装置 | |
US20190087081A1 (en) | Interactive media reproduction, simulation, and playback | |
CN112732152B (zh) | 直播处理方法、装置、电子设备及存储介质 | |
CN111986689A (zh) | 音频播放方法、音频播放装置和电子设备 | |
CN113313797A (zh) | 虚拟形象驱动方法、装置、电子设备和可读存储介质 | |
WO2023195909A2 (zh) | 特效视频确定方法、装置、电子设备及存储介质 | |
WO2019196378A1 (zh) | 应用程序的内容推送方法、内容推送系统及智能终端 | |
US20130187927A1 (en) | Method and System for Automated Production of Audiovisual Animations | |
US20230300429A1 (en) | Multimedia content sharing method and apparatus, device, and medium | |
CN112738617A (zh) | 一种有声幻灯片录制与播放方法及系统 | |
US20230215296A1 (en) | Method, computing device, and non-transitory computer-readable recording medium to translate audio of video into sign language through avatar | |
US12058410B2 (en) | Information play control method and apparatus, electronic device, computer-readable storage medium and computer program product | |
WO2023040633A1 (zh) | 一种视频生成方法、装置、终端设备及存储介质 | |
WO2022105097A1 (zh) | 视频流处理方法及装置、电子设备、存储介质及计算机程序 | |
US11481172B2 (en) | Method and apparatus for providing an intelligent response | |
CN111931510B (zh) | 一种基于神经网络的意图识别方法及装置、终端设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2021562332 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20217034309 Country of ref document: KR Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21796263 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21796263 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 25/04/2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21796263 Country of ref document: EP Kind code of ref document: A1 |