WO2019019406A1 - Teaching recording data updating device - Google Patents

Teaching recording data updating device Download PDF

Info

Publication number
WO2019019406A1
WO2019019406A1 PCT/CN2017/105553 CN2017105553W WO2019019406A1 WO 2019019406 A1 WO2019019406 A1 WO 2019019406A1 CN 2017105553 W CN2017105553 W CN 2017105553W WO 2019019406 A1 WO2019019406 A1 WO 2019019406A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
courseware
text
voice
content
Prior art date
Application number
PCT/CN2017/105553
Other languages
French (fr)
Chinese (zh)
Inventor
陈滢西
赵鹏祥
滕凯
Original Assignee
深圳市鹰硕技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市鹰硕技术有限公司 filed Critical 深圳市鹰硕技术有限公司
Publication of WO2019019406A1 publication Critical patent/WO2019019406A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the invention belongs to the field of network teaching recording and broadcasting technology, and can be used for recording and playing of a teaching activity or a conference process based on network teaching or online conference, in particular, a voice data capable of recording and broadcasting data in a completed recording teaching.
  • a device for updating courseware data is a device for updating courseware data.
  • the recorder mainly includes a camera and a wireless digital microphone to record video information and voice data of the courseware.
  • the first network transmits the courseware information to the server.
  • the server is used on the one hand to further process the courseware information, to generate courseware data, and on the other hand to search and call the courseware data in the database, and then convert the courseware data back to the courseware information.
  • the database is used to store the courseware data.
  • the second network is used to connect the client to the server.
  • the client is used to facilitate the user to query courseware information and invoke courseware information.
  • the patent application discloses a technique for recording a typical streaming media format recording course. Now, the main disadvantage is that the file formed after recording is relatively large, the uploading and downloading speed is slow, and the required storage space is large.
  • Recent technologies in teaching and recording such as CN105306861A (publication date February 3, 2016), disclose an effective classroom teaching and recording method and system.
  • multimedia whiteboard can be realized for users.
  • the functional voice, speech/speech voice, communication with other users, and/or coaching, etc. are recorded to form different data streams, and a unified time stamp for various data streams is generated by the network teaching recording system.
  • the end obtains the data stream according to the time stamp reproduction, and the organic combination plays out to display to the user, thereby completing the on-demand browsing.
  • the patent application discloses a separate storage in three data stream formats according to time stamps.
  • CN101354748A (Publication Date January 28, 2009) discloses a character recognition device including an image pickup device, a character recognition device, a voice conversion device, and a voice output device for taking in text information and taking a photo
  • the entered text information is sent to the character recognition device in the form of a picture;
  • the character recognition device is configured to identify the text information in the picture and send the text information to the voice conversion device;
  • the voice conversion device is configured to The text information is converted into voice data and sent to the voice output device; and the voice output device is configured to play the voice data.
  • the patent application discloses a technique for collecting and recognizing text symbols in image information and then converting the text symbols into speech.
  • CN102956231A discloses a semi-automatic correction based speech key information recording apparatus and method in the field of speech recognition technology, the apparatus comprising: a key information extraction unit and an information correction unit connected thereto, The key information extracting unit obtains the uncorrected text data and extracts the key information, and outputs the key information to the information correcting unit, and the information correcting unit outputs the text data confirmed by the user feedback.
  • the invention reduces the workload of manual correction by using a semi-automatic information correction unit; uses a database to correct special nouns such as place names and professional tool names, thereby reducing the influence caused by the operator's knowledge limit in manual correction; extracting voice data Key information in the message, thereby increasing the amount of information available for the recorded information.
  • the patent application aims to solve the problem of semi-automatic correction of text data after speech conversion into text.
  • CN105159870A (Publication Date December 16, 2015) discloses a processing system for accurately completing continuous natural speech text, the processing system including a cloud speech recognition engine and a speech recognition update platform, and the speech recognition and update platform and The cloud speech recognition engine is connected, and the speech recognition post update platform comprises a display unit, an update operation unit, a control unit and a three-dimensional integrated generation unit, and the update operation unit comprises a voice update, a keyboard update, a mouse update, and a keyboard plus a mouse.
  • the operation mode is updated, in which it is disclosed that the voice file to be identified can be finely segmented to achieve accurate recognition.
  • CN105808197A discloses an information processing method applied to an electronic device having a speech recognition module, the method comprising: receiving input speech data; After inputting voice data for recognition to obtain a recognition result, when When the first information in the recognition result is content that needs to be updated, the first information is at least one character in the recognition result, and the first information in the recognition result is updated by using an operation body input manner, The method of inputting the operation body updates the first information in the recognition result, and only needs to update the part of the target update, without the user inputting the voice data again to obtain the target result, the operation process is simple, and the information input is improved. The overall speed.
  • the patent application discloses that it is only necessary to update the content that needs to be updated at the first place after voice recognition, thereby improving the speed of the update, but such update is only for the recognized text data, wherein the process of voice recognition In the middle, the method of comparing the information to be identified with the standard voice data is used, thereby improving the recognition accuracy.
  • CN106328145A (Publication Date, January 11, 2017) discloses a voice update method and apparatus, comprising: acquiring voice data input by a user; and identifying the voice data to obtain text content corresponding to the voice data; When the text content includes the first preset keyword, the text content is divided into original text and edit text according to the first preset keyword, wherein the edit text is used to perform the original text. Updating; extracting text to be updated from the original text according to the edited text; updating the original text according to the edited text and the to-be-updated text to obtain updated text.
  • the patent application discloses that the text to be edited in the original text, that is, the edited text, can be obtained by means of keyword recognition, and the text content formed by the voice recognition is updated in a targeted manner.
  • CN102215233A (Publication Date, October 12, 2011) discloses an information system client installed in a user's terminal device, which can be applied to a microblog, a blog, a forum or a personal space, etc., including: a user interaction module and a connection office.
  • the voice module of the user interaction module preferably, further includes a feedback module, a conversion module, the voice module includes a voice collection unit, a voice recognition unit, and a voice synthesis unit, and the voice collection unit is configured to collect voices of the user; the voice recognition unit
  • the voice recognition unit collects the voice recognition as text output to the user interaction module; the voice synthesis unit converts the text obtained by the user interaction module from the information system server into voice output to the user;
  • the feedback module, the connection center a voice recognition unit, configured to confirm whether the voice recognition is correct, if the correct, the feedback module outputs the text to the user interaction module, and if not, the feedback module enables the voice collection unit Reacquiring the voice of the user or the voice recognition unit updates the text straight To confirm that it is correct.
  • the patent application discloses a technology for converting between voice and text respectively, and aims to convert information of one format into information of another format, and if the outputted text information is incorrect, the feedback module re- Collect user voices or directly update the output text information
  • CN106486113A discloses a method for recording a conference, comprising: acquiring a voice signal; converting the voice signal into corresponding text information by a voice conversion software, and in the document
  • the text information includes the correct text information and the incorrect text information; the wrong text information in the document is marked, and the marked error text information is associated with the voice signal corresponding to the incorrect text information.
  • Linking when clicking the error text information, using the voice conversion software to perform secondary recognition on the voice signal associated with the erroneous text information, and performing editable display on the second recognized text information in the document;
  • the correction text information is corrected and edited in the editable display to obtain corrected text information, and the error text information is replaced with the corrected text information.
  • the present invention aims to provide an apparatus for updating teaching recording and broadcasting data, which realizes by replacing the text content formed by the voice data and the replacement operation of the new and old courseware data, especially the PPT courseware data. After the completion of the recording of the teaching process, the purpose of updating the teaching recording data is further updated.
  • the present invention is directed to an apparatus for updating teaching recording data, which includes separately stored data streams such as voice data, video data, motion data, and the like, and specifically includes using a recording device to be taught in the network or
  • the voice signal during the online conference is converted into the original voice data with time stamp, saved in the voice data stream + time stamp format, and the operation action data obtained by using the motion data recording software, such as the courseware, especially the PPT file,
  • the courseware action action data stream + timestamp format is saved.
  • the original speech data is converted into original text data using a speech recognition model, the original text data is collated, and the old text content to be updated is replaced with the new text content, thereby realizing the original text data.
  • the update forms the updated text data, uses the timestamp for positioning, and replaces the standard voice data of the new text content with the corresponding voice data segment of the old text content to form updated voice data.
  • the new courseware is updated on the basis of the old courseware, and the relationship between the new and old coursewares is established, including the corresponding relationship between the page and the updated content, and the time of the old courseware data in the recorded data is obtained.
  • the stamp information is used to replace the old courseware with the new courseware according to the correspondence relationship and the timestamp information, so that the new courseware content is presented to the user in the manner of displaying the old courseware formed during the recording when the recorded data is played back on demand.
  • the apparatus of the present invention may also be used for recording of other online online communication processes, Play and update. That is to say, the present invention relates to a method, system and computer program product for teaching and recording a network teaching, online training, emergency command (map annotation and voice recording), a financial system or an online conference boarding system, or recording and playing in a conference process.
  • a network teaching online training, emergency command (map annotation and voice recording), financial system (marketing explanation) or online conference
  • the voice data is identified and converted.
  • the text data is updated, and the standard voice data of the updated text content is replaced with the corresponding recorded voice data, so that the update of the voice data and the replacement operation of the new and old courseware can be realized, and the courseware data is updated.
  • the present invention provides an apparatus for updating teaching recording data, during recording and on-demand review of a multimedia classroom (or online classroom) or the like, particularly when recording a multimedia classroom, including voice data,
  • the action data (electronic whiteboard book) on the multimedia whiteboard, the operation action data on the screen of the user terminal, the video data recorded by the recording device, etc. are added in time stamps after the data stream format is added, and the teaching recording data is formed, and the user logs in to the network teaching record.
  • the wireless local area or the wide area network After broadcasting the system, use wired or The wireless local area or the wide area network obtains the teaching and recording data, realizes the reproduction process on the user terminal by using the time stamp or simulates the teaching process of the recurring classroom, thereby realizing the review playback or the on-demand playback of the recorded classroom.
  • the device for updating teaching recording data includes: a file identification generating unit, a voice data collecting unit, a voice data updating unit, a courseware data collecting unit, a courseware data updating unit, other data collecting units, a recording data playing unit, and Error information feedback unit, etc., wherein
  • a file identification generating unit configured to generate a file identification ID when starting the recording teaching process
  • a voice data collecting unit configured to convert a voice signal into original voice data by using an audio collecting device, and save the voice signal in a voice data stream format; preferably, collecting at least one voice data from at least one voice source, and adding a time stamp to the voice Data stream format is saved;
  • a voice data update unit configured to update the voice data that needs to be updated by the original voice data, to form updated voice data, where the update of the voice data is implemented by a replacement operation of the text content formed by the voice data identification;
  • the courseware data collecting unit is configured to acquire the courseware data for the courseware file, in particular the operation action data of the PPT file during the teaching process;
  • a courseware data updating unit configured to update courseware data in the teaching record data, wherein the update of the courseware data is implemented by replacing the old courseware with a new courseware;
  • the other data collection unit is configured to collect at least one of the following data: action data on the multimedia whiteboard, operation data on the screen of the user terminal, video data of the video recording device, and adding the timestamp to each data collected, Separately saved in a data stream format, and together with the voice data stream and the courseware data, form recording data that can be played on-demand;
  • Recording a data playing unit the user uses the terminal to acquire the recorded data through a network, combines different data streams according to the time stamp, thereby playing the recorded data on the terminal, reproduces and/or simulates a recurring teaching process, and realizes Learning and/or reviewing the teaching process;
  • the error information feedback unit may: when the user plays the recorded data by using the terminal, may select and submit the error text content in the found updated text data, and the feedback content is updated by the administrator, and the update is updated. Text data, and repeating the voice data replacement unit, updating the updated voice data.
  • the voice data update unit further includes
  • a voice data identification unit configured to convert the original voice data identification into original text data according to the voice recognition model, and determine a time coordinate of each text content in the text data according to the time stamp;
  • a text data updating unit configured to proofread the original text data, and need to update The old text content is replaced with an updated new text content to form updated text data
  • a voice data replacing unit configured to replace, by using standard voice data of the new text content, a voice data segment of the old text content in the original voice data, to form updated voice data, to implement data recording and playing on the teaching Update of voice data.
  • the courseware data updating unit further includes
  • the courseware content update unit is used to update the content on the basis of the old courseware, replace the old content with the new content, form a new courseware, and record the correspondence data between the old courseware and the new courseware, including between the pages before and after the update.
  • a courseware data updating unit configured to acquire a timestamp identifier of the courseware data in the teaching recording and broadcasting data, and process the new courseware according to the corresponding relationship data, and add the timestamp identifier to the new courseware to form a new Courseware data;
  • the courseware replacement updating unit replaces the old courseware data in the teaching recording data with the new courseware data, and updates the courseware data of the teaching recording data.
  • the courseware is preferably a PPT file, and the standard voice data is obtained by searching from a standard voice database, and the voice data and the courseware data are separately saved.
  • the voice data is “voice data stream+”
  • the format of the timestamp is saved, and the courseware data is saved in the format of "courseware operation data stream + timestamp".
  • the text data updating unit is further configured to: after the courseware data update is completed, determine, according to the timestamp identifier, the display time of each page of the PPT file, and update according to each page after the update Content correspondence, searching for old content in the text data during the presentation time, replacing the old content with the new content, forming updated text data, implementing further updating of the text data, and making the updated voice data and the updated courseware
  • the data is adapted. Thereby, the interaction and association of the courseware update and the voice update are realized, so that when the content involved in the voice data has been updated and replaced with the new PPT file, the voice data is replaced according to the replacement relationship between the PPT and the new content.
  • the related content and voice in the speech realize the adaptive synchronization modification of the voice explanation and the PPT display content, and further improve the completeness and achievability of the recorded data after updating.
  • the further updating of the text data reveals the correspondence between the old content and the new content in the text data to the administrator before the replacement is completed, and the administrator confirms whether to replace according to the context.
  • the updated text data is displayed on the screen in a subtitle manner according to a timestamp identifier, preferably displayed on a screen area in which the video data is played, and more preferably, the text data is editable.
  • the selected mode is displayed in a specific area of the screen.
  • the device further includes an error information feedback unit,
  • the error information feedback unit may: when the user uses the terminal to play the teaching and recording data, select and submit the error text content in the found updated text data, and the feedback content is updated again after being confirmed by the administrator.
  • the text data is updated, and the updated voice data is updated again by the voice data replacement unit.
  • an update history is formed, which may include update time, update content, update operator, issue finder, and the like.
  • the voice data replacing unit is configured to calculate a smoothing coefficient according to the pronunciation time of the replaced old text content in the original voice data and the pronunciation time of the standard voice data of the new text content, and according to the smoothing coefficient, The pronunciation time of the new text content is adjusted, thereby smoothing and synchronizing the voice data before and after the replacement.
  • the level of classroom recording is improved, and the storage manners of various data streams are respectively saved based on the time stamp, and the voice data is updated and updated, and the voice data is updated according to the updated text content, and the voice data is updated according to the updated text content.
  • the content that needs to be updated in the original voice data overcomes the problems caused by “less talk, wrong talk and miss talk” in the classroom, and can obtain double updated voice data and text data (subtitle information) through the old courseware.
  • a new courseware is formed, and a corresponding relationship is established between the new and old coursewares, and the timestamp information in the recorded data is combined to realize the replacement of the courseware.
  • the voice problem is solved by updating the voice data, and the voice problem of non-standard expression in the teaching process is solved, and the new and old courseware can be replaced by the new and old courseware replacement operation.
  • the courseware is displayed to the users who study and record the data, so that after the teaching process is completed, that is, after the teaching and recording data is formed, the problems existing in the teaching process can still be remedied and improved through the update operation.
  • FIG. 1 is a block diagram of a recording and broadcasting system according to the present invention.
  • FIG. 2 is a flow chart showing the steps of updating the recording data according to the present invention.
  • FIG. 3 is a flow chart showing the steps of updating courseware according to the present invention.
  • FIG. 4 is a flow chart of a voice update step in accordance with the present invention.
  • the network teaching in the invention is not limited to the classroom teaching form of students and teachers, and may include online network teaching, remote network teaching, local network teaching, and employees of enterprises and institutions, with teachers and students, or trainers as participants. Participate in online web conferencing, remote web conferencing, local web conferencing, and other forms of communication/interaction that use the web for online communication and/or presentation of file content, such as remote collaborative work.
  • the teacher 1 and the student 2 respectively connect to the teaching server 3 via the Internet using a terminal device installed with a client of the network teaching recording system, and the teaching server 3 is also connected with the multimedia device 4 including an intelligent electronic whiteboard and a camera. At least one of a microphone, a high-stakes instrument, and thus a network lecture/listening/recording/on-demand/review of the multimedia classroom.
  • the terminal device includes: a processor, a network module, a control module, a display module, and a smart operating system, and can be a smart phone, a PAD, a notebook computer, a desktop computer, or the like.
  • the terminal may be provided with a plurality of data interfaces for connecting various extension devices and accessories through a data bus.
  • the intelligent operating system includes Windows, Android and its improvements, iOS, on which application software can be installed and run, and functions of various application software, services, and application stores/platforms under the intelligent operating system are realized.
  • Terminal devices can be connected to the Internet via RJ45/Wi-Fi/Bluetooth/2G/3G/4G/G.hn/Zigbee/Z-ware/RFID connections and connected to other terminals or other computers and devices via the Internet.
  • Connection methods such as audio and video interfaces to connect various expansion devices and accessories to form a conference/teaching device interactive system.
  • the reading device realizes image access, sound access, use control and screen recording of the electronic whiteboard, RFID reading function, and can access and control mobile storage devices, digital devices and other devices through corresponding interfaces; through DLNA/ IGRS technology and internet technology are used to implement functions such as manipulation, interaction and screen switching between multi-screen devices.
  • a processor is defined to include, but is not limited to, an instruction execution system such as a computer/processor based system, an application specific integrated circuit (ASIC), a computing device, or a non-transitory or non-transitory computer.
  • a hardware and/or software system that reads a storage medium to acquire or acquire logic and execute instructions contained in a non-transitory storage medium or a non-transitory computer readable storage medium.
  • the processor may also include any controller, state machine, microprocessor, internetwork-based entity, service or feature, or any other analog, digital, and/or mechanical implementation thereof.
  • the Internet may include a local area network and a wide area Internet, and may be a wired Internet or a wireless Internet, or any combination of these networks.
  • the main updating steps of the network teaching recording data according to the present invention are as follows:
  • S100 Starting the recording and broadcasting system: after the user (such as a teacher user) logs in using the terminal, various multimedia devices 4 such as an intelligent electronic whiteboard, a teacher terminal screen operation motion capturing program, a camera, a microphone, and the like are put into a working state, the camera There may be more than one, the microphone includes at least one, respectively for capturing the teacher's voice and for capturing the student's voice, and forming a voice data stream format together with the digital time stamp for saving; the screen operation motion capture program may be used to capture the teacher On the terminal, for the operation action of the courseware file, especially the PPT file, the page information, the action information and the timestamp of the operation PPT file are obtained, thereby forming the courseware operation action flow data, and the teaching server of the recording and broadcasting system can be used to generate the digital timestamp. .
  • various multimedia devices 4 such as an intelligent electronic whiteboard, a teacher terminal screen operation motion capturing program, a camera, a microphone, and the like are put into a
  • S200 Start online teaching: the teacher starts classroom teaching, and the document identification generating unit generates a teaching document ID.
  • the teacher uses the intelligent electronic whiteboard to display (as a teaching board or explain the problem board), and uses real-time voice to explain, Use real-time interactive voice to communicate, use electronic documents such as PPT documents on the teacher terminal for presentation and explanation, and then conduct multimedia teaching and interactive Q&A with students, and collect and form corresponding data streams through the devices as described in S100.
  • S300 Recording data saving: During the recording process, the actions on the intelligent whiteboard are transmitted and saved in the form of “action data stream + time stamp”, and the voice during the teaching and interaction process is “voice data stream” through the voice data collecting unit. + timestamp" transmission and saving, the operation actions of electronic documents such as PPT documents involved in the teacher terminal are transmitted and saved by the courseware data collection unit in the form of "electronic document operation data stream + time stamp" for other data streams It can be collected by other data acquisition units, such as video data transmitted and saved as "video data stream + time stamp". All of these data streams throughout the course of the course are tied to the teaching document ID to achieve the identity of the recorded course. These data can be added or deleted as needed.
  • a typical case is that the recorded data includes voice data, video data, and PPT document presentation data.
  • the classified recording and split screen display is relatively mature.
  • Technology. The various data recorded can be saved to a local database or a terminal database, and then uploaded to the remote teaching server through the network, or directly saved to the remote teaching server.
  • a voice acquisition device such as various available microphones may be used to collect the voice signal, and the voice signal is converted into voice data and saved in a data stream format.
  • the gender of the voice source can be marked so that when a subsequent voice update (replacement) operation is performed, the standard voice of the corresponding gender can be selected.
  • the gender of the voice source can be separately identified, and the multiple voice sources can be identified and clustered to form a voice file marked with a voice source, and the time stamp is added and saved separately.
  • the method for separately identifying the voice source can use the prior art, and details are not described herein again.
  • the electronic document is preferably a PPT document.
  • any other electronic document that can be divided into a course format and can be divided into a page format such as a WORD document
  • the electronic document operation data stream + time stamp can be Specifically, the page information + action information + timestamp information of the PPT file.
  • S400 voice data recognition: for the recorded original voice data, firstly through the voice data recognition unit, the voice model is used for recognition conversion to form original text data, and then the original text data is proofreaded and updated by the text data update unit.
  • the time stamp of the original voice data is added to the text data so that the text content in the text data can be time-located.
  • the text content may be at least one word, word, sentence or paragraph in the text data.
  • the clock data of the time dimension of the audio data can be obtained by the time positioning, that is, the clock parameter of the time point at which a certain data segment in one audio data can be relatively located.
  • the original speech data identification can be converted into the original text data using various available speech models, and when the speech data recognition conversion is performed, the gender of the speech source is first recognized, and the gender information is added to the text data.
  • Proofreading updates for text data include manual proofing, semi-automatic proofreading, and voice proofreading.
  • Voice data update The original text data is replaced by a voice data replacement unit using a voice update command (CN106406807A), but the present invention is not limited thereto.
  • the voice data replacement operation includes: accepting a voice update instruction, identifying, in the text data to be updated, all the characters that are the same as the voice update instruction sound, and a time stamp of the text content, and determining the text to be updated among all the recognized texts, Displaying an alternative text list corresponding to the text to be updated, accepting an alternative text selection instruction, performing a replacement operation, forming an updated text data, thereby completing the text update, and after completing the text update, performing the corresponding relationship information according to the text update.
  • the following voice data segment replacement operation includes: accepting a voice update instruction, identifying, in the text data to be updated, all the characters that are the same as the voice update instruction sound, and a time stamp of the text content, and determining the text to be updated among all the recognized texts.
  • the standard pronunciation information of the updated text is retrieved from the standard voice database, and the corresponding voice data segment in the original voice data is replaced with the standard pronunciation information according to the time stamp of the updated text, and the voice data is updated. , forming new voice data.
  • the standard voice data is obtained by searching from a standard voice database.
  • the standard speech database may include a girls standard speech database, a boys standard speech database, and/or a personalized standard speech database.
  • the personalized standard voice database is a voice model of a specific speaker formed by a standard voice database formed by recording a specific speaker, or by corpus training, and can be used for voice recognition, and can also be used to generate personalized standard voice. database.
  • the corresponding standard voice is selected according to the voice source gender information of the original text data, or other personalized information.
  • the old text content may be empty content, that is, the new text content replacing the empty content is missing, and the added text content is now required.
  • the new text content may be empty content, that is, the old text content that is replaced is redundant, and the deleted text content is now required.
  • step S400 may be consistent with the step S500.
  • the step sequence may be adjusted as needed, and is not specifically limited. However, after the S400 is completed, the S401 is performed first. : Courseware data update steps.
  • S401 Courseware data updating step: for updating the courseware data in the teaching and recording data by means of the courseware data updating unit and using the replacement operation of the new and old courseware.
  • the step of updating the courseware data by the courseware data updating unit further includes the following steps.
  • the S4011 courseware content update step updates the content on the basis of the old courseware, replaces the old content with the new content, forms a new courseware, and records the correspondence data between the old courseware and the new courseware, including the update.
  • This step can be performed at an appropriate time as needed, but the courseware data update operation can be performed after the courseware content update operation is completed.
  • S4012 courseware data updating step obtaining a timestamp identifier of the courseware data in the teaching recording and broadcasting data through the courseware data updating unit, and processing the new courseware according to the corresponding relationship data, adding the timestamp identifier to the new In the courseware, new courseware data is formed.
  • the S4013 courseware replacement operation step replaces the old courseware data in the teaching recording data with the new courseware data by using the courseware replacement update unit, and implements updating the courseware data of the teaching recording and broadcasting data.
  • the text data formed for the step S400 voice data identification may be further used by the text data update unit, after the courseware data update is completed, according to the timestamp Determining the display time of each page of the PPT file, so that the positioning may be performed, and searching for the text data formed by the voice recognition in the display time according to the content correspondence relationship before and after each page update
  • the old content, the old content is replaced with the new content, the updated text data is formed, and the text data formed by the voice recognition is further updated.
  • the standard voice library is used.
  • the corresponding speech segments in the original speech data are replaced with standard speech data segments to form further updated speech data.
  • the content of the PPT file involved in the voice data can be synchronously updated according to the modification of the PPT file, so that the updated voice data is adapted to the updated courseware data, and the recording data is further completed.
  • Update the content of the PPT file involved in the voice data can be synchronously updated according to the modification of the PPT file, so that the updated voice data is adapted to
  • the correspondence between the old content and the new content in the text data is presented to the administrator, and the administrator confirms whether according to the context. Replace it.
  • the system finds such problems, generates prompts or reports to the administrator, and the administrator confirms whether to perform the replacement.
  • the specific steps of the voice update are as follows:
  • the voice update instruction is received, for example, the user can issue a voice instruction of “check Hu Jian” through the unit, and initiate an update of the problem text “Hu Jian”. instruction.
  • the user can clarify which text needs to be updated by further voice instructions.
  • the characters that recognize the pronunciation as "hujian” from the destination are: “Hu Jian”, “mutual see”, “shoulder shoulder”, etc., the user currently wants to recognize the first
  • a "first" voice can be issued to determine the first recognized text as the text to be updated.
  • a list of alternate characters of the homophone is displayed in the vicinity of the text, so that the user can select the alternative text later. For example, if the first word “Hu Jian” in the text data is “hujian” is determined as the text to be updated, the first word in the text data in this step is “hujian”. "A list of alternative texts is displayed nearby: 1, Fujian; 2, accessories; 3, shoulder pads; 4, mutual see,...
  • the user can speak the position of the alternative text in the alternative text list by voice, and complete the work of selecting the alternative text. For example, use Fujian to replace Hu Jian.
  • the time position information of the text to be updated is marked with a time stamp, so as to accurately locate the time position information of the voice data corresponding to the updated text.
  • an update history is formed, the update history including update time, update content, update operator, and the like.
  • the standard speech data is searched according to the alternative text, and if a plurality of words or sentences are combined, a new piece of speech data is combined.
  • the text data includes gender information of the voice source, and when the search is performed, the girl's pronunciation or the boy's pronunciation, or various voice data such as various trebles and basses may be obtained according to the gender information.
  • the new voice data segment is replaced with the corresponding voice data segment in the original voice data according to the previously described time position information to form new voice data.
  • the pronunciation time is not necessarily the same.
  • the smoothing coefficient can be calculated according to the pronunciation time of the two voice segments, according to the smoothing The coefficient speeds up or slows down the standard pronunciation time so that the pronunciation duration of the same text content is the same after the replacement and before the replacement.
  • the user uses the terminal to log in to the recording and broadcasting system through the Internet, and can realize the review playback or on-demand playback of the recorded classroom.
  • these recording classrooms may be process record files of online online conferences, and the recording and playback system will send the teaching file IDs requested by the user for review or on-demand to the teaching server through the Socket encrypted channel, through teaching.
  • the file ID obtains the time-stamped action data stream, the voice data stream, the electronic document such as the PPT file operation data stream, the video data stream, and the text data of the course and is sent to the user terminal requesting the corresponding teaching file ID, and the user terminal is locally Restore (reproduce or simulate reproduce) the entire classroom teaching process based on timestamps.
  • These data streams can be displayed or switched display in each functional area of the user terminal. For video, it can generally be reproduced on the user terminal, but for the operation of the electronic whiteboard, simulation reproduction can be realized by the simulation program of the electronic whiteboard. For the operation of the PPT file, the replay display can be performed on the local terminal.
  • the user can choose to play only at least one of these data streams, for example, can only listen to the voice.
  • text data it can be displayed in a specific area of the user terminal in the form of subtitles, such as a video display area.
  • text data (formed by voice recognition) for performing caption display may be displayed in a specific editable area so that the user can perform a selected operation or the like, such that for the found non-standard voice data or text information, Just select the appropriate text message to get feedback.
  • the administrator of the recording and broadcasting system verifies. If it is found that there is an error, the previous text data and the updating process of the voice data are repeated, so that the text data and the voice data can be continuously improved and improved.
  • the terminal and the server are configured to be connected to a communication network including the Internet. Therefore, the medium may be a program that carries the program code in a streaming manner via the communication network.
  • the program code is downloaded from the communication network as described above, the program for downloading may be stored in the main device or may be installed from another recording medium.
  • the present invention can be realized by the above-described program code in the form of a computer data signal embodied in an electronic transmission embodied in a carrier wave.
  • the level of classroom recording is improved, and the storage manners of various data streams are respectively saved based on the time stamp, and the voice data is updated and updated, and the voice data is updated according to the updated text content, and the voice data is updated according to the updated text content.
  • the content that needs to be updated in the original voice data overcomes the problems caused by “less talk, wrong talk and miss talk” in the classroom, and can obtain double updated voice data and text data (subtitle information) through the old courseware.
  • a new courseware is formed, and a corresponding relationship is established between the new and old coursewares, and the timestamp information in the recorded data is combined to realize the replacement of the courseware.
  • the voice problem is solved by updating the voice data, and the voice problem of non-standard expression in the teaching process is solved, and the new and old courseware can be replaced by the new and old courseware replacement operation.
  • the courseware is displayed to the users who study and record the data, so that after the teaching process is completed, that is, after the teaching and recording data is formed, the problems existing in the teaching process can still be remedied and improved through the update operation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Computer Security & Cryptography (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The present invention provides a teaching recording data updating device, comprising a voice data updating unit and a courseware data updating unit. Replacement is performed using written content generated by voice data recognition, and old courseware data, particularly old PPT courseware data, is replaced by new courseware data, such that after a teaching process is recorded, updating is performed using the resulting teaching recording data. By using the device of the present invention, after a teaching process is recorded, a voice-related error, such as a mistaken, omitted, or inaccurate expression that occurs during teaching can be corrected by updating voice data. In addition, old courseware can be replaced by new courseware, such that new courseware is displayed to a user learning from the teaching recording data. Therefore, after a teaching process is completed, that is, after teaching recording data is formed, an error that occurs in the teaching process can be corrected by means of updating.

Description

一种用于更新教学录播数据的装置Device for updating teaching recording data 技术领域Technical field
本发明属于网络教学录播技术领域,可以用于基于网络教学或者在线会议等的教学活动或会议过程的录制和播放,特别是涉及一种能够对已完成录制教学录播数据中的语音数据和课件数据进行更新的装置。The invention belongs to the field of network teaching recording and broadcasting technology, and can be used for recording and playing of a teaching activity or a conference process based on network teaching or online conference, in particular, a voice data capable of recording and broadcasting data in a completed recording teaching. A device for updating courseware data.
背景技术Background technique
近些年来,由于传统教学模式已经越来越不能满足用户对于多媒体、信息化、便于回放等新型教学方式的需求,随着互联网技术,特别是移动互联网技术的快速发展和普及,各种网络教学录播系统蓬勃发展。在网络教学中,通过课堂录制将教学过程录制下来,可以在互联网上共享教学资源,用户可以使用终端在线访问这些教学资源,可以满足用户远程学习和回顾的需求。In recent years, as traditional teaching models have become less and less able to meet the needs of users for new teaching methods such as multimedia, informationization, and easy playback, with the rapid development and popularization of Internet technologies, especially mobile Internet technologies, various online teaching The recording and broadcasting system is booming. In the network teaching, the teaching process is recorded through the classroom recording, and the teaching resources can be shared on the Internet. The user can use the terminal to access these teaching resources online, which can meet the needs of the user for remote learning and review.
教学录播方面早期的技术,比如CN101141271A(公开日2008年3月12日)公开了一种网络教学的录播系统,包括:录制器、处理器、第一网络、第二网络,服务器、数据库及三个客户端。其中,录制器主要包括摄像头及无线数字话筒以录制课件的视频信息及语音数据。第一网络于将所述课件信息传送至服务器。服务器一方面用于进一步处理所述课件信息,产生课件数据,另一方面用于在数据库中搜寻并调用所述课件数据,进而将所述课件数据转换回所述课件信息。数据库用于存储所述课件数据。第二网络用于连接客户端与服务器。客户端用于方便用户查询课件信息及调用课件信息。所述专利申请公开了一种比较典型的流媒体格式录制课程的技术,现在看其主要缺点在于录制后形成的文件比较大,上传下载速度慢,需要的存储空间大等。Early technologies in teaching recording and broadcasting, such as CN101141271A (publication date March 12, 2008), disclosed a recording and broadcasting system for network teaching, including: recorder, processor, first network, second network, server, database And three clients. Among them, the recorder mainly includes a camera and a wireless digital microphone to record video information and voice data of the courseware. The first network transmits the courseware information to the server. The server is used on the one hand to further process the courseware information, to generate courseware data, and on the other hand to search and call the courseware data in the database, and then convert the courseware data back to the courseware information. The database is used to store the courseware data. The second network is used to connect the client to the server. The client is used to facilitate the user to query courseware information and invoke courseware information. The patent application discloses a technique for recording a typical streaming media format recording course. Now, the main disadvantage is that the file formed after recording is relatively large, the uploading and downloading speed is slow, and the required storage space is large.
教学录播方面近期的技术,比如CN105306861A(公开日2016年2月3日)公开了一种有效的课堂教学录播方法和系统,在网络教学或在线会议过程中,可以实现对于用户使用多媒体白板的功能操作、讲话/说话语音、与其他用户的交流和/或辅导等的交流语音进行录制,分别形成不同的数据流,并且由网络教学的录播系统产生统一的时间戳对各种数据流进行标记,而不是完全以流媒体的格式将整个事件记录下来,使得网络用户可随时随地方便的通过网络从云端服务器或局域网服务器下载各种需要播放的数据流,获取数据流后用户终端的客户端根据时间戳再现获得数据流,有机组合播放出来给用户进行展示,从而完成点播浏览。所述专利申请公开了一种根据时间戳以三种数据流格式分别存 储和记录课堂教学数据的课堂录播方法。Recent technologies in teaching and recording, such as CN105306861A (publication date February 3, 2016), disclose an effective classroom teaching and recording method and system. In the process of online teaching or online meeting, multimedia whiteboard can be realized for users. The functional voice, speech/speech voice, communication with other users, and/or coaching, etc., are recorded to form different data streams, and a unified time stamp for various data streams is generated by the network teaching recording system. Marking, instead of recording the entire event in a streaming media format, the network user can conveniently download various data streams to be played from the cloud server or the LAN server through the network anytime and anywhere, and obtain the data stream after the customer of the user terminal. The end obtains the data stream according to the time stamp reproduction, and the organic combination plays out to display to the user, thereby completing the on-demand browsing. The patent application discloses a separate storage in three data stream formats according to time stamps. A classroom recording method for storing and recording classroom teaching data.
随着对录制课程品质的追求越来越高,越来越多的教学录播系统采用了语音识别技术,通常需要将语音转换成文字,在屏幕上以字幕方式显示或者保存为文本格式。现有技术中,关于语音识别,特别是将语音转换为文字或者将文字转换为语音的专利申请不在少数,比如:With the increasing pursuit of the quality of recorded courses, more and more teaching and recording systems use speech recognition technology, usually need to convert the speech into text, display it as a subtitle on the screen or save it as a text format. In the prior art, there are a few patent applications for speech recognition, especially converting speech to text or converting text to speech, such as:
CN101354748A(公开日2009年1月28日)公开了一种文字识别装置,包括摄像装置、字符识别装置、语音转换装置、及语音输出装置,所述摄像装置,用于摄入文字信息,将摄入的文字信息以图片形式发送到所述字符识别装置;所述字符识别装置,用于在上述图片中识别出上述文字信息,发送到所述语音转换装置;所述语音转换装置,用于将上述文字信息转换为语音数据,发送到所述语音输出装置;所述语音输出装置,用于播放上述语音数据。所述专利申请公开了一种采集和识别图像信息中的文字符号,然后将文字符号转换成语音的技术。CN101354748A (Publication Date January 28, 2009) discloses a character recognition device including an image pickup device, a character recognition device, a voice conversion device, and a voice output device for taking in text information and taking a photo The entered text information is sent to the character recognition device in the form of a picture; the character recognition device is configured to identify the text information in the picture and send the text information to the voice conversion device; the voice conversion device is configured to The text information is converted into voice data and sent to the voice output device; and the voice output device is configured to play the voice data. The patent application discloses a technique for collecting and recognizing text symbols in image information and then converting the text symbols into speech.
CN102956231A(公开日2013年3月6日)公开了一种语音识别技术领域的基于半自动校正的语音关键信息记录装置及方法,所述装置包括:关键信息提取单元和与之相连的信息校正单元,其中:关键信息提取单元获取未经校正的文本数据并提取出关键信息后输出至信息校正单元,信息校正单元输出用户反馈确认后的文本数据。本发明通过半自动的信息校正单元,降低了人工校正的工作量;利用数据库对特殊名词如地名、专业工具名称进行校正,降低了人工校正中操作员的知识量限制所造成的影响;提取语音数据中的关键信息,从而提高所记录信息的有效信息量。所述专利申请旨在解决语音转换成文本之后,对文本数据进行半自动校正的问题。CN102956231A (Publication Date March 6, 2013) discloses a semi-automatic correction based speech key information recording apparatus and method in the field of speech recognition technology, the apparatus comprising: a key information extraction unit and an information correction unit connected thereto, The key information extracting unit obtains the uncorrected text data and extracts the key information, and outputs the key information to the information correcting unit, and the information correcting unit outputs the text data confirmed by the user feedback. The invention reduces the workload of manual correction by using a semi-automatic information correction unit; uses a database to correct special nouns such as place names and professional tool names, thereby reducing the influence caused by the operator's knowledge limit in manual correction; extracting voice data Key information in the message, thereby increasing the amount of information available for the recorded information. The patent application aims to solve the problem of semi-automatic correction of text data after speech conversion into text.
CN105159870A(公开日2015年12月16日)公开了一种精确完成连续自然语音文本化的处理系统,所述处理系统包括云端语音识别引擎及语音识别后更新平台,所述语音识别后更新平台与所述云端语音识别引擎连接,所述语音识别后更新平台包括显示单元、更新操作单元、控制单元及三维一体生成单元,所述更新操作单元包括语音更新、键盘更新、鼠标更新及键盘加鼠标的更新操作方式,其中公开了可以对于待识别的语音文件进行精细切分,实现精准识别。CN105159870A (Publication Date December 16, 2015) discloses a processing system for accurately completing continuous natural speech text, the processing system including a cloud speech recognition engine and a speech recognition update platform, and the speech recognition and update platform and The cloud speech recognition engine is connected, and the speech recognition post update platform comprises a display unit, an update operation unit, a control unit and a three-dimensional integrated generation unit, and the update operation unit comprises a voice update, a keyboard update, a mouse update, and a keyboard plus a mouse. The operation mode is updated, in which it is disclosed that the voice file to be identified can be finely segmented to achieve accurate recognition.
CN105808197A(公开日2016年7月27日)公开了一种信息处理方法,应用于具有语音识别模块的电子设备,所述方法包括:接收输入语音数据;在依据预设的语音识别模型对所述输入语音数据进行识别得到识别结果后,当所述 识别结果中的第一信息为需要更新的内容时,所述第一信息为所述识别结果中的至少一个字符,采用通过操作体输入的方式对所述识别结果中的第一信息进行更新,所述用操作体输入的方式对识别结果中的第一信息进行更新,只需对目的更新的部分进行更新,而无需用户再次输入语音数据即可得到目的结果,操作过程简单,提高了信息输入的整体速度。所述专利申请公开了可以只需要通过对语音识别后的第一处需要更新的内容进行更新,从而提高了更新的速度,但是这样的更新只是针对识别后的文本数据,其中在语音识别的过程中,使用了将待识别信息与标准语音数据进行比对,进而提高识别准确率的方式。CN105808197A (Publication Date July 27, 2016) discloses an information processing method applied to an electronic device having a speech recognition module, the method comprising: receiving input speech data; After inputting voice data for recognition to obtain a recognition result, when When the first information in the recognition result is content that needs to be updated, the first information is at least one character in the recognition result, and the first information in the recognition result is updated by using an operation body input manner, The method of inputting the operation body updates the first information in the recognition result, and only needs to update the part of the target update, without the user inputting the voice data again to obtain the target result, the operation process is simple, and the information input is improved. The overall speed. The patent application discloses that it is only necessary to update the content that needs to be updated at the first place after voice recognition, thereby improving the speed of the update, but such update is only for the recognized text data, wherein the process of voice recognition In the middle, the method of comparing the information to be identified with the standard voice data is used, thereby improving the recognition accuracy.
CN106328145A(公开日2017年1月11日)公开了一种语音更新方法及装置,包括:获取用户输入的语音数据;对所述语音数据进行识别,以得到所述语音数据对应的文本内容;当所述文本内容中包含第一预设关键词时,根据所述第一预设关键词将所述文本内容划分为原始文本和编辑文本,其中,所述编辑文本用于对所述原始文本进行更新;根据所述编辑文本从所述原始文本中提取出待更新文本;根据所述编辑文本和所述待更新文本更新所述原始文本,以得到更新后的文本。所述专利申请公开了,可以通过关键字识别的方式获得原始文本中需要编辑的文本即编辑文本,针对性更新语音识别形成的文本内容。CN106328145A (Publication Date, January 11, 2017) discloses a voice update method and apparatus, comprising: acquiring voice data input by a user; and identifying the voice data to obtain text content corresponding to the voice data; When the text content includes the first preset keyword, the text content is divided into original text and edit text according to the first preset keyword, wherein the edit text is used to perform the original text. Updating; extracting text to be updated from the original text according to the edited text; updating the original text according to the edited text and the to-be-updated text to obtain updated text. The patent application discloses that the text to be edited in the original text, that is, the edited text, can be obtained by means of keyword recognition, and the text content formed by the voice recognition is updated in a targeted manner.
CN102215233A(公开日2011年10月12日)公开了一种信息系统客户端,安装于用户的终端设备中,可以应用于微博、博客、论坛或个人空间等,包括:用户交互模块以及连接所述用户交互模块的语音模块,优选的,还包括反馈模块,转换模块,所述语音模块包括语音采集单元、语音识别单元、语音合成单元,语音采集单元用于采集用户的语音;语音识别单元将语音采集单元采集的语音识别为文字输出至所述用户交互模块;语音合成单元将所述用户交互模块从所述信息系统服务器上获取的文字转换为语音向用户输出;所述反馈模块,连接所述语音识别单元,用于确认所述语音识别为文字是否正确,若正确,所述反馈模块将所述文字输出至所述用户交互模块,若不正确,所述反馈模块使所述语音采集单元重新采集用户的语音或者所述语音识别单元更新所述文字直至确认正确。所述专利申请公开了一种可以进行语音和文字分别互相转换的技术,旨在将一种格式的信息转换成另一种格式的信息,所述反馈模块如果输出的文字信息不正确,就重新采集用户语音,或者直接更新所述输出的文字信息。CN102215233A (Publication Date, October 12, 2011) discloses an information system client installed in a user's terminal device, which can be applied to a microblog, a blog, a forum or a personal space, etc., including: a user interaction module and a connection office. The voice module of the user interaction module, preferably, further includes a feedback module, a conversion module, the voice module includes a voice collection unit, a voice recognition unit, and a voice synthesis unit, and the voice collection unit is configured to collect voices of the user; the voice recognition unit The voice recognition unit collects the voice recognition as text output to the user interaction module; the voice synthesis unit converts the text obtained by the user interaction module from the information system server into voice output to the user; the feedback module, the connection center a voice recognition unit, configured to confirm whether the voice recognition is correct, if the correct, the feedback module outputs the text to the user interaction module, and if not, the feedback module enables the voice collection unit Reacquiring the voice of the user or the voice recognition unit updates the text straight To confirm that it is correct. The patent application discloses a technology for converting between voice and text respectively, and aims to convert information of one format into information of another format, and if the outputted text information is incorrect, the feedback module re- Collect user voices or directly update the output text information.
CN106486113A(2017年3月8日)公开了一种会议记录方法,包括:获取语音信号;由语音转化软件将所述语音信号转化成对应的文字信息,并在文档 中予以显示,其中,所述文字信息包括正确文字信息和错误文字信息;对文档中的错误文字信息进行标记,并将标记的所述错误文字信息与对应所述错误文字信息的语音信号进行关联链接;点击所述错误文字信息时,采用所述语音转化软件对与所述错误文字信息关联链接的语音信号进行二次识别,并在文档中对二次识别出来的文字信息进行可编辑显示;通过可编辑显示中对错误文字信息进行更正编辑,以得到更正的文字信息,并用所述更正的文字信息替换所述错误文字信息。CN106486113A (March 8, 2017) discloses a method for recording a conference, comprising: acquiring a voice signal; converting the voice signal into corresponding text information by a voice conversion software, and in the document The text information includes the correct text information and the incorrect text information; the wrong text information in the document is marked, and the marked error text information is associated with the voice signal corresponding to the incorrect text information. Linking; when clicking the error text information, using the voice conversion software to perform secondary recognition on the voice signal associated with the erroneous text information, and performing editable display on the second recognized text information in the document; The correction text information is corrected and edited in the editable display to obtain corrected text information, and the error text information is replaced with the corrected text information.
综上可见,在现有技术中,无论是教学录播技术领域,还是语音识别转换领域,都没有涉及对于已经形成的课堂录播数据进行更新的构思,特别是通过对于识别后文本内容的更新来对语音本身进行更新以及使用新课件替换旧课件使得老课程能用上新课件等这样的构思,现有技术关心的是语音识别转换特别是语音转换成文字的准确率的问题。然而,在各种教学或者会议过程中,对于任何说话者来说,都可能存在错说、漏说或者发音不标准,甚至表达不标准的情况,对于这些问题,通常是采用在语音识别时,也就是转换成文字时(比如以字幕呈现),加上文字标注(比如以括号中解释的方式)的方式进行标识。In summary, in the prior art, neither the teaching recording and broadcasting technology field nor the speech recognition conversion field involves the concept of updating the already formed classroom recording data, especially through updating the recognized text content. To update the voice itself and replace the old courseware with a new courseware, the old course can use the idea of a new courseware, etc. The prior art is concerned with the problem of the accuracy of speech recognition conversion, especially voice conversion into text. However, in any teaching or conference process, for any speaker, there may be misrepresentation, omission or non-standard pronunciation, and even non-standard expression. For these problems, it is usually used in speech recognition. That is, when converting to text (such as in subtitles), plus text annotations (such as explained in parentheses).
特别地,对于教学录播系统,由于讲授的课程要进行录制并且通过网络重现给用户,错说、漏说、表达不标准等问题带来的影响因为语言数据被压缩而变得突出而且影响很大,一方面,因为用户通常难以识别出这些错误,而且即使以字幕方式进行标识,另一方面,因为使用环境的原因,用户可能不方便看字幕,仅能以语音的形式收听,语音表达不清楚,进一步影响了用户学习的效果。In particular, for the teaching and recording system, since the taught course is recorded and reproduced to the user through the network, the influence of the problems such as wrong saying, missing, and non-standard expression becomes prominent and affected because the language data is compressed. Very large, on the one hand, because users often find it difficult to identify these errors, and even if they are identified by subtitles, on the other hand, because of the use of the environment, the user may not be able to read the subtitles, and can only listen in the form of voice, voice expression Unclear, further affecting the effect of user learning.
对于课件数据,特别是PPT类,可以以页面分割的课件内容,一旦出现错误,比如文字错误,表达错误,在完成课程录制之后,就无法再进行更正,要么删除,要么通过添加附注说明的方式进行补救,显然这些问题需要通过新途径进行解决。For courseware data, especially the PPT class, the content of the courseware can be divided by the page. Once an error occurs, such as a text error or an expression error, after the course recording is completed, it can no longer be corrected, or deleted, or by adding a note description. Remedy, obviously these problems need to be solved through new ways.
针对现有技术中存在的问题,本发明旨在提供一种用于更新教学录播数据的装置,通过对语音数据识别形成的文本内容和新旧课件数据特别是PPT课件数据的替换操作,实现了在教学过程录制完成之后,对于形成的教学录播数据再进行更新的目的。In view of the problems existing in the prior art, the present invention aims to provide an apparatus for updating teaching recording and broadcasting data, which realizes by replacing the text content formed by the voice data and the replacement operation of the new and old courseware data, especially the PPT courseware data. After the completion of the recording of the teaching process, the purpose of updating the teaching recording data is further updated.
发明内容 Summary of the invention
本发明旨在提供一种用于更新教学录播数据的装置,所述教学录播数据包括单独保存的语音数据、视频数据、动作数据等数据流,具体的包括使用录音设备将在网络教学或在线会议过程中的语音信号转换成带有时间戳的原始语音数据,以语音数据流+时间戳的格式进行保存,以及使用动作数据录制软件获取的比如课件特别是PPT文件的操作动作数据,以课件操作动作数据流+时间戳的格式进行保存。The present invention is directed to an apparatus for updating teaching recording data, which includes separately stored data streams such as voice data, video data, motion data, and the like, and specifically includes using a recording device to be taught in the network or The voice signal during the online conference is converted into the original voice data with time stamp, saved in the voice data stream + time stamp format, and the operation action data obtained by using the motion data recording software, such as the courseware, especially the PPT file, The courseware action action data stream + timestamp format is saved.
根据本发明的装置,使用语音识别模型将所述原始语音数据识别转换成原始文本数据,对所述原始文本数据进行校对,使用新文本内容替换需要更新的旧文本内容,实现对原始文本数据的更新形成更新文本数据,使用时间戳进行定位,将新文本内容的标准语音数据替换旧文本内容的相应语音数据片段,形成更新语音数据。According to the apparatus of the present invention, the original speech data is converted into original text data using a speech recognition model, the original text data is collated, and the old text content to be updated is replaced with the new text content, thereby realizing the original text data. The update forms the updated text data, uses the timestamp for positioning, and replaces the standard voice data of the new text content with the corresponding voice data segment of the old text content to form updated voice data.
根据本发明的装置,在旧课件的基础上进行更新形成新课件,建立新旧课件之间的关联关系,包括页面的对应关系和更新内容的对应关系,获取旧课件数据在录播数据中的时间戳信息,根据对应关系和时间戳信息,使用新课件替换旧课件,从而在点播回放录播数据的时候,以录制时形成的旧课件的展示方式将新课件内容展示给用户。According to the device of the present invention, the new courseware is updated on the basis of the old courseware, and the relationship between the new and old coursewares is established, including the corresponding relationship between the page and the updated content, and the time of the old courseware data in the recorded data is obtained. The stamp information is used to replace the old courseware with the new courseware according to the correspondence relationship and the timestamp information, so that the new courseware content is presented to the user in the manner of displaying the old courseware formed during the recording when the recorded data is played back on demand.
应该理解的是,尽管说明书中主要以网络教学或网络会议的录播的名义描述了本发明的实施例,但是可以理解的是,本发明的装置还可以用于其他网络在线交流过程的录制、播放和更新。也就是说,本发明涉及给予网络教学、在线培训、应急指挥(地图标注和语音录制)、金融系统或者在线会议登系统的教学活动或者会议过程录制及播放的方法、系统以及计算机程序产品,在网络教学、在线培训、应急指挥(地图标注及语音录制)、金融系统(操盘讲解)或者在线会议的过程中,只要涉及语音数据和使用PPT文件的,通过对所述语音数据识别转换后形成的文本数据的更新,将更新的文本内容的标准语音数据替换原始录制的相应语音数据,可以实现对于语音数据的更新,及使用新旧课件的替换操作,实现课件数据的更新。It should be understood that although the embodiments of the present invention have been described primarily in the context of the recording of web-based teaching or web conferencing, it will be appreciated that the apparatus of the present invention may also be used for recording of other online online communication processes, Play and update. That is to say, the present invention relates to a method, system and computer program product for teaching and recording a network teaching, online training, emergency command (map annotation and voice recording), a financial system or an online conference boarding system, or recording and playing in a conference process. In the process of network teaching, online training, emergency command (map annotation and voice recording), financial system (marketing explanation) or online conference, as long as voice data and PPT files are involved, the voice data is identified and converted. The text data is updated, and the standard voice data of the updated text content is replaced with the corresponding recorded voice data, so that the update of the voice data and the replacement operation of the new and old courseware can be realized, and the courseware data is updated.
本发明提供一种用于更新教学录播数据的装置,在对多媒体课堂(或网络课堂)或类似场景的录制和点播回顾过程中,特别是在对多媒体课堂进行录制时,包括将语音数据、多媒体白板上的动作数据(电子白板板书)、用户终端屏幕上的操作动作数据、录像设备录制的视频数据等以数据流格式添加时间戳后分别保存,形成教学录播数据,用户登录网络教学录播系统之后,使用有线或 无线局域或广域网络,获得所述教学录播数据,借助时间戳在用户终端上实现重现或模拟重现课堂的授课过程,从而实现对录制课堂的回顾播放或点播播放。The present invention provides an apparatus for updating teaching recording data, during recording and on-demand review of a multimedia classroom (or online classroom) or the like, particularly when recording a multimedia classroom, including voice data, The action data (electronic whiteboard book) on the multimedia whiteboard, the operation action data on the screen of the user terminal, the video data recorded by the recording device, etc. are added in time stamps after the data stream format is added, and the teaching recording data is formed, and the user logs in to the network teaching record. After broadcasting the system, use wired or The wireless local area or the wide area network obtains the teaching and recording data, realizes the reproduction process on the user terminal by using the time stamp or simulates the teaching process of the recurring classroom, thereby realizing the review playback or the on-demand playback of the recorded classroom.
本发明的用于更新教学录播数据的装置,包括:文件标识生成单元、语音数据采集单元、语音数据更新单元、课件数据采集单元、课件数据更新单元、其他数据采集单元、录制数据播放单元和错误信息反馈单元等,其中,The device for updating teaching recording data includes: a file identification generating unit, a voice data collecting unit, a voice data updating unit, a courseware data collecting unit, a courseware data updating unit, other data collecting units, a recording data playing unit, and Error information feedback unit, etc., wherein
文件标识生成单元,用于在开始录制教学过程时,生成文件标识ID;a file identification generating unit, configured to generate a file identification ID when starting the recording teaching process;
语音数据采集单元,用于使用音频采集设备将语音信号转换成原始语音数据,以语音数据流格式保存;优选的,用于从至少一个语音源采集至少一个语音数据,并添加时间戳,以语音数据流格式保存;a voice data collecting unit, configured to convert a voice signal into original voice data by using an audio collecting device, and save the voice signal in a voice data stream format; preferably, collecting at least one voice data from at least one voice source, and adding a time stamp to the voice Data stream format is saved;
语音数据更新单元,用于更新所述原始语音数据需要更新的语音数据,形成更新语音数据,所述语音数据的更新是通过对所述语音数据识别形成的文本内容的替换操作实现的;a voice data update unit, configured to update the voice data that needs to be updated by the original voice data, to form updated voice data, where the update of the voice data is implemented by a replacement operation of the text content formed by the voice data identification;
课件数据采集单元,用于获取教学过程中,对于课件文件,特别是PPT文件的操作动作数据,形成课件数据;The courseware data collecting unit is configured to acquire the courseware data for the courseware file, in particular the operation action data of the PPT file during the teaching process;
课件数据更新单元,用于更新教学录播数据中的课件数据,所述课件数据的更新是通过使用新课件替换旧课件实现的;a courseware data updating unit, configured to update courseware data in the teaching record data, wherein the update of the courseware data is implemented by replacing the old courseware with a new courseware;
其他数据采集单元,用于采集以下数据中的至少一种:多媒体白板上的动作数据、用户终端屏幕上的操作数据、录像设备的视频数据,对于采集的每种数据添加所述时间戳,均以数据流格式分别进行保存,与所述语音数据流和所述课件数据共同形成可以点播播放的录制数据;The other data collection unit is configured to collect at least one of the following data: action data on the multimedia whiteboard, operation data on the screen of the user terminal, video data of the video recording device, and adding the timestamp to each data collected, Separately saved in a data stream format, and together with the voice data stream and the courseware data, form recording data that can be played on-demand;
录制数据播放单元,用户使用终端通过网络获取所述录制数据,根据所述时间戳组合不同数据流,从而在所述终端上播放所述录制数据,重现和/或模拟重现教学过程,实现对教学过程的学习和/或复习;Recording a data playing unit, the user uses the terminal to acquire the recorded data through a network, combines different data streams according to the time stamp, thereby playing the recorded data on the terminal, reproduces and/or simulates a recurring teaching process, and realizes Learning and/or reviewing the teaching process;
错误信息反馈单元,用户使用所述终端播放所述录制数据时,可以将发现的所述更新文本数据中的错误文字内容选定并提交反馈,反馈的内容经由管理员确认之后,更新所述更新文本数据,并重复所述语音数据替换单元,更新所述更新语音数据。The error information feedback unit may: when the user plays the recorded data by using the terminal, may select and submit the error text content in the found updated text data, and the feedback content is updated by the administrator, and the update is updated. Text data, and repeating the voice data replacement unit, updating the updated voice data.
所述语音数据更新单元进一步包括,The voice data update unit further includes
语音数据识别单元,用于根据语音识别模型将原始语音数据识别转换成原始文本数据,根据时间戳可以确定所述文本数据中的每个文字内容的时间坐标;a voice data identification unit, configured to convert the original voice data identification into original text data according to the voice recognition model, and determine a time coordinate of each text content in the text data according to the time stamp;
文本数据更新单元,用于对所述原始文本数据进行校对,将其中需要更新 的旧文字内容,替换更新为准确的新文字内容,形成更新文本数据;a text data updating unit, configured to proofread the original text data, and need to update The old text content is replaced with an updated new text content to form updated text data;
语音数据替换单元,用于使用所述新文字内容的标准语音数据替换在所述原始语音数据中的所述旧文字内容的语音数据片段,形成更新语音数据,实现对所述教学录播数据的语音数据的更新。a voice data replacing unit, configured to replace, by using standard voice data of the new text content, a voice data segment of the old text content in the original voice data, to form updated voice data, to implement data recording and playing on the teaching Update of voice data.
所述课件数据更新单元进一步包括,The courseware data updating unit further includes
课件内容更新单元,用于在旧课件的基础上对内容进行更新,使用新内容替换旧内容,形成新课件,并且记录旧课件和新课件之间的对应关系数据,包括更新前后页面之间的关联关系和每个页面更新前后的内容对应关系;The courseware content update unit is used to update the content on the basis of the old courseware, replace the old content with the new content, form a new courseware, and record the correspondence data between the old courseware and the new courseware, including between the pages before and after the update. The relationship between the relationship and the content before and after each page update;
课件数据更新单元,用于获取所述教学录播数据中课件数据的时间戳标识,并根据所述对应关系数据,对新课件进行处理,将所述时间戳标识添加到新课件中,形成新课件数据;a courseware data updating unit, configured to acquire a timestamp identifier of the courseware data in the teaching recording and broadcasting data, and process the new courseware according to the corresponding relationship data, and add the timestamp identifier to the new courseware to form a new Courseware data;
课件替换更新单元,使用所述新课件数据替换在所述教学录播数据中的旧课件数据,实现对所述教学录播数据的课件数据的更新。The courseware replacement updating unit replaces the old courseware data in the teaching recording data with the new courseware data, and updates the courseware data of the teaching recording data.
所述课件优选为PPT文件,所述标准语音数据是从标准语音数据库中通过搜索获取的,所述语音数据和课件数据是单独进行保存的,优选的,所述语音数据以“语音数据流+时间戳”的格式进行保存,所述课件数据以“课件操作数据流+时间戳”的格式进行保存。The courseware is preferably a PPT file, and the standard voice data is obtained by searching from a standard voice database, and the voice data and the courseware data are separately saved. Preferably, the voice data is “voice data stream+” The format of the timestamp is saved, and the courseware data is saved in the format of "courseware operation data stream + timestamp".
本发明特别优选的是,所述文本数据更新单元进一步用于,在课件数据更新完成之后,根据时间戳标识确定所述PPT文件的每个页面的展示时间,并根据所述每个页面更新前后的内容对应关系,搜索所述展示时间内的文本数据中的旧内容,使用新内容替换旧内容,形成更新文本数据,实现对文本数据的进一步更新,使得更新后的语音数据与更新后的课件数据相适应。由此实现了,课件更新和语音更新的互动和关联,使得当语音数据中涉及的内容一旦已经在使用新的PPT文件进行了更新和替换,根据PPT跟新的内容的替换关系,替换语音数据中的相关内容和语音,实现了语音讲解与PPT展示内容的适应性同步修改,进一步提高了录播数据更新之后的完备性和可实现性。The text data updating unit is further configured to: after the courseware data update is completed, determine, according to the timestamp identifier, the display time of each page of the PPT file, and update according to each page after the update Content correspondence, searching for old content in the text data during the presentation time, replacing the old content with the new content, forming updated text data, implementing further updating of the text data, and making the updated voice data and the updated courseware The data is adapted. Thereby, the interaction and association of the courseware update and the voice update are realized, so that when the content involved in the voice data has been updated and replaced with the new PPT file, the voice data is replaced according to the replacement relationship between the PPT and the new content. The related content and voice in the speech realize the adaptive synchronization modification of the voice explanation and the PPT display content, and further improve the completeness and achievability of the recorded data after updating.
所述对文本数据的进一步更新,在完成替换之前,将文本数据中的旧内容与新内容的对应关系展示给管理员,由管理员根据上下文确认是否进行替换。The further updating of the text data reveals the correspondence between the old content and the new content in the text data to the administrator before the replacement is completed, and the administrator confirms whether to replace according to the context.
所述更新的文本数据,根据时间戳标识,以字幕方式显示在屏幕上,优选的是,显示在在播放视频数据的屏幕区域,更优选的是,所述文本数据以可编辑的方式如可选定的方式,显示在屏幕的特定区域。 The updated text data is displayed on the screen in a subtitle manner according to a timestamp identifier, preferably displayed on a screen area in which the video data is played, and more preferably, the text data is editable. The selected mode is displayed in a specific area of the screen.
所述装置进一步包括错误信息反馈单元,The device further includes an error information feedback unit,
所述错误信息反馈单元,用户使用终端播放所述教学录播数据时,可以将发现的所述更新文本数据中的错误文字内容选定并提交反馈,反馈的内容经由管理员确认之后,再次更新所述更新文本数据,并通过所述语音数据替换单元,再次更新所述更新语音数据。The error information feedback unit may: when the user uses the terminal to play the teaching and recording data, select and submit the error text content in the found updated text data, and the feedback content is updated again after being confirmed by the administrator. The text data is updated, and the updated voice data is updated again by the voice data replacement unit.
在对文本数据和语音数据进行更新时,形成更新历史记录,所述更新历史记录可以包括更新时间、更新内容、更新操作人、问题发现人等等。When the text data and the voice data are updated, an update history is formed, which may include update time, update content, update operator, issue finder, and the like.
所述语音数据替换单元,用于根据被替换的旧文字内容在所述原始语音数据中的发音时间以及新文字内容的标准语音数据的发音时间,计算出平滑系数,再根据所述平滑系数,调整所述新文字内容的发音时间,由此使得替换前后语音数据的平滑和同步。The voice data replacing unit is configured to calculate a smoothing coefficient according to the pronunciation time of the replaced old text content in the original voice data and the pronunciation time of the standard voice data of the new text content, and according to the smoothing coefficient, The pronunciation time of the new text content is adjusted, thereby smoothing and synchronizing the voice data before and after the replacement.
通过本发明的方法,提高了课堂录制的水平,基于时间戳分别保存各种数据流的存储方式,通过对语音数据的识别转换以及文本数据的更新,并根据更新的文本内容更新语音数据,更新了原始语音数据中需要更新的内容,克服了课堂上“少说、错说和漏说”等带来的问题,可以获得双更新后的语音数据和文本数据(字幕信息),通过在旧课件的基础上更新形成新课件,并且在新旧课件之间建立对应关系,结合其在录播数据中的时间戳信息,实现了课件的替换更新。Through the method of the invention, the level of classroom recording is improved, and the storage manners of various data streams are respectively saved based on the time stamp, and the voice data is updated and updated, and the voice data is updated according to the updated text content, and the voice data is updated according to the updated text content. The content that needs to be updated in the original voice data overcomes the problems caused by “less talk, wrong talk and miss talk” in the classroom, and can obtain double updated voice data and text data (subtitle information) through the old courseware. On the basis of the update, a new courseware is formed, and a corresponding relationship is established between the new and old coursewares, and the timestamp information in the recorded data is combined to realize the replacement of the courseware.
使用本发明的装置,使得在教学过程录制完成之后,通过更新语音数据解决了教学过程中存在的错说、漏说、表达不标准的语音问题,以及,通过新旧课件的替换操作,可以将新课件展示给学习教学录播数据的用户,从而使得教学过程完成之后,也就是教学录播数据形成之后,依然可以对于教学过程中存在的问题通过更新操作进行补救和完善。By using the device of the invention, after the completion of the recording process of the teaching process, the voice problem is solved by updating the voice data, and the voice problem of non-standard expression in the teaching process is solved, and the new and old courseware can be replaced by the new and old courseware replacement operation. The courseware is displayed to the users who study and record the data, so that after the teaching process is completed, that is, after the teaching and recording data is formed, the problems existing in the teaching process can still be remedied and improved through the update operation.
本发明的上述和进一步的目的以及特征,根据结合附图的以下详细说明就会更加清楚和完整。The above and further objects and features of the present invention will become more apparent from the following detailed description.
附图说明DRAWINGS
图1是根据本发明的录播系统架构图;1 is a block diagram of a recording and broadcasting system according to the present invention;
图2是根据本发明的录播数据更新步骤流程图;2 is a flow chart showing the steps of updating the recording data according to the present invention;
图3是根据本发明的课件更新步骤流程图;和Figure 3 is a flow chart showing the steps of updating courseware according to the present invention; and
图4是根据本发明的语音更新步骤流程图。 4 is a flow chart of a voice update step in accordance with the present invention.
具体实施方式Detailed ways
以下,将结合附图对本发明的具体实施方式进行进一步详细的描述。Hereinafter, specific embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
本发明中网络教学不局限于学生和教师的课堂教学形式,其可以包括以教师和学生、或培训人为参与主体的在线网络教学、远程网络教学、本地网络教学,和以企事业单位员工等为参与主体的在线网络会议、远程网络会议、本地网络会议,以及其他的利用网络进行在线交流和/或文件内容展示的交流/交互形式,比如远程协同工作。The network teaching in the invention is not limited to the classroom teaching form of students and teachers, and may include online network teaching, remote network teaching, local network teaching, and employees of enterprises and institutions, with teachers and students, or trainers as participants. Participate in online web conferencing, remote web conferencing, local web conferencing, and other forms of communication/interaction that use the web for online communication and/or presentation of file content, such as remote collaborative work.
如图1所示,教师1、学生2分别使用安装有网络教学录播系统客户端的终端设备,通过互联网连接至教学服务器3,所述教学服务器3还连接有多媒体设备4包括智能电子白板、摄像机、麦克风、高拍仪中的至少一种,由此实现多媒体课堂的网络授课/听课/录制/点播/回顾等。As shown in FIG. 1, the teacher 1 and the student 2 respectively connect to the teaching server 3 via the Internet using a terminal device installed with a client of the network teaching recording system, and the teaching server 3 is also connected with the multimedia device 4 including an intelligent electronic whiteboard and a camera. At least one of a microphone, a high-stakes instrument, and thus a network lecture/listening/recording/on-demand/review of the multimedia classroom.
所述终端设备包括:处理器、网络模块、控制模块、显示模块以及智能操作系统,可以智能手机、PAD、笔记本电脑、台式电脑等。所述终端上可以设有通过数据总线连接各种拓展类设备和配件的多种数据接口。所述智能操作系统包括Windows、Android及其改进、iOS,在其上可以安装、运行应用软件,实现在智能操作系统下的各种应用软件、服务和应用程序商店/平台的功能。The terminal device includes: a processor, a network module, a control module, a display module, and a smart operating system, and can be a smart phone, a PAD, a notebook computer, a desktop computer, or the like. The terminal may be provided with a plurality of data interfaces for connecting various extension devices and accessories through a data bus. The intelligent operating system includes Windows, Android and its improvements, iOS, on which application software can be installed and run, and functions of various application software, services, and application stores/platforms under the intelligent operating system are realized.
终端设备可以通过RJ45/Wi-Fi/蓝牙/2G/3G/4G/G.hn/Zigbee/Z-ware/RFID等连接方式连接到互联网络,并借助互联网连接到其它的终端或其它电脑及设备,通过1394/USB/串行/SATA/SCSI/PCI-E/Thunderbolt/数据卡接口等多种数据接口或者总线方式,通过HDMI/YpbPr/SPDIF/AV/DVI/VGA/TRS/SCART/Displayport等音视频接口等连接方式,来连接各种拓展类设备和配件,组成了一个会议/教学设备互动系统。带有软件形式的声音捕捉控制模块和动作捕捉控制模块,或通过数据总线板载硬件形式的声音捕捉控制模块和动作捕捉控制模块,来实现声控和形控功能;通过音视频接口连接显示/投影模块、麦克风、音响设备和其它音视频设备,来实现显示、投影、声音接入、音视频播放,以及数字或模拟的音视频输入和输出功能;通过数据接口连接摄像头、麦克风、电子白板、RFID读取设备,实现影像接入、声音接入、电子白板的使用控制和录屏,RFID读取功能,并通过相应的接口可接入和管控移动存储设备、数字设备和其它设备;通过DLNA/IGRS技术和互联网络技术,来实现的包括多屏设备之间的操控、互动和甩屏等功能。 Terminal devices can be connected to the Internet via RJ45/Wi-Fi/Bluetooth/2G/3G/4G/G.hn/Zigbee/Z-ware/RFID connections and connected to other terminals or other computers and devices via the Internet. Through 1394/USB/Serial/SATA/SCSI/PCI-E/Thunderbolt/data card interface and other data interfaces or bus methods, through HDMI/YpbPr/SPDIF/AV/DVI/VGA/TRS/SCART/Displayport, etc. Connection methods such as audio and video interfaces to connect various expansion devices and accessories to form a conference/teaching device interactive system. The sound capture control module and the motion capture control module with software form, or the sound capture control module and the motion capture control module in the form of data bus onboard hardware, realize voice control and shape control function; connect display/projection through audio and video interface Modules, microphones, audio equipment and other audio and video equipment for display, projection, sound access, audio and video playback, and digital or analog audio and video input and output functions; connected to the camera, microphone, electronic whiteboard, RFID through the data interface The reading device realizes image access, sound access, use control and screen recording of the electronic whiteboard, RFID reading function, and can access and control mobile storage devices, digital devices and other devices through corresponding interfaces; through DLNA/ IGRS technology and internet technology are used to implement functions such as manipulation, interaction and screen switching between multi-screen devices.
本发明中,处理器定义为包括但不限于:指令执行系统,如基于计算机/处理器的系统、专用集成电路(ASIC)、计算设备、或能够从非暂时性存储介质或非暂时性计算机可读存储介质取得或获取逻辑并执行非暂时性存储介质或非暂时性计算机可读存储介质中包含的指令的硬件和/或软件系统。所述处理器还可以包括任意控制器,状态机,微处理器,基于互联网络的实体、服务或特征,或它们的任意其它模拟的、数字的和/或机械的实现方式。In the present invention, a processor is defined to include, but is not limited to, an instruction execution system such as a computer/processor based system, an application specific integrated circuit (ASIC), a computing device, or a non-transitory or non-transitory computer. A hardware and/or software system that reads a storage medium to acquire or acquire logic and execute instructions contained in a non-transitory storage medium or a non-transitory computer readable storage medium. The processor may also include any controller, state machine, microprocessor, internetwork-based entity, service or feature, or any other analog, digital, and/or mechanical implementation thereof.
本发明中,互联网可以包括局域网和广域互联网,可以是有线互联网,也可以是无线互联网,或者这些网络的任意组合。In the present invention, the Internet may include a local area network and a wide area Internet, and may be a wired Internet or a wireless Internet, or any combination of these networks.
如图2所示,根据本发明的网络教学录播数据主要更新步骤如下:As shown in FIG. 2, the main updating steps of the network teaching recording data according to the present invention are as follows:
S100:启动录播系统:用户(比如教师用户)使用终端登录后,各种多媒体设备4如智能电子白板、教师终端屏幕操作动作捕捉程序、摄像机、麦克风等多媒体教学设备进入工作状态,所述摄像机可以不止一个,所述麦克风包括至少一个,分别用于捕捉教师的语音和用于捕捉学生的语音,与数字时间戳一起形成语音数据流格式进行保存;屏幕操作动作捕捉程序可以用于捕捉在教师终端上,对于课件文件特别是PPT文件的操作动作,获取操作PPT文件的页面信息、动作信息和时间戳,由此形成课件操作动作流数据,录播系统的教学服务器可以用于产生数字时间戳。S100: Starting the recording and broadcasting system: after the user (such as a teacher user) logs in using the terminal, various multimedia devices 4 such as an intelligent electronic whiteboard, a teacher terminal screen operation motion capturing program, a camera, a microphone, and the like are put into a working state, the camera There may be more than one, the microphone includes at least one, respectively for capturing the teacher's voice and for capturing the student's voice, and forming a voice data stream format together with the digital time stamp for saving; the screen operation motion capture program may be used to capture the teacher On the terminal, for the operation action of the courseware file, especially the PPT file, the page information, the action information and the timestamp of the operation PPT file are obtained, thereby forming the courseware operation action flow data, and the teaching server of the recording and broadcasting system can be used to generate the digital timestamp. .
S200:开始网络教学:教师开始课堂教学,文件标识生成单元生成一个教学文件ID,在教学过程中,比如教师使用智能电子白板进行展示(作为授课板书或者讲解题板)、使用实时语音进行讲解、使用实时交互语音进行交流、在教师终端上使用电子文档比如PPT文档进行展示和说明,从而进行多媒体授课及与学生互动问答交流,通过如S100中介绍的装置分别采集并形成相应的数据流。S200: Start online teaching: the teacher starts classroom teaching, and the document identification generating unit generates a teaching document ID. In the teaching process, for example, the teacher uses the intelligent electronic whiteboard to display (as a teaching board or explain the problem board), and uses real-time voice to explain, Use real-time interactive voice to communicate, use electronic documents such as PPT documents on the teacher terminal for presentation and explanation, and then conduct multimedia teaching and interactive Q&A with students, and collect and form corresponding data streams through the devices as described in S100.
S300:录制数据保存:在录制过程中,智能电子白板上的动作以“动作数据流+时间戳”的方式传输和保存,授课及互动过程中的语音,通过语音数据采集单元以“语音数据流+时间戳”的方式传输和保存,教师终端上涉及的电子文档如PPT文档的操作动作,通过课件数据采集单元以“电子文档操作数据流+时间戳”的方式传输和保存,对于其他数据流,可以使用其他数据采集单元进行采集,比如视频数据以“视频数据流+时间戳”的方式传输和保存。整个授课过程中的所有这些数据流与教学文件ID绑定实现对应录制课程的标识。这些数据,可以根据需要进行添加或删减,一种典型的情况是所录制数据包括语音数据、视频数据和PPT文档演示数据。在现有技术中,分类录制分屏展示是比较成熟 的技术。录制得到的各种数据可以先保存到本地数据库或者终端数据库,再由这些数据库通过网络上传到远程教学服务器,也可直接保存到远程教学服务器。S300: Recording data saving: During the recording process, the actions on the intelligent whiteboard are transmitted and saved in the form of “action data stream + time stamp”, and the voice during the teaching and interaction process is “voice data stream” through the voice data collecting unit. + timestamp" transmission and saving, the operation actions of electronic documents such as PPT documents involved in the teacher terminal are transmitted and saved by the courseware data collection unit in the form of "electronic document operation data stream + time stamp" for other data streams It can be collected by other data acquisition units, such as video data transmitted and saved as "video data stream + time stamp". All of these data streams throughout the course of the course are tied to the teaching document ID to achieve the identity of the recorded course. These data can be added or deleted as needed. A typical case is that the recorded data includes voice data, video data, and PPT document presentation data. In the prior art, the classified recording and split screen display is relatively mature. Technology. The various data recorded can be saved to a local database or a terminal database, and then uploaded to the remote teaching server through the network, or directly saved to the remote teaching server.
在一个示例中,语音数据采集单元对于语音数据的采集,可以使用语音采集设备比如各种可用的麦克风采集语音信号,将语音信号转换成语音数据,以数据流格式保存。对于单一语音源的情况,可以标记出语音源的性别,这样在进行后续语音更新(替换)操作时,可以选择相应性别的标准语音。对于多个语音源的情况,可以分别识别出语音源的性别,这些多个语音源可以可以识别并聚类,形成以语音源为标记的语音文件,添加时间戳后分别进行保存,将多个语音源分别识别出来的方法可以使用现有技术,在此不再赘述。In one example, for the collection of voice data by the voice data collection unit, a voice acquisition device such as various available microphones may be used to collect the voice signal, and the voice signal is converted into voice data and saved in a data stream format. In the case of a single voice source, the gender of the voice source can be marked so that when a subsequent voice update (replacement) operation is performed, the standard voice of the corresponding gender can be selected. For the case of multiple voice sources, the gender of the voice source can be separately identified, and the multiple voice sources can be identified and clustered to form a voice file marked with a voice source, and the time stamp is added and saved separately. The method for separately identifying the voice source can use the prior art, and details are not described herein again.
在一个示例中,电子文档优选为PPT文档,可选的是,任何其他可以作为课件的,可以以页面格式进行分割的电子文档,比如WORD文档,所述电子文档操作数据流+时间戳,可以具体为PPT文件的页面信息+动作信息+时间戳信息。In an example, the electronic document is preferably a PPT document. Optionally, any other electronic document that can be divided into a course format and can be divided into a page format, such as a WORD document, the electronic document operation data stream + time stamp can be Specifically, the page information + action information + timestamp information of the PPT file.
S400:语音数据识别:对于录制的原始语音数据,首先通过语音数据识别单元,采用语音模型进行识别转换形成原始文本数据,再通过文本数据更新单元对所述原始文本数据进行校对更新。在形成原始文本数据时,将原始语音数据的时间戳添加到文本数据中,使得可以对文本数据中的文字内容进行时间定位。所述文字内容可以是文本数据中的至少一个字、词、句或段。通过所述时间定位获取可以标记音频数据的时间维度的时钟数据,也就是可以相对定位一个音频数据中某个数据片段的时间点的时钟参数。S400: voice data recognition: for the recorded original voice data, firstly through the voice data recognition unit, the voice model is used for recognition conversion to form original text data, and then the original text data is proofreaded and updated by the text data update unit. When the original text data is formed, the time stamp of the original voice data is added to the text data so that the text content in the text data can be time-located. The text content may be at least one word, word, sentence or paragraph in the text data. The clock data of the time dimension of the audio data can be obtained by the time positioning, that is, the clock parameter of the time point at which a certain data segment in one audio data can be relatively located.
在进行识别转换时,可以使用各种可用的语音模型将原始语音数据识别转换为原始文本数据,在进行语音数据识别转换时,首先识别语音源的性别,并且将性别信息添加到所述文本数据中。对于文本数据的校对更新包括人工校对、半自动校对、语音校对等。When the recognition conversion is performed, the original speech data identification can be converted into the original text data using various available speech models, and when the speech data recognition conversion is performed, the gender of the speech source is first recognized, and the gender information is added to the text data. in. Proofreading updates for text data include manual proofing, semi-automatic proofreading, and voice proofreading.
S500:语音数据更新:通过语音数据替换单元,使用语音更新指令即使用语音校对方式(CN106406807A)对原始文本数据进行替换,但是本发明不限于此。语音数据替换操作包括,接受语音更新指令,在待更新的文本数据中识别与所述语音更新指令读音相同的所有文字以及这些文字内容的时间戳,确定识别出的所有文字中的待更新文字,显示所述待更新文字对应的备选文字列表,接受备选文字选定指令,进行替换操作,形成更新文本数据,从而完成文本更新,完成文本更新之后,根据文本更新的对应关系信息,再进行下面的语音数据片段替换操作。 S500: Voice data update: The original text data is replaced by a voice data replacement unit using a voice update command (CN106406807A), but the present invention is not limited thereto. The voice data replacement operation includes: accepting a voice update instruction, identifying, in the text data to be updated, all the characters that are the same as the voice update instruction sound, and a time stamp of the text content, and determining the text to be updated among all the recognized texts, Displaying an alternative text list corresponding to the text to be updated, accepting an alternative text selection instruction, performing a replacement operation, forming an updated text data, thereby completing the text update, and after completing the text update, performing the corresponding relationship information according to the text update. The following voice data segment replacement operation.
在完成上述文本更新的过程中,从标准语音数据库中调取更新文字的标准发音信息,根据被更新的文字的时间戳,用标准发音信息替换在原语音数据中对应的语音数据片段,更新语音数据,形成新的语音数据。所述标准语音数据是从标准语音数据库中通过搜索获取的。所述标准语音数据库可以包括女生标准语音数据库、男生标准语音数据库和/或个性化标准语音数据库。所述个性化标准语音数据库是,通过对于特定发音人录制形成的标准语音数据库,或者通过语料训练,形成的特定发音人的语音模型,可以用于语音识别,还可以用于生成个性化标准语音数据库。In the process of completing the above text update, the standard pronunciation information of the updated text is retrieved from the standard voice database, and the corresponding voice data segment in the original voice data is replaced with the standard pronunciation information according to the time stamp of the updated text, and the voice data is updated. , forming new voice data. The standard voice data is obtained by searching from a standard voice database. The standard speech database may include a girls standard speech database, a boys standard speech database, and/or a personalized standard speech database. The personalized standard voice database is a voice model of a specific speaker formed by a standard voice database formed by recording a specific speaker, or by corpus training, and can be used for voice recognition, and can also be used to generate personalized standard voice. database.
在从标准语音数据中调取标准发音信息时,根据所述原始文本数据的语音源性别信息,或者其他个性化信息,选择相应的标准语音。作为一种选择,所述旧文字内容可以为空内容,也就是,替换所述空内容的新文字内容是遗漏的,现在需要添加的文字内容。所述新文字内容可以为空内容,也就是,被替换的所述旧文字内容是多余的,现在需要删除的文字内容。When the standard pronunciation information is retrieved from the standard voice data, the corresponding standard voice is selected according to the voice source gender information of the original text data, or other personalized information. Alternatively, the old text content may be empty content, that is, the new text content replacing the empty content is missing, and the added text content is now required. The new text content may be empty content, that is, the old text content that is replaced is redundant, and the deleted text content is now required.
在S300:录制数据保存步骤之后,开始步骤S400的时候,可以与步骤S500保持一致,步骤顺序可以根据需要进行调整,并不做具体的限定,但是优选的是,在S400完成之后,先进行S401:课件数据更新步骤。After the step S400 is started, the step S400 may be consistent with the step S500. The step sequence may be adjusted as needed, and is not specifically limited. However, after the S400 is completed, the S401 is performed first. : Courseware data update steps.
S401:课件数据更新步骤:用于通过课件数据更新单元,采用新旧课件的替换操作的方式,更新教学录播数据中的课件数据。S401: Courseware data updating step: for updating the courseware data in the teaching and recording data by means of the courseware data updating unit and using the replacement operation of the new and old courseware.
如图3所示,所述课件数据更新单元对于所述课件数据的更新步骤,进一步包括以下步骤。As shown in FIG. 3, the step of updating the courseware data by the courseware data updating unit further includes the following steps.
S4011课件内容更新步骤,通过课件内容更新单元,在旧课件的基础上对内容进行更新,使用新内容替换旧内容,形成新课件,并且记录旧课件和新课件之间的对应关系数据,包括更新前后页面之间的关联关系和每个页面更新前后的内容对应关系。本步骤,可以根据需要在适当的时间执行,但是完成课件内容更新操作之后才能进行课件数据的更新操作。The S4011 courseware content update step, through the courseware content update unit, updates the content on the basis of the old courseware, replaces the old content with the new content, forms a new courseware, and records the correspondence data between the old courseware and the new courseware, including the update. The relationship between the front and back pages and the content correspondence before and after each page update. This step can be performed at an appropriate time as needed, but the courseware data update operation can be performed after the courseware content update operation is completed.
S4012课件数据更新步骤,通过课件数据更新单元,获取所述教学录播数据中课件数据的时间戳标识,并根据所述对应关系数据,对新课件进行处理,将所述时间戳标识添加到新课件中,形成新课件数据。S4012 courseware data updating step, obtaining a timestamp identifier of the courseware data in the teaching recording and broadcasting data through the courseware data updating unit, and processing the new courseware according to the corresponding relationship data, adding the timestamp identifier to the new In the courseware, new courseware data is formed.
S4013课件替换操作步骤,通过课件替换更新单元,使用所述新课件数据替换在所述教学录播数据中的旧课件数据,实现对所述教学录播数据的课件数据的更新。 The S4013 courseware replacement operation step replaces the old courseware data in the teaching recording data with the new courseware data by using the courseware replacement update unit, and implements updating the courseware data of the teaching recording and broadcasting data.
作为一个优选的示例,在完成课件数据的更新操作之后,对于步骤S400:语音数据识别形成的文本数据,通过所述文本数据更新单元,可以进一步用于,在课件数据更新完成之后,根据时间戳标识确定所述PPT文件的每个页面的展示时间,从而可以进行定位,并根据所述每个页面更新前后的内容对应关系,搜索所述展示时间内的所述语音识别形成的文本数据中的旧内容,使用新内容替换旧内容,形成更新文本数据,实现对所述语音识别形成的文本数据的进一步更新,在完成文本数据的更新之后,按照步骤S500的语音替换方式,从标准语音库中使用标准语音数据片段替换原语音数据中的相应语音片段,形成进一步更新的语音数据。通过这种方式,使得在语音数据中涉及PPT文件的讲解内容,得以根据PPT文件的修改进行同步更新,从而使得更新后的语音数据与更新后的课件数据相适应,进一步实现录播数据的完整更新。As a preferred example, after the update operation of the courseware data is completed, the text data formed for the step S400: voice data identification may be further used by the text data update unit, after the courseware data update is completed, according to the timestamp Determining the display time of each page of the PPT file, so that the positioning may be performed, and searching for the text data formed by the voice recognition in the display time according to the content correspondence relationship before and after each page update The old content, the old content is replaced with the new content, the updated text data is formed, and the text data formed by the voice recognition is further updated. After the text data is updated, according to the voice replacement method of step S500, the standard voice library is used. The corresponding speech segments in the original speech data are replaced with standard speech data segments to form further updated speech data. In this way, the content of the PPT file involved in the voice data can be synchronously updated according to the modification of the PPT file, so that the updated voice data is adapted to the updated courseware data, and the recording data is further completed. Update.
作为一个示例,对语音识别形成的所述文本数据的进一步更新,在完成文本内容的替换之前,将文本数据中的旧内容与新内容的对应关系展示给管理员,由管理员根据上下文确认是否进行替换。As an example, for further updating of the text data formed by the voice recognition, before the completion of the replacement of the text content, the correspondence between the old content and the new content in the text data is presented to the administrator, and the administrator confirms whether according to the context. Replace it.
也就是说,在完成课件数据的更新之后,根据课件数据更新的对应关系数据,以及时间戳信息,比如获取某个被更新的PPT页面展示的起止时间段,并且在这个起止时间段之内,搜索经过语音转换形成的文本数据中,是否存在与新旧课件之间相同的替换内容,比如新课件的所述页面中的文字“麻酱”(可能是描述不准的原因)修改为“辣酱”,而讲解的时候,提及的页面的内容中也有“麻酱”,也就是说,这个“麻酱”是不准确的,应该使用“辣酱”进行替换,但是由于语义复杂的原因,本发明优选的,系统找出这类的问题,生成提示或报表发送给管理员,由管理员确认是否进行替换操作。That is, after completing the update of the courseware data, the corresponding relationship data updated according to the courseware data, and the timestamp information, for example, obtaining the start and end time period of an updated PPT page display, and within the start and end time period, Searching for textual data formed by voice conversion, whether there is the same replacement content between the old and new courseware, for example, the text “Ma Sauce” (which may be the reason for the description) in the page of the new courseware is modified to “hot sauce”. At the time of explanation, there is also "hemp sauce" in the content of the page mentioned. That is to say, this "hemp sauce" is inaccurate and should be replaced with "hot sauce", but for reasons of semantic complexity, the present invention is preferred, The system finds such problems, generates prompts or reports to the administrator, and the administrator confirms whether to perform the replacement.
如图4所示,在一个示例中,语音更新的具体步骤如下:As shown in FIG. 4, in one example, the specific steps of the voice update are as follows:
S110:接收指令S110: receiving an instruction
当识别的文本数据发现问题时,如需要更新的文字为“胡建”,接收语音更新指令,如用户可以通过此单元发出“选中胡建”的语音指令,发起更新问题文字“胡建”的指令。When the identified text data finds a problem, if the text to be updated is “Hu Jian”, the voice update instruction is received, for example, the user can issue a voice instruction of “check Hu Jian” through the unit, and initiate an update of the problem text “Hu Jian”. instruction.
S120:查找文字S120: Find text
在原始文本数据中识别与所述语音更新指令指定读音相同的所有文字。All texts identical to the specified pronunciation of the voice update instruction are identified in the original text data.
S130:确定文字S130: Determine the text
确定识别出的文本数据中的所有待更新文字。 Determine all the text to be updated in the recognized text data.
其中,当在文本数据中出现多个与语音更新指令指定读音相同的文字时,用户可以通过进一步的语音指令明确哪个文字需要更新。例如,在待更新文本数据中从前往后识别出读音为“hujian”的文字依次有:“胡建”、“互见”、“护肩”...等,用户当前想要将识别出的第一个文字进行更新,则可发出“第一个”的语音来将识别出的第一个文字确定为当前待更新的文字。Wherein, when a plurality of words corresponding to the specified pronunciation of the voice update instruction appear in the text data, the user can clarify which text needs to be updated by further voice instructions. For example, in the text data to be updated, the characters that recognize the pronunciation as "hujian" from the destination are: "Hu Jian", "mutual see", "shoulder shoulder", etc., the user currently wants to recognize the first When a text is updated, a "first" voice can be issued to determine the first recognized text as the text to be updated.
S140:备选列表S140: Alternative list
显示所述待更新的文字对应的备选文字列表;所述备选文字与所述待更新的文字同音。Displaying an alternative text list corresponding to the text to be updated; the candidate text is homophone with the text to be updated.
其中,当选定了待更新的文字后,在所述文字的附近显示同音的备选文字列表,便于用户后续选择备选文字。例如:若将文本数据中的第一个发音为“hujian”的文字“胡建”确定为待更新文字,则此步骤中在文本数据中的第一个发音为“hujian”的文字“胡建”附近显示备选文字列表:1、福建;2、附件;3、护肩;4、互见,...Wherein, after the text to be updated is selected, a list of alternate characters of the homophone is displayed in the vicinity of the text, so that the user can select the alternative text later. For example, if the first word “Hu Jian” in the text data is “hujian” is determined as the text to be updated, the first word in the text data in this step is “hujian”. "A list of alternative texts is displayed nearby: 1, Fujian; 2, accessories; 3, shoulder pads; 4, mutual see,...
S150:选定指令S150: Selected instruction
接收备选文字选定指令。Receive an alternate text selection instruction.
其中,用户可以通过语音说出备选文字在备选文字列表中的位置,完成备选文字选中的工作。比如使用福建替换胡建。Among them, the user can speak the position of the alternative text in the alternative text list by voice, and complete the work of selecting the alternative text. For example, use Fujian to replace Hu Jian.
S160:更新文字S160: Update text
将所述待更新文字更新为所述备选文字选定指令所指定的备选文字。在进行更新替换的过程中,将待更新文字的时间位置信息,以时间戳进行标记,从而准确定位被更新文字所对应的语音数据的时间位置信息。优选的是,在更新文本数据和语音数据流的过程中,形成更新历史记录,所述更新历史记录包括更新时间、更新内容、更新操作人等等。Updating the to-be-updated text to an alternate text specified by the alternate text selection instruction. In the process of performing the update replacement, the time position information of the text to be updated is marked with a time stamp, so as to accurately locate the time position information of the voice data corresponding to the updated text. Preferably, in the process of updating the text data and the voice data stream, an update history is formed, the update history including update time, update content, update operator, and the like.
S170:语音片段S170: Voice clip
从标准语音库中,根据备选文字搜索其标准语音数据,如果多字词或句子,就组合形成一段新的语音数据片段。优选的是,文本数据中包含有语音源的性别信息,在进行所述搜索时,就可以根据性别信息获得女生发音或男生发音,或者各种高音、低音等不同的语音数据。From the standard speech library, the standard speech data is searched according to the alternative text, and if a plurality of words or sentences are combined, a new piece of speech data is combined. Preferably, the text data includes gender information of the voice source, and when the search is performed, the girl's pronunciation or the boy's pronunciation, or various voice data such as various trebles and basses may be obtained according to the gender information.
S180:语音替换S180: Voice replacement
根据之前所述的时间位置信息,将所述新的语音数据片段替换原始语音数据中的相应语音数据片段,形成新的语音数据。优选的是,由于标准语音的发 音时间和被替换的语音的发音时间,即使文字内容完全相同,发音时间也不一定相同,为了平滑的无缝替换,可以先根据两个语音片段的发音时间计算出平滑系数,根据所述平滑系数,加快或减慢所述标准发音时间,使得替换后和替换前同样文字内容的发音持续时间保持一致。The new voice data segment is replaced with the corresponding voice data segment in the original voice data according to the previously described time position information to form new voice data. Preferably, due to the standard voice The sound time and the pronunciation time of the replaced voice, even if the text content is exactly the same, the pronunciation time is not necessarily the same. For smooth seamless replacement, the smoothing coefficient can be calculated according to the pronunciation time of the two voice segments, according to the smoothing The coefficient speeds up or slows down the standard pronunciation time so that the pronunciation duration of the same text content is the same after the replacement and before the replacement.
用户使用终端通过互联网登录录播系统,可以实现对录制课堂的回顾播放或点播播放。当然,对于某些用户比如网络在线会议用户,这些录制课堂可以是网络在线会议的过程记录文件,录播系统会把用户请求回顾或点播的教学文件ID通过Socket加密信道发送给教学服务器,通过教学文件ID获取此课程的带有时间戳的动作数据流、语音数据流、电子文档如PPT文件操作数据流、视频数据流以及文本数据等发送给请求相应教学文件ID的用户终端,用户终端在本地根据时间戳还原(重现或者模拟重现)整个课堂教学过程。这些数据流可以在用户终端的各个功能区分别进行显示或者切换式显示。对于视频一般可以在用户终端上进行重现,但是对于电子白板的操作,通过电子白板的模拟程序,可以实现模拟重现。对于PPT文件的操作,可以在本地终端上进行重现展示。The user uses the terminal to log in to the recording and broadcasting system through the Internet, and can realize the review playback or on-demand playback of the recorded classroom. Of course, for some users, such as online online conference users, these recording classrooms may be process record files of online online conferences, and the recording and playback system will send the teaching file IDs requested by the user for review or on-demand to the teaching server through the Socket encrypted channel, through teaching. The file ID obtains the time-stamped action data stream, the voice data stream, the electronic document such as the PPT file operation data stream, the video data stream, and the text data of the course and is sent to the user terminal requesting the corresponding teaching file ID, and the user terminal is locally Restore (reproduce or simulate reproduce) the entire classroom teaching process based on timestamps. These data streams can be displayed or switched display in each functional area of the user terminal. For video, it can generally be reproduced on the user terminal, but for the operation of the electronic whiteboard, simulation reproduction can be realized by the simulation program of the electronic whiteboard. For the operation of the PPT file, the replay display can be performed on the local terminal.
当然,用户可以选择只播放这些数据流的至少一种,比如可以只听语音。对于文本数据,可以以字幕的方式显示在用户终端的特定区域,比如视频展示区内。Of course, the user can choose to play only at least one of these data streams, for example, can only listen to the voice. For text data, it can be displayed in a specific area of the user terminal in the form of subtitles, such as a video display area.
在一个示例中,用于进行字幕展示的文本数据(语音识别形成的)可以显示在特定的可编辑区域,使得用户可以进行选定操作等,这样对于发现的不标准的语音数据或者文字信息,只需要选定相应的文字信息即可进行反馈。录播系统的管理员在接到用户的反馈之后,进行核实,如果发现确实存在错误,就重复前面的文本数据和语音数据的更新步骤,使得文本数据和语音数据能够得到不断的完善和改进。In one example, text data (formed by voice recognition) for performing caption display may be displayed in a specific editable area so that the user can perform a selected operation or the like, such that for the found non-standard voice data or text information, Just select the appropriate text message to get feedback. After receiving the feedback from the user, the administrator of the recording and broadcasting system verifies. If it is found that there is an error, the previous text data and the updating process of the voice data are repeated, so that the text data and the voice data can be continuously improved and improved.
在上述实施例中,终端和服务器是可以由与包含互联网在内的通信网络进行连接的构成,所以也可以是以经由通信网络下载程序代码的方式流动地承载程序代码的媒体。在这样从通信网络下载程序代码的情况下,也可以是所述下载用的程序预先保存在主体装置中或者从别的记录媒体进行安装的构成。In the above embodiment, the terminal and the server are configured to be connected to a communication network including the Internet. Therefore, the medium may be a program that carries the program code in a streaming manner via the communication network. In the case where the program code is downloaded from the communication network as described above, the program for downloading may be stored in the main device or may be installed from another recording medium.
此外,本发明可以通过上述程序代码以电子传输所体现的、被嵌入于载波中的计算机数据信号的形态而得以实现。Furthermore, the present invention can be realized by the above-described program code in the form of a computer data signal embodied in an electronic transmission embodied in a carrier wave.
以上介绍了本发明的较佳实施方式,旨在使得本发明的精神更加清楚和便于理解,并不是为了限制本发明,凡在本发明的精神和原则之内,所做的更新、 替换、改进,均应包含在本发明所附的权利要求概况的保护范围之内。The preferred embodiments of the present invention have been described above, and are intended to be illustrative and not to limit the scope of the present invention. Modifications and improvements are intended to be included within the scope of the appended claims.
工业实用性Industrial applicability
通过本发明的方法,提高了课堂录制的水平,基于时间戳分别保存各种数据流的存储方式,通过对语音数据的识别转换以及文本数据的更新,并根据更新的文本内容更新语音数据,更新了原始语音数据中需要更新的内容,克服了课堂上“少说、错说和漏说”等带来的问题,可以获得双更新后的语音数据和文本数据(字幕信息),通过在旧课件的基础上更新形成新课件,并且在新旧课件之间建立对应关系,结合其在录播数据中的时间戳信息,实现了课件的替换更新。Through the method of the invention, the level of classroom recording is improved, and the storage manners of various data streams are respectively saved based on the time stamp, and the voice data is updated and updated, and the voice data is updated according to the updated text content, and the voice data is updated according to the updated text content. The content that needs to be updated in the original voice data overcomes the problems caused by “less talk, wrong talk and miss talk” in the classroom, and can obtain double updated voice data and text data (subtitle information) through the old courseware. On the basis of the update, a new courseware is formed, and a corresponding relationship is established between the new and old coursewares, and the timestamp information in the recorded data is combined to realize the replacement of the courseware.
使用本发明的装置,使得在教学过程录制完成之后,通过更新语音数据解决了教学过程中存在的错说、漏说、表达不标准的语音问题,以及,通过新旧课件的替换操作,可以将新课件展示给学习教学录播数据的用户,从而使得教学过程完成之后,也就是教学录播数据形成之后,依然可以对于教学过程中存在的问题通过更新操作进行补救和完善。 By using the device of the invention, after the completion of the recording process of the teaching process, the voice problem is solved by updating the voice data, and the voice problem of non-standard expression in the teaching process is solved, and the new and old courseware can be replaced by the new and old courseware replacement operation. The courseware is displayed to the users who study and record the data, so that after the teaching process is completed, that is, after the teaching and recording data is formed, the problems existing in the teaching process can still be remedied and improved through the update operation.

Claims (10)

  1. 一种用于更新教学录播数据的装置,包括语音数据更新单元和课件数据更新单元,其特征在于,An apparatus for updating teaching recording data, comprising a voice data updating unit and a courseware data updating unit, wherein
    所述语音数据更新单元,用于更新教学录播数据中的语音数据,The voice data updating unit is configured to update voice data in the teaching and recording data,
    所述课件数据更新单元,用于更新教学录播数据中的课件数据,The courseware data updating unit is configured to update courseware data in the teaching and recording data,
    所述语音数据的更新是通过对所述语音数据识别形成的文本内容的替换操作实现的,The updating of the voice data is implemented by a replacement operation of the text content formed by the voice data identification,
    所述课件数据的更新是通过使用新课件替换旧课件实现的。The update of the courseware data is achieved by replacing the old courseware with a new courseware.
  2. 根据权利要求1的装置,其特征在于,The device according to claim 1 wherein
    所述语音数据更新单元进一步包括,The voice data update unit further includes
    语音数据识别单元,用于根据语音识别模型将原始语音数据识别转换成原始文本数据;a voice data identification unit, configured to convert the original voice data identification into original text data according to the voice recognition model;
    文本数据更新单元,用于对所述原始文本数据进行校对,将其中需要更新的旧文字内容,替换更新为准确的新文字内容,形成更新文本数据;a text data updating unit, configured to perform proofreading on the original text data, and replace the old text content that needs to be updated into an updated new text content to form updated text data;
    语音数据替换单元,用于使用所述新文字内容的标准语音数据替换在所述原始语音数据中的所述旧文字内容的语音数据片段,形成更新语音数据,实现对所述教学录播数据的语音数据的更新。a voice data replacing unit, configured to replace, by using standard voice data of the new text content, a voice data segment of the old text content in the original voice data, to form updated voice data, to implement data recording and playing on the teaching Update of voice data.
  3. 根据权利要求2的装置,其特征在于,The device according to claim 2, characterized in that
    所述课件数据更新单元进一步包括,The courseware data updating unit further includes
    课件内容更新单元,用于在旧课件的基础上对内容进行更新,使用新内容替换旧内容,形成新课件,并且记录旧课件和新课件之间的对应关系数据,包括更新前后页面之间的关联关系和每个页面更新前后的内容对应关系;The courseware content update unit is used to update the content on the basis of the old courseware, replace the old content with the new content, form a new courseware, and record the correspondence data between the old courseware and the new courseware, including between the pages before and after the update. The relationship between the relationship and the content before and after each page update;
    课件数据更新单元,用于获取所述教学录播数据中课件数据的时间戳标识,并根据所述对应关系数据,对新课件进行处理,将所述时间戳标识添加到新课件中,形成新课件数据;a courseware data updating unit, configured to acquire a timestamp identifier of the courseware data in the teaching recording and broadcasting data, and process the new courseware according to the corresponding relationship data, and add the timestamp identifier to the new courseware to form a new Courseware data;
    课件替换更新单元,使用所述新课件数据替换在所述教学录播数据中的旧课件数据,实现对所述教学录播数据的课件数据的更新。The courseware replacement updating unit replaces the old courseware data in the teaching recording data with the new courseware data, and updates the courseware data of the teaching recording data.
  4. 根据权利要求3的装置,其特征在于,The device according to claim 3, characterized in that
    所述课件优选为PPT文件,所述标准语音数据是从标准语音数据库中通过搜索获取的,所述语音数据和课件数据是单独进行保存的,优选的,所述语音数据以“语音数据流+时间戳”的格式进行保存,所述课件数据以“课件操作数据流 +时间戳”的格式进行保存。The courseware is preferably a PPT file, and the standard voice data is obtained by searching from a standard voice database, and the voice data and the courseware data are separately saved. Preferably, the voice data is “voice data stream+” "Timestamp" format is saved, the courseware data is "courseware operation data stream" +Timestamp" format is saved.
  5. 根据权利要求4的装置,其特征在于,The device according to claim 4, characterized in that
    所述文本数据更新单元进一步用于,在课件数据更新完成之后,根据时间戳标识确定所述PPT文件的每个页面的展示时间,并根据所述每个页面更新前后的内容对应关系,搜索所述展示时间内的文本数据中的旧内容,使用新内容替换旧内容,形成更新文本数据,实现对文本数据的进一步更新,使得更新后的语音数据与更新后的课件数据相适应。The text data updating unit is further configured to: after the courseware data update is completed, determine a presentation time of each page of the PPT file according to the timestamp identifier, and search for the content according to the content correspondence relationship before and after each page update. The old content in the text data in the presentation time is replaced with the new content, and the updated text data is formed to further update the text data, so that the updated voice data is adapted to the updated courseware data.
  6. 根据权利要求5的装置,其特征在于,The device according to claim 5, characterized in that
    所述对文本数据的进一步更新,在完成替换之前,将文本数据中的旧内容与新内容的对应关系展示给管理员,由管理员根据上下文确认是否进行替换。The further updating of the text data reveals the correspondence between the old content and the new content in the text data to the administrator before the replacement is completed, and the administrator confirms whether to replace according to the context.
  7. 根据权利要求6的装置,其特征在于,The device according to claim 6 wherein
    所述更新文本数据,根据时间戳标识,以字幕方式显示在屏幕上,优选的是,显示在在播放视频数据的屏幕区域,更优选的是,所述文本数据以可编辑的方式如可选定的方式,显示在屏幕的特定区域。The updated text data is displayed on the screen in a subtitle manner according to the timestamp identifier, preferably displayed on a screen area in which the video data is played, and more preferably, the text data is editable in an optional manner. The way it is displayed is displayed in a specific area of the screen.
  8. 根据权利要求7的装置,其特征在于,The device according to claim 7, wherein
    所述装置进一步包括错误信息反馈单元,The device further includes an error information feedback unit,
    所述错误信息反馈单元,用户使用终端播放所述教学录播数据时,可以将发现的所述更新文本数据中的错误文字内容选定并提交反馈,反馈的内容经由管理员确认之后,再次更新所述更新文本数据,并通过所述语音数据替换单元,再次更新所述更新语音数据。The error information feedback unit may: when the user uses the terminal to play the teaching and recording data, select and submit the error text content in the found updated text data, and the feedback content is updated again after being confirmed by the administrator. The text data is updated, and the updated voice data is updated again by the voice data replacement unit.
  9. 根据权利要求8的装置,其特征在于,The device according to claim 8 wherein
    在对文本数据和语音数据进行更新时,形成更新历史记录,所述更新历史记录可以包括更新时间、更新内容、更新操作人、问题发现人等等。When the text data and the voice data are updated, an update history is formed, which may include update time, update content, update operator, issue finder, and the like.
  10. 根据权利要求4的装置,其特征在于,The device according to claim 4, characterized in that
    所述语音数据替换单元,用于根据被替换的旧文字内容在所述原始语音数据中的发音时间以及新文字内容的标准语音数据的发音时间,计算出平滑系数,再根据所述平滑系数,调整所述新文字内容的发音时间,由此使得替换前后语音数据的平滑和同步。 The voice data replacing unit is configured to calculate a smoothing coefficient according to the pronunciation time of the replaced old text content in the original voice data and the pronunciation time of the standard voice data of the new text content, and according to the smoothing coefficient, The pronunciation time of the new text content is adjusted, thereby smoothing and synchronizing the voice data before and after the replacement.
PCT/CN2017/105553 2017-07-28 2017-10-10 Teaching recording data updating device WO2019019406A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710627684.1A CN109324811B (en) 2017-07-28 2017-07-28 Device for updating teaching recorded broadcast data
CN201710627684.1 2017-07-28

Publications (1)

Publication Number Publication Date
WO2019019406A1 true WO2019019406A1 (en) 2019-01-31

Family

ID=65039910

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/105553 WO2019019406A1 (en) 2017-07-28 2017-10-10 Teaching recording data updating device

Country Status (2)

Country Link
CN (1) CN109324811B (en)
WO (1) WO2019019406A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914563A (en) * 2019-04-23 2020-11-10 广东小天才科技有限公司 Intention recognition method and device combined with voice
CN113506482A (en) * 2021-06-15 2021-10-15 浙江传媒学院 Remote teaching intelligent blackboard writing system
CN113554904A (en) * 2021-07-12 2021-10-26 江苏欧帝电子科技有限公司 Intelligent processing method and system for multi-mode collaborative education
CN117596433A (en) * 2024-01-19 2024-02-23 自然语义(青岛)科技有限公司 International Chinese teaching audiovisual courseware editing system based on time axis fine adjustment

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858005B (en) * 2019-03-07 2024-01-12 百度在线网络技术(北京)有限公司 Method, device, equipment and storage medium for updating document based on voice recognition
CN110148418B (en) * 2019-06-14 2024-05-03 安徽咪鼠科技有限公司 Scene record analysis system, method and device
CN111312219B (en) * 2020-01-16 2023-11-28 上海携程国际旅行社有限公司 Telephone recording labeling method, system, storage medium and electronic equipment
CN115376372B (en) * 2022-08-26 2023-07-25 广东粤鹏科技有限公司 Multimedia teaching method and teaching system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060147891A1 (en) * 2004-12-16 2006-07-06 Ricardo Dreyfous Education management system including lesson plan file compilation
CN202771656U (en) * 2012-09-28 2013-03-06 冯贞 Teaching device capable of being updated automatically
CN103794098A (en) * 2014-01-01 2014-05-14 广州东软科技有限公司 Intelligent teaching management system
CN203689730U (en) * 2014-01-01 2014-07-02 广州东软科技有限公司 Intelligent teaching management system
CN104463483A (en) * 2014-12-17 2015-03-25 天脉聚源(北京)教育科技有限公司 Teaching terminal management platform for intelligent teaching system
WO2015112456A1 (en) * 2014-01-22 2015-07-30 AlvaEDU, Inc. On-line education system with synchronous lectures
CN106971749A (en) * 2017-03-30 2017-07-21 联想(北京)有限公司 Audio-frequency processing method and electronic equipment
CN107220228A (en) * 2017-06-13 2017-09-29 深圳市鹰硕技术有限公司 One kind teaching recorded broadcast data correction device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060147891A1 (en) * 2004-12-16 2006-07-06 Ricardo Dreyfous Education management system including lesson plan file compilation
CN202771656U (en) * 2012-09-28 2013-03-06 冯贞 Teaching device capable of being updated automatically
CN103794098A (en) * 2014-01-01 2014-05-14 广州东软科技有限公司 Intelligent teaching management system
CN203689730U (en) * 2014-01-01 2014-07-02 广州东软科技有限公司 Intelligent teaching management system
WO2015112456A1 (en) * 2014-01-22 2015-07-30 AlvaEDU, Inc. On-line education system with synchronous lectures
CN104463483A (en) * 2014-12-17 2015-03-25 天脉聚源(北京)教育科技有限公司 Teaching terminal management platform for intelligent teaching system
CN106971749A (en) * 2017-03-30 2017-07-21 联想(北京)有限公司 Audio-frequency processing method and electronic equipment
CN107220228A (en) * 2017-06-13 2017-09-29 深圳市鹰硕技术有限公司 One kind teaching recorded broadcast data correction device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914563A (en) * 2019-04-23 2020-11-10 广东小天才科技有限公司 Intention recognition method and device combined with voice
CN113506482A (en) * 2021-06-15 2021-10-15 浙江传媒学院 Remote teaching intelligent blackboard writing system
CN113554904A (en) * 2021-07-12 2021-10-26 江苏欧帝电子科技有限公司 Intelligent processing method and system for multi-mode collaborative education
CN117596433A (en) * 2024-01-19 2024-02-23 自然语义(青岛)科技有限公司 International Chinese teaching audiovisual courseware editing system based on time axis fine adjustment
CN117596433B (en) * 2024-01-19 2024-04-05 自然语义(青岛)科技有限公司 International Chinese teaching audiovisual courseware editing system based on time axis fine adjustment

Also Published As

Publication number Publication date
CN109324811A (en) 2019-02-12
CN109324811B (en) 2021-10-15

Similar Documents

Publication Publication Date Title
WO2018227761A1 (en) Correction device for recorded and broadcasted data for teaching
CN109324811B (en) Device for updating teaching recorded broadcast data
US11151892B2 (en) Internet teaching platform-based following teaching system
WO2019095446A1 (en) Following teaching system having speech evaluation function
JP6472898B2 (en) Recording / playback method and system for online education
US7458013B2 (en) Concurrent voice to text and sketch processing with synchronized replay
US9164590B2 (en) System and method for automated capture and compaction of instructional performances
WO2019095447A1 (en) Guided teaching method having remote assessment function
US8930308B1 (en) Methods and systems of associating metadata with media
CN111538851B (en) Method, system, equipment and storage medium for automatically generating demonstration video
WO2005027092A1 (en) Document creation/reading method, document creation/reading device, document creation/reading robot, and document creation/reading program
CN104408983A (en) Recording and broadcasting equipment-based intelligent teaching information processing device and method
Leander et al. Speaking and writing: How talk and text interact in situated practices
CN109697906B (en) Following teaching method based on Internet teaching platform
JPWO2014136534A1 (en) Understanding support system, understanding support server, understanding support method, and program
KR101858204B1 (en) Method and apparatus for generating interactive multimedia contents
Thompson Building a specialised audiovisual corpus
KR20130115484A (en) System for providing lecture contents using lecture data synchronized with teaching materials
KR100395883B1 (en) Realtime lecture recording system and method for recording a files thereof
CN103177621A (en) Multimedia teaching technology for musical instrument teaching
JP3930402B2 (en) ONLINE EDUCATION SYSTEM, INFORMATION PROCESSING DEVICE, INFORMATION PROVIDING METHOD, AND PROGRAM
KR101508718B1 (en) Listen and write system on network
JP2008032788A (en) Program for creating data for language teaching material
JP2021012341A (en) Information processing method, information processing device, and program
KR20200068380A (en) Method and system for Bible using virtual reality technology

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17919215

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17919215

Country of ref document: EP

Kind code of ref document: A1