JP2020003624A - System for optimizing and virtualizing video and voice for virtual reality lesson room for remote instruction - Google Patents
System for optimizing and virtualizing video and voice for virtual reality lesson room for remote instruction Download PDFInfo
- Publication number
- JP2020003624A JP2020003624A JP2018122553A JP2018122553A JP2020003624A JP 2020003624 A JP2020003624 A JP 2020003624A JP 2018122553 A JP2018122553 A JP 2018122553A JP 2018122553 A JP2018122553 A JP 2018122553A JP 2020003624 A JP2020003624 A JP 2020003624A
- Authority
- JP
- Japan
- Prior art keywords
- learner
- instructor
- delay
- ict
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000006870 function Effects 0.000 claims description 5
- 238000004891 communication Methods 0.000 abstract description 9
- 230000005540 biological transmission Effects 0.000 abstract description 5
- 238000004519 manufacturing process Methods 0.000 abstract 3
- 230000000694 effects Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000000034 method Methods 0.000 description 2
- 230000002250 progressing effect Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 208000019901 Anxiety disease Diseases 0.000 description 1
- 230000036506 anxiety Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000013065 commercial product Substances 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Landscapes
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Description
本発明は、ICT遠隔教育において、教師や学習者の個別の録画・録音も保存再生でき、個別に確認できるようにするものである。また、教場の講師と学習者の立体的な位置関係を仮想的に実現して、教場の臨場感に講師も学習者も引き込むことを可能にする。さらに避けられない通信遅延については視覚化によって本人の反応遅れなのかネットによる通信遅延なのかの識別を可能にする。 The present invention makes it possible to save and reproduce individual recordings and recordings of teachers and learners in ICT distance learning, and to confirm them individually. In addition, the three-dimensional positional relationship between the lecturer and the learner in the school is virtually realized, so that both the teacher and the learner can be drawn into the realism of the school. Further, with regard to the inevitable communication delay, it is possible to identify whether the communication delay is caused by the response delay of the person or the communication delay by the visualization.
ICT教育の映像と音声は遠隔地とインターネット回線を通じて利用され、特に講師が不足する地域や海外の学習者に対して語学教育を行う場合にはメリットが高いとされ広く普及している。しかし、インターネット回線を介する学習者の音声は、多数の声の混成でありその中の特定話者の発音が適正であったどうかを選択的に聞くことが出来なかった。また、通常の教場でのように学習者の座った位置から聞こえてくることがないので、講師は学習者の声の音質や発話上の癖以外に学習者の位置や方向を頼りに声の主を特定することが出来ない。実際、講師は声の主の特定を誤ったり判断に迷ったりするため、実技指導が不完全不十分になりがちであった。また、インターネット回線とICT機器やカメラシステムの技術的な問題から映像と音声には遅延が発生し講師の発音に対して学習者が躊躇なく反応しているのか迷いや不安がないのかなどを判断する事が困難であったため、教育効果には大きな制約が存在していた。 The video and audio of ICT education is used through remote areas and Internet lines, and is widely used, especially when it comes to providing language education to learners in areas where there is a shortage of lecturers or overseas learners. However, the learner's voice via the Internet line is a mixture of many voices, and it was not possible to selectively hear whether the pronunciation of a specific speaker was proper or not. In addition, since the learner does not hear from the position where the learner sits as in a normal school, the instructor relies on the learner's position and direction in addition to the sound quality and speech habits of the learner. I cannot identify the Lord. In fact, instructors tended to be incomplete and inadequate in practical instruction, as they misidentified or lost judgment of the main voice. In addition, video and audio are delayed due to technical problems with the Internet line and ICT equipment and camera system, and it is determined whether the learner is reacting to the instructor's pronunciation without hesitation or whether there is no hesitation or anxiety Because of the difficulty in doing so, there were significant constraints on educational effectiveness.
多数の発話者の混然音声の中から特定個人の発話を選択的に聞くこと、既存のシステムにない仮想現実の教室を立体音響によって実現すること、およびインターネットの回線速度の向上、、ICT機器の性能向上は進んではいるが、全ての学習者に最高性能のICT機器の性能向上は進んではいても、全ての学習者に最高性能のICT機器が用意される事は難しく、インターネット環境やICT機器による遅延の改善だけには頼らない問題解決、が必要であった。 Selectively listen to the utterance of a specific individual from the crowded voice of many speakers, realize a virtual reality classroom that does not exist in existing systems with 3D sound, improve the line speed of the Internet, ICT equipment Although the performance improvement of ICT equipment is progressing, it is difficult for all learners to prepare the highest performance ICT equipment even if the performance improvement of ICT equipment of the highest performance is progressing, and it is difficult to provide Internet environment and ICT It was necessary to solve problems that did not rely only on improving the delay caused by equipment.
現在広く普及しているICT機器を利用した授業形態では、多数の学習者ら(講師を含む)が一斉に発話すると音声は混成され、特定個人の声を事後においても選択的に聞くことが出来ない欠陥があった。また、複数の学習者が同時に発話する際に講師が学習者の所在方向を頼りに瞬時に声の主を特定することが出来ないので本人の声の質や答え方の癖など以外には頼しかなく、通常の教場ように講師は的確にして素早い反応が出来ず、講師による指導に支障が生じていた。
さらに、現在広く普及しているICT機器を利用した授業形態は一人の講師に対してICT機器で映像を撮影し音声はモノラルかステレオ音源として配信し、複数の学習者側では生徒が買える程度のスマートフォン、タブレット、コンピュータのビデオカメラとモノラルマイクで受け答えするのが一般的である。回線は一般的なインターネット回線で、海外においては品質の低い回線も少なくない。学習者側のICT機器は一般の市販品である。映像と音声は共に0.2秒から最大1秒程度あるいはそれ以上の遅延を発生し、講師と学習者側のコミュニケーションに支障が起こっていた。
In a class using ICT equipment, which is currently widely used, when many learners (including instructors) speak at once, the voice is mixed, and the voice of a specific individual can be selectively heard after the fact. There were no flaws. In addition, when multiple learners speak simultaneously, the instructor cannot instantaneously identify the main voice based on the direction of the learner, so the instructor must rely on the voice quality of the learner and the habit of answering. However, the instructor was not able to respond accurately and quickly as in a normal school, which hindered the instruction by the instructor.
In addition, the lesson style using ICT equipment, which is currently widely used, is such that one instructor can shoot video with ICT equipment and deliver audio as monaural or stereo sound source, and multiple learners can buy It is common to answer with smartphones, tablets, computer camcorders and monaural microphones. The line is a general Internet line, and many overseas have low quality lines. The learner's ICT equipment is a general commercial product. Both video and audio generated a delay of 0.2 seconds to a maximum of 1 second or more, which hindered communication between the instructor and the learner.
そのため、図1のように授業中の学習者の映像と音声を個別に分離取得して録音録画がされると同時に個別に加工されて通信先に適切に再現される方式を考案し発明した。このことによって、最大で30人程度になる学習者個々の特定が可能となり講師に対する学習者の応答状態を個別に確認出来るようになる。次に図2では講師と学習者の発話の位置関係を仮想的に教室内に配置してのように立体的な音響効果を実現できる機能を発明した。この音響効果により講師は学習者からの発話について位置関係をイメージする事が出来るようになり、教室で講義をしている状態に近付くことができる。さらに、図3のような授業中の映像と音声の遅延を相対的に視覚化する技術を発明した。講師が発話し、個々の学習者が反応し発話し講師に届く予想時間が視覚化される事で学習者が講師に対して適切な反応をしているかを把握する事が出来る。 Therefore, as shown in FIG. 1, a method was devised and invented in which a video and a sound of a learner in a class are separately acquired and recorded and simultaneously recorded and simultaneously processed and appropriately reproduced in a communication destination. As a result, it is possible to specify individual learners of up to about 30 students, and it is possible to individually confirm the learner's response state to the instructor. Next, FIG. 2 invents a function capable of realizing a three-dimensional sound effect as if the positional relationship between the instructor and the learner's utterance was virtually arranged in a classroom. This acoustic effect allows the instructor to imagine the positional relationship with respect to the utterance from the learner, and can approach the state of giving a lecture in a classroom. Furthermore, a technique for relatively visualizing the delay between video and audio during a lesson as shown in FIG. 3 was invented. The instructor speaks, the individual learners react, and the estimated time to reach the instructor is visualized, so that it is possible to grasp whether the learner is responding appropriately to the instructor.
この仕組が解決しようとしている問題点は、ICT教育で利用されるビデオ会議では参加する講師や学習者たちの発話が混成音声となって分離できないこと、また教場のような相互の位置関係が失われていること、および通信遅延の問題の解決である。しかし、遅延そのものがゼロになる事は無く、数十ミリ秒〜1秒程度の遅延は発生してしまう。 The problems that this system is trying to solve are that in video conferencing used in ICT education, the utterances of the participating lecturers and learners cannot be separated as mixed voices, and the mutual positional relationship such as the classroom is lost. And the problem of communication delays. However, the delay itself does not become zero, and a delay of about several tens of milliseconds to one second occurs.
本発明は、講師と学習者それぞれの映像と音声を個別に取得してそれぞれを録画・録音すること、音に対して遅延や反響等の加工を加えて講師と学習者の位置関係を教室のイメージに近づけること、講師と学習者に届く音声の遅延を可視化すること、以上により、これまでICT遠隔教育に内在していた問題が解決して講師と学習者の意志疎通が緊密化される仕組みとした。 The present invention obtains the video and audio of the instructor and the learner individually, and records and records each of them, and adds processing such as delay and reverberation to the sound to determine the positional relationship between the instructor and the learner. By approaching the image and visualizing the delay of the voice reaching the instructor and the learner, the mechanism that solves the problems inherent in ICT distance education and the communication between the instructor and the learner becomes closer And
本発明のICT遠隔教育向け仮想現実教室は、映像と音声を学習者別に取得し、仮想現実の立体音響効果を実現して教室で講義する仮想的な状態を実現し、インターネットを通じた音声と映像の遅延状態を分析して、その状態を可視化する事により、学習者の発話を適正認識するタイミングが明確にな学習者との意思疎通が緊密化され、映像と音声が必要に応じて再生可能となり、講師の判断を適正にする。 The virtual reality classroom for ICT distance education of the present invention acquires video and audio for each learner, realizes virtual reality stereoscopic sound effect, realizes a virtual state of lecture in the classroom, voice and video through the Internet By analyzing the delay state of the learner and visualizing the state, communication with the learner who has a clear timing for properly recognizing the utterance of the learner is tighter, and the video and audio can be reproduced as necessary And make the instructor's judgment appropriate.
ICT教育システムの映像と音声の最適化・仮想化をインターネットの仮想環境における各種プログラミング技術で可能とした。 The optimization and virtualization of the video and audio of the ICT education system was made possible by various programming technologies in the virtual environment of the Internet.
図1は、本発明の実施例の1001は講師側のコンピュータ、1002はマイクを内蔵したWEBカメラ、1003は学習者別にインターネット仮想環境で映像と音声を記録する機能、1004はカメラとマイクを内蔵する学習者側のスマートフォン、タブレットPC、パーソナル・コンピュータである。
FIG. 1 shows an embodiment of the
1003のインターネット仮想環境の映像と音声の記録機能は講習を行う側の管理者が受講者を指定して任意で映像と音声を記録する事が可能となっている。 The video and audio recording function of the Internet virtual environment of 1003 allows the administrator of the training side to record video and audio arbitrarily by specifying the student.
録画と録音は講師、学習者を個別に記録する事が可能で、講習への参加者全員を同時に記録する事も可能となっている。再生も個別と複数の記録を同時に再生する事ができる。 Recording and recording can record the instructor and the learner individually, and it is also possible to record all the participants in the course at the same time. Reproduction can be performed individually and a plurality of recordings can be reproduced simultaneously.
図2は2001遠隔地域に点在する学習者と講師の関係を仮想教室として再現する発明を示している。2003
仮想教室には学習者が遠隔地からログインした順序で席に配置される。
FIG. 2 shows an invention for reproducing a relationship between a learner and a lecturer scattered in a
In the virtual classroom, the learners are placed at the seats in the order in which the learners log in from remote locations.
それぞれの学習者がどの席に配置されたかを講師は2004自分のPCで視覚的に確認する事が可能になる。 Instructors will be able to visually check on their own PC which seats each learner has been placed in.
次に席に配置された2003学習者がそれぞれの席の位置で発話しているという音響効果を演出する。 Next, the sound effect that 2003 learners placed in the seats are speaking at the position of each seat is produced.
この仕組は講師に届く音の遅延を2004講師側の左右のスピーカーに対して制御する事で可能としている。以上から最大で30人程度の教室を想定して学習者が教室内の指定した座席に配置された仮想教室が実現する。 This mechanism enables the delay of the sound reaching the lecturer to be controlled by the left and right speakers of the lecturer in 2004. From the above, a virtual classroom in which a learner is placed in a designated seat in the classroom is realized assuming a classroom of up to about 30 people.
図3はITC講習を行う講師の側の画面イメージであり、学習者に講師の声が届き学習者がそれに反応し講師に学習者の発話が届く時間を可視化できるようにしている。 FIG. 3 is a screen image of the instructor who performs the ITC course, and the learner's voice reaches the learner, and the learner responds to it so that the learner's utterance can be visualized.
プログラムがインターネットによる現在の遅延状態を分析し、同時に講師側のICT機器の遅延と学習者側の遅延を分析して、学習者までの遅延と学習者からの遅延をグラフによって講師の画面に示す。この表示は講師が必要に応じて任意に行い、学習者の反応を判断する基準として利用する。 The program analyzes the current delay status through the Internet, and at the same time analyzes the delay of the ICT equipment of the instructor and the delay of the learner, and shows the delay to the learner and the delay from the learner on the instructor screen by graph . This display is arbitrarily performed by the instructor as needed, and is used as a reference for judging the response of the learner.
図3では最初に講師側の3002PCから音波を発信し発信時刻をマイクロセカンドまで記録する。生徒側の3003端末は音波を受信し、3004受信した音波を反射して3005反射した音波が講師側に届く時刻を同様にマイクロセカンドまで計測する。3006で講師側と受講者側の二点間の差分時間をマイクロセカンドまで判断する。3002講師のPCには二点間の3007差分時間を映像で可視化して体感遅延時間を感じるように表示する。 In FIG. 3, a sound wave is first transmitted from the instructor's 3002PC, and the transmission time is recorded up to the microsecond. The 3003 terminal on the student side receives the sound wave, reflects the 3004 received sound wave, and measures the time when the reflected sound wave reaches the instructor side to the microsecond in the same manner. At 3006, the difference time between two points on the instructor side and the student side is determined up to microsecond. The 3002 instructor's PC visualizes the 3007 difference time between the two points with video and displays it so that the user can feel the sensation delay time.
特に仮想教室の実現はICT学習だけでなく、企業、組織で頻繁に行われるWEB会議に於いても参加者の発話の位置がイメージされると会議がスムーズになり、幅広く普及しする事が予想される。 In particular, realization of virtual classrooms is expected not only for ICT learning but also for WEB conferences frequently conducted by companies and organizations, if the image of the utterance position of the participants is imaged, the conference will be smooth and widely spread Is done.
1001 講師側コンピュータ
1002 WEBカメラ
1003 インターネット環境の個別録音録画
1004 受講者側端末
2001 各地学習者の端末
2002 インターネット上のシステム
2003 仮想化教室の学習者と音声遅延による配置
2004 講師側コンピュータ
3001 プログラムの開始
3002 講師側コンピュータ
3003 学習者端末
3004 学習者端末による音波反射
3005 音波到達時間の計測
3006 音波送出受信時間比較
3007 音波遅延時間の可視化
1001 Instructor's
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018122553A JP2020003624A (en) | 2018-06-27 | 2018-06-27 | System for optimizing and virtualizing video and voice for virtual reality lesson room for remote instruction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018122553A JP2020003624A (en) | 2018-06-27 | 2018-06-27 | System for optimizing and virtualizing video and voice for virtual reality lesson room for remote instruction |
Publications (1)
Publication Number | Publication Date |
---|---|
JP2020003624A true JP2020003624A (en) | 2020-01-09 |
Family
ID=69099856
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2018122553A Pending JP2020003624A (en) | 2018-06-27 | 2018-06-27 | System for optimizing and virtualizing video and voice for virtual reality lesson room for remote instruction |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP2020003624A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111816012A (en) * | 2020-07-31 | 2020-10-23 | 陕西谦和顺泰智能科技有限公司 | Remote education system based on virtual reality technology |
CN112015783A (en) * | 2020-08-30 | 2020-12-01 | 上海松鼠课堂人工智能科技有限公司 | Interactive learning process generation method and system |
CN114745653A (en) * | 2022-02-21 | 2022-07-12 | 上海卓越睿新数码科技股份有限公司 | Method for realizing panoramic real-world teaching based on multi-channel surround sound effect |
-
2018
- 2018-06-27 JP JP2018122553A patent/JP2020003624A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111816012A (en) * | 2020-07-31 | 2020-10-23 | 陕西谦和顺泰智能科技有限公司 | Remote education system based on virtual reality technology |
CN112015783A (en) * | 2020-08-30 | 2020-12-01 | 上海松鼠课堂人工智能科技有限公司 | Interactive learning process generation method and system |
CN112015783B (en) * | 2020-08-30 | 2021-07-16 | 上海松鼠课堂人工智能科技有限公司 | Interactive learning process generation method and system |
CN114745653A (en) * | 2022-02-21 | 2022-07-12 | 上海卓越睿新数码科技股份有限公司 | Method for realizing panoramic real-world teaching based on multi-channel surround sound effect |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11170782B2 (en) | Real-time audio transcription, video conferencing, and online collaboration system and methods | |
US9935987B2 (en) | Participation queue system and method for online video conferencing | |
US9025002B2 (en) | Method and apparatus for playing audio of attendant at remote end and remote video conference system | |
JP2020003624A (en) | System for optimizing and virtualizing video and voice for virtual reality lesson room for remote instruction | |
US11540078B1 (en) | Spatial audio in video conference calls based on content type or participant role | |
Sanchez-Pizani et al. | Hybrid flexible (HyFlex) seminar delivery–A technical overview of the implementation | |
Davies | The effectiveness of LOLA (LOw LAtency) audiovisual streaming technology for distributed music practice | |
Hegarty et al. | Classrooms for distance teaching & learning: A blueprint | |
US20240155012A1 (en) | Web-based video conferencing system and method | |
US20240064485A1 (en) | Systems and methods for sound-enhanced meeting platforms | |
Kearney et al. | Design of an interactive virtual reality system for ensemble singing | |
KR20140087777A (en) | Multimedia learning system and method using mobile terminal | |
Cain et al. | Innovating the hybrid small group model in a synchromodal learning environment | |
Bliesener | Training synchronous collaborative e-learning | |
Martin et al. | Three Dimensional Spatial Techniques in 22.2 Multichannel Surround Sound for Popular Music Mixing | |
Verhaart et al. | gxLearning, teaching to geographically extended classes | |
Davat et al. | Integrating Socio-Affective Information in Physical Perception aimed to Telepresence Robots | |
Neidhardt et al. | Auditory perception of the listening position in virtual rooms using static and dynamic binaural synthesis | |
Aguilera et al. | Spatial audio for audioconferencing in mobile devices: Investigating the importance of virtual mobility and private communication and optimizations | |
Sporer et al. | Wave field synthesis in the real world: Part 2-In the movie theatre | |
KR20180105357A (en) | Synchronization Method for Eliminating Playback Delay in Interactive Learning Video Sharing | |
Joyal | Multichannel Sound Perception and Learning | |
JP2005331826A (en) | Learning system | |
Long | Strategic instructor positioning for accuracy in assessment | |
Yadav et al. | Detection of headtracking in room acoustic simulations for one’s own voice |