LU91549A2

LU91549A2 - Online speech repository and client tool therefor

Info

Publication number: LU91549A2
Application number: LU91549A
Authority: LU
Inventors: Marcel Meyer; Hannes Soukoup; Paolo Tosoratti
Original assignee: European Community
Priority date: 2009-04-07
Filing date: 2009-04-07
Publication date: 2010-10-07

Description

DESCRIPTION ONLINE SPEECH REPOSITORY AND CLIENT TOOL THEREFOR Technical field [0001] The present invention generally relates to an online electronic learning system for students of conference interpretation.

Background Art [0002] Electronic learning systems are well known. Electronic learning tools for students of conference interpretation have been developed in some universities and companies to let students train themselves in mock interpretations. A pioneer application, “Iris”, was based on a local area network, but its development has meanwhile been discontinued. “BlackBox” was a more advanced tool but usable only in a laboratory.

Technical problem [0003] The global objective of the present invention is to provide students of conference interpretation with high quality video and/or audio recordings from real life situations (e.g. meetings at EU level). Students should have the widest possible availability of content with a cheap and easy-to-use IT-tool to watch the content and possibly train themselves in the interpretation.

General Description of the Invention [0004] In order to overcome the above-mentioned problem, the present invention proposes a database with training material and an IT-tool (hereinafter referred to as “mock interpretation tool” or SCICREC™), which is a client software to access the database via Internet. The mock interpretation tool permits students of interpretation to watch video material and listen to audio material stored in the database (“Speech Repository”) in a standard but specific format and to record a mock simultaneous and consecutive interpretation with the possibility of self-evaluation and of being evaluated by a distant tutor.

[0005] The fundamental winning idea behind the mock interpretation tool is its simplicity of use: o The mock interpretation tool plays the audio of the original speech (a video clip) in both the left and right channels in mono. o While the student speaks his/her interpretation, the mock interpretation tool records the interpretation as an audio-only track in central memory, using time stamps permitting the synchronization with the file of the original speech. o The mock interpretation tool then saves the interpretation audio track on the local disk. The user can then possibly upload it to a interpretation store of the Speech Repository if the user chooses this option and make it available to other users/teachers. o During the playback of the simultaneous interpretation, the mock interpretation tool plays the original speech and the interpretation as follows: the original audio is muted on one of the audio channels and the synchronized audio of the interpretation is delivered in its place. This enables a student and/or a tutor to listen to the complete performance without the need for special equipment: a headset is used to hear the original in the left ear and the interpretation in the right ear: this set up is the simplest one and usable even while travelling with a laptop and with an earphone. As an alternative, an external loudspeaker is used to listen to the original and the interpretation is delivered in the headset: this set up requires a loudspeaker with external jacks that can be easily purchased on the market and reproduces a condition closer to real life (original in the meeting room and interpretation from earphones).

[0006] Those skilled will appreciate the simplicity and easiness-to-use of the present mock interpretation tool. The latter has struck many people because of the high added value obtained for only small investment. The mock interpretation tool provides a light, easy-to-use and freely usable video playing and recording tool to any student without the need of using a specially equipped language laboratory.

[0007] Preferably, the tool is configured so that may be installed by virtually anybody without any need for special technical knowledge and "administrator"- user rights.

[0008] Preferably, the tool is configured to run on a very basic personal computer. Accessories necessary to listen and record (loudspeaker and/or earphones, microphone) are present on multimedia-compatible computers, or if they need to be purchased, generally inexpensive. The mock interpretation tool exploits the fact that common video files in electronic format usually have the following basic structure: one video and two audio tracks for stereo.

[0009] The availability of the mock interpretation tool warrants that anybody can correctly exploit the speeches made available in the Speech Repository-system. Because of different formats and standards used for encoding audiovisual material, the tools currently in use by specific universities may be incompatible with the format of the common repository. The mock interpretation tool - though very simple - gives to everybody the possibility to play the clips without having to download any commercial product.

[0010] As students have to listen to an interpretation and to the original at the same time, the arrangement requiring the simplest possible set-up is to feed the original audio of a video clip into one ear and the interpretation into the other ear. Because of occidental use of writing from left to right, the most preferred set-up would comprise feeding the original into the left ear and the interpretation into the right one. Preferably, the mock interpretation tool comprises a volume balance to individually control the volume of original and interpretation.

[0011] As students have to record the audio in order to listen with the method described above, the mock interpretation tool preferably implements this by encoding an audio-only track in a file separate from the original audio/video file and then uses a special player capable of feeding audio into the two channels (left and right) from two different files. This avoids the need of re-encoding the video clip to substitute one of the two channels and permits an encoding of audio-only files. As will be appreciated, audio-only files can be encoded with simple software (compared to audio+video files) and use less computer resources. This allows to use the mock interpretation tool even on old and cheap personal computers.

[0012] Preferably, the mock interpretation tool is capable of uploading a student’s interpretation audio tracks on the server of the Speech Repository. Most preferably, the mock interpretation tool uses audio-only files for exchanging interpretation audio tracks, e.g. between students and teachers. This improves usability because upload from a PC to a server are quicker and perfectly compatible with the philosophy of ADSL (download speed is usually about five times higher than upload-speed).

[0013] Preferably, the mock interpretation tool is configured to download also interpretation files from the Speech Repository. This feature is particularly useful for a teacher when assessing their students’ works or for students who want to have access to their interpretation files from different computers.

[0014] Preferably, the mock interpretation tool allows the training of consecutive interpretation in addition to the training of simultaneous interpretation.

[0015] On the server side, a content management system (CMS) is used to store the video and audio files (originals and/or uploaded interpretation files). Preferably, the CMS uses an association mechanism, which associates several uploaded audio files (interpretations from students) to the same original video or audio file. This addresses the typical teaching situation, where several students work on the same exercise. Preferably, the content management system allows the association of an unlimited number of audio-only interpretation files to each original video or audio file. Advantageously, the content management system comprises an access management to grant and/or deny user access to uploaded interpretations based on a user profile and/or restriction data associated with the interpretations.

[0016] Preferably, uploaded recordings are maintained until the user profile is erased or a purge on old files is launched.

[0017] Most preferably, video files are managed in the standard ITU H.264 (A.K.A MPEG-4 part 10) format and audio recordings in standard MPEG AAC (A.K.A MPEG-4 Part 3). Podcasting of video clips is also possible since the availability on the market of iPod™, iPhone™ or similar widely available devices using the standard format H.264.

[0018] Access to the Speech Repository is provided under the address http://multilingualspeeches.tv/scic/portal/index.html with personal login and password.

Description of a Preferred Embodiment [0019] The initial idea of the Speech Repository originates from a long-standing demand of universities to have access to relevant and original audiovisual material permitting to teachers to better focus the pedagogical needs for future interpreters of European institutions and bodies and to students to train themselves.

[0020] Selected speeches, in all languages of interest for European institutions, are stored in a special purpose system (Content Management System, CMS), called . Speech Repository, accessible via a browser from the web and are associated with relevant information (metadata). The speeches are in the form of video clips in ITU H.264 format, including recordings from public conferences in Commission premises, press conferences, parliamentary debates, interviews, etc. The material is collected from European institutions in Brussels and Strasbourg but also from international organizations and national institutions. Specially conceived pedagogical material such as speeches given by professors, professional interpreters, officials of institutions or MEPs is included. Universities and other collaborating schools and institutions can contribute with their own material.

[0021] All clips are categorized taking into account language, accent, different levels of difficulty, content domain, and more. Graders, e.g. professional interpreters, evaluate difficulty level and add information, helping students to grasp the context and understand the speech.

[0022] A front-end software application, called mock interpretation tool, enables students to watch the clips on a personal computer, perform their own mock interpretation and self-evaluation. They can subsequently upload their own-recorded interpretation to the repository for the teacher to evaluate.

Content Management System (CMS) [0023] The content management system of the Speech Repository permits a content-based navigation with content-based linking, i.e. contents are identified by unique URLs. These URLs contain the content ID that is used to retrieve the information associated to the content from the content management system and render its corresponding presentation. The CMS supports template-based rendering of contents, i.e. depending on the chosen template type (defined in the requested URL, or content information) one content may be rendered in an arbitrary number of ways. The system supports multi-language contents, i.e. contents can be displayed in the corresponding language chosen by the website visitor. The preferred languages configured in the user browser are used. The CMS offers the possibility to summarize contents within topic categories, which may be linked separately. It may also summarize categories within parent categories in order to improve potentials of navigation in the future. The CMS includes a web-based content edit user interface enabling the editing and creation of contents: creating speakers, creating several profiles for the same speaker (e.g. the same person can appear as Commissioner, then as minister, then as representative of an organisation, etc.), creating speeches, etc. and each profile of the same speaker is associated to the proper speech. The CMS has an improved search function to support searching for speakers, speakers' abstracts or other content. The content management system was developed based upon the open source framework "Struts". The following components are used in a redundant configuration: o 2 Apache HTTP web server o 2 Tomcat 5 application server with JSP servlet engine o 1 Firebird open source database.

User Portal [0024] The web portal of the Speech Repository is accessible from the Internet using the domain name multilingualspeeches.tv, registered since August 2005 or using the domain name multilingualspeeches.eu, associated to the same IP address and registered in March 2008.

[0025] The look and feel is compliant with the inter-institutional web site europa.eu: the web pages use the “Languages and Europe” banner as this has been conceived as the future area where the portal should be integrated and become accessible to users.

[0026] Access to the system is controlled by user authentication. After logging in, the user founds him/herself in a search page with few simple search criteria: all official EU languages and all candidate languages (currently 26), 4 different difficulty levels, 32 domains of interest (different EU policies, as matching the "Europa" web site grouping), 2 speech uses (simultaneous vs. consecutive) and a numerical identifier (to immediately get already known content).

[0027] A couple of web pages give information about the project, contacts and copyright: links are in the left frame, as in other "Europa" pages.

[0028] The main goal when designing the user interface was to permit search for relevant content requiring the minimum possible number of clicks but keeping flexibility. A design with a series of pages where to search and progressively refine the search was discarded because judged annoying and inefficient. This evaluation was done already during the design of the very first user interface. The search is possible only putting in “AND” “Language”, “Difficulty” “Domain” and “Use”: this has been considered as the search method covering the widest possible number of cases. A search on the unique speech identifier, if known, is also possible.

[0029] Launching a search from the “Search” page leads to a search "Results” page, listing all speeches fulfilling the search criteria entered. If no criteria is entered, the search result gives the whole set of clips available in the system. The “Result” page of the current version's prototype has been slightly modified: the direct link to the clip has been replaced by a link to a “details” page, with totally new "Details" where all relevant metadata entered by graders becomes visible.

[0030] An audio-only 20 seconds trailer lets the student quickly evaluate if pace, accent or other details fit her/his expectations. To get more information on a specific clip, the student can click on "Details": almost all metadata entered become visible from this page, anyhow the name of the grader that put the data in and any other metadata not relevant for the final student/user does not appear in the user interface. Because of the user management system, each user can save the results of his/her own search and name them. The search results, if saved can be used from within the mock interpretation tool.

Streaming and Downloading [0031] On the "Details" web page, the student can see all metadata entered by graders. From this page it is also possible to immediately watch a clip in streaming (without waiting the entire file to be downloaded); if a transcript is available, it becomes visible on the right of the clip. The student can also download the whole clip for any further use (e.g. upload it into a portable multimedia device, such as e.g. an iPod™, an iPhone™ or the like)

Video Clips and Audio Files Formats

The format chosen for video clips stored in the Speech Repository is currently the following: o encoding format for video. MPEG-4.10, a.k.a. ITU-T H.264; o video size: 320x240 pixels; o frame rate: 25. FpS (Frames per Second) to assure full motion video compression ratio: 320 kbps; o encoding format for audio: AAC (Advanced Audio Coding, replacing MP3 and available in all personal computers and all new (portable) electronic devices; o audio data rate and resolution: 64 kbps, mono, 48 kHz, 16 bit.

With such parameters, a clip takes about 2.8 MB per minute of disk space. On a 3 Mbps ADSL line an average clip is downloaded in 30 seconds. Every clip is encoded with a watermark clearly identifying' its origin (multilingualspeeches.tv). The audio recording performed during the mock interpretation is an audio only AAC file and it is recorded at 44.1 kHz. Video clips and audio recordings are stored locally. The metadata associated to each clip is stored locally inside XML files. Clips from broadcast level quality sources result in a surprising quality even if compressed at 384 kbps with the new H.264 standard format, maintaining an acceptable quality even in full-screen mode and allowing projection in a classroom.

In the future, the video size of clips may be enlarged, e.g. to take into account improvements in terms of average bandwidth available to European university students.

Selection of Speeches [0032] The selection of dips in the Speech Repository is based upon the following technical aspects: o coherence with image and audio quality requirements (e.g. ISO 2603). ISO standards for interpretation are not applicable to digitally encoded material; however, careful consideration has been taken to ensure coherence with the necessity of providing to future interpreters clips having an acceptable level of: o image resolution: the highest in the source video material, the better results in the final clip, after editing, re-encoding and adding a watermark o image sharpness : broadcast-level cameras (even not HD) guarantee the best possible final result o light conditions: the ideal conditions are TV studio ones, however professional cameras with a correct white-balance can compensate scarce light situations and can cope with lighting conditions normally present in a big meeting/conference room o perfect lip synchronization: this is a "must", probably the most important quality aspect to allow interpretation from a shot and reproduced image. Transcoding from non-MPEG formats (e.g. from Windows Media™ files) leads to unacceptable results. o good quality of sound and virtual absence of reverberation or other disturbing factors: recordings done with microphones far from the speaker mouth and capturing the room reverberation must be discarded, unless of immense value. Material received and not up to desired quality standards has been discarded with few temporary exceptions (exotic languages or extremely valuable content).

The Data-Entry Interface [0033] To guarantee ease of use of the Speech Repository also for the graders, the data entry interface of the Speech Repository includes only the fields strictly necessary to graders. The order and the layout of the data fields has been carefully elaborated together with graders in order to be coherent with the logical order of content to enter. The data entry interface matches the needs of graders with high turnover and with heterogeneous technical skills. The colours used as background are meaningful: o dark grey for content not displayed to users but necessary during the encoding phase and for future reference tracking, o light grey for objective descriptive content that can be copied or copied & pasted from the description of the source video material, o yellow for content specifically requiring interpreters' skills o orange for relevant non-editable information.

[0034] The data entry interface window fits into a 1024x768 pixels display. This will avoid the need of scrolling while entering data and permit a full view on all fields to fill.

Access to the Portal and Accounting [0035] The software development during year 2008 has lead to the creation of a user management system to control access and use of the system. DG Interpretation C1 - Multilingualism Unit grants top-level authorization to all universities or institutions requesting access to the system. A person from each university or institution is so delegated by DG Interpretation (SCIC) for the granting and revoking of user authorization access. The system has a three-layered management: a. Top level coordinator (contact person); b. Teachers, creating their own courses; c. Students registering themselves (no burden for teachers) to each course thus gaining full access to all speeches in the system till expiry of the account. The registration page is dynamically created by the CMS upon the creation of the course. The profile so created is associated to that specific course. Students already registered in a course can re-register in any additional course (maintaining the initial username/password) to prolong the use of the system till the last expiry date of all courses and in order to make the uploaded interpretations automatically available to their own teachers.

[0036] Each institution is responsible for the use of the system and in case of evident violation of rules agreed (no further copy and redistribution of the content) DG Interpretation may revoke the access authorization. Because of the complexity of the system, the top-level authorization is only managed by DG Interpretation.

[0037] The three Institutions collaborating to the project participating to the Management Board (Steering Committee from late 2008) are granted a unique shared access for all staff: access is virtually password-free if the access is via Commission DG Interpretation intranet or European Parliament one.

User Web Interface [0038] The user web interface of the Speech Repository is kept relatively simple. It provides, in particular: a) Search Results page o Results counter after a search: this simple counter reproduces what any user is used to get from any search machine (e.g. Google™). o Possibility of re-ordering the results clicking on the column title. The default order is the identification number (so that the most recently added content is on top of the column) but the user can choose six more different sorting criteria. o Possibility of saving a search result, naming it: It is now possible for all "personal" accounts. From the mock interpretation tool is so possible to restrict the pool of clips for which to download the description to only those of a saved search. o Audio sample: the search result gives the user the opportunity to listen to 20 seconds of a speech. All audio samples (20 seconds randomly chosen in the video clip) are encoded in Flash format for a quick check by users. b) Details page

Once the user has chosen a clip, he clicks on "Details", whereupon he is directed to a full details page: o Information is grouped in the most logical order and the layout permits to get all the content in the page without scrolling. o There is an option to display a transcript of the speech. However, the transcript does not appear by default, since most users consider that it should not be accessible before they try the interpretation, but becomes only visible after the user has clicked on a specific button. o A download button is available for those who wish to download the video clip. o Another button opens the clip in "streaming", so that it can be watched without need of using a special application (the mock interpretation tool or any other): the clip can be played within QuickTime™ (for high quality viewing), Flash (widely available and with possibility of passing through most of the firewalls).

More H.264-capable players are possible in future (e.g. VLC, a FOSS alternative player, commonly available on Linux PCs).

[0039] A special page is available where video clips related to the job of interpreter are collected. The clips are documentaries, interviews of pedagogical material publicly available in YouTube™ but are here collected together. This page is available to everybody, without need of logging into the system.

[0040] A FAQs page is provided to address technical questions on use, access and content itself. The FAQs are organised in sections.

[0041] Notwithstanding the simplicity and intuitiveness of the system, a concise online manual to the Speech Repository is accessible once the user has installed the mock interpretation application. The web site will also have a support/download page where the online manual can be downloaded in PDF format.

[0042] A download area is provided where the mock interpretation tool can be downloaded in different versions (for different operating systems, such as e.g. Windows™ , Linux™ and MacOS™).

[0043] An on-line manual is available.

The Mock Interpretation Tool [0044] The mock interpretation tool is configured to download speeches from the Speech Repository. The student than uses the tool as follows: o The mock interpretation tool plays the audio of the original speech in both the left and right channels in mono while it displays the video in a window of the screen. o While the student speaks his/her interpretation, the mock interpretation tool records the interpretation as an audio-only track in central memory, using time stamps permitting the synchronization with the file of the original speech. o The student can perform both simultaneous interpretation (speaking while the video clip plays) and consecutive interpretation (first watches the clip and take notes, then records him/herself) o The mock interpretation tool then saves the interpretation audio track on the local disk. The user may also choose the option of uploading her/his interpretation to the interpretation store of the Speech Repository. o During the playback of a simultaneous interpretation, the mock interpretation tool plays the original speech on one of the audio channels and the synchronized audio of the interpretation is delivered on the other audio channel. o During the playback of a consecutive interpretation, the mock interpretation tool plays back just the audio of the interpretation with no time stamp associated. o Each audio file is unambiguously associated to a specific video clip.

[0045] The mock interpretation tool provides a light, easy-to-use and freely usable video-playing and recording tool to any student without the need of using a specially equipped language-laboratory.

[0046] The tool is configured so that may be installed by virtually anybody without any need for special technical knowledge and "administrator"- user rights. Furthermore, it may be executed on a very basic personal computer. The availability of the mock interpretation tool warrants that anybody can correctly exploit the speeches made available in the Speech Repository-system. The mock interpretation tool - though very simple - gives to everybody the possibility to play the clips without having to download any commercial product.

[0047] As students have to listen to an interpretation and to the original at the same time, the arrangement requiring the simplest possible set-up is to feed the original audio of a video clip into one ear and the interpretation into the other ear. Because of occidental use of writing from left to right, the most preferred set-up would comprise feeding the original into the left ear and the interpretation into the right one. The mock interpretation tool comprises a volume balance to individually control the volume of original and interpretation. A student and/or a tutor may listen to the original speech and its interpretation using a headset rendering the original in the left ear and the interpretation in the right ear, or a loudspeaker (for the original speech) plus a headset (for the interpretation). The latter setup provides a condition closer to real life (original in the meeting room and interpretation from earphones). Adjusting the two volumes permits to focus the attention on what originally said or, conversely, on the way the student delivers his/her interpretation. This is particularly useful while assessing the interpretation performance.

[0048] During recording, the mock interpretation tool encodes the interpreted speech as an audio-only track in a file separate from the original audio/video file and then uses a special player capable of feeding audio into the two channels (left and right) from two different files.

[0049] The mock interpretation tool is capable of uploading a student’s interpretation audio tracks on the server of the Speech Repository. These interpretations can then later be downloaded from the Speech Repository by a teacher to assess their students’ work or by the student themselves using a different computer.

[0050] The mock interpretation tool comprises a selection button to select between simultaneous (recoding audio while watching the video clip) and consecutive (watching the video while taking notes and then record audio once the video clip has ended) interpretation.

[0051] With the mock interpretation tool, a personal computer and a headset, any student can: o perform his/her own mock interpretation, (performing several trials on the same clip is of course possible), o evaluate her/himself listening her/his own audio recording synchronized with the original video and adjusting audio levels, o upload her/his audio recording to the system and have a tutor evaluate the performance from any distant place, via the Internet and using the same tool (audio recordings by students remain available in the system for a predetermined time, e.g. at least one month).

[0052] The mock interpretation tool includes the user management support: in case of a Windows™ system, all user-specific parameters (username, password, proxy parameters, and personalized options) are written in the Windows™ registry for the specific user profile (so no administrator privileges are needed for installation or any modification). It further supports extended character sets, e.g. character sets of eastern languages (Bulgarian, Greek, etc.).

[0053] Different releases of the mock interpretation tool are available (for Apple™, Linux or Windows™ computers). Further releases may follow.

Operation of the Mock Interpretation Tool [0054] The mock interpretation tool helps a conference interpretation student to train their skills in simultaneous or consecutive interpretation. The program enables a user to watch the recording of a speech and record an interpretation thereof. This recording can then be uploaded to a central repository so that the student can get the comments from a tutor.

[0055] After being started by the user, the mock interpretation tool checks online for a new version. If the local installation is no longer up-to-date, the user is informed and an update is suggested. A local copy of available speeches on multilingualspeeches.tv is also synched with the servers and updated. The user can choose to delay the update but will suffer from outdated data. All these jobs have fall backs, so the program can start without a network connection, if the needed data was downloaded before.

[0056] The first time consuming activity of the mock interpretation tool is the downloading of all speech information, which is saved within the server database. This information is saved into a local cache in a XML-format. In memory they share a single list object. While creating these objects and adding them to the list, the thumbnails with speaker pictures are also downloaded so that they can be presented to the user from the beginning on. They are also saved in the local file cache and don't need to be reloaded every time the application is started.

[0057] All objects in the main window are independent of their content and are not recreated but re-initialised when a new speech is loaded. In detail they are o a small filter; o the speech listing; o a video window with a video and a dedicated audio player object in the background; o a volume control, connected to the video and audio objects; o the control widget for the recording management; o some text area, where details for a selected speech will be displayed.

[0058] The user should always accept it as long as there are no specific reasons for not updating. The welcome screen allows the user to start the main application, change their local proxy settings or open a simple help screen.

Change Proxy Settings [0059] If the user’s computer is located within a network using a http-proxy, the user must tell the mock interpretation tool about the proxy server. The proxy settings can be entered/modified by clicking on "Change proxy settings" on the welcome screen. Here the user can enter the hostname or the IP-address of the server, the port to use and user identification information. The mock interpretation tool will use the settings entered here on for ail download routines. fforking with Speeches [0060] By clicking the corresponding button in the welcome screen, the user starts the main application. The screen now displays, on the left side, all the speeches that are currently available on the server. The text field on the top is a filter to find a specific speech by its ID. The right side contains a video player, the volume controls, the managing interface for the user’s recordings and there below a field for detailed information about the currently selected session.

Opening Speeches [0061] On start, the cursor is placed inside of the filter field so that the user can simply start typing an ID and hit enter to open it. He can also search for an interesting speech within the scrollable list on the left and simply click on the speech he wants to see. If he wants to limit the shown speech list to a few speeches he knows about, he can enter the IDs of these speeches separated with spaces and click on the button labelled “filter”. To get the full listing back, the user just has to click on “reset”. To select a speech, the user has to click it in the list.

Watching Speeches [0062] Once the user has selected a speech, the he will be presented with information (metadata) in the bottom part of the application window. The user may manually start the download of the video clip. It is not downloaded automatically (several megabytes of data) so the user can decide if he wants to download. To start the download, he has to press the download button. If the user cancels the download before it is complete, he cannot watch the speech.

[0063] If the user enters the ID of the desired speech in to the filter field and presses enter or selects it within the list widget (w/ or w/o prefiltering by the filter field), a speech object is loaded and the information from the list object in memory is propagated to all the other widgets. o By getting the speech ID, the video control searches for a copy of the speech video within the local file cache by looking for specially crafted file names. If none is found, the download button is activated and the URL of the video is prepared for the generic download class. If found, the play controls are activated and a video player object is prepared with the local video file. o Not being directly linked to the speech, the volume control simply activates the video volume slider and connects it to the video player. o If there are already local recordings or downloaded interpretations from other people associated with this speech, the recordings management control includes them into its list and offers the possibility to listen to or delete them. A new recording can also be started, if the speech video is already in the file cache. o The text area just displays some formatted data fields.

[0064] Clicking on the control buttons within the video widget will spawn an external mplayer process, which is embedded into the black rectangle using the window ID of the widget. It is controlled over the stdpipes by its slave commands. Downloading files is delegated to a dedicated download class, which spawns a dialog for user interaction.

[0065] Once the download finishes, the AV-controls will change. The download button will be deactivated, the control buttons will be activated. The functions associated with these buttons from are: o download the video file (deactivated if the download has been carried out) o play/pause the recorded video o stop the playback (and get back to the beginning) o start a new recording (explained later) o show the transcript if available o delete the local copy of this video [0066] While watching the video, the user can only pause or stop the playback and adjust the audio volume. All other actions are deactivated until the video is stopped. When the user plays a normal speech (video plus original audio track only), only one volume slider will be active, with which the volume of the video playback can be adjusted. The right slider is used when playing back recordings. This will be explained later. The volume can be reset to the default value by clicking on the little speaker symbol.

[0067] If the user runs out of disk space, it is a good idea to delete no longer needed videos to free space. The user can always download the clip again from the server if he needs it in the future. Typically, the videos take around 10 to 20 MB of hard disk space for each locally cached speech (for speeches of 5-15 minutes at 384 kbps).

Recording a Speech [0068] The first step for a user to record his/her own interpretation, is selecting and downloading a speech as described above. As soon as the system is ready, the “record” button will be activated. By clicking on it, the system will start the playback of the video and simultaneously record everything from the user’s default input audio device into local memory. As soon as the user has finished or the video clip ends, he/she can stop the recording by clicking on the normal stop button. The system will then ask if the recording is to be encoded. If the user clicks on “discard", the recording will be lost and the memory freed. To save the recording, the user will have to click on “save” which will start the audio encoding process. The audio track is encoded into a compressed standard format (AAC at 32kbps). The newly saved recording will now be made available within the recordings control on the right side of the screen. Whenever the user loads this speech again, his/her own recordings will also be loaded and shown here.

[0069] When starting a recording, a video playback object is created just like when playing back a speech. Additionally, a recording object is created. As a first step, it allocates memory dependant on the length of the video, which is saved within the speech information. Recording is done using callback methods within the audio recorder object, which is called whenever the portaudio lib has new data available. As soon as the buffer is filled or the user requests an interruption, the user is asked for the recording to be saved. When confirmed, the buffer is written in 16 bit raw mode to the disk. This raw file is fed to a faac process, also controlled by its stdpipes, and encoded to AAC which is then stored in the local file cache. By encoding the corresponding speech ID and a unique identifier into the file name, this recording can always be reassigned to the correct speech.

Managing Recordings [0070] Whenever the recording management control gets the signal to update its data according to the loaded speech, it searches for previous recordings within the local file cache by looking at specially crafted filenames and adds appropriate entries to its internal drop down list, labeling it with the time of the creation of the file.

[0071] Selecting a valid recording will activate the control buttons below the drop down list. After selecting an entry, the user can choose to listen to, to delete or to upload this file.

[0072] The list is limited to 9 own recordings to avoid waste of disk space. If the user has filled all recording slots with recordings, he must delete one or more of his/her recordings to free a slot and be able again to proceed to a further recording or download one from the Speech Repository. To free a recording slot, the user may also upload one or more of his/her recordings to the Speech Repository.

[0073] Listening will spawn another mplayer process additionally to the video one and instruct both processes to play their audio only on a single channel, so that the user can listen to the original speech and the recorded voice concurrently. After verifying the quality of the recording, the student can upload his/her interpretation to the central Speech Repository. By clicking on upload, an upload object, similar to the download objects used before, is created. It sends the encoded data stream by HTTP POST to the server, reads an id sent back by the server and presents this id to the uploader within a simple dialog box.

Uploading [0074] If the user wants to make one of his/her interpretations available to someone else using the mock interpretation tool, he only needs to choose his/her local recording and click on “upload”. The mock interpretation tool will establish a connection to the Speech Repository-server and place the audio file on it. Uploading will take few seconds on a common ADSL line. Once the file is saved on the server, the mock interpretation tool will show a dialog box where an "ID" is presented. The user can select this ID with his/her computer mouse and copy&paste it into an email and send the latter to his/her tutor. The tutor only needs to know this ID to download the recording from the Speech Repository.

Listening to a Recording [0075] To listen to his/her own interpretations, the user simply chooses the recording he wants to listen to from the drop down list and clicks on the activated "listen to this recording"-button. He will then hear the original speech on the left audio channel and his/her own recording on the right one. Now both sliders of the volume control are activated so that the user can adjust both channels independently from each other. Clicking on the speaker symbol will reset both channels to their default values.

[0076] To listen to someone else’s recording, a user downloads that recording from the Speech Repository. To download a interpretation from the server, the user must give the ID of the wanted recording to the program. Again, the download class is initialised with the URL to the audio file, composed out of the ID, and downloads the wanted file to the local file cache. The file name is chosen to be able to reassign the file to a specific speech - session and recording id are used within the file name.

Playing this audio file will trigger the same operations as when listening to own recordings, giving the downloaded audio file as parameter to the audio only mplayer process.

[0077] In practice, to listen to someone else’s recording, a user has to select the option “download a recording". The user will then be prompted to enter the ID of the recording to download and to start the download. After the download is finished, the recording will be added into the drop-down list with the recordings and can be used just as the user’s own local recordings. Uploading is deactivated and deleting will only delete the local copy and not the one on the server.

Source Code [0078] In this chapter an overview of the architecture is given with shortened snippets of the source code and/or class diagrams of an embodiment of the mock interpretation tool (called “iRec”). References in parentheses refer to lines of source code. The source code is available under the GNU General Public License.

[0079] The mock interpretation tool iRec 1.x was developed for Microsoft™ Windows™ using C++, compiled by gcc 3.4.2 and Qt 4.2.2 in the GPL-version as the GUI toolkit. For audio-recording, the portaudio library with the V19 API is used. Encoding the audio streams is done by calling an external faac 1.25 process and the whole play back is delegated to external mplayer 1.0rc1 processes, controlled through stdin and stdout pipes.

Main Window [0080] As a first object on program start, the class MainWindow is created. It inherits QMainWindow and offers a frame with tool bar, status bar and place for a main widget. In this version, only the main widget is used.

Constructor [0081] The constructor first recreates the old window settings (1.17) and loads the WelcomeWidget (1.19). After loading its signals are connected to slots within the MainWindow. As soon as all is prepared, the widget is shown within the main are of the MainWindow. The WelcomeWidget initiates the first variables and checks online for updates. Further details can be found in the section about the WelcomeWidget. 017 readWindowSettingsO; 018 019 m_welcoineW1dget = new WelcomeWidget(tlils); 020 m_applicationPath = new QDir(QD1r::homePath(). append(APPLICATION_PATH)); 021 m sessionListFIle = new QFile(ni_appl1cationPath->path{).append('Vsessionlist.xml")); 022 023 connect(m_welcomeW1dget, SIGNAL(startNewSession()), 024 this, SLOT(startNewSessionO)); 025 connect(m_welcomeWidget, SIGNAL(downloadSessionListO), 026 this, SLOT(downloadSessionListO)); 027 connect(m_welcomeW1dget, SIGNAL(setSessionL1stUrl(const QString &)). 028 this, SLOT(setSessionListUrl(const QString &))): 029 030 setCentralWidget(m_welcomeW1dget);

SessionList handling [0082] The MainWindow class is also responsible for managing the SessionList object. This array stores informations for all available sessions by handling references to the different Session objects. startNewSession [0083] When getting the signal startNewSession, the equally named local function is called. It initialises the SessionList object (1.147) and calls the constructor of the Session Widget, which is the main interaction object for the user. As parameter, it gets the newly initialised SessionList object (1.151). 147 If (!flllSesstonListsO) 148 return; 149 15Θ m_sessionWidget = new SessionWldget{m_sessionList); 151 setCentralWldget(m_sesslonwidget); fillSessionLists [0084] The whole logic to get the database contents, parse it and map it into fitting objects in memory is included in MainWindow. When calling fillSessionLists, it first looks for a local copy of the data (I.39). When there is no file, the user may decline the download of the data set (11.41-49) and cancel any further initialisation. Θ39 if <!ra sessionl1stFile->exists()) Θ4Θ { 041 if (QHessageBox::question(this, \ Θ42 QObject::tr("iRec Updater"), \ 043 QObject::tr("There is no local copy of the session listing. \ 044 Do you want to download it now?"), 045 QHessageBox::Yes | QHessageBox::No) == QHessageBox::Yes) 046 { 047 downloadSessionList(); 048 } 049 return false; 050 } [0085] The local cache of the database contents is stored within an own XML format. This file only contains a listing of the available speeches on the server with an URL where to find the detailed informations and a version number (1.3). By comparing this version number with the local saved one of a potential already cached session enables iRec to recognize updated entries and download only the changes to save band with. 0Θ1 <?xral version="1.0” encoding="UTF-8" ?> 002 <sessionli$t serial="2007070707070l'· base=“http://www.multi lingual speeches.tv/scic_portal/scic/” count=“2“ clientversion="1.0"> 003 «session version="23" id=”l">l/session.xml</session> ΘΘ3 «session version="42" 1d="2”>2/sess1on.xnil</session> 005 </sessionlist> [0086] After ensuring the precense of the local file (II.52-59) an XML parser is initiated and fed with the XML content. It checks for the compatibility of the file with the program (11.86-114) and then creates a mapping object which contains newly created Session objects (1.125). This mapping object is then converted to a list object and saved as a member variable. 085 QDomElement root = doc.documentElement(); 088 if (root.tagMameO != "sessionllst") 097 return false; 1Θ1 QString xmlVerslon = root.attribute("clientversion", "ERROR"); 102 1f (xmlVersion != QString(IREC_ClIENT_VERSION)> 113 return false: Π5 116 m sessionBase = new QString(root.attribute(”base")); 117 118 QMap<int, Session*» map; 119 120 QDomNode node = root.firstChild(); 121 while (inode.isNullO) 122 { 123 if (node. toEleinentO . tagNameO == "session") 124 { 125 Session ‘session = new Sessionfnode.toElement(). attributed id"). node.toElement().at tribute("version”).tolnt0, QString(*m_sessionBase),append(node.toElementO.text())); 126 map.1nsert(node.toElement().attribute(“id").toIntO, session); 127 } 128 node = node.nextSibling(); 129 } 130 131 m_sessionlist = new QList<5ession*>(map.valuesO); [0087] The last operations on the XML content is the storing of the serial number and the list count of the session list itself into the local program settings. downloadSessionList [0088] Whenever the user agrees to download a new session list, the function downloadSessionList is called. Its sole purpose is to create a DownloadFileDialog and give it the source and the target URL of the session list.

WelcomeWidaet [0089] The WelcomeWidget is the first Ul the user sees. It enables the user to open the SessionWidget, change the network settings and read a short help file. Additionally the WelcomeWidget checks for updates of the session list and the program itself.

Constructor [0090] After setting up the GUI, the constructor checks the multilingualspeeches.tv server for updates (1.51). Further he connects the emittable signals with own signals or slots. 049 setlayout(in_centerVLayout): Θ5Θ 051 checkStatus(); 052 053 connect(m_startNewSession, SIGNAL(clickedO) > this, SIGNAL(startNewSessionO)); 054 connect(m_change$ettings. SIGNAL(clicked(>), this, SLOT(changeSettingsO)); 055 connect(m_showlntroduction, SIGNAL(clicked!)), this, SLOT(showIntroduction())); checkStatus [0091] checkStatus only creates a non visible DownloadFile object which gets the current version informations from the server and calls checkStatusDownloadDone when ready. checkStatusDownloadDone [0092] This functions is automatically called when the previously called download function returns. Its only parameter is the bool indicator if the download succeeded or not. The downloaded file, gotten from a static URL, is a XML file containing the version number of the current up to date iRec program (I.3) and further informations about the location and status of the session list for every still supported client (I.4). 001 <?xrol version='1.0' encoding='UTF-8'?> 002 <irec> 003 <client currentversion=”1.Θ" /> 004 <sessionlist serial="20070707070701'' count=“2" clientversion=”1.0”> http://www.multilIngualspeeches.tv/scic_portal/scic/i rec_list </sessionlist> ~ 005 </irec> [0093] When the download was successful, the file is parsed on the one hand for the up to date iRec version (1.144) and on the other for the details about the session listing (1.149). 118 if (!doc.setContent(m_downloadStatusfile->getTempFiie(), \ &errorStr, SerrorLine, &errorColumn)) 119 { 126 return; 127 > 128 129 QDomElement root = doc.documentElementO; 13Θ if (root. tagNameO != “irec") 131 { 135 return: 136 } 137 138 QDomNode node = root.firstChiId(}; 139 bool foundList = false; 140 while (Inode.IsNullO) 141 { 142 if (node.toElement().tagNameO == "client”) 143 { 144 parseClientInfo(node. toElementO); 145 } 146 else if (node. toElementO . tagNameO == "sessionlist”) 147 { 148 if (IfoundliSt) 149 foundList = parseSessionlistInfo(node.toElementO); 150 } 151 node = node.nextSiblingO; 152 } parseClientlnfo [0094] The version number of the running program is compared to the one submitted within the file. The user is informed about new releases and asked to update his software. parseSessionlistlnfo [0095] After downloading the new session list informations, they are parsed by this function. First it checks the passed data for compatibility with the running program (1.207). On error it just returns and awaits a new data set from its caller. If the data on the server was updated (1.213), the function calculates the differences and asks the user if he wants to bring its local cache to an up to date data set (11.219-230). 204 QString xmlVersfon = element.attribute(“cl1entvers1on". "ERROR"); 207 if (xmlVersion == IREC_CLIENT_VERSION) 2Θ8 { 2Θ9 emit setSessionListUrl(element.textO); 21Θ QSettings settingsC'scic”, "iRec”); 211 quint64 sessionlistserial = \ settings.value("sessionlistserial”. ”0").toUlongLongO; 212 QString xmlSerial = element.attributeC’seriai"); 213 if (sessionlistserial < xmlSerial.toULonglongO) 214 { 215 int sessionlistserial = \ settings.value("sessionli$tcount”, "Θ"). toUInt (); 216 QString xmlCount = element.attribute("count”): 217 int diff = xmlCount.toIntO - sessionlistserial; 218 QString messageCThe database has been updated."); 219 if (diff > Θ) 220 message.append(QString(“ %1 sessions have been \ added.").arg(diff)); 221 else if (diff < Θ) 222 message.append(Q5tring(" %1 sessions have been \ deleted.!).arg(-diff)): 223 message.append("\n\nDo you want to update your local copy?"); 224 if (QMessageBox::question(this, \ 225 QObject::tr("1Rec Updater"), \ 226 message, QMessageBox:: Yes | QMessageBox::No) == QMessageBox::Yes) 227 { 228 emit downloadSessionList(); 229 return true; 230 } 24Θ }

Session [0096] The central object around which iRec is organised is the Session object. It contains all relevant informations about a specific speech, its local cache and the already available interpretations of the video. When looking at the source code, it is the second largest part of iRec, seconded only by AVPIayer which is the main widget for the GUI. All session objects together form the SessionList object, explained above.

Constructor [0097] Each Session object is initialised with an unique id, the serial number of the session loaded and the URL of the remote data source on the server (11.26-28). Additionally to saving these informations in local variables the constructor prepares several data fields to hold the other speech data (11.55-75). This includes a few file objects which represent binary data within the local file cache such as thumbnails, audio files or the video of the speech itself (11.31-53). Here you can see which file masks are used to store and retrieve the data linked to a session. 026 m_id = new QString(id); 027 m_remoteversion = remoteversion; 028 ra_url_contents = new QUrl(remotefilename); 029 m islnitialised = false; 030 031 m_applicat1onPath = new QDir(QDir::homePath().append(APPLICATION PATH)); 032 033 m_localContentsfilePath = new QString(m_applicationPath->path(). append("/"). append(*m_id). append("_session.xml”)); 034 m_localContentsflie = new QFile(*m_localContentsfilePath); 055 m_title = new QStringO; init [0098] Init is called by the UpdateSessionListDialog after the SessionList itself is constructed and filled with freshly constructed Session objects. Init is responsible for calling the functions which download, parse (1.100) and store the contents and the searching of locally stored interpretations by calling updateRecordingsList (1.104). 100 if (!parseContents()> 101 if (IparseContentsO) 102 return false; 103 104 updateRecordlngsListQ; parseContents [0099] The longest Session function parses the XML files.which contains all the detailed informations and saves them into local variables which can be read by other objects over currently very simple getter functions.

[00100] First it ensures the existence of the content data within the local file cache (1115) and initiates a download if it's not available (1117). When the parsing itself succeeds (1125) the state of the content of the local file is compared to the state on the server (1142). When the local copy is no longer up to date, all local files are deleted (f.ex. 1.148) and the function returns false. In this case init calls parseContens again (1101) so it can retry by downloading the now missing files (1.117).

[00101] Having up to date data within the local cache the file is parsed and transferred into local variables (I.55 ff). 115 if (!m_localContentsfile->exists()) 115 { 117 downloadContentsfile(true /* blocking */); 118 } 125 if (!doc.setContent(m_localContentsfile, &errorStr, &errorLine, &errorColumn)) 131 return false: 141 int xmlVersion = root.attributeCversion", "Θ").toIntO; 142 if (m remoteversion > xmlVersion) 143 { 148 deleteVideofileO; 149 return false: 15Θ } 151 152 QDomNode node = root.firstChildO; 153 while (inode.isNullO) 154 { 155 QDomElement element = node.toElementO; 155 if (element. tagHameO == “title'') 157 { 158 m title->append(element. textO): 159 } 160 else if (element. tagNameO == "speakername”) 241 node = node.nextSiblingO; 242 } updateRecordingsList [00102] In addition to the content and video files, the recordings of own interpretations are also saved within the local file cache. When a Session object is initialised or any managing process assumes changes within the list of recordings, all recordings are searched and saved internally within a QList (I264). The recordings are matched with the Session by searching for the session id within the filename of the recording xml file (1.261). 255 QStringLiSt filters; 255 filters « *m_id + "_record1ng_*.xml”: 257 QStringList filelist = m_applicat1onPath->entryL1st(filters); 258 259 foreach(QString filename, filelist) 260 { 261 QString rec = f ilename.mid(8, filename.lastlndexOf (“.’’) )\ .mid(filename.lastlndexOf(”_”}+l); 262 Recording* pRecording = new Recording(rec); ~ 263 if (pRecording->isInitialised()) 264 m_localRecordings->append(pRecording): 265 } file cache management [00103] The Session object itself is managing the state of the binary files within the local file cache. So it offers functions to get their path, get an loaded object in some cases, delete the files from the hard drive and also to download new copies from the server. downloadContentsfile is explained exemplary for all the other download functions. All other functions are trivial. downloadContentsfile [00104] To prevent two parallel download threads conflicting with each other each download function is secured with a binary mutex (I.273). When another thread is already downloading the needed file it makes no sense to start another download and the download return immediately (1.274). When the currently active function is the only one running it sets the mutex (1.276). Checking and setting is not atomic, leaving a very small time frame for conflicts. But since all successive downloads will be initiated by human interaction this case can be safely ignored here.

[00105] Downloading data within iRec creates dedicated download objects which will be destroyed after each successfully download. In order to be save any erroneous active download object will be deleted first (II.278-282). The newly-created DownloadFile object gets only the source URL and the destination file as parameters (II.284-285). Its signal is connected to a local handling function, named the same as the calling function with an added Done (downloadContentsfileDone in this case) (II.286-287). After setting up, the download is started (blocking or not blocking as parameter (I.288). 273 if (m_mutex_downloadContentsfile) 274 return; 275 276 *_mutex_downloadContentsfile = true; 277 278 if (m downloadContentsfile != NULL) 279 { 280 delete m_downloadContentsfile; 281 in downloadContentsfile = NULL; 282 } 283 284 m_downloadContentsfile = new DownloadFile(m_url_contents, 285 m_localContentsfile); 286 connect(m_downloadContentsfile. SIGNAL(downToadDone(bool)>, 287 this, SLOT(downloadContentsfileDone(bool))); 288 m_downloadContentsfile->download(blocking); downloadContentsfileDone [00106] When the previously initiated download finishes its connected slot - this function - is called with the outcome of the operation as a boolean parameter. First the existing download objects and the mutex are deleted (11.295-297). If the enquired file has been successfully downloaded and saved a new signal is emitted, telling the system about the change in the file cache. 295 delete m_downloadContentsflie; 296 m_downloadContentsfile = NULL; 297 m mutex_downloadContentsfile = false; 298 299 if (Jerror) 3ΘΘ emit contentsfileUpdatedO;

DownloadFile [00107] Many objects in iRec need to download files from the central server or other media sources. To hide the complexity of these operations a dedicated class for downloading is offered.

Contractor [00108] The DownloadFile object gets the source URL and the destination path as the constructor arguments. These informations are stored in local variables, a temporary file as the download destination is created (I.23) and the needed proxy settings (if available) are read from the program settings (II.27-30) (Registry in Microsoft™ Windows™). Θ19 m_url = url; Θ2Θ m_localFile = localFIle; Θ21 m_httpConnection = new QHttpO; Θ22 m_wasAborted = false: Θ23 m_tempFile = new QTemporaryFileCQOir::tempPath().appendC’/iRec downloader")); Θ24 Θ25 m settings = new QSettings(REG_CATEGORY, REG_SUBKEY); Θ26 Θ27 m_proxyHost = new Q$tring(m_settings->value("proxyhost”. .toStringO); Θ28 m_proxyPort = «rsettings->value("proxyport". .toInt(); 029 tn_proxyUser = new QStr1ng(m_settings->value(“proxyuser'\ "”). toStringO); 030 m_proxyPassword = new QString(m_settings->value("proxypassword“, .toStringO); download [00109] The download itself is initiated by calling the function download. Since the constructor already got all relevant informations for the download object the only needed parameter is a boolean value indicating whether this specific download process should be handled synchronous or asynchronous, blocking the caller or not.

[00110] First the finish (II.45-46) and progress (II.47-48) signals are connected to local functions and signals, thus enabling the object of informing external objects about the progress of the download. When a GET query is encoded within the URL, this query is appended to the asked for URI (11.59-64).

[00111] After setting proxy and host settings the download thread is started (1.68). This is a separated thread, the call to get will therefore return immediately. To offer the user the possibility to download with a blocking call, a new event loop is -depending on the boolean parameter the programmer gives - created and started (II.70-75). This loop runs forever until it gets the quit signal when the download finishes (successfully or not) (1.73). 045 connect(m_httpConnection, SIGNAL(done(bool)), 046 this, SLOT(dwnldDone(bool))>: 047 connect(ifi_httpConnection, SIGNAL(dataReadProgress(int, int)), 048 this, SIGNAL(dataReadProgress(int, int))); 049 059 QString uri(m_url->path()); 060 if (in url->hasQuery()) 061 { 062 uri.appendC'?”) 063 .append(m uri->encodedQuery{)); 064 } 065 066 m_httpConnection->setProxy(*m_proxyHost, m_proxyPort, \ *m_proxyUser, *m~proxyf>assword) : 067 ra_httpConnection->setHosUm_url->host(), m_url->port(80)); 068 m httpConnection->get(ur1, m tempFile); 069 070 if (blocking) 071 { 072 QEventLoop loop; 073 connect(m_httpConnectiort, SIGNAL(done(bool>), &loop, SL0T(quit())); 074 loop. exec(QEventLoop::ExcludeUserInputEvents): Θ75 } dwnldDone [00112] When the download finishes the local slot dwnldDone is called. Its parameter is a boolean indicator if the download was successful or not. The function closes the open connection (1.83), disconnects all of its signals (1.85) and handles the moving of the temporary download target into the final local file (1.109) if this was demanded at the time of the construction of the download object (1.20). Not all downloads need to be saved into a local file since calling objects can also request a handle to the temporary file directly. Finally the downloadDone signal is emitted, potentially reporting an error to the calling object (1.119). 083 m_httpConnection->close(>; 084 085 disconnect(m httpConnection, 0, this, 0); 086 087 if (error) 093 emit downtoadDone(error); 094 return; 098 if (m_localFile) 099 { 100 if (!QDir().mkpath(QFileInfo(*m_localFile).absolutePath()>) 104 error = true; 106 else 107 { 108 in_localFile->remove(); 109 if (!m_tempFile->rename(m_localFile->fileName())) 114 error = true; 116 } 117 > 118 119 emit downloadDone(error):

DownloadFileDialoa [00113] While DownloadFile is working in the background without being noticed by the user, DownloadFileDialog uses the services of DownloadFile while presenting the user a small dialog informing him about the progress of the current download and offering him the possibility to cancel the process.

Constructor [00114] When creating a DownloadFileDialog the source URL and destination file name are given to the constructor. DownloadFile is used transparently for the calling object within DownloadFileDialog (1.35). By connection the signals of the DownloadFile object the GUI is updated. 019 m_label = new Qlabel(tr("Downloading %1...·).arg(url->to$tring())); 020 m_progressBar = new QProgressBarQ; 021 m_button = new QPush8utton(tr(”Cancel download”)); 035 m_downloadFile = new DownloadFile(url. localflle): 036 037 connect(m_downloadFile, SIGNAL(downloadDone(bool)), 038 this, SLOT(dwnldDone(bool))); 039 connect(m_downloadFite, SIGNAL(dataReadProgress(int, int)), 040 this, SLOT(updateOataReadProgress(int, int))); 041 connect(m_button, SIGNAL(clickedO), 042 m_downtoadFile, SLOT(abortO)); download [00115] Calling download simply starts the download process by drawing the dialog (1.49) and instructing the DownloadFile object to start its asynchronous download (1.50) . It also accepts a boolean parameter. With this parameter the modality of the dialog can be switched (I.48) and the GUI blocked. Θ4δ setModai(modal); 049 show(); 050 m_downloadFile->download(false /* blocking */); dwnldDone [00116] As soon as the signal about the finished download is received from the DownloadFile object, it emits the same signal (1.74) and closes the dialog (1.75). In case of errors, the error message is gotten from DownloadFile (1.68) and displayed instead (I.69). The parameter of the emitted signal represents the error state of the download process itself (I.70+I.74). 058 if (error) 059 { 060 m_»ainLayout->removeW1dget(m_progressBar); 061 delete ia_progressBar; 062 m_button->disconnect(): 063 m_button->setText{tr(”0K”)): 064 connectOnJbutton, SIGNAL(clickedO), this. SL0T(close())): 065 066 QString errorNessage; 067 errorMessage.append(QObject::tr("There was a problem \ downloading the file:\n")); 068 errorMessage.append(*m_downloadFile->errorString()); 069 m_label->setText(errorNessage); 070 emit downloadDone(true): 071 } 072 else 073 { 074 emit downloadDone(false); 075 this->done(0); 076 }

UpdateSessionListDialoa [00117] As explained previously, the Session objects are organised within a list which is created at program start in MainWindow. This list is filled with newly created Session objects containing the informations from the session digest - id, version number and remote data URL. Note that there is no detailed data loaded yet. This list is given to the SessionWidget for further processing which delegates the initialisation of the Session objects within this list to the UpdateSessionListDialog.

Constructor [00118] The constructor only assembles and shows a small dialog with a progress bar and does not contain any further logic. updateSessionlist [00119] The main work is iterating through all registered sessions (I.49) and initialise them by calling their own init function (1.54). Session which fail to initialise without problems are removed from the session list (1.57) and therefore never displayed within iRec. The user is informed about the progress and potential problems if a session is removed. 049 for (int i = Hstsize-1; i >= Θ; i--) Θ54 if (!sess1onlist->at(i)->init()) Θ57 sessionlist->removeAt(i);

SessionWidaet [00120] The SessionWidget is the main interaction widget for the user and the most complex one. It assembles the part of iRec in which the user is working and coordinates the used sub-widgets. It is created within MainWindow and replaces the WelcomeWidget when the user clicks on Start iRec.

Constructor [00121] The interface for the user is loaded and assembled in the constructor of the SessionWidget. There are the filter input field (1.21), the list widget which shows all the selectable sessions (I.35), the AV-player with its controls (I.44), the volume controls (I.50), the recording controls (1.57) and the field for the presentation of the detailed informations of each session (1.75).

[00122] After setting up the GUI the untreated session list is given to a newly created UpdateSessionlistDialog which initialises the sessions and removes the erroneous ones (11.91-92). The list widget is now filled with the content of this session list (I.94). 018 m_sessionlist = sessionlist; 021 m_f 11terEdit = new QLineEditO; 023 ra_fiiterButton = new QPushButton("open"); 031 QRegExp regex("(\\d+ ?)*’’); 032 QValidator ‘validator = new QRegExpValidator(regex, this); 033 m_filterEdit->setValidator(validator); 034 035 VistWidget * new QListWidgetO; 044 tn_avPlayer = new AVPlayerO; 05Θ rn_volumeControl = new VolumeControl(); 057 m_recordingsControl = new RecordingsControlO ; 075 ra_sessionInfoWidget = new SessionlnfoWidgetO; 091 UpdateSessionlistDialog ‘updateDialog = new UpdateSessionlistDialog(this); 092 update0ialog->update5essionlist(m sessionlist); tbc [00123] Besides of the loading of all the other widgets, the coordination between them is set up in the SessionWidget. It connects the different signals and slots of the widgets with each other to establish independent flows of information. Each widget needs to know which session is currently active and is informed about it by the calling of a slot, which is connected to the local signal sessionSelected (11.108-113). 094 f illListWidgetO ; 095 096 connect(m filterEdit. SIGNAL(textChartged(QString)). 097 this, SLOKfil terChanged(QString))); 108 connect(th1s, SIGNAL(sessionSelected(Session*)), 109 m_avPlayer, SLOT(loadSession(Ses$i'on*))); 110 connect(th1s, SIGNAKsessionSelectedCSession*)), 111 tii_sessionInfoWidget, 5L0T(load5ession(Se$sion*))); 112 connect(th1s, SIGNAL(sessionSelected(Sessiort*)), 113 m_recordingsControl, SLOT(loadSession(Session*))); 122 connect(tn_avPlayer, SIGNAL(encodingDoneO), 123 m~recordingsControl, SLOT(reloadRecsO)); 124 connecHnTavPlayer, SIGNAL(playbackStoppedO), 125 m_recordingsControl, SLOT(StopPlaybackO)); 132 connect(m_recordingsControl, SIGNAL(playRecording(int)). 133 m”avPlayer, SLOT(playRecording(int))); 134 connectCm_recordingsControl, SIGNAL(playbackStoppedO), 135 m_avPlayer, 5L0T(btnStop())); 136 connect(m~recordingsControl, SIGNAL(loadRecording(QString)), 137 this, SLQT(loadRecording(QStr1ng))); fillListWidget [00124] The ListWidget on the left side is filled by iterating over the SessionList object (1.269) and extracting the needed informations out of the saved Session objects (1.271). The start of the epoch is filtered out (II.278-284) since this is the received value for non saved dates. 269 for (1nt i = Θ; i < m sessionlist->size(); i++) 27Θ { 271 Session ‘session = m_sessionl1st->at(i); 272 QListWidgetltem ‘item = new QlistWidgetItem(QString(*session->title()) 273 .append(" (%l)\n") 274 ,arg(*session->id()) 275 . append(*session->shortdesc ()) 276 .append("\nrecorded in *1 %2") 277 .arg(*session->recordingplace()) 278 ,arg( 279 ((!session->record1ngdate()->compare("01-01-1970'’)) || 28Θ {!session->recordi'ngdate()->compare("01.01.1970")) || 281 (!session->recordingdate()->compare("01/01/1970")) || 282 (!session->recordingdate()->compare(“1970-01-01")) jj 283 (!session->recordingdate()->compare("1970.01.01")) || 284 (!session->recordingdate()->compare{“1970/01/01")) )\ on « + *session->recordingdate{)). 285 listWidget); 286 item->setIcon(*session->thurabnail()); 287 item->setData(Qt::UserRole, i); 288 } itemActivated [00125] Whenever the user clicks on a session entry in the list widget or selects it with the keyboard the local slot itemActivated is emitted. The sole purpose is to extract the chosen session out of the sessionList object and passing it to interested objects by emitting a signal with the session object as parameter. The contemplable objects were already registered within the constructor (11.108-113). 146 Session ‘session = m_sessionlist->at(item->data(Qt:rUserRole).toIntO): 147 emit sessionSelected(session); loadRecording [00126] Being the super widget for all GUI objects, the SessionWidget needs to transfer signals originating from subwidgets to other subwidgets. So it lies within the area of responsibility of the SessionWidget to coordinate the loading of recorded interpretations from the server.

[00127] The RecordingsControl tells the SessionWidget about the recording to be loaded by emitting a signal with the recording id as parameter (11.136-137). A new Recording object is now created (1.300), its id and the current session list as parameter so its constructor can download the needed files. After a successful loading (1.302) the Session object - referred from within the Recording object - is asked to update its internal list of recordings (1.304) and the session in question is loaded within all other widgets (1.305). m__recording = new Recording(id, m_sessionl 1st); if (m recording->isInitialisedO> { m__recordi ng->se$sion()->updateRecordingsList(); sessionSelected(m recording-?session()>; } ilterChanged [00128] Each time something is entered into the filter field above the list widget the slot filterChanged is called (II.96-97). It controls the state of the push button left from it and sets its text.

[00129] As long as nothing is contained within the input field, the button is disabled and its text set to the default string “open" (11.153-159). Entering anything will activate it (1.161) and then change its text to “open" (11.167-171). When a white space character is added to the field (only spaces and digits are allowed by the regular expression) the text is changed to “filter" to show the user that he will now filter the list below for the entered IDs. 153 if (filter. IsEmptyO) 154 { 155 m_filterButton~>$etEnabled(false); 156 m”fiiterButton->setText("open"); 157 m_filterButton->setToolTip("enter session Id to continue"): 158 return: 159 } 16Θ 161 m f11terButton->setEnabled(true); 162 1? (filter.indexOfC* ") ϊ= -1) 163 { 164 m_filterButton->setText("filter”); 165 m filterButton->setToolTip("filter sessionlist for ids"); 166 > 167 else 168 { 169 ra_filterButton->setText(”open"): 17Θ m~filter8utton->setToolTip(Mopen session template"): 171 > btnFilter [00130] The push button in the upper left corner right of the filter field emits a signal when pushed which is handled by the local slot btnFilter. The button supports three states, which are represented by its label text to simplify matters. Changing this to an internal variable is necessary for interpretation. These three states are “reset”, “open” and “filter”. Depending on its state, the actions taken by the slot are different.

[00131] To “reset” the listWidget is the most trivial operation. After a filtering action, the listWidget hides all non relevant sessions. These are now redisplayed (11.184-187) and the filter field is cleared (1.181). 181 m_filterEdit->clear(); 184 for (int 1=0; 1 < listWi'dget->rnodel()->rowCount(); i++) 185 { 186 T.istWidget->$etRowHidden(i, false); 187 } [00132] The default action of the button is to ’’open” a single session whose ID was entered into the filter field. It simply iterates through all entries within the local SessionList (1.247) and compares the entered id (limited to digits by design) with the ids of the Session objects (1.249). If a match is found the session within the listWidget is chosen (1.252) and loaded within the other widget by emitting the know signal sessionSelected (I.253). 247 for (1nt 1=0; i < m_sessionlist->count(); 1++) 249 if (*m_sessionlist->at(i)->1d() == m_filterEdit->text()} 252 listWidget->setCurrentRow(i); 253 emit sessionSelected(m_sessionlist->at(i)); [00133] The third possible operation is the filtering, called when the button was labelled “filter”. After resetting the button and the filter field to a resettable state (11.191-193) the text entered into the filter field is split up into usable tokens (1.194). Remember that only digits and spaces are allowed being entered into the filter field. A boolean array, having as many fields as entries in the filter field, is allocated to hold the information about each searched session if it was found (1.195). Another array with the size of the session list is created to hold the informations about which session will be shown in the list widget. The entries are initialised with false (11.196-199).

[00134] The main loop iterates over the just created list of filter field entries (1.201). For each entry in the list, the whole session list is searched (1.206). When a matching session is found, the corresponding entry in the found array is set to true to indicate this session was successfully found (1.212). The just selected session gets marked as to show in the show array.

[00135] Finally, all session entries not marked within the array show are hidden within the list widget (11.221-225). 191 m_filterButton->setText("reset''); 192 m_filterButton->setToolTip("reset the filter”); 193 m_filterEdit->setEnabled(false); 194 QStringList list = ra_filterEdit->text().splitC "); 195 bool found[list.count()J: 196 bool show[m_sessionlist->count()]; 197 198 for (int i = 0: i < m_sessionllst->count(); 1++) 199 show(i] = false; 2Θ0 2Θ1 for (int i = Θ, j = 0; i < list.countO; 1++) 202 { 2Θ4 found[ij = false; 205 j = Θ; 206 while (j < m sessionlist->count()) 207 { 208 if (!list.at(i),compare(*m sessionlist->at(j)->id())) 209 { 212 foundfi] = true; 213 show[jl = true; 214 j++; 215 break; 216 } 217 j++; 218 } 219 } 220 221 for (int i = m_sessionlist->count()-l; 1 >= 0; i--) 222 { 223 if (!$how[i)) 224 listWidget->setRowHidden(i, true); 225 > 226

Recording [00136] All recordings available within the local file cache are internally represented by Recording objects. It is partly comparable with the Session object.

Constructor [00137] Being responsible for the management of its files the Recording object only gets an ID and a reference to the program wide session list. With this id the file paths for the audio file and an additional XML file containing further information like author, date of recording or the ID of the parent speech. This XML file is parsed at the initialisation of the object (1.45). It includes downloading the file if it is not locally available.

[00138] To ease the finding of recordings belonging to a specific session a symbolic link (.Ink-file on MS Windows) is created and the filename is starting with the parental session id of the recording (II.48-53). Θ45 if (iparseContentsfileQ) 046 return; 047 048 m_contentsfile->link(m_applicationPath->path() 049 . append("/") 050 .append(*m_sessionId) 051 . append C'Jrecording^) 052 . append(*m_id) ~ 053 . append(".xml”)); 054 055 m_islnitialised = true; 056 057 if (iaudiofileExistsO) 058 downloadAudiofile(false); [00139] When constructing a Recording object it extracts its needed informations out of a XML file which was either already downloaded or will be when trying to parse it (11.155-157). After checking the file for its semantically and syntactically correctness each node is traversed and parsed (1.192).

[00140] The ID of the speech on which this recording is based is contained in the based on field. Its value is stored locally (1.197). The session list is searched for this string (I.209) and a reference to the belonging Session objects is stored in the Recording object.

[00141] All other informations are directly extracted out of the tags without further treatment (II.225-245). 155 if (!m_contentsfile->exists()) 157 downloadContentsfile(true /* blocking */); 190 QDomNode node = root.firstChiid(); 191 192 while (Inode.isNullO) 193 { 194 QDomElement element = node.toElement(); 195 if (element. tagNameO == "basedon") 196 { 197 m_sessionld->append(element. textO); 2Θ6 bool found = false; 2Θ7 for (int i = 0; i < m sessionlist->sl2e(); i++) 208 { 209 if (*m_sessionlist->at(i)->id() == sessionld) 210 { 211 m_session = m„sessionlist->at(i); 212 found = true;” 213 break; 214 } 215 } 215 if (!found) 222 return false; 224 } 225 else if (element. tagNameQ == "recordingdate") 225 { 227 m date->append{element.textO); 228 } 245 node = node.nextSiblingO; 246 } 247 return true;

Further Functions [00142] All further functions are comparable to their counterparts within the Session object and are trivial.

RecordinasControl [00143] On the right hand side of the GUI the RecordingsControl is loaded to offer some interface to the locally stored and downloadable recordings.

Constructor [00144] The constructor creates the GUI (II.30-50), connects the buttons with local slots (II.53-60) and starts the first painting (1.51). ©28 m_state = RecordingsControl:: Undefined; 050 setLayout(m_mainLayout); 051 updateGuiState(); ©52 053 connect(m_deleteBut ton, SIGNAL(clickedO), 054 this, SLOT(btnOeleteO)); updateGuiState [00145] Saving its internal state within an enumeration variable (I.28), the object can easily adept its painting to its state. Depending on the value of the internal state (I.75) the buttons are labeled differently (1.81/1.117) and activated (I.79) or deactivated (1.91). The possible states are Undefined, Ready and Playing. 075 switch (ra_state) Θ76 { 077 case RecordingsControl:Undefined: 078 m_comboBox->setEnabled(false); 079 m_deleteButton->setEnabled(false); 080 m_listenButton->setEnabled(false); 081 m_listenButton-?setText(tr("listen to this recording")); 082 m_listenButton->setIcon(QIcon(";/iniages/player_play. png”)); 083 m_uploadButton->setEnabled(false); 084 m_downloadButton->setEnabled(true); 085 break; 086 case RecordingsControl::Ready: 087 m_coeiboBox->setEnabled(true); 088 if ((m_comboBox->currentIndex() < HAX_RECQRDING_FILES) && \ (m_session->localAudiof ileExists{ni__comboBox->currentIndex{)))) 091 m_deleteButton->setEnabled(true); Π5 break; 116 case RecordingsControl:'.Playing; 117 m_listenButton->setText(tr("stop the current playback")); 118 m_listenButton->setIcon(QIcon(":/images/player_stop.png")); 119 break; 120 } reloadRecs [00146] By downloading a new recording, deleting a local one or changing the active session within iRec triggers a call of reloadRecs. For each possible local audio file (1.138) an entry is added to the just freed combo box. It is either the creation date of the file (1.142) or the string “free recording slot” when no file is present within this slot.

[00147] Afterwards the Recording objects referenced by the active Session object are added to the combo box, too (11.149-152). 138 for (1nt i * 0; i < MAX RECORDING FILES; i++) 139 { 140 if (m session->localAudiofileExists(i)) 141 { 142 m_coinboBox->addItem(QFUeInfo(* (iti_session>localAudiofile(i))). \ createdO . toString()); 143 } 144 else 145 { 146 m_comboBox->addItein(tr(''free recording slot")); 147 } 148 } 149 foreach(Recording* rec, *m sess1on->localRecordings()) 15Θ { 151 ni comboBox->addltem(*rec->id() + " (” + *rec->date() + ")"); 152 } btnUpload [00148] When a local recording is selected within the combo box an upload is enabled by activating the upload button which is connected to the slot btnUpload. It creates a new UploadFileDialog and passes the session id as parameter for the POST data (1.211). This UploadFileDialog object is then used to upload the corresponding audio file (1.212). 210 QString* parameter = new QString(*m_session->id()); 211 m_uploadFileOialog = new DploadFileDialog(parameter); 212 m_uploadFileDialog->upload(const_cast<QF1le *>(m_session->\ localAudi ofile(m_comboBox->currentIndex())));

UploadFile [00149] Similar to DownloadFile there exists a small framework to upload files to the multilingualspeeches.tv server. It's called UploadFile and brings an dedicated dialog UploadFileDialog which is very similar to DownloadFileDialog.

Constructor [00150] Since uploading can be dependent of a proxy server the proxy settings are loaded from the program settings (11.26-31) and a httpConnection is initialised with this data (II.33-34).

[00151] The common connection of signals and slots is at the end of the constructor. Θ26 m settings = new QSettingS(REG CATEGORY. REG_SUBKEY); 027 028 m_proxyHost = new QString(m_settings->value("proxyhost", "*).toStringO); 029 m_proxyPort = Hi_settings->value{“proxyport", . toIntO; 030 ra_prQxyllser = new QString(m_settings->value(Bproxyuser", ""). toStringO); 031 m proxyPassword = new QString(m_settings->value("proxypassword", .toStringO): 032 033 m_httpConnection->setHost(m_url->host(), m_url->port(80)); 034 m_httpConnection->setProxy(*m_proxyHost. m_proxyPort, \ *m_proxyUser, *m_proxyPassword); upload [00152] When the object is ready the upload can be started by giving the a reference to the file object to the function upload. It uses the same concepts like an own event queue as the download function in DownloadFile. The most significant difference is that upload needs to construct its own HTTP header so that the parameter and the file can be sent. 064 m_header = new QHttpRequestHeader("POST’\ m_url->path()); 065 ra_header->setValue(”Host“, m url->host()); Θ66 m_header->setValue(”User-Agent", "iRec " + QString(IREC_CLIENT_VERSION)); 067 m_header->setValue(”Accept'\ "text/xml"); 068 ra header->setValue("Content-Type”. "multipart/form-data: boundary=" + \ QString(UPLOAD_DIVIDER)); 067 m_header->setValue(”Connect1on", "Keep-Alive"); 068 069 m_bytes = new Q8yteArray(); 070 m_bytes->append(”--” + Q5tring(UPL0AD_DIVIDER) + *’\r\nH); 071 m_bytes->append(”Content-Disposition: form-data; name=\"content\“\r\n\r\n"); 072 m_bytes->append(*m parameter + ”\r\n”); 073 m_bytes->append("+ QString(UPLOAD_OIVIDER) + "\r\n"); 081 m bytes ->append(”--” + QString(UPLOAO_OIVIDER) + °--\r\n"); 082 083 m_httpRequestId = m_httpConnection->request(*m_header, *m_byte$); AVPIaver [00153] A very important interaction widget for the user is AVPIayer. It is visible as the video window with the playback and recording controls below of the output area. It also controls the used MPlayer processes, the recording and the encoding.

Constructor [00154] Most of the code within the constructor is dedicated to build the GUI and connect the widget's signals with the slot and signals of the subwidgets. Since AVPIayer is the parent object for playing and recording it supports more states than the other objects: “Undefined", “Recording", “Encoding”, “Ready”, “Playing”, “Paused”, "ReadyRecording”, “PlayingRecording” “PausedRecording”.

[00155] Two MPlayer objects are already created within the constructor. The external processes will be started when the user clicks on play, however.

Ill m_state * AVPIayer::Undefined; 115 m_mplayerVideo = new MPlayer (m_videoWindow); 116 m_mp layer Audio = new MPlayerO; 117 m_encoderDialog = new AacEncoderDialogO; loadSession [00156] When a new session is loaded from the list widget on the left side or a recording is opened on the right side which wants to change the GUI to represent its own session the signal loadSession is emitted and finally fed into the local slot having the same name.

[00157] It stops all playback or recording by simulating a click on the stop button (1.158), resets the state (1.180/1.187) and readjusts the slider beneath the video window (11.189-190). If a transcription of this speech is available the transcript button is activated (11.192-193). After changing the internal state it is important to update the GUI to reflect these changes which is centralised within updateGuiState (1.196). 158 btnStopO; 159 180 m_session = session; 187 m_state = AVPlayer;: Ready; 188 nTvideoReady = ni_session->localVideof ileExistsO ; 189 m_videoSlider->setHaximum{m_session->length()); 19Θ m_videoSlider->setValue(0); 191 192 m_transcr1ptButton- >setEnabled{!m_session->transcr iption()->isEmpty()); 193 iD_tfanscriptButton->setToolTip(m_session->transcription()->isEnipty()?\ "no transcription available":"show transcription"); 194 195 196 updateGuiStateQ; updateGuiState [00158] Each time the internal state changes the GUI must be changed too. To ease the handling of GUI updates they are centralised within this function.

[00159] It checks the current state of the widget (1.204) and changes the buttons and slider suitable to the state (II.206-302). 284 switch (m state) 205 { 2Θ6 case AVPlayer:Undefined: 207 case AVPlayer;'.Encoding; 208 m_downloadButton->setEnabled(false); 209 m_deleteButton->setEnabled(false); 210 m_playPauseButton->setEnabled(false); 211 m_playPauseButton->setIcon(QIcon(“:/images/player_play.png")): 212 m_playPauseButton->setToolTip("start the playback"); 213 m_stopButton->setEnabled(false); 214 m_recordButton->setEnabled(false); 215 m_boostButton->setEnabled(false); 216 m_video51ider->setEnabled(false); 217 m_video5lider->setValue(0); 218 emit setVolumeLeftEnabled(false); 219 emit setVolumeRightEnabled(false); 220 break; 221 case AVPlayer:;Ready: 3Θ3 } playRecording [00160] Before starting to play another audio file potentially running playbacks must be stopped (1.313). Local recordings and downloaded ones are inserted at different positions in the combo box within recordings control so the distinction is already made by the index of the chosen entry (1.315). The corresponding audio and video file are each given to a MPlayer process and played back on different audio channels (see MPlayer object documentation) (11.316-319). 313 btnStopO; 314 m_state = AVPlayer::ReadyRecording; 315 if (index < MAX_RECORDING_FIlES) 316 injnplayerAudio->play(m session->localAudiofile(index), 1); // rechts 317 else 318 mjnplayerAudio->play(m_sess1on->localRecordings()-> \ at(index-MAX_RECORDING_FILES)->audiofile(), 1); // rechts 319 m_mplayerVideo->play(m_session->localVideofile(). 0); // links 320 321 m_videoWindow->resize{30, 20): btnPlayPause [00161] Depending on the state of the AVPlayer widget (I.346) this slot either spawns a new MPlayer process for playback (see MPlayer object documentation) (1.349) or toggles the playing state within the same (1.353). 346 switch (m_state) 347 { 348 case AVPlayer::Ready: 349 m raplayerVideo->play(m session->localVideofile()): 35Q break; 351 case AVPlayer:: Playing: 352 case AVPlayer:: Paused: 353 m_mplayerVideo->togglePlayPause(); 362 ) btnRecord [00162] To start the recording of a interpretation the user has to click on the record button. The then called slot btnRecord creates a new AudioRecorder object with enough space to hold a recording which is 110% as long as the original speech (1.410). After marking the inner state as Recording (1.421) the signals of the newly created AudioRecorder are connected to the local slot recFinished to make encoding outside of the AudioRecorder easily possible (II.423-424).

[00163] The video of the speech is started (I.426) and the AudioRecorder object is instructed to start the capturing of audio (I.427). 410 m_recorder = new Aud1oRecorder(m_session->length() + (m_session->length()/10)); 421 m_state = AVPlayer:Recording; 422 423 connect(m_recorder, SIGNAL(done()), this, SLOT(recFinishedC))); 424 connect(m_recorder, SIGNAL(stoppedO), this, SLOT(recFinishedO)); 425 426 m_uiplayerVideo->play(m_session->localVideofile()): 427 m_recorder->start(); 428 updateGuiStateQ; recFinished [00164] Almost all status signal receiving slots change the inner state of the widget and update the GUI afterwards. RecFinished, called when a recording action is ending, additionally initiates the encoding of a recorded interpretation by giving the newly filled buffer of the local AudioRecorder object and a reference to the target file to the session independent EncoderDialog (II.578-579). 559 int retMsg = QMessageBox::questton(this, tr("iRec Audio Recording"), 56Θ tr("Do you want to save this recording to your \ local hard disk?"), 561 QMessageBox::Save | QMessageBox; -.Discard, 562 QMessageBox:: Save); 563 if (retMsg == QMessageBox::Save) 564 { 578 ni_encoderDialog->encode(m_recorder->saveBuffer(), 579 const_cast<QFile *>(m_session-> \ localAudiofile(filelndex))); 582 } MPIaver [00165] Playback of audio and video in this version of iRec is done by an external MPlayer process. This is controlled by the MPlayer object from within iRec.

Constructor [00166] As several other objects already described, MPlayer saves an internal state (1.15) which can be “Undefined”, “Playing” and “Paused".

[00167] To control the external process, a QProcess and a QTimer are created (11.18-19). QProcess is responsible for passing the byte streams between the external process and iRec. The timer is used to ask MPlayer for its current position within the stream to be able to update the GUI. They are connected with local slots to process the events (11.28-37).

[00168] The widget in which the video shall be displayed is given to the constructor of the MPlayer object and a reference is saved within the object. 015 m_state = HPlayer:Undefined: 016 Ifi fflplayerpath = new QString(MPLAYER PATH); 017 018 m_timer * new QTimer(this); 019 m_process = new QProcess(this): 020 m_errorStr1ng = new QStringO; 026 m_outputWidget = outputWidget; 027 028 connect(m_process, SIGNAL(startedO), 029 this, SLOT(processStartedO)); 030 connect(m_process, SIGNAL(fin1shed(int, QProcess::ExitStatus)), 031 this, SLOT(processFinished(int, QProcess::ExitStatus))); 032 connect(m_process, SIGNAL(error(QProcess::ProcessError)>, 033 this, SLOT(processError(QProcess::ProcessError))); 034 connect(m_process, SIGNAL(readyReadStandardOutputO), 035 this, SLOT(processReadStandardOutput())): 036 connect(m_timer, SIGNAL(timeout ()), 037 this, SLOT(pollCurrentTimeO)): play [00169] When the MPlayer object is told to start playing, it assembles a command line and spawns an additional process (1.102). All arguments to the external process are saved into a QStringList (1.57) which is used by start to assemble the whole command line.

[00170] Play must also find the window-ID of the target widget (1.66) and coordinate the parallel running video and audio mplayer processes by limiting the audio output channels (II.76-94).

[00171] Since now the mplayer process is running and showing media the timer is started to poll MPlayer for the current position within the video (1.103). 357 QStringList args; 058 args << "-slave" 063 if(m_outputWidget) 066 args << “-wid" << QString::number(\ reinterpret_cast<qlonglong>(m_outputWidget->winId())) 076 switch (channel) 077 { 078 case Θ: 079 /* nur links */ 080 args << ”-af" 081 « "channels*^ :2:0:0:1:0"; 082 break; 094 ) 095 096 args « mediaFile->fileNameO ; 100 m_process->setProcessChannelMode(QProcess::MergedChannels); 102 m_process->start(*m_mplayerPath, args); 103 i mer - >s tar t (1Θ0ΘΤ; pause [00172] All controlling functions work similar. Pause is taken as an example. First the current state is checked to ensure that this slot can actually be called at this situation (1.110). When the slot is runnable it writes the command into the process pipe (1.112), eventually resetting the state (1.113) and emitting a signal on success (1.114). 110 if (m state == HPlayer:.Playing) 111 { 112 m_process->wri te(”pause\n"): 113 m_state = MPlayer:: Paused; 114 emit playbackPausedQ; 115 } pollCurrentTime [00173] Whenever the timer is fired the current position within the media file is read.

This is done by telling MPlayer to print this information onto the default pipe (I.289). When MPlayer writes the time into the pipe, the local slot processReadStandardOutput is called. 288 if (m_state == HPlayer::Play1ng) 289 m_process->write("get_time_pos\n"); processReadStandardOutput [00174] Each line of output is read (1.257) and parsed for specific keywords (1.274). When a relevant information is found, it is parsed, saved into a local variable (I.277) and the change is propagated by emitting a signal, already containing the new value as parameter (I.278). This signal can be used within other objects to update the GUI (AVPIayer). 257 while(m_process->canReadL1ne()) 258 { 259 QByteArray buffer(m_proce$$->readL1ne()); 274 i f(buffer.s ta r tsWit h("ANS_TIHE_POSITION")) 277 m_pos =. buffer.toFloat(); 278 emit posChanged((int)(m_pos + 0.5)); 280 }

AudioRecorder [00175] To capture the audio signals the portaudio library is used. It is encapsulated within a dedicated object which offers well known interfaces to the rest of the program.

Constructor [00176] The sole parameter the constructor of AudioRecorder get is the maximum length of the audio to be recorded. This value is used to calculate the number of needed frames (II.39-32) and allocate enough memory in advance (1.34). This newly allocated buffer is filled with “silence” (II.42-42) and then initialised. PortAudio is configured as needed (II.53-57) so that an input stream can be opened (II.59-66). As soon as the stream is opened, AudioRecorder changes to ready and awaits the instruction to begin recording. 027 m recordingBuffer = new sampleBuffer; 028 029 m_recordingBuffer->maxFrameIndex = totalFrames = 1nt (length * SAHPLE_RATE); 030 m”recordingBuffer->framelndex = 0; ~ 031 numSamples = totalFrames * NUM_CHANNELS: 032 numBytes = numSamples * sizeof(SAMPLE); 033 034 m_recordingBuffer->samples = (SAMPLE *) maHoc(numBytes); 041 for(int i = Θ; i < numSamples; i++) 042 m recordingBuffer->samples(1] = SAMPLE_SILENCE; 043 044 err = Pa_Initialize(); 059 err = Pa_Open$tream(&m_stream, 060 &inputParameters. 061 NULL, 062 SAHPLE_RATE, 063 FRAMES PER_BUFFER, 064 paClipOff, 065 recordCallback, 066 this); 075 m_state = AudioRecorder;:Ready; saveBuffer [00177] After a recording is finished, the audio data is only stored in main memory. The saveBuffer function is called from outside after the user confirmed he wants to save and encode the data.

[00178] As target a temporary file is created (I.205) and opened for writing. The number of bytes to be written is calculated using the whole buffer as base (I.209). This buffer is then written unformated into the file (1.210), resulting in a collection of unsigned 16-bit integers. The file pointer is given back so that the calling object can pass this file to an encoding process. 205 m_tempFile = new QTeraporaryFi le(QDi r:: tempPathQ . append{"/iRec recording’’)); 206 207 if (m_tempFile->open()) 208 { 209 int bytesToWrite = m_recordingBuffer->framelndex · NUM_CHANNEL5 * \ sizeof(SAMPLE); 210 int writtenBytes = m_tempFile->write((char*)m_recording8uffer->samplesΛ bytesToWrite); 211 m_tempFile->flush(); 213 return m tempFile; 214 }

AacEncoder [00179] Encoding is taken over by faac which is controlled by the AacEncoder object. It uses the same techniques to communicate with the external process as the MPlayer object. Only the commands are different, of course.

Further Objects [00180] There are still some classes and objects in iRec, which have not described within this document. But they can be considered trivial and do not need any more explanation than the source code itself since they only provide further GUI elements and no program logic.

Claims

1. Mock interpretation tool permitting students of conference interpretation to watch video material and listen to audio material stored in standard but specific format in a database and to record a mock interpretation with the possibility of being evaluated by a distant tutor.