AU2021104873A4 - An audio-visual analysing system for automated presentation delivery feedback generation - Google Patents

An audio-visual analysing system for automated presentation delivery feedback generation Download PDF

Info

Publication number
AU2021104873A4
AU2021104873A4 AU2021104873A AU2021104873A AU2021104873A4 AU 2021104873 A4 AU2021104873 A4 AU 2021104873A4 AU 2021104873 A AU2021104873 A AU 2021104873A AU 2021104873 A AU2021104873 A AU 2021104873A AU 2021104873 A4 AU2021104873 A4 AU 2021104873A4
Authority
AU
Australia
Prior art keywords
audio
video
presentation
analysis
audio analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2021104873A
Inventor
Gail Bower
Tim Kirkman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2021900520A external-priority patent/AU2021900520A0/en
Application filed by Individual filed Critical Individual
Application granted granted Critical
Publication of AU2021104873A4 publication Critical patent/AU2021104873A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/69Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for evaluating synthetic or decoded voice signals
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B17/00Teaching reading
    • G09B17/003Teaching reading electrically operated apparatus or devices
    • G09B17/006Teaching reading electrically operated apparatus or devices with audible presentation of the material to be studied
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information

Abstract

An audio-visual analysing system for automated presentation delivery feedback generation comprising a computing device having a: processor and memory device operably coupled thereto, the memory device comprising computer program code instructions and associated data fetched, interpreted and executed by the processor in use; a video interface interfacing a video camera; an audio interface interfacing a microphone; wherein the computer program code instructions comprise: a video analysis controller which analyses video captured by the video camera to generate a video analysis feedback score; and an audio analysis controller which analyses audio captured by the microphone to generate an audio analysis feedback score. 12 1/4 Profile Displa y Video Audio - Interface Interface Interface 4-4 (3-Structure Data Tx-ospeech VpidyeVodoAdnalysis l~i Processor FAudio AnalysisPrcso Templating Controllers Fp Memory Figure 1

Description

1/4
4-4
Profile Displa y Video Audio - Interface Interface Interface
(3-Structure
Data
Tx-ospeech
VpidyeVodoAdnalysis l~i Processor FAudio AnalysisPrcso
Templating Controllers Memory Fp
Figure 1
An audio-visual analysing system for automated presentation
delivery feedback generation
Field of the Invention
[0001]This invention relates generally to an audio-visual analysing system for automated presentation delivery feedback generation.
Summary of the Disclosure
[0002] There is provided herein an audio-visual analysing system for automated generation of presentation delivery feedback.
[0003] The system comprises a processor and memory device operably coupled thereto. The memory device comprises computer program code instructions and associated data which is fetched, interpreted and executed by the processor in use.
[0004] The computer device has a video interface interfacing a video camera and an audio interface interfacing a microphone.
[0005] The computer program code instructions comprise a video analysis controller which analyses video captured by the video camera to generate a video analysis feedback score. Furthermore, the computer program code instructions comprise an audio analysis controller which analyses audio captured by the microphone to generate an audio analysis feedback score.
[0006] Other aspects of the invention are also disclosed.
Brief Description of the Drawings
[0007] Notwithstanding any other forms which may fall within the scope of the present invention, preferred embodiments of the disclosure will now be described, by way of example only, with reference to the accompanying drawings in which:
[0008] Figure 1 shows an audio-visual analysing system for automated presentation delivery feedback generation in accordance with an embodiment;
[0009] Figure 2 shows exemplary processing by the system of Figure 1 in accordance with an embodiment;
[0010] Figure 3 shows exemplary video processing by the system of Figure 1 in
accordance with an embodiment; and
[0011] Figure 4 shows exemplary audio processing by the system of Figure 1 in
accordance with an embodiment.
Description of Embodiments
[0012] Figure 1 shows an audio-visual analysing system 100 for presentation delivery
feedback.
[0013] The system 100 comprises a computing device 101 having a processor 102 in
operable communication with a memory device 103 across a system bus 104.
[0014] The memory device 102 comprises computer program code instructions and
associated data which are fetched, interpreted and executed by the processor 102 in
use for implementing the functionality described herein.
[0015]The computing device 101 comprises a display interface 105 interfacing a
digital display device 106. The processor 102 controls the display interface 105 to
display a user interface 107 on the digital display 106 comprising digital information
108.
[0016]The computing device 101 further comprises a video interface 109 which
captures video from a video camera 110. The computing device 101 further comprises
an audio interface 111 which captures audio from a microphone 112.
[0017]The memory 103 may comprise data 113 comprising presentation templates
114 which are used to generate a presentation structure 115. The data 113 may
further store parameters 116 in relation to a presentation. The data 113 may further
store a profile 117 generated according to calculated video and audio analysis
feedback scores.
[0018] The memory 103 controllers 118 may comprise a templating controller 119 for
generating the presentation structure 115 from the presentation templates 114.
[0019] The controllers 118 may further comprise an audio analysis controller 120
which analyses audio captured by the microphone 112. Furthermore, the controllers
118 may comprise a video analysis controller 121 which analyses video captured by
the video camera 110. In embodiments the controllers 118 may further comprise a text-to-speech controller which converts text from the presentation structure 115 to speech.
[0020] Figure 2 illustrates exemplary processing 123 by the system 100.
[0021] The processing 123 may comprise template interface interaction at step 124
wherein the template generation controller 119 generates the presentation data
structure 115 and associated parameters at step 125 using at least one template 114.
[0022] Specifically, for the generation of a presentation structure 115, a user of the
system 100 may select a template 114 for generating a presentation wherein the
system 100 displays a user interface 107 comprising on-screen controls to generate
the presentation structure 115.
[0023] The interface 107 may request various information such as the objectives of
the presentation, introduction, talking points, conclusions and the like. In
embodiments, the interface 107 may have input fields for what the presenter would
like the audience to think, feel and do.
[0024] The information input may be inserted into placeholders of the relevant
template 114 for the generation of the presentation structure 115.
[0025] The user may also configure various parameters, such as the intended
audience, style of presentation and the like. These parameters may be stored in
relation to the presentation structure 115.
[0026] Once having generated the presentation at step 124 and 125, the system
displays the presentation in the interface 107 according to the generated presentation
structure 115. For example, the interface 107 may display key points, tips and content
for each stage of the presentation, such as for the introduction, main part and
conclusion stages of the presentation.
[0027] The user uses the presentation using the interface 107 to deliver the
presentation while the system 100 records the user.
[0028] Specifically, at step 127 the system 100 captures audio via the microphone
112 and, at step 128, the audio analysis controller 120 analyses the audio to generate
an audio analysis feedback score.
[0029] Simultaneously, at step 126, the system 100 captures video data using the
video camera 110 and, at step 129, the video analysis controller 129 analyses the
video for generating a video analysis feedback score.
[0030] In a preferred embodiment, the system 100 captures both audio and video.
However, in embodiments, the system 100 captures either audio or video, including
depending on the hardware capabilities of the computer 100.
[0031] At step 130 the system 100 updates the user profile 117 with the audio and
video feedback scores and, at step 131, displays the results thereof on the interface
107.
[0032] Figure 3 illustrates exemplary video analysis processing 132 performed by the
video analysis controller 121 for gaze detection in accordance with an embodiment.
[0033] The processing 132 is used to classify the gaze of the user during presentation
and, more specifically, the classify whether the user is looking directly ahead or up,
down, left and right during the presentation.
[0034] In accordance with this classification, the video analysis controller 121 may
increase the video feedback score when the user is looking directly ahead as opposed
to the sides or up and down.
[0035] In embodiments, the video analysis controller 121 may further adjust the video
analysis feedback score depending on whether the user is looking to the sides or
down wherein the video analysis controller 121 penalises the user (such as by
decrementing the video analysis feedback score or incrementing the video analysis
feedback score by a smaller amount) when the user is looking down at presentation
material as opposed to looking centre and side-to-side and engaging with the
audience.
[0036] The video analysis processing 132 may comprise facial key point detection at
step 133 wherein facial key points are identified from the video data. The facial key
points detected may comprise nose, chin, facial outline, eyebrow, ear, hairline key
points.
[0037] Step 134 - 136 may be used for detecting gaze at step 137. Step 134 may
comprise binary masking wherein the video data is converted to black and white. Such conversion may depend on ambient lighting conditions and the like and the video analysis controller 121 my dynamically adjust a threshold accordingly.
[0038]At step 134, the video analysis controller 121 specifically attempts to binary
mask the eye region of the user. As such, the video analysis controller 121 may
segment the eye region using the facial key points detected at step 133 and then
dynamically adjusts the threshold controller until such time that two independent
white regions (i.e. the eyes) are detected within a continuous black background.
[0039] Step 135 may comprise segmentation wherein the white regions (correlating
to each eye) are segmented.
[0040] Step 136 may comprise contour finding. Specifically, contour finding may
comprise complete or partial circular contour finding to identify the generally circular
iris within the sclera. In embodiments, contour finding may identify the general shape
of the sclera.
[0041]At step 137, the analysis controller can therefore classify the gaze.
Specifically, the video analysis controller 121 may classify the centre point of the iris
region identified by the aforedescribed contour finding wherein the respective position
of the centre point with respect to the surrounding segmented region is indicative of
the gaze of the user.
[0042] In other words, if the centre point is substantially within the centre of the
segmented area, the video analysis controller 121 determines that the gaze is directly
forward whereas if the centre point is to one side of the segmented area, the video
analysis controller 121 determines that the gaze is to one side.
[0043] At step 138 the video analysis controller 121 classifies the gaze. Classification
may comprise classifies in the gaze into five regions comprising centre, up, down, left
and right.
[0044] Each region may comprise an associated weighting/score used for updating
the video analysis feedback score. For example, the central region may comprise a
weighting of 10, the left and right regions may yet comprise a weighting of five, the
upper region may comprise a weighting of three and a lower region may comprise a
weighting of zero.
[0045] As such, for each time period, such as one minute, the video analysis controller
121 may detect the duration of the gaze within each of these regions and calculate a
video feedback analysis score proportion with the time with an end region and the
associated score.
[0046] For example, for a one-minute period, should the gaze be detected as being
within the central region for 30 seconds and within the lower region for 30 seconds,
the video analysis controller 121 may assign video analysis feedback score of five,
being the average of the scores of ten and five for the respective central and lower
regions.
[0047] Figure 4 illustrates exemplary audio processing 139 in accordance with an
embodiment. An initial score 142, such as zero, may be set at step 142.
[0048] Step 140 comprises capturing audio from the microphone 112.
[0049] Step 141 may comprise the audio analysis controller 121 performing volume
measurement of the captured audio. The audio analysis controller 121 may compare
the volume average or time period volume average against a target threshold region
such as between 70 and 80 dB.
[0050] The audio analysis controller 121 may positively adjust the score at step 143
when the detected volume is within this target range and negatively adjust the score
when the detected volume is outside this target range.
[0051] In embodiments, the audio analysis controller 121 may set the target threshold
region depending on the various parameters. For example, the parameters may
include the size of the venue (such as a meeting room or theatre), the number of
attendees, whether an audio amplifier is being used, distance from the microphone
and the like. Depending on these parameters, the audio analysis controller 121 may
select the target threshold region from a lookup table. For example, for a meeting
room venue, the target threshold region may be between 60 and 70 dB as opposed
to the theatre venue wherein the target threshold region would be higher such as from
to 80 dB.
[0052] The processing 139 may comprise speech-to-text at step 144 wherein the
audio analysis controller 121 converts the audio to speech. Specifically, the speech to-text converts the audio into a string of words each having an associated timing marker.
[0053]As such, at step 145, the processing 130 may comprise pace measurement
wherein the audio analysis controller 121 adjusts the score at step 146 with reference
to a target pace threshold range. Similarly, the audio analysis controller 121 may
adjust the target pace threshold range depending on the various parameters. For
example, depending on the type of presentation, audience and/or topic of the
presentation, the target pace may vary accordingly.
[0054] Step 147 may comprise pause measurement 147 wherein the pauses between
words and between sentences are measured. At step 148, the audio analysis
controller 120 may adjust the score accordingly. Similarly, the audio analysis
controller 121 may utilise a target pause threshold range depending on the configured
parameters 116.
[0055]The processing 139 may comprise filler word detection at step 149. Filler
worded detection 149 may comprise cross-referencing the detected words with a
pause word dictionary 151. The pause word dictionary 151 may comprise pause
words such as "umm", "ah" and the like. The audio analysis controller 120 may
decrement the score at step 150 proportionate to the number of filler words detected.
[0056] At step 152, the audio analysis controller 122 may assign the score to various
categories.
[0057] The display of the feedback results at step 131 may comprise displaying the
actual video and audio analysis feedback score. Alternatively, the video and audio
analysis feedback score may be assigned to various categories.
[0058] For example, the gaze detection may be used to classify that the user gaze
during presentation is "good", or "looking down too much".
[0059] Furthermore, the audio feedback score may use to classify that the user's
speech is "a bit rushed for the intended audience" or "needs more pauses for greater
emphasis for this type of topic".
[0060] In embodiments, the text-to-speech controller 122 may convert text of a
presentation as defined by the presentation structure 115 to speech. In embodiments, the text-to-speech controller 122 may present the presentation depending on the configured parameters 116. For example, depending on the target pace, pause and volume thresholds which depend on the configured parameters 116, the text-to speech controller may convert the text of the presentation to speech with volume, pace and pauses within the target thresholds.
[0061] In embodiments, the system may convert the presentation structure to a
conventional presentation format, such as MicrosoftTM PowerPointTM.
[0062]The foregoing description, for purposes of explanation, used specific
nomenclature to provide a thorough understanding of the invention. However, it will
be apparent to one skilled in the art that specific details are not required in order to
practise the invention. Thus, the foregoing descriptions of specific embodiments of
the invention are presented for purposes of illustration and description. They are not
intended to be exhaustive or to limit the invention to the precise forms disclosed as
obviously many modifications and variations are possible in view of the above
teachings. The embodiments were chosen and described in order to best explain the
principles of the invention and its practical applications, thereby enabling others
skilled in the art to best utilize the invention and various embodiments with various
modifications as are suited to the particular use contemplated. It is intended that the
following claims and their equivalents define the scope of the invention.
[0063] The term "approximately" or similar as used herein should be construed as
being within 20% of the value stated unless otherwise indicated.

Claims (27)

  1. Claims 1. An audio-visual analysing system for automated presentation delivery
    feedback generation comprising a computing device having a:
    processor and memory device operably coupled thereto, the memory device
    comprising computer program code instructions and associated data fetched,
    interpreted and executed by the processor in use;
    a video interface interfacing a video camera;
    an audio interface interfacing a microphone; wherein the computer program
    code instructions comprise:
    a video analysis controller which analyses video captured by the video
    camera to generate a video analysis feedback score; and
    an audio analysis controller which analyses audio captured by the
    microphone to generate an audio analysis feedback score.
  2. 2. The system as claimed in claim 1, wherein the computer program code
    instructions further comprise a templating controller which generates a presentation
    data structure using at least one presentation template.
  3. 3. The system as claimed in claim 1, wherein at least one parameter is stored in
    relation to the presentation data structure.
  4. 4. The system as claimed in claim 2, wherein the system further comprises a
    display interface interfacing a digital display and wherein the system displays a user
    interface configured according to the presentation data structure whilst the video and
    audio is captured.
  5. 5. The system as claimed in claim 1, wherein video analysis comprises gaze
    detection.
  6. 6. The system as claimed in claim 5, wherein gaze detection comprises
    segmentation to segment eye regions and contour finding to identify an iris within
    each segment.
  7. 7. The system as claimed in claim 6, wherein segmentation comprises selecting
    eye regions according to detected facial key points.
  8. 8. The system as claimed in claim 6, wherein segmentation comprises binary
    masking.
  9. 9. The system as claimed in claim 8, wherein binary masking comprises adaptive
    binary masking thresholding.
  10. 10. The system as claimed in claim 9, wherein adaptive binary masking
    thresholding comprises adapting threshold until two regions are detected.
  11. 11. The system as claimed in claim 5, wherein gaze detection comprises assigning
    a detected gaze to a plurality of gaze regions.
  12. 12. The system as claimed in claim 11, wherein the gaze regions comprise a
    central region, upper region, lower region and side regions.
  13. 13. The system as claimed in claim 12, wherein each region is associated with a
    score and wherein the video analysis comprises adjusting the video analysis
    feedback score depending on the score associated with each region.
  14. 14. The system as claimed in claim 13, wherein video analysis further comprises
    adjusting the video analysis feedback score depending on a time period associated
    with each region.
  15. 15. The system as claimed in claim 1, wherein audio analysis comprises volume
    measurement.
  16. 16. The system as claimed in claim 15, wherein the audio analysis controller
    adjusts the audio analysis feedback score with reference to a target volume threshold
    range.
  17. 17. The system as claimed in claim 16, wherein the target volume threshold range
    depends on a presentation parameter.
  18. 18. The system as claimed in claim 1, wherein the audio analysis comprises
    speech-to-text to convert the audio to words, each having a timing marker associated
    therewith and pace measurement which determines a pace according to time of
    marking is associated with each word.
  19. 19. The system as claimed in claim 18, wherein the audio analysis controller
    adjusts the audio analysis feedback score with reference to a target pace threshold
    range.
  20. 20. The system as claimed in claim 19, wherein the target pace threshold range
    depends on a presentation parameter.
  21. 21. The system as claimed in claim 1, wherein the audio analysis comprises
    speech-to-text to convert the audio to words, each having a timing marker associated
    therewith and pause measurement which determines pauses between words
    according to time of marking is associated with each word.
  22. 22. The system as claimed in claim 21, wherein the audio analysis controller
    adjusts the audio analysis feedback score with reference to a target pause threshold
    range.
  23. 23. The system as claimed in claim 22, wherein the target pause threshold range
    depends on a presentation parameter.
  24. 24. The system as claimed in claim 1, wherein the audio analysis comprises
    speech-to-text to convert the audio to words and filler worded detection which detects
    filler words from a filler words dictionary.
  25. 25. The system as claimed in claim 24, wherein the audio analysis controller
    adjusts the audio analysis feedback score proportionate to a number of detected filler
    words.
  26. 26. The system as claimed in claim 1, wherein the computer program code
    instructions comprise a text-to-speech controller which converts text of a presentation
    is defined by a presentation structure to speech.
  27. 27. The system as claimed in claim 26, wherein the text-to-speech controller
    converts the text depending on at least one parameter configured for the presentation
    structure.
AU2021104873A 2021-02-25 2021-08-03 An audio-visual analysing system for automated presentation delivery feedback generation Ceased AU2021104873A4 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2021900520A AU2021900520A0 (en) 2021-02-25 An audio-visual analysing system for automated presentation delivery feedback generation
AU2021900520 2021-02-25

Publications (1)

Publication Number Publication Date
AU2021104873A4 true AU2021104873A4 (en) 2021-09-30

Family

ID=77857775

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2021104873A Ceased AU2021104873A4 (en) 2021-02-25 2021-08-03 An audio-visual analysing system for automated presentation delivery feedback generation

Country Status (2)

Country Link
AU (1) AU2021104873A4 (en)
WO (1) WO2022178587A1 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050119894A1 (en) * 2003-10-20 2005-06-02 Cutler Ann R. System and process for feedback speech instruction
DK2012304T3 (en) * 2007-07-06 2012-11-19 Zero To One Technology Comscope Methods for electronic analysis of a dialogue and similar systems
US20170213190A1 (en) * 2014-06-23 2017-07-27 Intervyo R&D Ltd. Method and system for analysing subjects
US10446055B2 (en) * 2014-08-13 2019-10-15 Pitchvantage Llc Public speaking trainer with 3-D simulation and real-time feedback
US20180025303A1 (en) * 2016-07-20 2018-01-25 Plenarium Inc. System and method for computerized predictive performance analysis of natural language
WO2019017922A1 (en) * 2017-07-18 2019-01-24 Intel Corporation Automated speech coaching systems and methods
US10963841B2 (en) * 2019-03-27 2021-03-30 On Time Staffing Inc. Employment candidate empathy scoring system

Also Published As

Publication number Publication date
WO2022178587A1 (en) 2022-09-01

Similar Documents

Publication Publication Date Title
EP3614377B1 (en) Object recognition method, computer device and computer readable storage medium
US10083710B2 (en) Voice control system, voice control method, and computer readable medium
JP5323770B2 (en) User instruction acquisition device, user instruction acquisition program, and television receiver
US10019060B2 (en) Mind-controlled virtual assistant on a smartphone device
US20140379351A1 (en) Speech detection based upon facial movements
KR20150112337A (en) display apparatus and user interaction method thereof
KR20100062207A (en) Method and apparatus for providing animation effect on video telephony call
KR102193029B1 (en) Display apparatus and method for performing videotelephony using the same
WO2020244074A1 (en) Expression interaction method and apparatus, computer device, and readable storage medium
KR102351008B1 (en) Apparatus and method for recognizing emotions
EP2898510A1 (en) Method and system for object-dependent adjustment of levels of audio objects
WO2020224126A1 (en) Facial recognition-based adaptive adjustment method, system and readable storage medium
US11699439B2 (en) Digital assistant and a corresponding method for voice-based interactive communication based on detected user gaze indicating attention
US20210327436A1 (en) Voice Interaction Method, Device, and System
KR20210011146A (en) Apparatus for providing a service based on a non-voice wake-up signal and method thereof
CN114556469A (en) Data processing method and device, electronic equipment and storage medium
US11819996B2 (en) Expression feedback method and smart robot
WO2020244160A1 (en) Terminal device control method and apparatus, computer device, and readable storage medium
CN104615252A (en) Control method, control device, wearable electronic device and electronic equipment
CN111149172A (en) Emotion management method, device and computer-readable storage medium
AU2021104873A4 (en) An audio-visual analysing system for automated presentation delivery feedback generation
AU2013222959B2 (en) Method and apparatus for processing information of image including a face
US20190198044A1 (en) Voice recognition device, robot, voice recognition method, and storage medium
US20210082405A1 (en) Method for Location Reminder and Electronic Device
CN109413470A (en) A kind of the determination method and terminal device of image to be detected frame

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry