US20180375993A1 - Transcribing media files - Google Patents
Transcribing media files Download PDFInfo
- Publication number
- US20180375993A1 US20180375993A1 US16/116,671 US201816116671A US2018375993A1 US 20180375993 A1 US20180375993 A1 US 20180375993A1 US 201816116671 A US201816116671 A US 201816116671A US 2018375993 A1 US2018375993 A1 US 2018375993A1
- Authority
- US
- United States
- Prior art keywords
- user device
- message
- transcript
- media file
- media
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000012545 processing Methods 0.000 claims description 96
- 230000004044 response Effects 0.000 claims description 38
- 238000000926 separation method Methods 0.000 claims description 4
- 238000013518 transcription Methods 0.000 abstract description 109
- 230000035897 transcription Effects 0.000 abstract description 109
- 238000004891 communication Methods 0.000 description 77
- 208000032041 Hearing impaired Diseases 0.000 description 24
- 230000006870 function Effects 0.000 description 10
- 238000007792 addition Methods 0.000 description 6
- 238000013500 data storage Methods 0.000 description 6
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000004590 computer program Methods 0.000 description 4
- 230000003292 diminished effect Effects 0.000 description 4
- 230000002093 peripheral effect Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 239000013589 supplement Substances 0.000 description 3
- 230000004075 alteration Effects 0.000 description 2
- 230000010267 cellular communication Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 206010011878 Deafness Diseases 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/42391—Systems providing special services or facilities to subscribers where the subscribers are hearing-impaired persons, e.g. telephone devices for the deaf
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M7/00—Arrangements for interconnection between switching centres
- H04M7/0024—Services and arrangements where telephone services are combined with data services
- H04M7/0051—Services and arrangements where telephone services are combined with data services where the data service is a multimedia messaging service
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/60—Medium conversion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2203/00—Aspects of automatic or semi-automatic exchanges
- H04M2203/45—Aspects of automatic or semi-automatic exchanges related to voicemail messaging
- H04M2203/4527—Voicemail attached to other kind of message
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2203/00—Aspects of automatic or semi-automatic exchanges
- H04M2203/45—Aspects of automatic or semi-automatic exchanges related to voicemail messaging
- H04M2203/4581—Sending message identifiers instead of whole messages
Definitions
- the embodiments discussed herein are related to transcribing media files of multimedia messages.
- Modern telecommunication services provide features to assist those who are deaf or hearing-impaired.
- One such feature is a text captioned telephone system for the hearing impaired.
- a text captioned telephone system may be a telecommunication intermediary service that is intended to permit a hearing-impaired user to utilize a normal telephone network.
- a computer-implemented method to provide transcriptions of a multimedia message may include receiving, at a server, a message with an attached media file.
- the message may be directed to a user device and the server may be configured to receive and direct messages to the user device.
- the server may also be configured to separate the media file from the message before the message is provided to the user device.
- the method may further include generating, at a transcription system, a transcript of audio data in the media file and providing the message to the user device for presentation of the message on the user device.
- the method may further include, providing the transcript and the media file to the user device for presentation of the transcript and the media file on the user device.
- FIG. 1 illustrates a first example environment related to providing transcriptions of a multimedia message
- FIG. 2 illustrates a second example environment related to providing transcriptions of a multimedia message
- FIG. 3 illustrates a third example environment related to providing transcriptions of a multimedia message
- FIG. 4 illustrates an example system that may be used in providing transcriptions of a multimedia message
- FIG. 5 is a flowchart of an example computer-implemented method to provide transcriptions of a multimedia message
- FIG. 6 is a flowchart of another example computer-implemented method to provide transcriptions of a multimedia message.
- FIG. 7 illustrates an example communication system that may provide transcriptions of a multimedia message.
- a device for example a smart phone, may send a multimedia message, which may include a media file, over a network to a user device.
- a user of the user device may be hearing-impaired.
- the user may not be able to fully understand audio received as part of the media file included in the multimedia message.
- the audio may be voice data generated by another device that is attached to the multimedia message.
- the audio may be audio from a video file attached to the multimedia message.
- the multimedia message may be sent to a processing system before being provided to the user device.
- the processing system may be configured to separate the media file from the multimedia message prior to the multimedia message being delivered to the user device.
- the media file may be provided to a transcription system.
- the transcription system may be configured to transcribe the audio from the media file and send a transcript of the audio to the user device for presentation to the user.
- the transcription system may also be configured to send the media file of the multimedia message to the user device for presentation to the user.
- the transcript may assist the user to understand the audio from the media file.
- the processing system may send the multimedia message to the user device for presentation of the multimedia message on the user device along with the media file.
- the systems and/or methods described in this disclosure may help to enable the transcription of a media file attached to a multimedia message received at a user device or other devices.
- the systems and/or methods provide at least a technical solution to a technical problem associated with the design of user devices in the technology of telecommunications.
- FIG. 1 illustrates a first example environment 100 related to providing transcripts of a multimedia message.
- the environment 100 may be arranged in accordance with at least one embodiment described in the present disclosure.
- the environment 100 may include a communication system 170 including a processing system 110 and a transcription system 130 , and a user device 160 .
- the communication system 170 may be configured to direct multimedia messages 102 a to the user device 160 .
- the communication system 170 may be configured to direct messages from the user device 160 .
- the user device 160 may be configured to receive messages only through the communication system 170 and to send messages only through the communication system 170 .
- the communication system 170 may be a host system that is configured to receive messages destined for the user device 160 and relay the messages to the user device 160 .
- the network address of the user device 160 may be such that the multimedia message 102 a is routed through the communication system 170 to the user device 160 .
- the user device 160 may be any electronic or digital device.
- the user device 160 may be a smartphone, a mobile phone, a tablet computer, a laptop computer, a desktop computer, a phone console, or other processing device.
- the user device 160 may include or be a phone console.
- the user device 160 may include a video screen and a speaker system.
- the user device 160 may be configured to present the message 102 b, to present the media file 104 , and to present the transcript 136 for viewing and/or listening on the user device 160 .
- the user device 160 may be configured to present the transcript 136 during playback of the media file 104 on the user device 160 .
- the user device 160 may be configured to interact with the multimedia message 102 a through the communication system 170 .
- the processing system 110 may include any configuration of hardware, such as one or more processors, servers, and databases that are networked together and configured to perform a task.
- the processing system 110 may include one or more multiple computing systems, such as multiple servers that each include memory and at least one processor, which are networked together and configured to perform operations as described in this disclosure, among other operations.
- the processing system 110 may include computer-readable-instructions that are configured to be executed by the processing system 110 to perform operations described in this disclosure.
- the processing system 110 may be configured to receive a multimedia message 102 a and direct the multimedia message 102 a to a user device 160 .
- the multimedia message 102 a may be any electronic or digital message.
- the multimedia message 102 a may include a Multimedia Messaging Service (MMS) message, an email message, or another messaging type that may include a media file 104 .
- the media file 104 may be included in the multimedia message 102 a by being attached to the multimedia message 102 a.
- the multimedia message 102 a may be an MMS message that includes text and an attached video file or an attached audio file as the media file 104 .
- the multimedia message 102 a may be sent from any user or any technology device.
- the multimedia message 102 a may be sent from a smartphone, a mobile phone, a tablet computer, a laptop computer, a desktop computer, a phone console, or other processing device.
- the multimedia message 102 a may be received by the processing system 110 over a network.
- the network may include a peer-to-peer network.
- the network may also be coupled to or may include portions of a telecommunications network for sending data in a variety of different communication protocols.
- the network may include Bluetooth® communication networks or cellular communication networks for sending and receiving communications and/or data including via short message service (SMS), multimedia messaging service (MMS), hypertext transfer protocol (HTTP), direct data connection, wireless application protocol (WAP), e-mail, etc.
- SMS short message service
- MMS multimedia messaging service
- HTTP hypertext transfer protocol
- WAP wireless application protocol
- e-mail e-mail
- the network may also include a mobile data network that may include third-generation (3G), fourth-generation (4G), long-term evolution (LTE), long-term evolution advanced (LTE-A), Voice-over-LTE (“VoLTE”) or any other mobile data network or combination of mobile data networks.
- the network may include one or more IEEE 802.11 wireless networks, optical networks, a conventional type network, a wired network, and may have numerous different configurations.
- the processing system 110 may be configured to separate the media file 104 from the multimedia message 102 a. As a result, the processing system 110 may separately transmit the media file 104 and a message 102 b without the media file 104 .
- the message 102 b may be the multimedia message 102 a, but stripped of the media file 104 .
- the processing system 110 may be configured to send the media file 104 without the message 102 b to the transcription system 130 .
- the processing system 110 may also be configured to send the message 102 b to the user device 160 .
- the transcription system 130 may be communicatively coupled to the processing system 110 .
- the transcription system 130 may be configured to receive the media file 104 from the processing system 110 .
- the transcription system 130 may be communicatively coupled to the processing system 110 over a network.
- the network may include any network or configuration of networks configured to send and receive communications between devices.
- the network may include a conventional type network, a wired or wireless network, and may have numerous different configurations.
- the network may include a local area network (LAN), a wide area network (WAN) (e.g., the Internet), or other interconnected data paths across which multiple devices and/or entities may communicate.
- LAN local area network
- WAN wide area network
- the network over which the transcription system 130 is communicatively coupled to the processing system 110 may not be the same as the network over which the processing system 110 receives the multimedia message 102 a.
- the transcription system 130 may be configured to generate a transcript 136 by transcribing audio data of the media file 104 received from the processing system 110 .
- the transcription system 130 may include any configuration of hardware, such as processors, servers, and databases that are networked together.
- the transcription system 130 may include multiple computing systems, such as multiple servers that each include memory and at least one processor, which are networked together and configured to perform operations as described in this disclosure, among other operations.
- the transcription system 130 may include computer-readable-instructions that are configured to be executed by the transcription system 130 to perform operations described in this disclosure.
- the transcription system 130 may be configured to transcribe audio data of the media file 104 received from the processing system 110 to generate a transcript 136 of the audio data.
- a call assistant may listen to the audio data and “revoice” the words of the audio data to a speech recognition computer program tuned to the voice of the call assistant.
- the call assistant may be an operator who serves as a human intermediary between a hearing impaired user and the media file 104 .
- the transcript 136 may be generated by the speech recognition computer.
- the media file 104 may be sent to a speech recognition computer program without “revoicing” of the audio data of the media file 104 by a call assistant or other human intermediary.
- the audio data of the media file 104 may be sent directly to a speech recognition computer program which may generate the transcript 136 without the use of a call assistant or human intermediary.
- the transcription system 130 may be configured to provide the generated transcript 136 to the user device 160 over a network.
- the transcription system 130 may also be configured to provide the media file 104 to the user device 160 over a network.
- the network over which the media file 104 may be provided to the user device 160 may not be the same network as the network over which the multimedia message 102 a is received by the processing system 110 .
- the network over which the media file 104 may be provided to the user device 160 may or may not be the same network as the network over which the media file 104 is provided to the transcription system 130 .
- the user device 160 may be configured to present the generated transcripts 136 .
- the generated transcripts 136 may be displayed on a display of the user device 160 .
- the user device 160 may be configured to present the media file 104 on a display of the user device 160 .
- the video of the audiovisual file may be presented on a display of the user device 160 .
- the user device 160 may be configured to present the audio file, such as through one or more speakers, and present an image on a display of the user device 160 .
- the image may be a picture of a contact that sent the multimedia message 102 a to the user device 160 .
- the image may be a picture of a musical note to denote that the media file 104 is an audio file.
- a multimedia message 102 a may be sent from a device and directed to the user device 160 .
- the multimedia message may be sent via one or more networks.
- the networks may include cellular networks, the Internet, and other wireless or wired networks.
- the network address of the user device 160 may be such that the multimedia message 102 a is provided to the communication system 170 .
- the multimedia message 102 a may be provided to the processing system 110 before the multimedia message 102 a is sent to the user device 160 .
- an email with an attached video file may be sent from a laptop computer to a user device 160 .
- an MMS message with an audio file may be sent from a smartphone to a user device 160 .
- the user device 160 may be used by a user that is hearing-impaired.
- a “hearing-impaired user” may refer to a person with diminished hearing capabilities. Hearing-impaired users often have some level of hearing ability that has usually diminished over a period of time such that the hearing-impaired user can communicate by speaking, but that the hearing-impaired user often struggles in hearing and/or understanding others.
- the multimedia message 102 a may include an attached media file 104 that may include audio data.
- the audio data and the attached media file 104 may originate from any other device.
- the media file 104 may be a video recorded on a tablet computer.
- the media file 104 may be audio data recorded on a smartphone.
- the audio data may be based on a voice signal from a user of a smartphone device.
- the voice signal may be words spoken by the user of the smartphone device prior to sending the multimedia message 102 a to the user device 160 .
- the processing system 110 may separate the media file 104 from the multimedia message 102 a.
- the processing system 110 may send the media file 104 to the transcription system 130 .
- the processing system 110 may send the media file 104 received from a user or from the user device 160 to the transcription system 130 via one or more networks.
- the transcript 136 of the audio data of the media file 104 may be provided over a network to the user device 160 .
- the media file 104 may be provided over a network to the user device 160 from the transcription system 130 .
- the media file 104 may be provided to the user device 160 together with the message 102 b.
- the user device 160 by way of an application associated with the transcription system 130 and an electronic display, may display the transcript 136 while the media file 104 is also displayed on the electronic display.
- the media file 104 may be a video and the transcript 136 of the audio data of the media file 104 may be displayed while the video is being displayed on an electronic display of the user device 160 .
- the transcript 136 may allow the hearing-impaired user to supplement the audio data received from the user device 160 and confirm his or her understanding of the words spoken in the media file 104 .
- the environment 100 may be configured to provide the transcript 136 of the audio data in substantially real-time or real-time.
- the message 102 b may be presented on the user device 160 prior to the generation of a transcript 136 of the audio data of the media file at the transcription system 130 .
- the transcript 136 of the audio data of the media file may be provided to the user device 160 in less than 2 , 3 , 5 , or 10 seconds after the audio data is presented to the user of the user device 160 by an audio output device.
- the transcript 136 may be generated prior to the transmission of the multimedia message to the user device 160 and the transcript 136 of the audio data may not be provided in substantially real-time.
- the environment 100 may be configured to provide transcripts of media files attached to multimedia messages 102 a directed to the user device 160 .
- the user device 160 may be associated with a hearing impaired user and may be in communication with the transcription system 130 .
- the media file 104 may capture words spoken by a variety of individuals in the format of an audiovisual recording or an audio recording.
- the processing system 110 may send the media file 104 to the user device 160 via the transcription system 130 and over a network.
- the media file 104 may be transmitted over a network and may be sent to the user device 160 as a media file for presentation to the user of the user device 160 as a normal media file.
- the processing system 110 may transmit the media file 104 received over a network to the transcription system 130 with an indication that the transcript 136 and the media file 104 be provided to the user device 160 .
- the transcription system 130 may transcribe the audio data of the media file 104 and provide the transcript 136 to the user device 160 .
- the user device 160 may present the transcript 136 along with the media file 104 on a display of the user device 160 for the hearing-impaired user of the user device 160 .
- the environment 100 may include additional devices similar to the user device 160 .
- the message 102 may be directed to multiple user devices 160 .
- one or more of the communication couplings between the sender of the multimedia message 102 a and the user device 160 may be a wired connection.
- the processing system 110 and the user device 160 may be the same device.
- the processing system 110 may create a copy of the media file 104 and may not separate the media file 104 from the multimedia message 102 a.
- the audio data of the media file 104 may be obtained by the transcription system 130 via a link to a webpage on the Internet.
- the media file 104 may be a link to a media file or a link to a media stream.
- the processing system 110 may remove or make a copy of the link.
- the processing system 110 may send the link to the transcription system 130 . The link may be used by the transcription system 130 to obtain audio that may be transcribed.
- the transcription system 130 may use the link to access a webpage on the Internet to obtain the audio data of an audiovisual stream or of an audiovisual file.
- the link may or may not be sent by the transcription system 130 to the user device 160 .
- the transcription system 130 may send the link to the media stream to the user device 160 .
- the user device 160 may use the link to access a webpage on the Internet to obtain the audiovisual stream or the audiovisual file.
- the transcript 136 may be automatically linked with the link such that it is presented along with the media from the media link on the user device 160 .
- the transcript 136 may be presented on the user device 160 together with the media from the media link.
- FIG. 2 illustrates a second example environment 200 related to providing transcripts of a multimedia message.
- the environment 200 may be arranged in accordance with at least one embodiment described in the present disclosure.
- the environment 200 may include a communication system 270 including a processing system 210 , a transcription system 230 , and a queue 220 , and a user device 260 .
- the communication system 270 may be configured to direct multimedia messages 202 a to the user device 260 .
- the communication system 270 may also be configured to direct messages from the user device 260 .
- the user device 260 may be configured to receive messages only through the communication system 270 and to send messages only through the communication system 270 .
- the communication system 270 may be a host system that is configured to receive messages for the user device 260 and relay the messages to the user device 260 .
- the network address of the user device 260 may be such that the multimedia message 202 a is routed through the communication system 270 to the user device 260 .
- the communication system 270 may include functionality similar to Multimedia Messaging Service Center (MMSC).
- MMSC Multimedia Messaging Service Center
- the communication system 270 may include functionality similar to an email exchange server.
- the communication system 270 may receive the multimedia messages 202 a from another server and be configured to relay the multimedia messages 202 a to the user device 260 .
- the user device 260 may be any electronic or digital device and may be analogous to the user device 160 of FIG. 1 , except the user device 260 may interact with the communication system 270 in a manner different than the user device 160 interacts with the communication system 170 of FIG. 1 .
- the processing system 210 may include an analogous configuration of hardware as the processing system 110 of FIG. 1 , but may perform one or more of the same tasks or one or more different tasks than the processing system 110 of FIG. 1 . Thus, further description of hardware of the processing system 210 is not provided with respect to FIG. 2 .
- the processing system 210 may be configured to receive a multimedia message 202 a and direct the multimedia message 202 a to a user device 260 .
- the multimedia message 202 a may be analogous to the multimedia message 102 a of FIG. 1 and may be received in an analogous manner, as such further description is not provided with respect to FIG. 2 .
- the processing system 210 may be configured to separate the media file 204 from the multimedia message 202 a. As a result, the processing system 210 may separately transmit the media file 204 and a message 202 b without the media file 204 .
- the multimedia messages 202 a may be an email.
- the processing system 210 may remove an attachment to the email that is a media file 204 .
- the multimedia messages 202 a may be an MMS that includes text and an embedded uniform resource locator (URL) for a temporary storage location of the media file 204 in a server with an HTTP front-end of a network that provided the multimedia messages 202 a to the processing system 210 .
- the processing system 210 may strip the URL from the MMS, such that the media file 204 is the URL and the message 202 b is the text from the MMS.
- URL uniform resource locator
- the processing system 210 may also be configured to add a media tag 218 to the message 202 b.
- the message 202 b may include the multimedia message 202 a, but stripped of the media file 204 and with the addition of the media tag 218 .
- the processing system 210 may be configured to send the media file 204 without the message 202 b to the transcription system 230 .
- the processing system 210 may also be configured to send the message 202 b with the media tag 218 to the queue 220 .
- the media tag 218 may include information regarding the media file 204 stripped from the multimedia messages 202 a. The information may be used to properly associate the media file 204 with the message 202 b that has been stripped of the media file 204 .
- the media tag 218 may be configured to include metadata about the media file 204 and/or the message 202 b.
- the media tag 218 may include a name of the media file 204 and a length of the media file 204 .
- the media tag 218 may be a unique identifier of the media file 204 within the communication system 270 .
- the unique identifier may be used by the communication system 270 to locate and relay the media file 204 to the user device 260 or other devices that request the media file 204 using the unique identifier.
- the media tag 218 may include a file type of the media file 204 such as an audio file type or a video file type. In some embodiments, the media tag 218 may include information about a storage location of the media file 204 in the communication system 270 . In these and other embodiments, the media file 204 may be stored in a database that may be accessed by the processing system 210 and the transcription system 230 . Alternatively or additionally, the media tag 218 may include a link, such as a URL or type of location information, that may allow a device, such as the user device 260 , to retrieve the media file 204 and files associated with the media file 204 , such as a transcript of the media file 204 .
- a link such as a URL or type of location information
- the transcription system 230 may be communicatively coupled to the processing system 210 and may be configured to receive the media file 204 from the processing system 210 .
- the transcription system 230 may not physically receive the media file 204 from the processing system 210 , but may receive a storage location of the media file 204 in a database in the communication system 270 .
- the transcription system 230 may receive a link to the media file 204 .
- the transcription system 230 may be configured to generate a transcript 236 by transcribing audio data of the media file 204 .
- the transcription system 230 may include a hardware configuration analogous to hardware configuration of the transcription system 130 of FIG. 1 , but may perform one or more of the same tasks or one or more different tasks than the transcription system 130 of FIG. 1 . Thus, further description of hardware of the transcription system 230 and operations analogous to the operations performed by the transcription system 130 are not provided with respect to FIG. 2 .
- the transcription system 230 may be configured to provide an indication of completion 238 to the queue 220 over a network in response to completing the transcript 236 of the audio of the media file 204 .
- the network over which the indication of completion 238 may be provided to the user device 260 may be the same network as the network over which the media file 204 is received by the transcription system 230 .
- the indication of completion 238 may be a signal to the queue 220 that the transcript 236 of the audio of the media file 204 is complete.
- the indication of completion 238 may help enable the user device 260 to present the message 202 b, the media file 204 , and the transcript 236 to a user of the user device 260 at a single time as will be explained hereafter.
- the queue 220 may be communicatively coupled to the processing system 210 .
- the queue 220 may be configured to receive the message 202 b with the media tag 218 from the processing system 210 .
- the queue 220 may be communicatively coupled to the processing system 210 over a network.
- the network over which the queue 220 is communicatively coupled to the processing system 210 may not be the same as the network over which the processing system 210 receives the multimedia message 202 a.
- the network over which the queue 220 is communicatively coupled to the processing system 210 may be the same as the network over which the processing system 210 provides the media file 204 to the transcription system 230 .
- the queue 220 may be configured to retain the message 202 b with the media tag 218 until the queue 220 receives the indication of completion 238 from the transcription system 230 .
- the queue 220 may provide the message 202 b with the media tag 218 to the user device 260 .
- the queue 220 may prevent the user device 260 from being aware of the multimedia messages 202 a or the message 202 b until after completion of the transcript 236 .
- Retaining the message 202 b with the media tag 218 may facilitate the simultaneous presentation of the media file 204 and the transcript 236 on the user device 260 .
- the generation of the transcript 236 may not occur directly after the transcription system 230 may receive the media file 204 to generate the transcript 236 .
- the transcript 236 may be unavailable.
- Retaining the message 202 b at the queue 220 until the transcript 236 is generated may help the communication system 270 to ensure that transcript 236 may be available for presentation by the user device 260 as soon as the user device 260 receives the message 202 b and is able to present the media file 204 .
- the queue 220 may facilitate the simultaneous presentation of the media file 204 and the transcript 236 on the user device 260 .
- the user device 260 may be configured to receive the message 202 b with the media tag 218 . After receiving the message 202 b with the media tag 218 , the user device 260 may be configured to generate a request for media 268 using the information in the media tag 218 . The request for media 268 may be directed to the transcription system 230 and may be configured to allow the transcription system 230 to determine the media file 204 that is associated with the message 202 b and the media tag 218 and the transcript 236 associated with the media file 204 .
- the user device 260 may generate and send the request for media 268 in response to receiving the message 202 b from the queue 220 .
- the user device 260 may generate and send the request for media 268 in response to a user interaction with the user device 260 .
- the user device 260 may be configured to provide an indication that the message 202 b is available on the user device 260 .
- an indicator on the user device 260 may be used to alert a user that the message 202 b is available on the user device 260 .
- the user may interact with the message 202 b.
- the user device 260 may provide the request for media 268 in response to interacting with the message 202 b on the user device 260 .
- interacting with the message may include opening the message 202 b, selecting an option to download or stream the media file 204 on the user device 260 , among other interactions with the message 202 b.
- the transcription system 230 may be configured to provide the generated transcript 236 to the user device 260 over a network in response to receiving a request for media 268 from the user device 260 .
- the transcription system 230 may be configured to use the information from the request for media 268 to locate the media file 204 and the transcript 236 of the media file 204 .
- the transcription system 230 may be configured to provide the media file 204 and the transcript 236 to the user device 260 over a network in response to receiving a request for media 268 from the user device 260 .
- the network over which the media file 204 may be provided to the user device 260 may not be the same network as the network over which the multimedia message 202 a is received by the processing system 210 .
- the network over which the media file 204 may be provided to the user device 260 may or may not be the same network as the network over which the media file 204 is provided to the transcription system 230 .
- the user device 260 may be configured to present the media file 204 and the transcript 236 .
- the generated transcript 236 may be displayed on a display of the user device 260 .
- the user device 260 may be configured to present the media file 204 on a display of the user device 260 .
- the video of the audiovisual file may be presented on a display of the user device 260 .
- the user device 260 may be configured to present the audio file, such as through one or more speakers, and present an image on a display of the user device 260 .
- a multimedia message 202 a may be sent from a device and directed to a user device 260 .
- the processing system 210 may separate the media file 204 from the multimedia message 202 a.
- the processing system 210 may generate a media tag 218 with an identifier of the media file 204 for the communication system 270 .
- the processing system 210 may attach the media tag 218 to the separated message 202 b.
- the processing system 210 may send the media file 204 to the transcription system 230 .
- the processing system 210 may send the media file 204 received from a user or from the user device 260 to the transcription system 230 via one or more networks.
- the processing system 210 may provide the message 202 b with the media tag 218 to the queue 220 .
- the message 202 b with the media tag 218 may remain in the queue 220 until the queue 220 receives the indication of completion 238 from the transcription system 230 in response to the transcription system 230 completing the transcript 236 of the media file 204 .
- the queue 220 may provide the message 202 b with the media tag 218 to the user device 260 .
- the user device 260 may send the request for media 268 based on the media tag 218
- the transcription system 230 locates the media file 204 and the transcript 236 generated based on the media file 204 .
- the transcription system 230 may provide the media file 204 and the transcript 236 associated with the media file 204 to the user device 260 .
- the environment 200 may include additional devices similar to the user device 260 .
- the multimedia message 202 a may be directed to multiple user devices 260 .
- the media tags 218 provided to each of the multiple user devices 260 may include information to a single media file 204 and a single transcript 236 that may be shared among the multiple user devices 260 such that multiple transcripts are not made of the same media file 204 .
- one or more of the communication couplings between the sender of the multimedia message 202 a and the user device 260 may be a wired connection.
- the processing system 210 and the user device 260 may be the same device.
- the processing system 210 may create a copy of the media file 204 and may not separate the media file 204 from the multimedia message 202 a.
- the transcription system 230 may provide the transcript 236 in request to the request for media 268 and may not send the media file 204 .
- the audio data of the media file 204 may be obtained by the transcription system 230 via a URL to a webpage on the Internet.
- the media file 204 may be a link to a media file or a link to a media stream.
- the transcription system 230 may provide the transcript 236 in request to the request for media 268 and may not send the media file 204 .
- the user device 260 may obtain the media file 204 from the queue 220 or from another source other than the communication system 270 .
- the environment 200 may not include the media tag 218 .
- the processing system 210 may separate the media file 204 from the multimedia message 202 a and may not attach the media tag 218 to the message 202 b.
- the transcription system 230 may send the indication of completion 238 to the queue 220 .
- the queue 220 may transmit the message 202 b without the media tag 218 to the user device 260 .
- the user device 260 may be configured to transmit the request for media 268 to the transcription system 230 in response to receiving the message 202 b from the queue 220 based on information in the message 202 b.
- the transcription system 230 may associate the media file 204 and the transcript 236 with the message 202 b based on a unique identifier from the multimedia message 202 a.
- the environment 200 may not include the queue 220 .
- the message 202 b may be sent directly to the user device 260 prior to the completion of the transcript 236 in the transcription system 230 .
- the transcription system 230 may provide the indication of completion 238 to the user device 260 in response to completing the transcript 236 of the audio of the media file 204 .
- the user device 260 may present an alert on a display that the message 202 b is available on the user device 260 .
- the user device 260 may present the message 202 b with the media tag 218 in response to receiving the indication of completion 238 from the transcription system 230 .
- Presenting the message 202 b with the media tag 218 in response to receiving the indication of completion 238 may facilitate the simultaneous presentation of the media file 204 and the transcript 236 on the user device 260 .
- the generation of the transcript 236 may cause a delay in the presentation of the transcript 236 on the user device 260 .
- not indicating a receipt of the message 202 b until receipt of the indication of completion 238 may facilitate the presentation of the transcript 236 with the media file 204 .
- FIG. 3 illustrates a third example environment 300 related to providing transcripts of a multimedia message.
- the environment 300 may be arranged in accordance with at least one embodiment described in the present disclosure.
- the environment 300 may include a communication system 370 that includes a processing system 310 , a transcription system 330 , and a combining system 340 , and a user device 360 .
- the communication system 370 may be configured to direct multimedia messages 302 a to the user device 360 .
- the communication system 370 may be configured to direct messages from the user device 360 .
- the user device 360 may be configured to receive messages only through the communication system 370 and to send messages only through the communication system 370 .
- the communication system 370 may be a host system that is configured to receive messages for the user device 360 and relay the messages to the user device 360 .
- the network address of the user device 360 may be such that the multimedia message 302 a is routed through the communication system 370 to the user device 360 .
- the user device 360 may be any electronic or digital device and may be analogous to the user device 160 of FIG. 1 , except the user device 360 may interact with the communication system 370 in a manner different than the user device 160 interacts with the communication system 170 of FIG. 1 .
- the processing system 310 may include an analogous configuration of hardware as the processing system 110 of FIG. 1 , but may perform one or more of the same tasks or one or more different tasks than the processing system 110 of FIG. 1 . Thus further description of hardware of the processing system 310 is not provided with respect to FIG. 3 .
- the processing system 310 may be configured to receive a multimedia message 302 a.
- the multimedia message 302 a may be analogous to the multimedia message 102 a of FIG. 1 and may be received in an analogous manner, as such further description is not provided with respect to FIG. 3 .
- the processing system 310 may be configured to separate the media file 304 from the multimedia message 302 a. As a result, the processing system 310 may separately transmit the media file 304 and a message 302 b without the media file 304 .
- the message 302 b may be the multimedia message 302 a, but stripped of the media file 304 .
- the processing system 310 may be configured to send the media file 304 without the message 302 b to the transcription system 330 .
- the processing system 310 may also be configured to send the message 302 b to a combining system 340 .
- the transcription system 330 may include an analogous configuration of hardware as the processing system 110 of FIG. 1 , but may perform one or more of the same tasks or one or more different tasks than the processing system 110 of FIG. 1 . Thus, further description of hardware of the transcription system 330 and analogous operation to the transcription system 130 is not provided with respect to FIG. 3 .
- the transcription system 330 may be configured to receive the media file 304 from the processing system 310 .
- the transcription system 330 may also be configured to generate a transcript 336 by transcribing audio data of the media file 304 received from the processing system 310 and to provide the generated transcript 336 to the combining system 340 .
- the transcription system 330 may also be configured to provide the media file 304 to the combining system 340 .
- the combining system 340 may be communicatively coupled to the processing system 310 and to the transcription system 330 .
- the combining system 340 may be configured to receive the media file 304 and the transcript 336 from the transcription system 330 .
- the combining system 340 may include any configuration of hardware, such as processors, servers, and databases that are networked together.
- the combining system 340 may include multiple computing systems, such as multiple servers that each include memory and at least one processor, which are networked together and configured to perform operations as described in this disclosure, among other operations.
- the combining system 340 may include computer-readable-instructions that are configured to be executed by the combining system 340 to perform operations described in this disclosure.
- the combining system 340 may be configured to combine the message 302 b received from the processing system 310 , the media file 304 received from the transcription system 330 , and the transcript 336 received from the transcription system 330 to generate a combined message 302 c.
- the combining system may attach the media file 304 and the transcript 336 to the message 302 b to generate a combined message 302 c.
- the transcript 336 may be incorporated into the media file 304 as closed-captioning data. Alternatively or additionally, the transcript 336 may be separate from the media file 304 .
- the combining system 340 may be configured to provide the generated combined message 302 c to the user device 360 over a network.
- the network over which the combined message 302 c may be provided to the user device 360 may not be the same network as the network used for communication between the processing system 310 , the transcription system 330 , and the combining system 340 .
- the user device 360 may be configured to present the combined message 302 c. For example, the combined message 302 c may be presented on a display of the user device 360 .
- the media file 304 attached to the combined message 302 c may be selected.
- the user device 360 may be configured to present the media file 304 of the combined message 302 c on a display of the user device 360 .
- the user device 360 may be configured to present the transcript 336 of the combined message 302 c.
- the transcript 336 may be displayed on a display of the user device 360 .
- the transcript 336 may be selected from the combined message 302 c and may be presented on a display of the user device 360 .
- the environment 300 may include additional devices similar to the user device 360 .
- the message 302 may be directed to multiple user devices 360 .
- one or more of the communication couplings between the sender of the multimedia message 302 a and the user device 360 may be a wired connection.
- the processing system 310 and the user device 360 may be the same device.
- the processing system 310 may create a copy of the media file 304 and may not separate the media file 304 from the multimedia message 302 a.
- the combining system 340 may be configured to insert the transcript 336 generated based on the media file 304 into the multimedia message 302 a to generate the message 302 c.
- the transcript 336 may be incorporated into the media file 304 as closed-captioning data and may not be a separate element of the combined message 302 c.
- the media file 304 may include a link to the transcript 336 .
- the communication system 370 may not include the processing system 310 .
- the transcription system 330 may receive the multimedia message 302 a and generate a transcript of the media file 304 .
- the combining system 340 may combine the transcript 336 with the multimedia messages 302 a to generate the message 302 c provided to the user devices 360 .
- FIG. 4 illustrates an example computing system 400 that may be arranged in accordance with at least one embodiment described in the present disclosure.
- the system 400 may include a processor 410 , a memory 412 , a data storage 414 , a communication unit 416 , a display 418 , a user interface 420 , and peripheral devices 422 , which all may be communicatively coupled.
- the system 400 may be part of any of the electronic devices described in this disclosure.
- the system 400 may be part of the processing system 110 of FIG. 1 , the transcription system 130 of FIG. 1 , and/or the user device 160 of FIG. 1 .
- the system 400 may also be part of system/components illustrated in FIGS. 2 and 3 .
- the processor 410 may include any suitable special-purpose or general-purpose computer, computing entity, or processing device including various computer hardware or software modules and may be configured to execute instructions stored on any applicable computer-readable storage media.
- the processor 410 may include a microprocessor, a microcontroller, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a Field-Programmable Gate Array (FPGA), or any other digital or analog circuitry configured to interpret and/or to execute program instructions and/or to process data, or any combination thereof.
- DSP digital signal processor
- ASIC application-specific integrated circuit
- FPGA Field-Programmable Gate Array
- the processor 410 may include any number of processors distributed across any number of networks or physical locations that are configured to perform individually or collectively any number of operations described herein.
- the processor 410 may interpret and/or execute program instructions and/or process data stored in the memory 412 , the data storage 414 , or the memory 412 and the data storage 414 .
- the processor 410 may fetch program instructions from the data storage 414 and load the program instructions into the memory 412 .
- the processor 410 may execute the program instructions.
- the system 400 may be part of the processing system 110 of FIG. 1 .
- the program instructions may cause the processor 410 to perform the operations of separating a media file from a multimedia message.
- the system 400 may be part of the transcription system 130 of FIG. 1 .
- the program instructions may cause the processor 410 to perform the operations of generating a transcript of a media file.
- the memory 412 and the data storage 414 may include computer-readable storage media for carrying or having computer-executable instructions or data structures stored thereon.
- Such computer-readable storage media may be any available media that may be accessed by a general-purpose or special-purpose computer, such as the processor 410 .
- Such computer-readable storage media may include non-transitory computer-readable storage media including Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory devices (e.g., solid state memory devices), or any other storage media which may be used to carry or store desired program code in the form of computer-executable instructions or data structures and which may be accessed by a general-purpose or special-purpose computer. Combinations of the above may also be included within the scope of computer-readable storage media.
- Computer-executable instructions may include, for example, instructions and data configured to cause the processor 410 to perform a certain operation or group of operations.
- the communication unit 416 may include any component, device, system, or combination thereof that is configured to transmit or receive information over a network. In some embodiments, the communication unit 416 may communicate with other devices at other locations, the same location, or even other components within the same system.
- the communication unit 416 may include a modem, a network card (wireless or wired), an infrared communication device, a wireless communication device (such as an antenna), and/or chipset (such as a Bluetooth device, an 802.6 device (e.g., Metropolitan Area Network (MAN)), a WiFi device, a WiMax device, cellular communication facilities, etc.), plain old telephone service (POTS), and/or the like.
- the communication unit 416 may permit data to be exchanged with a network and/or any other devices or systems described in the present disclosure.
- the display 418 may be configured as one or more displays, like an LCD, LED, or other type display.
- the display 418 may be configured to present video, text captions, user interfaces, and other data as directed by the processor 410 .
- the display 418 may present a media file, a transcript, a multimedia message, among other information.
- the user interface 420 may include any device that allows a user to interface with the system 400 .
- the user interface 420 may include a mouse, a track pad, a keyboard, a touchscreen, a telephone switch hook, and/or a telephone keypad, among other devices.
- the user interface 420 may receive input from a user and provide the input to the processor 410 .
- the peripheral devices 422 may include one or more devices.
- the peripheral devices may include a microphone, an imager, and/or a speaker, among other peripheral devices.
- the microphone may be configured to capture audio.
- the imager may be configured to capture digital images. The digital images may be captured in a manner to produce video or image data.
- the speaker may broadcast audio received by the system 400 or otherwise generated by the system 400 .
- FIG. 5 is a flowchart of an example computer-implemented method to provide transcriptions of a multimedia message.
- the method 500 may be arranged in accordance with at least one embodiment described in the present disclosure.
- the method 500 may be performed, in whole or in part, in some embodiments by a system and/or environment, such as the environment 100 , the environment 200 , the environment 300 , the system 400 , and/or the communication system 700 of FIGS. 1, 2, 3, 4, and 7 , respectively.
- the method 500 may be performed based on the execution of instructions stored on one or more non-transitory computer-readable media.
- various blocks may be divided into additional blocks, combined into fewer blocks, or eliminated, depending on the desired implementation.
- the method 500 may begin at block 502 , where a message with an attached media file may be received at a server.
- the message may be directed to a user device.
- the server may be configured to receive and direct messages to the user device.
- the media file may be separated from the message before the message is provided to the user device.
- the message in response to the separation of the media file from the message, the message may be modified to include a tag to the media file.
- the media tag may include information regarding a storage location of the media file.
- the media file may be provided to a transcription system.
- a transcript of audio data in the media file may be generated at the transcription system.
- the generation of the transcript of the audio data may include modifying the media file to include closed-captioning data.
- the message with the tag may be provided to the user device for presentation of the message on the user device.
- the transcript and the media file may be provided to the user device for presentation of the transcript and the media file on the user device.
- the method 500 may further include combining the message, the transcript, and the media file; and providing the combination to the user device.
- the method 500 may include holding the message with the media tag in queue during the generation of the transcript such that the message with the media tag is not provided to the user device during the generation of the transcript.
- FIG. 6 is a flowchart of another example computer-implemented method to provide transcriptions of a multimedia message.
- the method 600 may be arranged in accordance with at least one embodiment described in the present disclosure.
- the method 600 may be performed, in whole or in part, in some embodiments by a system and/or environment, such as the environment 100 , the environment 200 , the environment 300 , the system 400 , and/or the communication system 700 of FIGS. 1, 2, 3, 4, and 7 , respectively.
- the method 600 may be performed based on the execution of instructions stored on one or more non-transitory computer-readable media.
- various blocks may be divided into additional blocks, combined into fewer blocks, or eliminated, depending on the desired implementation.
- the method 600 may begin at block 602 , where a message with an attached media file may be received at a server.
- the message may be directed to a user device.
- the server may be configured to receive and direct messages to the user device.
- the media file may be separated from the message before the message is provided to the user device.
- a transcript of audio data in the media file may be generated at a transcription system.
- the generation of the transcript of the audio data comprises modifying the media file to include closed-captioning data.
- the message may be provided to the user device for presentation of the message on the user device.
- the transcript and the media file may be provided to the user device for presentation of the transcript and the media file on the user device.
- the method 600 may further include in response to the separation of the media file from the message, modifying the message to include a media tag to the media file.
- the method 600 may further include combining the message, the transcript, and the media file and providing the combination to the user device.
- the method 600 may further include holding the message in queue until after the transcript of the audio data is generated. Alternatively or additionally, the method 600 may further include in response to the generation of the transcript of the audio data, providing an indication on the user device of receipt of the message.
- the media file may include information regarding a storage location of the transcript.
- the method 600 may further include combining the message and the media file and providing the combination to the user device.
- FIG. 7 illustrates an example communication system 700 that may provide transcriptions of a multimedia message.
- the communication system 700 may include an electronic device that is capable of sending a message.
- the communication system 700 may be arranged in accordance with at least one embodiment described in the present disclosure.
- the communication system 700 may include a first device 710 , a second device 720 , and a system 730 .
- the first device 710 and the system 730 may be communicatively coupled by a network 740 .
- the first device 710 and the second device 720 may be communicatively coupled by the network 740 .
- the network 740 may be any network or configuration of networks configured to send and receive communications between systems and devices.
- the network 740 may include a conventional type network, a wired or wireless network, and may have numerous different configurations. In some embodiments, the network 740 may also be coupled to or may include portions of a telecommunications network, including telephone lines, for sending data in a variety of different communication protocols, such as a plain old telephone system (POTS).
- POTS plain old telephone system
- the communication system 700 illustrated may be configured to facilitate an assisted call between a hearing-impaired user 702 and a second user 704 .
- a “hearing-impaired user” may refer to a person with diminished hearing capabilities. Hearing-impaired users often have some level of hearing ability that has usually diminished over a period of time such that the hearing-impaired user can communicate by speaking, but that the hearing-impaired user often struggles in hearing and/or understanding others.
- the communication system 700 illustrated may be configured to facilitate a call between a person with medical expertise and/or experience and the second user 704 .
- a “person with medical expertise and/or experience” may be a nurse, doctor, or some other trained medical professional.
- a communication session such as an audio or a video communication session, may be established between the first device 710 and the second device 720 .
- the communication session may be a captioning communication session.
- the system 730 may be an assistive service, which is intended to permit a hearing-impaired person to utilize a communication network and assist their understanding of a conversation by providing text captions to supplement voice conversation occurring during communication sessions with other devices, such as the second device 720 .
- the system 730 may be an assistive service to couple a person with medical expertise and/or experience with a person requesting medical assistance.
- the system 730 and the first device 710 may be communicatively coupled using networking protocols.
- the first device 710 may provide the audio signal from the second device 720 to the system 730 .
- a call assistant may listen to the audio signal of the second user 704 and “revoice” the words of the second user 704 to a speech recognition computer program tuned to the voice of the call assistant.
- the call assistant may be an operator who serves as a human intermediary between the hearing-impaired user 702 and the second user 704 .
- text captions may be generated by the speech recognition computer as a transcription of the audio signal of the second user 704 .
- the text captions may be provided to the first device 710 being used by the hearing-impaired user 702 over the network 740 .
- the first device 710 may display the text captions while the hearing-impaired user 702 carries on a normal conversation with the second user 704 .
- the text captions may allow the hearing-impaired user 702 to supplement the voice signal received from the second device 720 and confirm his or her understanding of the words spoken by the second user 704 .
- the second user 704 may be hearing impaired.
- the system 730 may provide text captions to the second device 720 based on audio data transmitted by the first device 710 .
- the system 730 may include additional functionality.
- the system 730 may edit the text captions or make other alterations to the text captions after presentation of the text captions on the first device 710 .
- the environments 100 , 200 , and/or 300 of FIGS. 1, 2, and 3 may be combined with the communication system 700 .
- the communication system 700 may facilitate live verbal captioning of a communication session and the transcription of media files of multimedia messages.
- a message with a media file may be sent from the second device 720 to the first device 710 .
- the system 730 may be configured to provide text captions to the media file of the multimedia message along with performing the operations described with respect to FIG. 7 .
- embodiments described herein may include the use of a special purpose or general purpose computer (e.g., the processor 410 of FIG. 4 ) including various computer hardware or software modules, as discussed in greater detail below. Further, as indicated above, embodiments described herein may be implemented using computer-readable media (e.g., the memory 412 or data storage 414 of FIG. 4 ) for carrying or having computer-executable instructions or data structures stored thereon.
- a special purpose or general purpose computer e.g., the processor 410 of FIG. 4
- embodiments described herein may be implemented using computer-readable media (e.g., the memory 412 or data storage 414 of FIG. 4 ) for carrying or having computer-executable instructions or data structures stored thereon.
- the different components, modules, engines, and services described herein may be implemented as objects or processes that execute on a computing system (e.g., as separate threads). While some of the systems and methods described herein are generally described as being implemented in software (stored on and/or executed by general purpose hardware), specific hardware implementations or a combination of software and specific hardware implementations are also possible and contemplated.
- any disjunctive word or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms.
- the phrase “A or B” should be understood to include the possibilities of “A” or “B” or “A and B.”
- first,” “second,” “third,” etc. are not necessarily used herein to connote a specific order or number of elements.
- the terms “first,” “second,” “third,” etc. are used to distinguish between different elements as generic identifiers. Absence a showing that the terms “first,” “second,” “third,” etc., connote a specific order, these terms should not be understood to connote a specific order. Furthermore, absence a showing that the terms first,” “second,” “third,” etc., connote a specific number of elements, these terms should not be understood to connote a specific number of elements.
- a first widget may be described as having a first side and a second widget may be described as having a second side.
- the use of the term “second side” with respect to the second widget may be to distinguish such side of the second widget from the “first side” of the first widget and not to connote that the second widget has two sides.
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 15/380,589, filed on Dec. 15, 2016, the disclosure of which is incorporated herein by reference in its entirety.
- The embodiments discussed herein are related to transcribing media files of multimedia messages.
- Modern telecommunication services provide features to assist those who are deaf or hearing-impaired. One such feature is a text captioned telephone system for the hearing impaired. A text captioned telephone system may be a telecommunication intermediary service that is intended to permit a hearing-impaired user to utilize a normal telephone network.
- The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one example technology area where some embodiments described herein may be practiced.
- A computer-implemented method to provide transcriptions of a multimedia message is disclosed. The method may include receiving, at a server, a message with an attached media file. The message may be directed to a user device and the server may be configured to receive and direct messages to the user device. The server may also be configured to separate the media file from the message before the message is provided to the user device. The method may further include generating, at a transcription system, a transcript of audio data in the media file and providing the message to the user device for presentation of the message on the user device. The method may further include, providing the transcript and the media file to the user device for presentation of the transcript and the media file on the user device.
- Example embodiments will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
-
FIG. 1 illustrates a first example environment related to providing transcriptions of a multimedia message; -
FIG. 2 illustrates a second example environment related to providing transcriptions of a multimedia message; -
FIG. 3 illustrates a third example environment related to providing transcriptions of a multimedia message; -
FIG. 4 illustrates an example system that may be used in providing transcriptions of a multimedia message; -
FIG. 5 is a flowchart of an example computer-implemented method to provide transcriptions of a multimedia message; -
FIG. 6 is a flowchart of another example computer-implemented method to provide transcriptions of a multimedia message; and -
FIG. 7 illustrates an example communication system that may provide transcriptions of a multimedia message. - Some embodiments in this disclosure relate to a method and/or system that may transcribe multimedia messages. In some embodiments, a device, for example a smart phone, may send a multimedia message, which may include a media file, over a network to a user device. A user of the user device may be hearing-impaired. As a result, the user may not be able to fully understand audio received as part of the media file included in the multimedia message. For example, the audio may be voice data generated by another device that is attached to the multimedia message. As another example, the audio may be audio from a video file attached to the multimedia message.
- The multimedia message may be sent to a processing system before being provided to the user device. The processing system may be configured to separate the media file from the multimedia message prior to the multimedia message being delivered to the user device. After separating the media file from the multimedia message, the media file may be provided to a transcription system. In these and other embodiments, the transcription system may be configured to transcribe the audio from the media file and send a transcript of the audio to the user device for presentation to the user. The transcription system may also be configured to send the media file of the multimedia message to the user device for presentation to the user. The transcript may assist the user to understand the audio from the media file. The processing system may send the multimedia message to the user device for presentation of the multimedia message on the user device along with the media file.
- In some embodiments, the systems and/or methods described in this disclosure may help to enable the transcription of a media file attached to a multimedia message received at a user device or other devices. Thus, the systems and/or methods provide at least a technical solution to a technical problem associated with the design of user devices in the technology of telecommunications.
- Turning to the figures,
FIG. 1 illustrates afirst example environment 100 related to providing transcripts of a multimedia message. Theenvironment 100 may be arranged in accordance with at least one embodiment described in the present disclosure. Theenvironment 100 may include acommunication system 170 including aprocessing system 110 and atranscription system 130, and a user device 160. - In some embodiments, the
communication system 170 may be configured to directmultimedia messages 102 a to the user device 160. Thecommunication system 170 may be configured to direct messages from the user device 160. In some embodiments, the user device 160 may be configured to receive messages only through thecommunication system 170 and to send messages only through thecommunication system 170. In these and other embodiments, thecommunication system 170 may be a host system that is configured to receive messages destined for the user device 160 and relay the messages to the user device 160. For example, when themultimedia message 102 a is directed to the user device 160, the network address of the user device 160 may be such that themultimedia message 102 a is routed through thecommunication system 170 to the user device 160. - The user device 160 may be any electronic or digital device. For example, the user device 160 may be a smartphone, a mobile phone, a tablet computer, a laptop computer, a desktop computer, a phone console, or other processing device. In some embodiments, the user device 160 may include or be a phone console. The user device 160 may include a video screen and a speaker system. The user device 160 may be configured to present the
message 102 b, to present themedia file 104, and to present thetranscript 136 for viewing and/or listening on the user device 160. The user device 160 may be configured to present thetranscript 136 during playback of themedia file 104 on the user device 160. In some embodiments, the user device 160 may be configured to interact with themultimedia message 102 a through thecommunication system 170. - In some embodiments, the
processing system 110 may include any configuration of hardware, such as one or more processors, servers, and databases that are networked together and configured to perform a task. For example, theprocessing system 110 may include one or more multiple computing systems, such as multiple servers that each include memory and at least one processor, which are networked together and configured to perform operations as described in this disclosure, among other operations. In some embodiments, theprocessing system 110 may include computer-readable-instructions that are configured to be executed by theprocessing system 110 to perform operations described in this disclosure. - The
processing system 110 may be configured to receive amultimedia message 102 a and direct themultimedia message 102 a to a user device 160. Themultimedia message 102 a may be any electronic or digital message. For example, themultimedia message 102 a may include a Multimedia Messaging Service (MMS) message, an email message, or another messaging type that may include amedia file 104. In some embodiments, themedia file 104 may be included in themultimedia message 102 a by being attached to themultimedia message 102 a. For example, themultimedia message 102 a may be an MMS message that includes text and an attached video file or an attached audio file as themedia file 104. Themultimedia message 102 a may be sent from any user or any technology device. For example, themultimedia message 102 a may be sent from a smartphone, a mobile phone, a tablet computer, a laptop computer, a desktop computer, a phone console, or other processing device. - The
multimedia message 102 a may be received by theprocessing system 110 over a network. In some embodiments, the network may include a peer-to-peer network. The network may also be coupled to or may include portions of a telecommunications network for sending data in a variety of different communication protocols. In some embodiments, the network may include Bluetooth® communication networks or cellular communication networks for sending and receiving communications and/or data including via short message service (SMS), multimedia messaging service (MMS), hypertext transfer protocol (HTTP), direct data connection, wireless application protocol (WAP), e-mail, etc. The network may also include a mobile data network that may include third-generation (3G), fourth-generation (4G), long-term evolution (LTE), long-term evolution advanced (LTE-A), Voice-over-LTE (“VoLTE”) or any other mobile data network or combination of mobile data networks. Further, the network may include one or more IEEE 802.11 wireless networks, optical networks, a conventional type network, a wired network, and may have numerous different configurations. - In some embodiments, the
processing system 110 may be configured to separate the media file 104 from themultimedia message 102 a. As a result, theprocessing system 110 may separately transmit themedia file 104 and amessage 102 b without themedia file 104. In these and other embodiments, themessage 102 b may be themultimedia message 102 a, but stripped of themedia file 104. Theprocessing system 110 may be configured to send themedia file 104 without themessage 102 b to thetranscription system 130. Theprocessing system 110 may also be configured to send themessage 102 b to the user device 160. - The
transcription system 130 may be communicatively coupled to theprocessing system 110. Thetranscription system 130 may be configured to receive the media file 104 from theprocessing system 110. In some embodiments, thetranscription system 130 may be communicatively coupled to theprocessing system 110 over a network. - In some embodiments, the network may include any network or configuration of networks configured to send and receive communications between devices. In some embodiments, the network may include a conventional type network, a wired or wireless network, and may have numerous different configurations. Furthermore, the network may include a local area network (LAN), a wide area network (WAN) (e.g., the Internet), or other interconnected data paths across which multiple devices and/or entities may communicate. In some embodiments, the network over which the
transcription system 130 is communicatively coupled to theprocessing system 110 may not be the same as the network over which theprocessing system 110 receives themultimedia message 102 a. - In some embodiments, the
transcription system 130 may be configured to generate atranscript 136 by transcribing audio data of themedia file 104 received from theprocessing system 110. In some embodiments, thetranscription system 130 may include any configuration of hardware, such as processors, servers, and databases that are networked together. For example, thetranscription system 130 may include multiple computing systems, such as multiple servers that each include memory and at least one processor, which are networked together and configured to perform operations as described in this disclosure, among other operations. In some embodiments, thetranscription system 130 may include computer-readable-instructions that are configured to be executed by thetranscription system 130 to perform operations described in this disclosure. - In some embodiments, the
transcription system 130 may be configured to transcribe audio data of themedia file 104 received from theprocessing system 110 to generate atranscript 136 of the audio data. In some embodiments, to transcribe the audio data, a call assistant may listen to the audio data and “revoice” the words of the audio data to a speech recognition computer program tuned to the voice of the call assistant. In these and other embodiments, the call assistant may be an operator who serves as a human intermediary between a hearing impaired user and themedia file 104. In some embodiments, thetranscript 136 may be generated by the speech recognition computer. In some embodiments, themedia file 104 may be sent to a speech recognition computer program without “revoicing” of the audio data of themedia file 104 by a call assistant or other human intermediary. For example, the audio data of themedia file 104 may be sent directly to a speech recognition computer program which may generate thetranscript 136 without the use of a call assistant or human intermediary. - In some embodiments, the
transcription system 130 may be configured to provide the generatedtranscript 136 to the user device 160 over a network. Thetranscription system 130 may also be configured to provide themedia file 104 to the user device 160 over a network. The network over which themedia file 104 may be provided to the user device 160 may not be the same network as the network over which themultimedia message 102 a is received by theprocessing system 110. The network over which themedia file 104 may be provided to the user device 160 may or may not be the same network as the network over which themedia file 104 is provided to thetranscription system 130. In these and other embodiments, the user device 160 may be configured to present the generatedtranscripts 136. For example, the generatedtranscripts 136 may be displayed on a display of the user device 160. - In response to the
media file 104 being an audiovisual file, the user device 160 may be configured to present themedia file 104 on a display of the user device 160. The video of the audiovisual file may be presented on a display of the user device 160. In response to themedia file 104 being an audio file, the user device 160 may be configured to present the audio file, such as through one or more speakers, and present an image on a display of the user device 160. In some embodiments, the image may be a picture of a contact that sent themultimedia message 102 a to the user device 160. Alternatively or additionally, the image may be a picture of a musical note to denote that themedia file 104 is an audio file. - An example of the operation of the
environment 100 follows. Amultimedia message 102 a may be sent from a device and directed to the user device 160. The multimedia message may be sent via one or more networks. The networks may include cellular networks, the Internet, and other wireless or wired networks. The network address of the user device 160 may be such that themultimedia message 102 a is provided to thecommunication system 170. For example, themultimedia message 102 a may be provided to theprocessing system 110 before themultimedia message 102 a is sent to the user device 160. For example, an email with an attached video file may be sent from a laptop computer to a user device 160. Alternatively or additionally, an MMS message with an audio file may be sent from a smartphone to a user device 160. - The user device 160 may be used by a user that is hearing-impaired. As used in the present disclosure, a “hearing-impaired user” may refer to a person with diminished hearing capabilities. Hearing-impaired users often have some level of hearing ability that has usually diminished over a period of time such that the hearing-impaired user can communicate by speaking, but that the hearing-impaired user often struggles in hearing and/or understanding others.
- The
multimedia message 102 a may include an attachedmedia file 104 that may include audio data. The audio data and the attachedmedia file 104 may originate from any other device. For example, themedia file 104 may be a video recorded on a tablet computer. Alternatively or additionally, themedia file 104 may be audio data recorded on a smartphone. The audio data may be based on a voice signal from a user of a smartphone device. For example, the voice signal may be words spoken by the user of the smartphone device prior to sending themultimedia message 102 a to the user device 160. - The
processing system 110 may separate the media file 104 from themultimedia message 102 a. Theprocessing system 110 may send themedia file 104 to thetranscription system 130. Theprocessing system 110 may send themedia file 104 received from a user or from the user device 160 to thetranscription system 130 via one or more networks. - The
transcript 136 of the audio data of themedia file 104 may be provided over a network to the user device 160. The media file 104 may be provided over a network to the user device 160 from thetranscription system 130. Alternatively or additionally, themedia file 104 may be provided to the user device 160 together with themessage 102 b. In some embodiments, the user device 160, by way of an application associated with thetranscription system 130 and an electronic display, may display thetranscript 136 while themedia file 104 is also displayed on the electronic display. For example, themedia file 104 may be a video and thetranscript 136 of the audio data of themedia file 104 may be displayed while the video is being displayed on an electronic display of the user device 160. Thetranscript 136 may allow the hearing-impaired user to supplement the audio data received from the user device 160 and confirm his or her understanding of the words spoken in themedia file 104. - The
environment 100 may be configured to provide thetranscript 136 of the audio data in substantially real-time or real-time. In some embodiments, themessage 102 b may be presented on the user device 160 prior to the generation of atranscript 136 of the audio data of the media file at thetranscription system 130. For example, thetranscript 136 of the audio data of the media file may be provided to the user device 160 in less than 2, 3, 5, or 10 seconds after the audio data is presented to the user of the user device 160 by an audio output device. In some embodiments, thetranscript 136 may be generated prior to the transmission of the multimedia message to the user device 160 and thetranscript 136 of the audio data may not be provided in substantially real-time. As described, theenvironment 100 may be configured to provide transcripts of media files attached tomultimedia messages 102 a directed to the user device 160. - In some embodiments, the user device 160 may be associated with a hearing impaired user and may be in communication with the
transcription system 130. In these and other embodiments, themedia file 104 may capture words spoken by a variety of individuals in the format of an audiovisual recording or an audio recording. Theprocessing system 110 may send themedia file 104 to the user device 160 via thetranscription system 130 and over a network. The media file 104 may be transmitted over a network and may be sent to the user device 160 as a media file for presentation to the user of the user device 160 as a normal media file. Theprocessing system 110 may transmit themedia file 104 received over a network to thetranscription system 130 with an indication that thetranscript 136 and themedia file 104 be provided to the user device 160. Thetranscription system 130 may transcribe the audio data of themedia file 104 and provide thetranscript 136 to the user device 160. The user device 160 may present thetranscript 136 along with themedia file 104 on a display of the user device 160 for the hearing-impaired user of the user device 160. - Modifications, additions, or omissions may be made to the
environment 100 without departing from the scope of the present disclosure. For example, in some embodiments, theenvironment 100 may include additional devices similar to the user device 160. Alternatively or additionally, the message 102 may be directed to multiple user devices 160. Alternatively or additionally, one or more of the communication couplings between the sender of themultimedia message 102 a and the user device 160 may be a wired connection. In some embodiments, theprocessing system 110 and the user device 160 may be the same device. - Alternatively or additionally, the
processing system 110 may create a copy of themedia file 104 and may not separate the media file 104 from themultimedia message 102 a. Alternatively or additionally, the audio data of themedia file 104 may be obtained by thetranscription system 130 via a link to a webpage on the Internet. In some embodiments, themedia file 104 may be a link to a media file or a link to a media stream. In these and other embodiments, theprocessing system 110 may remove or make a copy of the link. In these and other embodiments, theprocessing system 110 may send the link to thetranscription system 130. The link may be used by thetranscription system 130 to obtain audio that may be transcribed. For example, thetranscription system 130 may use the link to access a webpage on the Internet to obtain the audio data of an audiovisual stream or of an audiovisual file. The link may or may not be sent by thetranscription system 130 to the user device 160. In some embodiments, thetranscription system 130 may send the link to the media stream to the user device 160. The user device 160 may use the link to access a webpage on the Internet to obtain the audiovisual stream or the audiovisual file. In some embodiments, thetranscript 136 may be automatically linked with the link such that it is presented along with the media from the media link on the user device 160. Thetranscript 136 may be presented on the user device 160 together with the media from the media link. -
FIG. 2 illustrates asecond example environment 200 related to providing transcripts of a multimedia message. Theenvironment 200 may be arranged in accordance with at least one embodiment described in the present disclosure. Theenvironment 200 may include acommunication system 270 including aprocessing system 210, atranscription system 230, and aqueue 220, and a user device 260. - In some embodiments, the
communication system 270 may be configured to directmultimedia messages 202 a to the user device 260. Thecommunication system 270 may also be configured to direct messages from the user device 260. In some embodiments, the user device 260 may be configured to receive messages only through thecommunication system 270 and to send messages only through thecommunication system 270. In these and other embodiments, thecommunication system 270 may be a host system that is configured to receive messages for the user device 260 and relay the messages to the user device 260. For example, when themultimedia message 202 a is directed to the user device 260, the network address of the user device 260 may be such that themultimedia message 202 a is routed through thecommunication system 270 to the user device 260. For example, when themultimedia message 202 a is a MMS, thecommunication system 270 may include functionality similar to Multimedia Messaging Service Center (MMSC). Alternatively or additionally, when themultimedia message 202 a is an email, thecommunication system 270 may include functionality similar to an email exchange server. In these and other embodiments, thecommunication system 270 may receive themultimedia messages 202 a from another server and be configured to relay themultimedia messages 202 a to the user device 260. - The user device 260 may be any electronic or digital device and may be analogous to the user device 160 of
FIG. 1 , except the user device 260 may interact with thecommunication system 270 in a manner different than the user device 160 interacts with thecommunication system 170 ofFIG. 1 . - In some embodiments, the
processing system 210 may include an analogous configuration of hardware as theprocessing system 110 ofFIG. 1 , but may perform one or more of the same tasks or one or more different tasks than theprocessing system 110 ofFIG. 1 . Thus, further description of hardware of theprocessing system 210 is not provided with respect toFIG. 2 . - The
processing system 210 may be configured to receive amultimedia message 202 a and direct themultimedia message 202 a to a user device 260. Themultimedia message 202 a may be analogous to themultimedia message 102 a ofFIG. 1 and may be received in an analogous manner, as such further description is not provided with respect toFIG. 2 . - In some embodiments, the
processing system 210 may be configured to separate the media file 204 from themultimedia message 202 a. As a result, theprocessing system 210 may separately transmit themedia file 204 and amessage 202 b without themedia file 204. - For example, the
multimedia messages 202 a may be an email. In these and other embodiments, theprocessing system 210 may remove an attachment to the email that is amedia file 204. Alternatively or additionally, themultimedia messages 202 a may be an MMS that includes text and an embedded uniform resource locator (URL) for a temporary storage location of themedia file 204 in a server with an HTTP front-end of a network that provided themultimedia messages 202 a to theprocessing system 210. In these and other embodiments, theprocessing system 210 may strip the URL from the MMS, such that themedia file 204 is the URL and themessage 202 b is the text from the MMS. - The
processing system 210 may also be configured to add amedia tag 218 to themessage 202 b. In these and other embodiments, themessage 202 b may include themultimedia message 202 a, but stripped of themedia file 204 and with the addition of themedia tag 218. Theprocessing system 210 may be configured to send themedia file 204 without themessage 202 b to thetranscription system 230. Theprocessing system 210 may also be configured to send themessage 202 b with themedia tag 218 to thequeue 220. - In some embodiments, the
media tag 218 may include information regarding themedia file 204 stripped from themultimedia messages 202 a. The information may be used to properly associate themedia file 204 with themessage 202 b that has been stripped of themedia file 204. For example, in some embodiments, themedia tag 218 may be configured to include metadata about themedia file 204 and/or themessage 202 b. For example, in some embodiments, themedia tag 218 may include a name of themedia file 204 and a length of themedia file 204. - In some embodiments, the
media tag 218 may be a unique identifier of themedia file 204 within thecommunication system 270. The unique identifier may be used by thecommunication system 270 to locate and relay themedia file 204 to the user device 260 or other devices that request themedia file 204 using the unique identifier. - In some embodiments, the
media tag 218 may include a file type of themedia file 204 such as an audio file type or a video file type. In some embodiments, themedia tag 218 may include information about a storage location of themedia file 204 in thecommunication system 270. In these and other embodiments, themedia file 204 may be stored in a database that may be accessed by theprocessing system 210 and thetranscription system 230. Alternatively or additionally, themedia tag 218 may include a link, such as a URL or type of location information, that may allow a device, such as the user device 260, to retrieve themedia file 204 and files associated with themedia file 204, such as a transcript of themedia file 204. - The
transcription system 230 may be communicatively coupled to theprocessing system 210 and may be configured to receive the media file 204 from theprocessing system 210. In some embodiments, thetranscription system 230 may not physically receive the media file 204 from theprocessing system 210, but may receive a storage location of themedia file 204 in a database in thecommunication system 270. For example, thetranscription system 230 may receive a link to themedia file 204. - In some embodiments, the
transcription system 230 may be configured to generate atranscript 236 by transcribing audio data of themedia file 204. Thetranscription system 230 may include a hardware configuration analogous to hardware configuration of thetranscription system 130 ofFIG. 1 , but may perform one or more of the same tasks or one or more different tasks than thetranscription system 130 ofFIG. 1 . Thus, further description of hardware of thetranscription system 230 and operations analogous to the operations performed by thetranscription system 130 are not provided with respect toFIG. 2 . - In some embodiments, the
transcription system 230 may be configured to provide an indication ofcompletion 238 to thequeue 220 over a network in response to completing thetranscript 236 of the audio of themedia file 204. The network over which the indication ofcompletion 238 may be provided to the user device 260 may be the same network as the network over which themedia file 204 is received by thetranscription system 230. The indication ofcompletion 238 may be a signal to thequeue 220 that thetranscript 236 of the audio of themedia file 204 is complete. The indication ofcompletion 238 may help enable the user device 260 to present themessage 202 b, themedia file 204, and thetranscript 236 to a user of the user device 260 at a single time as will be explained hereafter. - The
queue 220 may be communicatively coupled to theprocessing system 210. Thequeue 220 may be configured to receive themessage 202 b with the media tag 218 from theprocessing system 210. In some embodiments, thequeue 220 may be communicatively coupled to theprocessing system 210 over a network. In some embodiments, the network over which thequeue 220 is communicatively coupled to theprocessing system 210 may not be the same as the network over which theprocessing system 210 receives themultimedia message 202 a. In some embodiments, the network over which thequeue 220 is communicatively coupled to theprocessing system 210 may be the same as the network over which theprocessing system 210 provides themedia file 204 to thetranscription system 230. - In some embodiments, the
queue 220 may be configured to retain themessage 202 b with themedia tag 218 until thequeue 220 receives the indication ofcompletion 238 from thetranscription system 230. In response to receiving the indication ofcompletion 238 from thetranscription system 230, thequeue 220 may provide themessage 202 b with themedia tag 218 to the user device 260. Thus, thequeue 220 may prevent the user device 260 from being aware of themultimedia messages 202 a or themessage 202 b until after completion of thetranscript 236. - Retaining the
message 202 b with themedia tag 218 may facilitate the simultaneous presentation of themedia file 204 and thetranscript 236 on the user device 260. For example, the generation of thetranscript 236 may not occur directly after thetranscription system 230 may receive themedia file 204 to generate thetranscript 236. There may be a period of time between the receipt of themedia file 204 and the generation of thetranscript 236. As a result, if themessage 202 b was delivered to the user device 260 directly after separation of the media file 204 from themessage 202 b, thetranscript 236 may be unavailable. Retaining themessage 202 b at thequeue 220 until thetranscript 236 is generated may help thecommunication system 270 to ensure thattranscript 236 may be available for presentation by the user device 260 as soon as the user device 260 receives themessage 202 b and is able to present themedia file 204. Thus, thequeue 220 may facilitate the simultaneous presentation of themedia file 204 and thetranscript 236 on the user device 260. - In some embodiments, the user device 260 may be configured to receive the
message 202 b with themedia tag 218. After receiving themessage 202 b with themedia tag 218, the user device 260 may be configured to generate a request formedia 268 using the information in themedia tag 218. The request formedia 268 may be directed to thetranscription system 230 and may be configured to allow thetranscription system 230 to determine themedia file 204 that is associated with themessage 202 b and themedia tag 218 and thetranscript 236 associated with themedia file 204. - In some embodiments, the user device 260 may generate and send the request for
media 268 in response to receiving themessage 202 b from thequeue 220. Alternatively or additionally, the user device 260 may generate and send the request formedia 268 in response to a user interaction with the user device 260. For example, the user device 260 may be configured to provide an indication that themessage 202 b is available on the user device 260. For example, in response to receiving themessage 202 b from thequeue 220, an indicator on the user device 260 may be used to alert a user that themessage 202 b is available on the user device 260. In these and other embodiments, the user may interact with themessage 202 b. The user device 260 may provide the request formedia 268 in response to interacting with themessage 202 b on the user device 260. In some embodiments, interacting with the message may include opening themessage 202 b, selecting an option to download or stream themedia file 204 on the user device 260, among other interactions with themessage 202 b. - In some embodiments, the
transcription system 230 may be configured to provide the generatedtranscript 236 to the user device 260 over a network in response to receiving a request formedia 268 from the user device 260. For example, thetranscription system 230 may be configured to use the information from the request formedia 268 to locate themedia file 204 and thetranscript 236 of themedia file 204. In these and other embodiments, thetranscription system 230 may be configured to provide themedia file 204 and thetranscript 236 to the user device 260 over a network in response to receiving a request formedia 268 from the user device 260. The network over which themedia file 204 may be provided to the user device 260 may not be the same network as the network over which themultimedia message 202 a is received by theprocessing system 210. The network over which themedia file 204 may be provided to the user device 260 may or may not be the same network as the network over which themedia file 204 is provided to thetranscription system 230. - In these and other embodiments, the user device 260 may be configured to present the
media file 204 and thetranscript 236. For example, the generatedtranscript 236 may be displayed on a display of the user device 260. In response to themedia file 204 being an audiovisual file, the user device 260 may be configured to present themedia file 204 on a display of the user device 260. The video of the audiovisual file may be presented on a display of the user device 260. In response to themedia file 204 being an audio file, the user device 260 may be configured to present the audio file, such as through one or more speakers, and present an image on a display of the user device 260. - An example of the operation of the
environment 200 follows. Amultimedia message 202 a may be sent from a device and directed to a user device 260. - The
processing system 210 may separate the media file 204 from themultimedia message 202 a. Theprocessing system 210 may generate amedia tag 218 with an identifier of themedia file 204 for thecommunication system 270. Theprocessing system 210 may attach themedia tag 218 to the separatedmessage 202 b. Theprocessing system 210 may send themedia file 204 to thetranscription system 230. Theprocessing system 210 may send themedia file 204 received from a user or from the user device 260 to thetranscription system 230 via one or more networks. Theprocessing system 210 may provide themessage 202 b with themedia tag 218 to thequeue 220. Themessage 202 b with themedia tag 218 may remain in thequeue 220 until thequeue 220 receives the indication ofcompletion 238 from thetranscription system 230 in response to thetranscription system 230 completing thetranscript 236 of themedia file 204. In response to receiving the indication ofcompletion 238 from thetranscription system 230, thequeue 220 may provide themessage 202 b with themedia tag 218 to the user device 260. - In response to receiving the
message 202 b with themedia tag 218, the user device 260 may send the request formedia 268 based on themedia tag 218 In response to receiving the request formedia 268, thetranscription system 230 locates themedia file 204 and thetranscript 236 generated based on themedia file 204. Thetranscription system 230 may provide themedia file 204 and thetranscript 236 associated with themedia file 204 to the user device 260. - Modifications, additions, or omissions may be made to the
environment 200 without departing from the scope of the present disclosure. For example, in some embodiments, theenvironment 200 may include additional devices similar to the user device 260. Alternatively or additionally, themultimedia message 202 a may be directed to multiple user devices 260. In these and other embodiments, themedia tags 218 provided to each of the multiple user devices 260 may include information to asingle media file 204 and asingle transcript 236 that may be shared among the multiple user devices 260 such that multiple transcripts are not made of thesame media file 204. - In some embodiments, one or more of the communication couplings between the sender of the
multimedia message 202 a and the user device 260 may be a wired connection. In some embodiments, theprocessing system 210 and the user device 260 may be the same device. - Alternatively or additionally, the
processing system 210 may create a copy of themedia file 204 and may not separate the media file 204 from themultimedia message 202 a. In these and other embodiments, thetranscription system 230 may provide thetranscript 236 in request to the request formedia 268 and may not send themedia file 204. Alternatively or additionally, the audio data of themedia file 204 may be obtained by thetranscription system 230 via a URL to a webpage on the Internet. In some embodiments, themedia file 204 may be a link to a media file or a link to a media stream. In these and other embodiments, thetranscription system 230 may provide thetranscript 236 in request to the request formedia 268 and may not send themedia file 204. In these and other embodiments, the user device 260 may obtain the media file 204 from thequeue 220 or from another source other than thecommunication system 270. - Alternatively or additionally, in some embodiments, the
environment 200 may not include themedia tag 218. In these and other embodiments, theprocessing system 210 may separate the media file 204 from themultimedia message 202 a and may not attach themedia tag 218 to themessage 202 b. In these and other embodiments, thetranscription system 230 may send the indication ofcompletion 238 to thequeue 220. Thequeue 220 may transmit themessage 202 b without themedia tag 218 to the user device 260. In these and other embodiments, the user device 260 may be configured to transmit the request formedia 268 to thetranscription system 230 in response to receiving themessage 202 b from thequeue 220 based on information in themessage 202 b. In these and other embodiments, thetranscription system 230 may associate themedia file 204 and thetranscript 236 with themessage 202 b based on a unique identifier from themultimedia message 202 a. - Alternatively or additionally, in some embodiments, the
environment 200 may not include thequeue 220. In these and other embodiments, themessage 202 b may be sent directly to the user device 260 prior to the completion of thetranscript 236 in thetranscription system 230. In these and other embodiments, thetranscription system 230 may provide the indication ofcompletion 238 to the user device 260 in response to completing thetranscript 236 of the audio of themedia file 204. In response to receiving the indication ofcompletion 238 from thetranscription system 230, the user device 260 may present an alert on a display that themessage 202 b is available on the user device 260. In some embodiments, the user device 260 may present themessage 202 b with themedia tag 218 in response to receiving the indication ofcompletion 238 from thetranscription system 230. Presenting themessage 202 b with themedia tag 218 in response to receiving the indication ofcompletion 238 may facilitate the simultaneous presentation of themedia file 204 and thetranscript 236 on the user device 260. In some embodiments, the generation of thetranscript 236 may cause a delay in the presentation of thetranscript 236 on the user device 260. In some embodiments, not indicating a receipt of themessage 202 b until receipt of the indication ofcompletion 238 may facilitate the presentation of thetranscript 236 with themedia file 204. -
FIG. 3 illustrates athird example environment 300 related to providing transcripts of a multimedia message. Theenvironment 300 may be arranged in accordance with at least one embodiment described in the present disclosure. Theenvironment 300 may include acommunication system 370 that includes aprocessing system 310, atranscription system 330, and a combiningsystem 340, and auser device 360. - In some embodiments, the
communication system 370 may be configured to directmultimedia messages 302 a to theuser device 360. Thecommunication system 370 may be configured to direct messages from theuser device 360. In some embodiments, theuser device 360 may be configured to receive messages only through thecommunication system 370 and to send messages only through thecommunication system 370. In these and other embodiments, thecommunication system 370 may be a host system that is configured to receive messages for theuser device 360 and relay the messages to theuser device 360. For example, when themultimedia message 302 a is directed to theuser device 360, the network address of theuser device 360 may be such that themultimedia message 302 a is routed through thecommunication system 370 to theuser device 360. - The
user device 360 may be any electronic or digital device and may be analogous to the user device 160 ofFIG. 1 , except theuser device 360 may interact with thecommunication system 370 in a manner different than the user device 160 interacts with thecommunication system 170 ofFIG. 1 . - In some embodiments, the
processing system 310 may include an analogous configuration of hardware as theprocessing system 110 ofFIG. 1 , but may perform one or more of the same tasks or one or more different tasks than theprocessing system 110 ofFIG. 1 . Thus further description of hardware of theprocessing system 310 is not provided with respect toFIG. 3 . - The
processing system 310 may be configured to receive amultimedia message 302 a. Themultimedia message 302 a may be analogous to themultimedia message 102 a ofFIG. 1 and may be received in an analogous manner, as such further description is not provided with respect toFIG. 3 . - In some embodiments, the
processing system 310 may be configured to separate the media file 304 from themultimedia message 302 a. As a result, theprocessing system 310 may separately transmit themedia file 304 and amessage 302 b without themedia file 304. In these and other embodiments, themessage 302 b may be themultimedia message 302 a, but stripped of themedia file 304. Theprocessing system 310 may be configured to send themedia file 304 without themessage 302 b to thetranscription system 330. Theprocessing system 310 may also be configured to send themessage 302 b to a combiningsystem 340. - The
transcription system 330 may include an analogous configuration of hardware as theprocessing system 110 ofFIG. 1 , but may perform one or more of the same tasks or one or more different tasks than theprocessing system 110 ofFIG. 1 . Thus, further description of hardware of thetranscription system 330 and analogous operation to thetranscription system 130 is not provided with respect toFIG. 3 . - In some embodiments, the
transcription system 330 may be configured to receive the media file 304 from theprocessing system 310. Thetranscription system 330 may also be configured to generate atranscript 336 by transcribing audio data of themedia file 304 received from theprocessing system 310 and to provide the generatedtranscript 336 to the combiningsystem 340. Thetranscription system 330 may also be configured to provide themedia file 304 to the combiningsystem 340. - The combining
system 340 may be communicatively coupled to theprocessing system 310 and to thetranscription system 330. The combiningsystem 340 may be configured to receive themedia file 304 and thetranscript 336 from thetranscription system 330. - In some embodiments, the combining
system 340 may include any configuration of hardware, such as processors, servers, and databases that are networked together. For example, the combiningsystem 340 may include multiple computing systems, such as multiple servers that each include memory and at least one processor, which are networked together and configured to perform operations as described in this disclosure, among other operations. In some embodiments, the combiningsystem 340 may include computer-readable-instructions that are configured to be executed by the combiningsystem 340 to perform operations described in this disclosure. - In some embodiments, the combining
system 340 may be configured to combine themessage 302 b received from theprocessing system 310, themedia file 304 received from thetranscription system 330, and thetranscript 336 received from thetranscription system 330 to generate a combinedmessage 302 c. In some embodiments, to combine themessage 302 b, themedia file 304, and thetranscript 336, the combining system may attach themedia file 304 and thetranscript 336 to themessage 302 b to generate a combinedmessage 302 c. In some embodiments, thetranscript 336 may be incorporated into themedia file 304 as closed-captioning data. Alternatively or additionally, thetranscript 336 may be separate from themedia file 304. - In some embodiments, the combining
system 340 may be configured to provide the generated combinedmessage 302 c to theuser device 360 over a network. The network over which the combinedmessage 302 c may be provided to theuser device 360 may not be the same network as the network used for communication between theprocessing system 310, thetranscription system 330, and the combiningsystem 340. In these and other embodiments, theuser device 360 may be configured to present the combinedmessage 302 c. For example, the combinedmessage 302 c may be presented on a display of theuser device 360. - In these and other embodiments, in response to the combined
message 302 c being displayed on a display of theuser device 360, themedia file 304 attached to the combinedmessage 302 c may be selected. Theuser device 360 may be configured to present the media file 304 of the combinedmessage 302 c on a display of theuser device 360. In these and other embodiments, theuser device 360 may be configured to present thetranscript 336 of the combinedmessage 302 c. Thetranscript 336 may be displayed on a display of theuser device 360. In some embodiments, thetranscript 336 may be selected from the combinedmessage 302 c and may be presented on a display of theuser device 360. - Modifications, additions, or omissions may be made to the
environment 300 without departing from the scope of the present disclosure. For example, in some embodiments, theenvironment 300 may include additional devices similar to theuser device 360. Alternatively or additionally, the message 302 may be directed tomultiple user devices 360. Alternatively or additionally, one or more of the communication couplings between the sender of themultimedia message 302 a and theuser device 360 may be a wired connection. In some embodiments, theprocessing system 310 and theuser device 360 may be the same device. - Alternatively or additionally, the
processing system 310 may create a copy of themedia file 304 and may not separate the media file 304 from themultimedia message 302 a. In these and other embodiments, the combiningsystem 340 may be configured to insert thetranscript 336 generated based on themedia file 304 into themultimedia message 302 a to generate themessage 302 c. In some embodiments, thetranscript 336 may be incorporated into themedia file 304 as closed-captioning data and may not be a separate element of the combinedmessage 302 c. In some embodiments, themedia file 304 may include a link to thetranscript 336. Alternatively or additionally, thecommunication system 370 may not include theprocessing system 310. In these and other embodiments, thetranscription system 330 may receive themultimedia message 302 a and generate a transcript of themedia file 304. The combiningsystem 340 may combine thetranscript 336 with themultimedia messages 302 a to generate themessage 302 c provided to theuser devices 360. -
FIG. 4 illustrates anexample computing system 400 that may be arranged in accordance with at least one embodiment described in the present disclosure. Thesystem 400 may include aprocessor 410, amemory 412, adata storage 414, acommunication unit 416, adisplay 418, a user interface 420, andperipheral devices 422, which all may be communicatively coupled. In some embodiments, thesystem 400 may be part of any of the electronic devices described in this disclosure. For example, thesystem 400 may be part of theprocessing system 110 ofFIG. 1 , thetranscription system 130 ofFIG. 1 , and/or the user device 160 ofFIG. 1 . Thesystem 400 may also be part of system/components illustrated inFIGS. 2 and 3 . - Generally, the
processor 410 may include any suitable special-purpose or general-purpose computer, computing entity, or processing device including various computer hardware or software modules and may be configured to execute instructions stored on any applicable computer-readable storage media. For example, theprocessor 410 may include a microprocessor, a microcontroller, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a Field-Programmable Gate Array (FPGA), or any other digital or analog circuitry configured to interpret and/or to execute program instructions and/or to process data, or any combination thereof. - Although illustrated as a single processor in
FIG. 4 , it is understood that theprocessor 410 may include any number of processors distributed across any number of networks or physical locations that are configured to perform individually or collectively any number of operations described herein. In some embodiments, theprocessor 410 may interpret and/or execute program instructions and/or process data stored in thememory 412, thedata storage 414, or thememory 412 and thedata storage 414. In some embodiments, theprocessor 410 may fetch program instructions from thedata storage 414 and load the program instructions into thememory 412. - After the program instructions are loaded into the
memory 412, theprocessor 410 may execute the program instructions. For example, thesystem 400 may be part of theprocessing system 110 ofFIG. 1 . In these and other embodiments, the program instructions may cause theprocessor 410 to perform the operations of separating a media file from a multimedia message. As another example, thesystem 400 may be part of thetranscription system 130 ofFIG. 1 . The program instructions may cause theprocessor 410 to perform the operations of generating a transcript of a media file. - The
memory 412 and thedata storage 414 may include computer-readable storage media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable storage media may be any available media that may be accessed by a general-purpose or special-purpose computer, such as theprocessor 410. By way of example, and not limitation, such computer-readable storage media may include non-transitory computer-readable storage media including Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory devices (e.g., solid state memory devices), or any other storage media which may be used to carry or store desired program code in the form of computer-executable instructions or data structures and which may be accessed by a general-purpose or special-purpose computer. Combinations of the above may also be included within the scope of computer-readable storage media. Computer-executable instructions may include, for example, instructions and data configured to cause theprocessor 410 to perform a certain operation or group of operations. - The
communication unit 416 may include any component, device, system, or combination thereof that is configured to transmit or receive information over a network. In some embodiments, thecommunication unit 416 may communicate with other devices at other locations, the same location, or even other components within the same system. For example, thecommunication unit 416 may include a modem, a network card (wireless or wired), an infrared communication device, a wireless communication device (such as an antenna), and/or chipset (such as a Bluetooth device, an 802.6 device (e.g., Metropolitan Area Network (MAN)), a WiFi device, a WiMax device, cellular communication facilities, etc.), plain old telephone service (POTS), and/or the like. Thecommunication unit 416 may permit data to be exchanged with a network and/or any other devices or systems described in the present disclosure. - The
display 418 may be configured as one or more displays, like an LCD, LED, or other type display. Thedisplay 418 may be configured to present video, text captions, user interfaces, and other data as directed by theprocessor 410. For example, thedisplay 418 may present a media file, a transcript, a multimedia message, among other information. - The user interface 420 may include any device that allows a user to interface with the
system 400. For example, the user interface 420 may include a mouse, a track pad, a keyboard, a touchscreen, a telephone switch hook, and/or a telephone keypad, among other devices. The user interface 420 may receive input from a user and provide the input to theprocessor 410. - The
peripheral devices 422 may include one or more devices. For example, the peripheral devices may include a microphone, an imager, and/or a speaker, among other peripheral devices. In these and other embodiments, the microphone may be configured to capture audio. The imager may be configured to capture digital images. The digital images may be captured in a manner to produce video or image data. In some embodiments, the speaker may broadcast audio received by thesystem 400 or otherwise generated by thesystem 400. - Modifications, additions, or omissions may be made to the
system 400 without departing from the scope of the present disclosure. -
FIG. 5 is a flowchart of an example computer-implemented method to provide transcriptions of a multimedia message. Themethod 500 may be arranged in accordance with at least one embodiment described in the present disclosure. Themethod 500 may be performed, in whole or in part, in some embodiments by a system and/or environment, such as theenvironment 100, theenvironment 200, theenvironment 300, thesystem 400, and/or thecommunication system 700 ofFIGS. 1, 2, 3, 4, and 7 , respectively. In these and other embodiments, themethod 500 may be performed based on the execution of instructions stored on one or more non-transitory computer-readable media. Although illustrated as discrete blocks, various blocks may be divided into additional blocks, combined into fewer blocks, or eliminated, depending on the desired implementation. - The
method 500 may begin atblock 502, where a message with an attached media file may be received at a server. In some embodiments, the message may be directed to a user device. In some embodiments, the server may be configured to receive and direct messages to the user device. - In block 504, the media file may be separated from the message before the message is provided to the user device.
- In
block 506, in response to the separation of the media file from the message, the message may be modified to include a tag to the media file. In some embodiments, the media tag may include information regarding a storage location of the media file. Inblock 508, the media file may be provided to a transcription system. - In
block 510, a transcript of audio data in the media file may be generated at the transcription system. In some embodiments, the generation of the transcript of the audio data may include modifying the media file to include closed-captioning data. - In
block 512, the message with the tag may be provided to the user device for presentation of the message on the user device. - In
block 514, in response to a request from the user device based on the tag from the message, the transcript and the media file may be provided to the user device for presentation of the transcript and the media file on the user device. - One skilled in the art will appreciate that, for this and other processes, operations, and methods disclosed herein, the functions and/or operations performed may be implemented in differing order. Furthermore, the outlined functions and operations are only provided as examples, and some of the functions and operations may be optional, combined into fewer functions and operations, or expanded into additional functions and operations without detracting from the essence of the disclosed embodiments. For example, the
method 500 may further include combining the message, the transcript, and the media file; and providing the combination to the user device. Alternatively or additionally, themethod 500 may include holding the message with the media tag in queue during the generation of the transcript such that the message with the media tag is not provided to the user device during the generation of the transcript. -
FIG. 6 is a flowchart of another example computer-implemented method to provide transcriptions of a multimedia message. Themethod 600 may be arranged in accordance with at least one embodiment described in the present disclosure. Themethod 600 may be performed, in whole or in part, in some embodiments by a system and/or environment, such as theenvironment 100, theenvironment 200, theenvironment 300, thesystem 400, and/or thecommunication system 700 ofFIGS. 1, 2, 3, 4, and 7 , respectively. In these and other embodiments, themethod 600 may be performed based on the execution of instructions stored on one or more non-transitory computer-readable media. Although illustrated as discrete blocks, various blocks may be divided into additional blocks, combined into fewer blocks, or eliminated, depending on the desired implementation. - The
method 600 may begin atblock 602, where a message with an attached media file may be received at a server. In some embodiments, the message may be directed to a user device. In some embodiments, the server may be configured to receive and direct messages to the user device. - In
block 604, the media file may be separated from the message before the message is provided to the user device. Inblock 606, a transcript of audio data in the media file may be generated at a transcription system. In some embodiments, the generation of the transcript of the audio data comprises modifying the media file to include closed-captioning data. - In
block 608, the message may be provided to the user device for presentation of the message on the user device. Inblock 610, the transcript and the media file may be provided to the user device for presentation of the transcript and the media file on the user device. - One skilled in the art will appreciate that, for this and other processes, operations, and methods disclosed herein, the functions and/or operations performed may be implemented in differing order. Furthermore, the outlined functions and operations are only provided as examples, and some of the functions and operations may be optional, combined into fewer functions and operations, or expanded into additional functions and operations without detracting from the essence of the disclosed embodiments.
- For example, the
method 600 may further include in response to the separation of the media file from the message, modifying the message to include a media tag to the media file. Alternatively or additionally, themethod 600 may further include combining the message, the transcript, and the media file and providing the combination to the user device. - In some embodiments, the
method 600 may further include holding the message in queue until after the transcript of the audio data is generated. Alternatively or additionally, themethod 600 may further include in response to the generation of the transcript of the audio data, providing an indication on the user device of receipt of the message. - In some embodiments, in the
method 600 the media file may include information regarding a storage location of the transcript. In these and other embodiments, themethod 600 may further include combining the message and the media file and providing the combination to the user device. -
FIG. 7 illustrates anexample communication system 700 that may provide transcriptions of a multimedia message. Thecommunication system 700 may include an electronic device that is capable of sending a message. Thecommunication system 700 may be arranged in accordance with at least one embodiment described in the present disclosure. Thecommunication system 700 may include afirst device 710, asecond device 720, and asystem 730. Thefirst device 710 and thesystem 730 may be communicatively coupled by anetwork 740. Alternately or additionally, thefirst device 710 and thesecond device 720 may be communicatively coupled by thenetwork 740. In some embodiments, thenetwork 740 may be any network or configuration of networks configured to send and receive communications between systems and devices. In some embodiments, thenetwork 740 may include a conventional type network, a wired or wireless network, and may have numerous different configurations. In some embodiments, thenetwork 740 may also be coupled to or may include portions of a telecommunications network, including telephone lines, for sending data in a variety of different communication protocols, such as a plain old telephone system (POTS). - In some embodiments, the
communication system 700 illustrated may be configured to facilitate an assisted call between a hearing-impaired user 702 and asecond user 704. As used in the present disclosure, a “hearing-impaired user” may refer to a person with diminished hearing capabilities. Hearing-impaired users often have some level of hearing ability that has usually diminished over a period of time such that the hearing-impaired user can communicate by speaking, but that the hearing-impaired user often struggles in hearing and/or understanding others. - Alternatively or additionally, the
communication system 700 illustrated may be configured to facilitate a call between a person with medical expertise and/or experience and thesecond user 704. As used in the present disclosure, a “person with medical expertise and/or experience” may be a nurse, doctor, or some other trained medical professional. - In some embodiments, a communication session, such as an audio or a video communication session, may be established between the
first device 710 and thesecond device 720. In one example embodiment, the communication session may be a captioning communication session. - In some embodiments, the
system 730 may be an assistive service, which is intended to permit a hearing-impaired person to utilize a communication network and assist their understanding of a conversation by providing text captions to supplement voice conversation occurring during communication sessions with other devices, such as thesecond device 720. Alternatively or additionally, thesystem 730 may be an assistive service to couple a person with medical expertise and/or experience with a person requesting medical assistance. - During a communication session, the
system 730 and thefirst device 710 may be communicatively coupled using networking protocols. In some embodiments, during the communication session between thefirst device 710 and thesecond device 720, thefirst device 710 may provide the audio signal from thesecond device 720 to thesystem 730. - In some embodiments, at the
system 730, a call assistant may listen to the audio signal of thesecond user 704 and “revoice” the words of thesecond user 704 to a speech recognition computer program tuned to the voice of the call assistant. In these and other embodiments, the call assistant may be an operator who serves as a human intermediary between the hearing-impaired user 702 and thesecond user 704. In some embodiments, text captions may be generated by the speech recognition computer as a transcription of the audio signal of thesecond user 704. The text captions may be provided to thefirst device 710 being used by the hearing-impaired user 702 over thenetwork 740. Thefirst device 710 may display the text captions while the hearing-impaired user 702 carries on a normal conversation with thesecond user 704. The text captions may allow the hearing-impaired user 702 to supplement the voice signal received from thesecond device 720 and confirm his or her understanding of the words spoken by thesecond user 704. - Modifications, additions, or omissions may be made to the
communication system 700 without departing from the scope of the present disclosure. For example, in some embodiments, thesecond user 704 may be hearing impaired. In these and other embodiments, thesystem 730 may provide text captions to thesecond device 720 based on audio data transmitted by thefirst device 710. Alternately or additionally, thesystem 730 may include additional functionality. For example, thesystem 730 may edit the text captions or make other alterations to the text captions after presentation of the text captions on thefirst device 710. - In some embodiments, the
environments FIGS. 1, 2, and 3 , respectively, may be combined with thecommunication system 700. For example, thecommunication system 700 may facilitate live verbal captioning of a communication session and the transcription of media files of multimedia messages. For example, in some embodiments, a message with a media file may be sent from thesecond device 720 to thefirst device 710. Thesystem 730 may be configured to provide text captions to the media file of the multimedia message along with performing the operations described with respect toFIG. 7 . - As indicated above, the embodiments described herein may include the use of a special purpose or general purpose computer (e.g., the
processor 410 ofFIG. 4 ) including various computer hardware or software modules, as discussed in greater detail below. Further, as indicated above, embodiments described herein may be implemented using computer-readable media (e.g., thememory 412 ordata storage 414 ofFIG. 4 ) for carrying or having computer-executable instructions or data structures stored thereon. - In some embodiments, the different components, modules, engines, and services described herein may be implemented as objects or processes that execute on a computing system (e.g., as separate threads). While some of the systems and methods described herein are generally described as being implemented in software (stored on and/or executed by general purpose hardware), specific hardware implementations or a combination of software and specific hardware implementations are also possible and contemplated.
- In accordance with common practice, the various features illustrated in the drawings may not be drawn to scale. The illustrations presented in the present disclosure are not meant to be actual views of any particular apparatus (e.g., device, system, etc.) or method, but are merely idealized representations that are employed to describe various embodiments of the disclosure. Accordingly, the dimensions of the various features may be arbitrarily expanded or reduced for clarity. In addition, some of the drawings may be simplified for clarity. Thus, the drawings may not depict all of the components of a given apparatus (e.g., device) or all operations of a particular method.
- Terms used herein and especially in the appended claims (e.g., bodies of the appended claims) are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including, but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes, but is not limited to,” etc.).
- Additionally, if a specific number of an introduced claim recitation is intended, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases “at least one” and “one or more” to introduce claim recitations. However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim recitation to embodiments containing only one such recitation, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an” (e.g., “a” and/or “an” should be interpreted to mean “at least one” or “one or more”); the same holds true for the use of definite articles used to introduce claim recitations.
- In addition, even if a specific number of an introduced claim recitation is explicitly recited, those skilled in the art will recognize that such recitation should be interpreted to mean at least the recited number (e.g., the bare recitation of “two recitations,” without other modifiers, means at least two recitations, or two or more recitations). Furthermore, in those instances where a convention analogous to “at least one of A, B, and C, etc.” or “one or more of A, B, and C, etc.” is used, in general such a construction is intended to include A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B, and C together, etc. For example, the use of the term “and/or” is intended to be construed in this manner.
- Further, any disjunctive word or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase “A or B” should be understood to include the possibilities of “A” or “B” or “A and B.”
- However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim recitation to embodiments containing only one such recitation, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an” (e.g., “a” and/or “an” should be interpreted to mean “at least one” or “one or more”); the same holds true for the use of definite articles used to introduce claim recitations.
- Additionally, the use of the terms “first,” “second,” “third,” etc., are not necessarily used herein to connote a specific order or number of elements. Generally, the terms “first,” “second,” “third,” etc., are used to distinguish between different elements as generic identifiers. Absence a showing that the terms “first,” “second,” “third,” etc., connote a specific order, these terms should not be understood to connote a specific order. Furthermore, absence a showing that the terms first,” “second,” “third,” etc., connote a specific number of elements, these terms should not be understood to connote a specific number of elements. For example, a first widget may be described as having a first side and a second widget may be described as having a second side. The use of the term “second side” with respect to the second widget may be to distinguish such side of the second widget from the “first side” of the first widget and not to connote that the second widget has two sides.
- All examples and conditional language recited herein are intended for pedagogical objects to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Although embodiments of the present disclosure have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the present disclosure.
Claims (17)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/116,671 US20180375993A1 (en) | 2016-12-15 | 2018-08-29 | Transcribing media files |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/380,589 US10091354B1 (en) | 2016-12-15 | 2016-12-15 | Transcribing media files |
US16/116,671 US20180375993A1 (en) | 2016-12-15 | 2018-08-29 | Transcribing media files |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/380,589 Continuation US10091354B1 (en) | 2016-12-15 | 2016-12-15 | Transcribing media files |
Publications (1)
Publication Number | Publication Date |
---|---|
US20180375993A1 true US20180375993A1 (en) | 2018-12-27 |
Family
ID=63638785
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/380,589 Active US10091354B1 (en) | 2016-12-15 | 2016-12-15 | Transcribing media files |
US16/116,671 Abandoned US20180375993A1 (en) | 2016-12-15 | 2018-08-29 | Transcribing media files |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/380,589 Active US10091354B1 (en) | 2016-12-15 | 2016-12-15 | Transcribing media files |
Country Status (1)
Country | Link |
---|---|
US (2) | US10091354B1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11373654B2 (en) * | 2017-08-07 | 2022-06-28 | Sonova Ag | Online automatic audio transcription for hearing aid users |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10091354B1 (en) * | 2016-12-15 | 2018-10-02 | Sorenson Ip Holdings, Llc | Transcribing media files |
RU2712101C2 (en) * | 2018-06-27 | 2020-01-24 | Общество с ограниченной ответственностью "Аби Продакшн" | Prediction of probability of occurrence of line using sequence of vectors |
US10573312B1 (en) * | 2018-12-04 | 2020-02-25 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
US11017778B1 (en) * | 2018-12-04 | 2021-05-25 | Sorenson Ip Holdings, Llc | Switching between speech recognition systems |
US11488604B2 (en) | 2020-08-19 | 2022-11-01 | Sorenson Ip Holdings, Llc | Transcription of audio |
CN114168105B (en) * | 2021-12-08 | 2023-12-01 | 深圳市研强物联技术有限公司 | Implementation method and medium for audio media playing based on dual-system wearable product |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020056123A1 (en) * | 2000-03-09 | 2002-05-09 | Gad Liwerant | Sharing a streaming video |
US20070263259A1 (en) * | 2004-10-19 | 2007-11-15 | Shin Yoshimura | E-Mail Transmission System |
US8887306B1 (en) * | 2011-12-20 | 2014-11-11 | Google Inc. | System and method for sending searchable video messages |
US10091354B1 (en) * | 2016-12-15 | 2018-10-02 | Sorenson Ip Holdings, Llc | Transcribing media files |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020199096A1 (en) | 2001-02-25 | 2002-12-26 | Storymail, Inc. | System and method for secure unidirectional messaging |
US20030088687A1 (en) | 2001-12-28 | 2003-05-08 | Lee Begeja | Method and apparatus for automatically converting source video into electronic mail messages |
US7016844B2 (en) * | 2002-09-26 | 2006-03-21 | Core Mobility, Inc. | System and method for online transcription services |
US20070094333A1 (en) | 2005-10-20 | 2007-04-26 | C Schilling Jeffrey | Video e-mail system with prompter and subtitle text |
US20080284910A1 (en) * | 2007-01-31 | 2008-11-20 | John Erskine | Text data for streaming video |
US8064576B2 (en) * | 2007-02-21 | 2011-11-22 | Avaya Inc. | Voicemail filtering and transcription |
CN101252636A (en) | 2007-12-27 | 2008-08-27 | 深圳市同洲电子股份有限公司 | Information publishing system, method, information server as well as display terminal |
EP2454733A1 (en) | 2009-07-15 | 2012-05-23 | Google, Inc. | Commands directed at displayed text |
US8543652B2 (en) * | 2010-07-22 | 2013-09-24 | At&T Intellectual Property I, L.P. | System and method for efficient unified messaging system support for speech-to-text service |
US8417223B1 (en) | 2010-08-24 | 2013-04-09 | Google Inc. | Advanced voicemail features without carrier voicemail support |
US20130033560A1 (en) | 2011-08-05 | 2013-02-07 | Alcatel-Lucent Usa Inc. | Method and apparatus for adding closed captioning to video mail |
CN202602809U (en) | 2012-04-17 | 2012-12-12 | 浙江中烟工业有限责任公司 | Video conferencing system |
US20150011251A1 (en) | 2013-07-08 | 2015-01-08 | Raketu Communications, Inc. | Method For Transmitting Voice Audio Captions Transcribed Into Text Over SMS Texting |
US9710140B2 (en) * | 2015-03-17 | 2017-07-18 | Adobe Systems Incorporated | Optimizing layout of interactive electronic content based on content type and subject matter |
US10332506B2 (en) * | 2015-09-02 | 2019-06-25 | Oath Inc. | Computerized system and method for formatted transcription of multimedia content |
US9374536B1 (en) * | 2015-11-12 | 2016-06-21 | Captioncall, Llc | Video captioning communication system, devices and related methods for captioning during a real-time video communication session |
-
2016
- 2016-12-15 US US15/380,589 patent/US10091354B1/en active Active
-
2018
- 2018-08-29 US US16/116,671 patent/US20180375993A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020056123A1 (en) * | 2000-03-09 | 2002-05-09 | Gad Liwerant | Sharing a streaming video |
US20070263259A1 (en) * | 2004-10-19 | 2007-11-15 | Shin Yoshimura | E-Mail Transmission System |
US8887306B1 (en) * | 2011-12-20 | 2014-11-11 | Google Inc. | System and method for sending searchable video messages |
US10091354B1 (en) * | 2016-12-15 | 2018-10-02 | Sorenson Ip Holdings, Llc | Transcribing media files |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11373654B2 (en) * | 2017-08-07 | 2022-06-28 | Sonova Ag | Online automatic audio transcription for hearing aid users |
Also Published As
Publication number | Publication date |
---|---|
US10091354B1 (en) | 2018-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10091354B1 (en) | Transcribing media files | |
WO2017036290A1 (en) | Voice conference method, conference client, and system | |
US10582044B2 (en) | Selecting audio profiles | |
US10224057B1 (en) | Presentation of communications | |
US8489696B2 (en) | Instant messaging exchange incorporating user-generated multimedia content | |
US10362173B2 (en) | Web real-time communication from an audiovisual file | |
US11782674B2 (en) | Centrally controlling communication at a venue | |
US10313502B2 (en) | Automatically delaying playback of a message | |
JP2015100073A (en) | Communication apparatus, terminal, communication system, communication method, and communication program | |
US20170270948A1 (en) | Method and device for realizing voice message visualization service | |
US11050871B2 (en) | Storing messages | |
US20230247131A1 (en) | Presentation of communications | |
US10789954B2 (en) | Transcription presentation | |
JP7052335B2 (en) | Information processing system, information processing method and program | |
US20190007775A1 (en) | Integration of audiogram data into a device | |
US10818295B1 (en) | Maintaining network connections | |
US20200184973A1 (en) | Transcription of communications | |
US11431767B2 (en) | Changing a communication session | |
US10237402B1 (en) | Management of communications between devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SORENSON IP HOLDINGS, LLC, UTAH Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CAPTIONCALL, LLC;REEL/FRAME:046753/0357 Effective date: 20170103 Owner name: CAPTIONCALL, LLC, UTAH Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BOEHME, KENNETH;ROYLANCE, SHANE;REEL/FRAME:046753/0306 Effective date: 20161214 |
|
STCT | Information on status: administrative procedure adjustment |
Free format text: PROSECUTION SUSPENDED |
|
AS | Assignment |
Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLAT Free format text: PATENT SECURITY AGREEMENT;ASSIGNORS:SORENSEN COMMUNICATIONS, LLC;CAPTIONCALL, LLC;REEL/FRAME:050084/0793 Effective date: 20190429 Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT, NEW YORK Free format text: PATENT SECURITY AGREEMENT;ASSIGNORS:SORENSEN COMMUNICATIONS, LLC;CAPTIONCALL, LLC;REEL/FRAME:050084/0793 Effective date: 20190429 |
|
AS | Assignment |
Owner name: SORENSON COMMUNICATIONS, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752 Effective date: 20190429 Owner name: SORENSON IP HOLDINGS, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752 Effective date: 20190429 Owner name: CAPTIONCALL, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752 Effective date: 20190429 Owner name: INTERACTIVECARE, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752 Effective date: 20190429 |
|
AS | Assignment |
Owner name: SORENSON IP HOLDINGS, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468 Effective date: 20190429 Owner name: CAPTIONCALL, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468 Effective date: 20190429 Owner name: SORENSON COMMUNICATIONS, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468 Effective date: 20190429 Owner name: INTERACTIVECARE, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468 Effective date: 20190429 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: CORTLAND CAPITAL MARKET SERVICES LLC, ILLINOIS Free format text: LIEN;ASSIGNORS:SORENSON COMMUNICATIONS, LLC;CAPTIONCALL, LLC;REEL/FRAME:051894/0665 Effective date: 20190429 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STCB | Information on status: application discontinuation |
Free format text: EXPRESSLY ABANDONED -- DURING EXAMINATION |
|
AS | Assignment |
Owner name: CAPTIONCALL, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CORTLAND CAPITAL MARKET SERVICES LLC;REEL/FRAME:058533/0467 Effective date: 20211112 Owner name: SORENSON COMMUNICATIONS, LLC, UTAH Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CORTLAND CAPITAL MARKET SERVICES LLC;REEL/FRAME:058533/0467 Effective date: 20211112 |