WO2023048803A1 - Botcasts - podcasts personnalisés basés sur l'intelligence artificielle (ia) - Google Patents

Botcasts - podcasts personnalisés basés sur l'intelligence artificielle (ia) Download PDF

Info

Publication number
WO2023048803A1
WO2023048803A1 PCT/US2022/037652 US2022037652W WO2023048803A1 WO 2023048803 A1 WO2023048803 A1 WO 2023048803A1 US 2022037652 W US2022037652 W US 2022037652W WO 2023048803 A1 WO2023048803 A1 WO 2023048803A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
user
botcast
media
selected content
Prior art date
Application number
PCT/US2022/037652
Other languages
English (en)
Inventor
Karen Master Ben-Dor
Adi Diamant
Stav Yagev
Eshchar ZYCHLINSKI
Yoni SMOLIN
Original Assignee
Microsoft Technology Licensing, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/530,068 external-priority patent/US20230092783A1/en
Application filed by Microsoft Technology Licensing, Llc filed Critical Microsoft Technology Licensing, Llc
Publication of WO2023048803A1 publication Critical patent/WO2023048803A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/638Presentation of query results
    • G06F16/639Presentation of query results using playlists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings

Definitions

  • the Internet stores and indexes a variety of media content, including audio content, literary content, and mixed-media content, all of which can be searched and rendered with specialized browsers, media players and other specialized user interfaces.
  • podcasts which are configured as audio files categorized within genres or topics, have become popular, at least in part, because of the ease in which the underlying content can be tagged and categorized and for enabling consumers to search for podcasts containing content of that may be of interest to the consumers.
  • Some podcasts are published without restrictions and/or for consumption by the public-at-large.
  • Other podcasts are restricted and are only available to subscribers or users having verifiable credentials.
  • the search tools and functionality provided by browsers for enabling users to search for podcasts can also be used to search for other types of media content available on the Internet, as well as to search for content saved on other public and private enterprise storage systems.
  • many conventional browsers and media players are configured with query tools that enable users to search for and identify media from any accessible and indexed storage location that contains content associated with a specified keyword or attribute of interest to the user (e.g., file type, format, size, duration, creation date, author, etc.).
  • Logistical time constraints can also impede the manner in which users are able to find and consume media that is most relevant to their current needs and desires. For instance, some content is published at times when users are busy or unavailable, such as when they are sleeping, eating, driving, working, etc. Likewise, even when users are available to search for desired content, they may not have the time to consume all of the content that they found or that is available. The foregoing problems can be made even worse when the desired content, such as a particular snippet of an audio recording, is buried within a relatively large and lengthy media file that the user does not have time to listen to.
  • Distribution constraints can also negatively affect a consumer’s ability to find and obtain specific content that is relevant to their current needs and desires.
  • some publishers try to generalize their content in an effort to make their content more palatable and broadly applicable to many different interests and parties by casting a wide net, by essentially addressing many different concepts within a single publication and/or by restating the same content in many different ways within the same media product.
  • This type of media is very commercially viable because it can resonate with a relatively large and diverse base of consumers that have differing perspectives and interests.
  • this type of media will inevitably include various content that is not particularly relevant and/or of interest to each and every individual consumer.
  • New and improved methods, systems, products, and devices are provided for identifying, accessing, and presenting media content to different users.
  • the disclosed embodiments that provide this functionality include methods, systems, and devices for identifying, accessing, filtering, augmenting, customizing, personalizing and/or otherwise modifying media content for user consumption. These embodiments are operable for facilitating the accessibility and presentation of the media content in a personalized or customized manner to the individual users.
  • the disclosed embodiments include generating unique and/or customized botcasts for a plurality of different users, where each botcast comprises an audio file of assembled/compiled audio content that is personalized for each individual user.
  • the assembled/compiled audio content for a user’s botcast comprises similar underlying audio content, the same underlying audio content and/or different underlying audio content than the audio content that is assembled/compiled for the same user in different contextual circumstances and/or that is assembled/compiled for different user botcasts, according to the different user preference and profile settings and/or different contextual circumstances.
  • the audio content that is selected, formatted, augmented, summarized, filtered and/or otherwise modified for each botcast is done so in a manner that is determined to be of a personal interest and/or of a contextual relevance for each corresponding individual user according to their preference and profile settings and/or current contextual circumstances.
  • the disclosed embodiments include, for example, systems and methods for configuring and/or utilizing a botcast of media content customized for a particular user. These embodiments include systems identifying and analyzing selected content to include in a botcast for a particular user based on one or more profile or preference settings associated with the particular user (which may include contextual circumstances associated with the particular user), as well as for generating a transition associated with the selected content that is personalized to the particular user and that will be assembled into the botcast with the selected content to identify the relevance of the content to the user.
  • the transition includes audio that is supplementary to the selected content and includes at least one of an identification of a relevance of the selected content to the particular user based on the one or more profile or preference settings associated with the particular user, and/or a summary of the selected content that is formatted in a selected summary format that is selected from a plurality of available different summary formats, each of which is based on the one or more profile or preference settings associated with the particular user and which setting may include contextual circumstances associated with the particular user.
  • the disclosed embodiments also include systems sequencing the selected content with the transition sequence into a playback sequence that is used by media players for presenting and/or rendering the botcast transition(s) and content in the ordered sequence. This sequencing may also be based on the profile/preference settings and/or contextual circumstances associated with the particular user.
  • the disclosed embodiments also include systems formatting and/or otherwise linking or assembling the selected content and transition(s) into a digital structure comprising the botcast and with the selected content and transition(s) being stored in an audio format or audio playable format that is selected from a plurality of different audio/audio playable formats, the audio/audio playable format being selected from the plurality of different formats based on the one or more profile or preference settings associated with the particular user and which may include contextual circumstances associated with the particular user.
  • Some embodiments also include providing the botcast to an audio player that renders the botcast (including the selected content and transition(s)) according to the ordering of the botcast sequence and in the format selected for the botcast, based on the particular user settings/circumstances.
  • Other embodiments also include modifying the botcast for the same user and/or for different users, based on detecting different dynamic changes to user settings, available content and/or contextual circumstances for the user/different users.
  • modifications can include adding new content and/or removing content from the botcasts, based on a detected increase/decrease in relevance of the content, respectively, for the different users.
  • the modifications can also include reordering/resequencing of the content and/or transitions in the botcast, reformatting of the content/transitions into different playback audio formats, and/or creating, augmenting, deleting, or otherwise modifying the transit! ons/content in the botcast.
  • Yet other embodiments include storing, publishing and/or playing the botcasts.
  • Figure 1 illustrates a computing environment in which a computing system incorporates and/or is utilized to perform disclosed aspects of the disclosed embodiments for configuring, generating and/or otherwise utilizing personalized botcasts.
  • Figure 2 illustrates an embodiment of system model components and a flow corresponding to the overall generation, configuration and/or utilization of personalized botcasts.
  • Figures 3 A-3C illustrate various example embodiments of user interfaces for selecting content for personalized botcasts.
  • Figure 4 illustrates an embodiment of a user interface for selecting segmented content that has been segmented for selection and inclusion into personalized botcasts.
  • Figures 5-7 illustrate various example embodiments of user interfaces for selecting content for a personalized botcast.
  • Figures 8-10 illustrate various example embodiments of user interfaces and interface components for selecting content for personalized botcasts.
  • Figure 11 illustrates an embodiment of a botcast user interface media player that is showing different personalized botcasts and their current statuses
  • Figure 12 illustrates an embodiment of a botcast user interface media player that is currently playing a personalized botcast
  • Figure 13 illustrates one embodiment of a flow diagram having a plurality of acts associated with methods for configuring, generating and/or utilizing personalized botcasts.
  • Disclosed embodiments include methods, systems, products, and devices for identifying, accessing and/or for presenting media in a personalized/customized format for individual users. These embodiments include methods, systems, and devices for identifying, accessing, filtering, augmenting, customizing, personalizing and/or otherwise modifying media content with corresponding content transitions for user consumption in the form of a botcast.
  • the disclosed botcasts comprise files of selected media content from one or more media sources, as well as transitions that are created for the different media content contained in the botcast.
  • the media content and transitions in the botcasts are selected, created, sequenced and/or formatted in a customized/personalized manner based one or more user profile or preference settings and contextual circumstances associated with the particular user/users for whom the botcasts are created.
  • the technical benefits include improved efficiency in identifying, assembling, and formatting content of a variety of underlying formats for playback and consumption by different users that have different system capabilities that may not initially be operable to render/play all of the underlying formats in a desired playback format.
  • Figure 1 illustrates components of a computing system 110 which may include and/or be used to implement aspects of the disclosed invention.
  • the computing system 110 is illustrated as being incorporated within a broader computing environment 100 that also includes one or more remote system(s) 120 communicatively connected to the computing system 110 through a network 130 (e.g., the Internet, cloud, or other network connection(s)).
  • the remote system(s) 120 comprise one or more processor(s) 122 and one or more computer-executable instruction(s) stored in corresponding hardware storage device(s) 124, for facilitating processing/functionality at the remote system(s), such as when the computing system 110 is distributed to include remote system(s) 120.
  • the computing system 110 incorporates and/or utilizes various components that enable the disclosed functionality for configuring and utilizing botcasts and other similar products/structures.
  • the functionality and processing performed by and/or incorporated into the computing system 110 and the corresponding computing system components includes, but is not limited to, identifying, accessing, filtering, augmenting, customizing, personalizing and/or otherwise modifying media content for user consumption in the form of a botcast or other similar product/structure comprising audio content and corresponding transitions personalized for individual users.
  • the disclosed functionality also includes generating and/or obtaining training data and training the disclosed models to perform the underlying and described functionality of each of the disclosed models and which functionality will be described in more detail throughout this disclosure.
  • Figure 1 illustrates the computing system 110 and some of the referenced components that incorporate and/or that enable the disclosed functionalities.
  • the computing system 110 is shown to include one or more processor(s) 112 (such as one or more hardware processor(s)) and a storage 140 (i.e., hardware storage device(s)) storing computer-executable instructions 118 wherein the storage 140 is able to house any number of data types and any number of computerexecutable instructions 118 by which the computing system 110 is configured to implement one or more aspects of the disclosed embodiments when the computer-executable instructions 118 are executed by the one or more processor(s) 112.
  • the computing system 110 is also shown to include one or more user interface(s) 114 and input/output (VO) device(s) 116.
  • VO input/output
  • the one or more user interface(s) 114 and input/output (I/O) device(s) 116 include, but are not limited to, speakers, microphones, vocoders, display devices, browsers, media players and application displays and controls for receiving and displaying/rendering user inputs and for accessing, selecting, formatting media content and for configuring and utilizing the disclosed botcasts.
  • the user inputs received and processed by the user interface(s) 114 include user inputs for identifying and/or generating the referenced user preferences and profiles (160), for identifying or selecting media content to include in botcasts and for selecting and playing botcasts, as will be described in more details throughout this disclosure.
  • interface menus with selectable control element, stand-alone control objects and other control features are also included with the interface(s) 114, including the menus, icons, controls, and other objects described in reference to Figures 3A-12, and which are configured for receiving user input that is operable, when received, to trigger access to the referenced functionalities associated with configuring and utilizing the disclosed botcasts.
  • botcast should be broadly interpreted to comprise a data structure or file that includes or operably links to media content and corresponding transitions created for the different media content in a sequenced ordering for sequenced playback in an audio format.
  • the media content is preferably formatted into an audio format.
  • some botcasts are configured with media and/or links to other media content that is not in a pure audio format (e.g., in a format comprising any combination of audio, text, image and/or video formatting) and which is accessed in real-time, during playback or rendering of the botcast, and transformed into audio for playback by the media player and/or systems used to access and render the botcast.
  • the botcast may include or comprise a listing or links to media content to be rendered, along with instructions for how the media is to be transformed and/or rendered during playback, as well as the sequencing for playback.
  • the botcasts also include transitions, created according to the disclosed embodiments, for providing relevance or summaries of the content that is played. In some instances, the transitions are prepended to the content they correspond to. In other instances, they are separate structures that are sequenced/linked to play prior to the content they correspond to.
  • botcast content 150 All of the botcast media content, transitions, links, sequencing instructions, playback instructions, playback status and access event records are included in the referenced botcast content 150, which is stored in storage 140.
  • the botcast content 150 also includes a plurality of different botcasts corresponding to a single user or a plurality of different users. These botcasts can be stored and managed over time in a static state, without change to the content or the transitions. Additionally, in some instances, the botcasts are stored in a dynamic and modifiable format, such as when they are modified in view of different detected circumstances or user preferences and profiles 160, such that the botcasts can operate as a dynamic streaming channel, for example.
  • the storage 140 is configured to store the referenced user preferences and profiles 160 along with the botcasts, although they may be stored in different containers or locations of the storage 140.
  • the data included in the user preferences and profiles 160 is also sometimes referred to herein as preference and profile data or more simply as profile data.
  • This profile data includes, but is not limited to, any combination of (i) language, prosody and/or other speech-related profile preferences, which reflect one or more preferential languages for particular users, (ii) calendar and schedule information associated with one or more calendars and scheduled events of particular users or user contacts, (iii) user account login and credential(s) for accessing one or more applications, media content resources, user calendar and meeting event interfaces, and/or other files and resources, (iv) learned or entered user preferences for topics of interest, hobbies, resources of media content, navigation histories and/or media playback preferences, (v) user device capabilities and characteristics, device logs and/or device usage histories, (vi) contacts, contact information, titles, responsibilities, memberships and/or assigned tasks or projects, (vii) location and address information associated with the users’ residences, workplaces, places of travel, and/or other location information, (viii) tagged content and other explicitly
  • the profile data will also include certain circumstantial and contextual information that is determined to be relevant to the corresponding users and/or that can be used to determine said relevance.
  • the circumstantial and contextual information of the profile data may include, but is not limited to (i) weather and environmental conditions associated with user locations or events, (ii) meeting schedules, locations, attendees, participants, materials, credentials, access information, etc., (iii) current status, activities, map, travel and logistical information associated with the user, user events and other user contacts, (iv) reports, tasks, participants, equipment, progress and status information associated with projects or events associated with the users and user contacts, (v) and/or any other contextual information associated with circumstantially relevant environments, events and/or other conditions associated with the users and their other profile data.
  • profile data can take different forms and can be consolidated or distributed.
  • This profile data can be stored, for instance, in one or more different data structures, with one or more different formats in local storage 140.
  • These data structures can also include links or pointers to other profile/preference data that is stored separately within storage 140, such as in different directories or domains, and/or remotely from storage 140.
  • entity may be viewed as broader than the term user when the term user is interpreted as a single individual person.
  • the term entity can apply to an entire enterprise or group composed of a plurality of individual people when that group is collectively associated with the organization, or another common affiliation.
  • each individual person of the collective entity may have their own profile data set stored in the preferences and profiles 160, as well as having an additional shared set of profile data that includes and/or that at least identifies common individual preferences and profiles for each of the individuals in the collective entity.
  • the system will create, access, update, store and/or otherwise utilize a separate corresponding set of profile data for each of the different user entities (e.g., groups, organizations and affiliations), each of which, for example, may specify correspondingly different roles, circumstances, contextual and/or other profile data for each of the different users within those different groups, organizations and affiliations and which may impact relative relevance of content to the different users and/or for the overall entities.
  • the different user entities e.g., groups, organizations and affiliations
  • storage 140 is shown as a single storage unit. However, it will be appreciated that the storage 140 is, in some embodiments, a distributed storage that is distributed to several separate and sometimes remote systems 120.
  • system 110 will comprise a distributed system, in some embodiments, with one or more of the system 110 components being maintained/run by different discrete systems that are remote from each other and that each perform different tasks. In some instances, a plurality of distributed systems performs similar and/or shared tasks for implementing the disclosed functionality, such as in a distributed cloud environment.
  • storage 140 is also configured to store one or more of the following: content selection interfaces and controls 170, as well as the different machine learning or machine learned models (ML models 180) used to implement the functionality described herein, and which will now be described in more detail.
  • content selection interfaces and controls 170 as well as the different machine learning or machine learned models (ML models 180) used to implement the functionality described herein, and which will now be described in more detail.
  • ML models 180 machine learning or machine learned models
  • the content selection interfaces and controls 170 include, but are not limited to, menus and interfaces that display or reference content, interface tools and objects that enable a user to flag, tag or otherwise identify and select content for inclusion in a botcast and/or for botcast processing, interfaces and tools for parsing content and for segmenting content, interfaces, menus and tools for entering and/or selecting profile data, as well as interfaces and tools for selecting botcast media files to play and for playing/rendering and sharing or distributing the botcast media files.
  • Some non-limiting examples of these content selection interfaces and controls 170 are reflected in Figures 3A-12.
  • the ML models 180 that are utilized by the systems described herein to implement the disclosed functionality include, but are not limited, a content selector model, a sequencer model, a transition model, a formatting model, a modification model, and a presentation model. Each of these models is trained with training data to perform the different functions attributed to them. These functions and additional descriptions of the different ML models 180 will now be provided in reference to Figure 2.
  • Figure 2 illustrates various components and elements associated with the botcast systems referenced above, as well as a general flow and organization of processing that occurs by the various elements and ML models to implement the functionality for configuring and utilizing the botcasts described herein.
  • the content selector model accesses various content resources to identify relevant content and to select the content that is determined to be relevant to one or more users based on the profile data for those different users.
  • the content selector model can operate automatically, based on the various profile data, and based on referencing directories of resources for the potential content it already has stored in storage 140.
  • the content selector can also respond to new and direct user input that identifies the content to be selected.
  • the content selector will utilize the aforementioned content selector interfaces and controls 170. Although the content selector interfaces and controls 170 are not shown in Figure 2. It will be appreciated that they can still be utilized by the content selector, as well as each of the other referenced and described ML models to interface with the users, content and botcasts, to perform the described functionality.
  • Some non-limiting examples of utilizing the content selector interfaces and controls 170 for receiving explicit user input for selecting the content are described in reference to Figures 3A-10 and some non-limiting examples of utilizing the content selector interfaces and controls 170 for interfacing with the botcasts (botcast media files) are shown in figures 11 and 12.
  • the content selector is trained to search/crawl target resources for content that is determined to be relevant to the users based on their user profile data and/or based on newly received user input, which user input, when received is also used to update the user profile data.
  • the content selector can perform both of these functions and is trained with training data to perform these functions. Details about how a ML model is trained will not be provided at this time, as training ML models is well known to those in the industry.
  • the content selector can be trained with keyword and content source pairs, as well as keyword and resource content pairings, to learn associations and relevance of keywords to particular content sources/resources, as well as discrete segments or portions of different resources.
  • training data sets include pairing in which the pairings include content comprising text, audio, images, video and/or mixed media that includes combinations of text, audio, images and/or video along with characterization terms for the associated content.
  • the training data sets also include combinations of different media (e.g., text, audio, video, images) along with different terms, contexts or other characterizing information for the content that can be used to train the content selector model to characterize content and the relevance of the content to particular terms and contexts, such as the terms and contexts that are obtained from the user’s profile data.
  • media e.g., text, audio, video, images
  • the training data sets further include combinations of characterizations and content that enable the training of the content selector to perform analysis and processing of all types of content to determine possible characterizations of the content and potential relevance to the user’s profile data.
  • These different types of processing include but are not limited to combinations of natural language processing and analysis, image processing and analysis, context analysis, video analysis, body language analysis, intonation analysis, emotional processing, concept inclusion analysis and/or other processing and analysis of underlying content that enable the characterizations and relative importance of the content to corresponding user preferences and contexts.
  • the training data sets also include keywords in the pairings that are reflective of or that correspond to the user profile data, such terms, words, ideas, concepts, speech elements, media preferences, events, projects, tasks, locations, names and/or other elements that may be included in the user profile data.
  • keywords in the pairings that are reflective of or that correspond to the user profile data, such terms, words, ideas, concepts, speech elements, media preferences, events, projects, tasks, locations, names and/or other elements that may be included in the user profile data.
  • the content selector model is trained to identify sources and resources that may be determined to be relevant to each particular user, based on correlations between determined characterizations of the content being considered and the terms/characterizations of the user preferences, profiles and contexts referenced in the user profile data.
  • the content selector model is also trained to prioritize or weight different elements of the profile data, as well as to determine significance and/or relevance of content to the user, based on the prioritized/weighted profile data elements of each user, and to thereby be able to determine the potential relevance of content that is discovered when the content selector identifies different content that may be of interest or that may be relevant to each user.
  • This training and real-time analysis can be done at different levels of granularity, to determine the relevance of an entire resource (e.g., document or file) to a user, as well as to determine the relative relevance of different discrete segments within the resource to the user. For instance, if the content/resource identified by the content selector is an entire paper or book, the content selector model can determine a collective relevance of the entire resource, as well as the discrete segment relevance for each chapter, paragraph, or other portion of the content.
  • the determined relevance can be categorized by a numerical scale (e.g., 0-100) or by tier (e.g., High, Medium, Low) or by temporal relevance to impending events (e.g., very urgent, urgent, low urgency, no urgency, such as based on a corresponding urgency scale of event scheduled now, in next hour, next day, next month, respectively, etc.).
  • the relevance can also be categorized by prevalence of discovered content that is similar or redundant (e.g., many articles, resources, podcasts, broadcasts, or other resources covering a common or similar topic/issue, as well as emails, voicemails or other communications that are determined to be directed to a common task, event, project, person, or topic, etc.).
  • the relevance can also be a relative relevance, rather than a relevance of magnitude or category per se.
  • the different items of content identified for a user for a particular botcast can be sorted in an ordering of relative relevance (e.g., reference C is more relevant than reference B, but less relevant than A reference, etc.)
  • the content selector model is trained to identify a fixed number or quantity of content resources to use for the botcast. In other instances, the content selector model is trained to incorporate all content resources that meet a predetermined relevance threshold (e.g., all urgent content items and/or all references having a relevance of at least high relevance and/or all references having a relevance score of at least n-value, wherein n is a value in the range of 0 to a max value).
  • a predetermined relevance threshold e.g., all urgent content items and/or all references having a relevance of at least high relevance and/or all references having a relevance score of at least n-value, wherein n is a value in the range of 0 to a max value.
  • the content selector is also trained to parse and analyze the content/resources identified from the different sources to segment the content into different portions that are independently evaluated for determined relevance. In some instances, the relevance of each segment is further determined based on user input. For instance, in some instances, once the content selector model parses and segments the content into a plurality of segments, the content selector will cause the segments to be presented to the user for further selection of segments that are of interest to the user.
  • a user first select File/Resource 2 to access the specific segments of the File/Resource 2 that are of interest (i.e., segments 1, 3, 5 and 6), while leaving segments 2 and 4 unselected, such that the botcast will include the selected segments and omit the unselected segments from a base selected file.
  • the user can specifically identify the discrete segments/portions of a resource that are of most interest to the user for inclusion into the user’s botcast, without requiring the user to have to listen to all of the different segments/portions of the resource.
  • the content selector model automatically and independently determines which segments are sufficiently relevant to include in the botcast for the user.
  • the determined relevance of each content resource and/or resource segment is tracked and stored, in some instances, in a table or other data structure within the botcast content 150 mentioned in reference to Figure 1.
  • a separate relevance tracking table can be maintained separately for each user and/or each resource and/or collectively for a plurality of different users and/or resources.
  • the content selector model (and other models) will reference the relevance tracking table(s) during implementation to identify content for inclusion in the botcasts, to identify transitions to use for the content and to update and calculate the relative relevance of other content to be included or excluded or modified for the different botcasts for each particular user.
  • the content selector model is also trained to identify credentials, tokens and other verification and authentication information from the user profile data and to, likewise, identify requirements by the different content sources for verifying/authenticating the user to obtain access to the resources contained in the different content sources.
  • the content selector model is enabled to access content from individual user accounts, such as personal email, voicemail, video conference, text and other personal communication applications and accounts, as well as from enterprise directories and databases associated with the user.
  • the content sources can include any public and/or private source of content that may be relevant to the users.
  • sources such as the world wide web (Internet), as well as enterprise systems, private and public databases, broadcast services, location services, as well as specific applications like email applications, event task and project scheduling applications, personal and business calendar applications, gaming applications, communication applications, location service applications, weather applications, and so forth.
  • the content selector model once trained, can also use any new user input for tuning or refining the training of the content selector model. For instance, user input that explicitly identifies content to include or exclude from a botcast, or that provides new profile data and/or that is used to trigger the playing, sharing, or modifying of a botcast by the user can be used to update the stored profile data and to impact which relevance determinations are made for selecting the content to include in the botcasts.
  • the content selector model will be trained to initiate their functionality/processing directly in response to a user input that explicitly requests the processing, such as the input entered from a user to create a botcast for the user (e.g., a selection of a menu item to create a botcast - prior to or subsequent to selecting the content for the botcast - and which will trigger processes for creating the botcast from previously selected content and/or for the user to manually select the content for the botcast and/or for the system to automatically identify and select the content for the botcast).
  • a user input that explicitly requests the processing, such as the input entered from a user to create a botcast for the user (e.g., a selection of a menu item to create a botcast - prior to or subsequent to selecting the content for the botcast - and which will trigger processes for creating the botcast from previously selected content and/or for the user to manually select the content for the botcast and/or for the system to automatically identify and select the content for the botcast).
  • the user input for triggering the selection of content and/or for triggering the creation of a botcast includes a user selection of a botcast control icon presented in an application that generates content (e.g., icon described in Figures 8-10), which when selected, triggers the creation of a botcast for content provided by the application.
  • Other explicit user input for selecting content and/or for triggering the botcast processing/functionality includes a user indicating that highlighted or tagged portions of content should be include in a botcast.
  • user input for triggering disclosed functionalities are more indirect and not as isolated, such as user input that received incrementally/periodically over a duration of time, such as when a user incrementally/collectively selects a predetermined quantity of content resources or content to include in a botcast.
  • user input that received incrementally/periodically over a duration of time, such as when a user incrementally/collectively selects a predetermined quantity of content resources or content to include in a botcast.
  • the system will trigger the processes required to create and configure the podcast based on the selected content.
  • the content selector and other models perform their functionality automatically according to predetermined schedules (e.g., once a day, once a week, etc.) and/or in response to detecting certain events independent of user input (e.g., detecting an urgent communication directed to the user, detecting a meeting or other event has concluded or is about to take place, in response to detecting a broadcast or notification regarding content from a third party, detecting numerous new resources of content, etc.).
  • Each of these events can be a triggering condition.
  • each of the triggering conditions can be a triggering condition for selecting content and/or for more broadly generating a new botcast for a particular user and/or modifying an existing botcast for a particular user.
  • the computing system will automatically access the one or more profile or preference settings of the particular user, which is stored in memory and/or other storage, in response to any of the triggering events or conditions described herein, which may reflect or comprise a trigger for generating or modifying a botcast for the particular user;
  • the content selector is enabled to identify and select the content for user botcasts that is determined to be sufficiently relevant to the corresponding and particular user(s) that the botcasts are being created or modified for.
  • the content selector then obtains the selected content, from memory and/or other storage, which may include entire resources and/or discrete subsets or segments of the resources, and which may be configured as different types and in different formats, including various forms of text data, audio data, image data, video data, mixed-media data.
  • the selected content is processed by the sequencer model, transition model and the formatting model to create the final botcasts (i.e., botcast media file(s) which comprise formatted audio files and/or other file structures that are configured to be processed and rendered as audio).
  • the configured botcasts are then further utilized by the modification model that modifies the botcasts and/or by the presentation model that presents the botcasts for publication, storage and/or audio play.
  • Each of the foregoing models are trained with training data to perform their functionalities, as generally described above with reference to the content selector model.
  • the training and training data sets may be unique to each model and/or may be shared between the different models.
  • the sequencer model is trained to sequence different portions of the selected content into sequences based on relative importance/relevance.
  • the training data includes different pairings of relevance valuations with sequence and ordering rules.
  • the sequencer model is also trained with pairings of relevance valuations with circumstantial and contextual information to re-evaluate/modify relevance valuations.
  • the trained sequencer model is enabled to evaluate user profile data (including new and dynamic context/circumstances) to re-evaluate the relevance valuations of the different selected content, based on the new and dynamic context/circumstances and/or other updated profile data, subsequent to the initial identification of the selected content. This can be important, particularly when the selected content is identified over a lengthy period of time and/or prior to a significant new contextual event has occurred that is relevant to the user.
  • the transition model creates transitions for each of the different content segments/resources that are to be included in the botcast for the user.
  • the transition model is trained with various training data sets, to accommodate different needs and preferences, and to perform desired analysis of the selected content and to generate transitions for the content that summarize the context of the content and/or to clarify the relevance of the content to the user in a personalized manner for the user, based on different preference and profile settings of the user.
  • the analysis and processing performed by the transition model on the content includes, but is not limited to, any combination of natural language processing and analysis, image processing and analysis, context analysis, video analysis, body language analysis, intonation analysis, emotional processing, concept inclusion analysis and/or other processing and analysis of underlying content (e.g., text, audio, video, images) to determine context and/or other characterizing information of the content that can be used to characterize relevance of the content to the user.
  • underlying content e.g., text, audio, video, images
  • the transition model evaluates metadata associated with the content to obtain information associated with timing, location, participants and/or other contextual information related to the content.
  • transition model Any combination of the foregoing processing is performed by the transition model during implementation, as appropriate for the corresponding types of content, to identify key terms, key phrases, authors, instructions, tasks, intentions, concepts, summaries, participants, deadlines, events, dates, times, emphasis, and emotional tones, for the underlying content.
  • the transition model obtains results of such analysis from the content selector model if the content selector model already performed such an analysis on the content.
  • the transition model is enabled to perform functionality for identifying context and relative relevance of the selected content to a user and to summarize this context/rel evance in a message that is spoken to the user.
  • the transition which prepends the corresponding content, can be helpful for enabling a user to selectively decide whether to listen to the actual content or not. Such a transition can also be helpful to clarify what the user is listening to and to prepare the user for changes in the current content from previous content in the botcast that may be related to completely different topics or concepts.
  • the transition model can include an audio message that identifies context or relevance of the meeting for the particular user to prepare the user for what they are going to hear.
  • the transition for the meeting might summarize the context/relevance of the meeting for a particular user in the following way “This is where your Boss instructed you to perform task A during all hand meeting on a date Y,” or “In the meeting last Thursday, you presented your summary on the project progress to the group. Meeting attendees included people A,B,C,. . or “Person A said this to you and about you in the email they referenced during the meeting ”.
  • transition messages that can be created, depending on the underlying content and the determined relevance/context to the particular user.
  • the messages may convey information spoken by or about the user, temporal information associated with a scheduled calendar event associated with the particular user, and/or any other relevant information.
  • the transition model can perform their functionality for each of the different entities/individuals separately and can configure/utilize different versions of a similar botcast having similar or the same underlying selected content. For instance, when an enterprise has a botcast created with selected content applicable to a plurality of different members, the transition model will still customize/personalize a plurality of discrete botcast versions of the same underlying botcast, with the same underlying content, but different customized/personalized transitions for each individual that personalize/customize how the content is summarized to the individual and/or how the relevance of the content to each individual is characterized.
  • transitions are customized and/or personalized includes generating the speaking style with a language, prosody, timbre and/or other speaking attribute that is preferred by the different users to use for the summary of the content in the transition, which is selected from a plurality of different summary formats based on profile or preference settings of the different users.
  • This speaking attribute information or profile data that can be used to ascertain the preferred speaker attributes are included in the user profile data mentioned previously.
  • the prosody and language selected for presenting the transition messages and/or that is selected for presenting content that is converted into an audio format can be automatically and/or manually identified/selected from a plurality of different available prosodies and languages, based on preconfigured settings, or based on contextual environmental/user conditions (e.g., detected user, geography, mood, or other user profile data) and/or based on explicit user input and/or system settings.
  • the transition summary/message format and prosody that is used which is selected from a plurality of different summary/message formats, will be based on different prosody style attributes, such as speaking styles of a particular target speaker and/or a style mode of speaking (e.g., John Wayne or western style, storytelling style, news reporting style, comedic style, intellectual style, etc.).
  • Some prosody attributes associated with the prosody style also include typical human-expressed emotions such as a happy emotion, a sad emotion, an excited emotion, a nervous emotion, or other emotion.
  • a particular speaker is feeling a particular emotion and thus the way the speaker talks is affected by the particular emotion in ways that would indicate to a listener that the speaker is feeling such an emotion.
  • a speaker who is feeling angry may speak in a highly energized manner, at a loud volume, and/or in truncated speech.
  • a speaker may wish to convey a particular emotion to an audience, wherein the speaker will consciously choose to speak in a certain manner.
  • a speaker may wish to instill a sense of awe into an audience and will speak in a hushed, reverent tone with slower, smoother speech.
  • the prosody styles are not further categorized or defined by descriptive identifiers.
  • the disclosed systems and models are configured to select a prosody/style of speech, from a plurality of available prosodies/styles/languages, based on the user profile data that reflects a relevance of a particular prosody/style to the user and/or the content.
  • the user profile data also specifies whether music or other background sounds or affects should be used with the transition messages when they are presented. Some affects can be used to emphasize the messages and/or to make the message more pleasant to listen to and/or help the listener know that they are listening to a transition rather than the underlying content.
  • the formatting model is trained to prepend transitions to different content and content segments that they are associated with.
  • This training and processing includes formatting all of the content into a common format (e.g., an audio format or another format that can be processed by the presentation model and/or an audio media player in the form of audio).
  • the formatting model converts the content into a common format. This process may include summarizing the concepts and contexts of the non-audio content into text and then performing natural language processing to generate audio representations of the text. In some instances, this also includes generating audio representations that are formatted to render the audio in a particular language or speaking style. Techniques for converting text to language are known and will not be described in detail at this time. Importantly, however, the formatting model selects the speaking style to use for the audio representation, from a plurality of available speaking styles available to the formatting model, based on the profile data of the user associated with the botcast for which the transition is to be created.
  • the formatting model will still generate different formatted transitions for each user, with each formatted transition being formatted with a speaking style (e.g., language, prosody, timbre, etc.) and/or presentation style (e.g., with sound effects, with background music, etc.) that is unique to the user relative to other users, based on their unique set of user preferences and profile data.
  • a speaking style e.g., language, prosody, timbre, etc.
  • presentation style e.g., with sound effects, with background music, etc.
  • the formatting model creates the botcast media file having the transitions prepended before each of the different corresponding portions of content in the botcast file, as a single integrated audio file.
  • the formatting model also formats the botcast with tags, markers and/or index features that identify the start and stop portions of each transition and corresponding content within the botcast, so as to enable a media player to jump to the different portions of the botcast for playback.
  • the formatting model creates a playlist of the transitions and content segments for the botcast, which are each stored separately, but presented and/or played together in the sequenced ordering during playback.
  • the botcast file is formatted into text, containing text that corresponds to the transition messages and the content and which is converted to audio by the presentation model and/or a media player.
  • the presentation model is trained/configured to present the botcasts to the user for rendering, on demand, through the various interfaces (e.g., Figures 11-12) that the user can query and select the botcast from, and/or by sending the botcast files directly to a user device (e.g., a message containing the botcast file or link that is sent to a user’s device via a messaging application).
  • the presentation model is also configured to enable playback of the botcasts with various playback functionality (e.g., play, pause, fast-forward to next transition, rewind to previous transition, etc.).
  • the modification model is configured and trained to identify user input and updated profile data and to further evaluate user playback of different botcasts, to determine whether to modify the botcasts.
  • the modification model performs redundant processing performed by the other models to update or modify botcast(s) based on new user input and/or updated profile data.
  • the modification model causes the other models to implement their processes iteratively, relative to the existing botcast and/or with newly discovered content and/or new user profile data, and which causes a change to content and/or the transition(s) in the existing botcast(s).
  • the modification processing causes validating or invalidating of content in the botcasts for particular users.
  • the modification processing may diminish relative relevance of content in a botcast for a particular user based on the new/updated profile data and contextual circumstances of the user(s) and/or based on newly discovered content and/or based on a determination the user has already listened to some of the content.
  • the modification model is configured to cause the botcast to be modified by omitting, resequencing, reformatting, replacing and/or otherwise modifying the content and/or the corresponding transition(s) for the content in the botcast.
  • the modifications merely resequences the content in the botcast by newly determined relevance, with the more urgent/prioritized/relevant content being sequenced before relatively less urgent, prioritized, or relevant content in the botcast.
  • Figures 3A-3B illustrate a content selection UI 300 that can be used by a user to select from a selection of possible resources to identify and select content for a botcast.
  • the listing of resources can be made through directory of resources (e.g., a user’s database of recordings and other content, for example).
  • the system will, in some instances, query for available resources in a particular domain or directory in response to user input identifying the domain/directory from an initial query entered on an interface to find content for a botcast.
  • the system parses the selected content/resource into a plurality of different segments, as described previously, for subsequent identification/selection. This is shown, for example, in Figure 3C, where a user is presented with segments 1-6 for file/resource 2, subsequent to the user selecting file/resource for inclusion in the user’s botcast. The user is also shown as having now selected segments 1, 3, 5 and 6 for inclusion. In this instance, the botcast will include the content of these segments, while excluding the other segments. If other resources and/or segments are selected, they can also be included in the same or different botcast, according to different embodiments.
  • Figure 4 illustrates another embodiment for selecting content
  • an interface is presented with segments of an underlying resource (e.g., a recording of an “All Hands Meeting” conducted September 10, 2021).
  • This resource e.g., a recording of an “All Hands Meeting” conducted September 10, 2021).
  • This resource after initial selection, has been segmented and presented to the user for subsequent selection.
  • the segments are further provided with additional details, like duration and speaker information. This information can be used for the transitions and/or merely to help the user determine which segments to select for final inclusion in the botcast.
  • each botcast will be specific to a single resource.
  • a botcast may include content from multiple different resources.
  • Figures 5-7 illustrate additional embodiments for identifying and selecting content.
  • an interface is provided with browser controls that are operable, when selected or interacted with by a user to navigate to content for selection.
  • input fields can also be used to enable a user to specify the address of a particular resource to include for the selection and other botcast processing described herein.
  • Figure 6 illustrates how the selection of content can also include interfaces that present scheduled meeting events and other scheduled events/content (e.g., video conference events, meetings that will be recorded, teleconferences, scheduled releases of publications, etc.). These interfaces may list scheduled meeting or other scheduled events, for example, that are selectable by the user as resources to include for the selection and other botcast processing described herein.
  • scheduled meeting events and other scheduled events/content e.g., video conference events, meetings that will be recorded, teleconferences, scheduled releases of publications, etc.
  • Figure 7 illustrates how the selection of content can also include interfaces that present dynamic and streaming content that is available for selection (e.g., website(s), streaming channel(s), and other streaming services/content). These interfaces may list the various dynamic streaming sources as selectable content to undergo the botcast processing described herein.
  • Figure 8 illustrates an embodiment of a web browser showing a webpage of content 800.
  • a user is enabled to select or flag content for inclusion in a botcast and for the referenced processing by selecting a selectable icon (e.g., icon 810) which, when selected, causes the system to identify the webpage that the icon 810 is displayed with for selection by the content selector model.
  • a selectable icon e.g., icon 810
  • Figure 9 illustrates a related embodiment, in which a portion of a website 900 is highlighted by a user.
  • the system displays a selectable control 910 which, when selected, causes the highlighted portion of content to be selected by the content selector model for the user’s botcast and referenced processing by the ML models.
  • Figure 10 illustrates another embodiment, in which a control 1010 is presented with an application 1000, such as a video conferencing application.
  • an application 1000 such as a video conferencing application.
  • the content selector model is caused to identify the video conferencing application 1000 and/or a video conference meeting associated with the conferencing application 1000 to be a source of media to include in the selected content.
  • FIGS 11 and 12 illustrate embodiments of interfaces (1100 and 1200) that display a set of botcasts associated with and configured for a particular user (1100). These botcasts are currently in interface 1100 as files and are displayed with playback statuses showing total duration, dates of creation and a listening progress bar. Titles for the botcasts are also shown, which may be transcribed portions of the transition messages. Various different metadata and status information can be displayed with the botcasts, including dates referenced in the content, storage locations, expiration dates (if any), modification dates (if any), or any other information. Controls can also be provided for sharing, playing, and deleting the botcasts.
  • Interface 1200 shows a playback interface for a botcast that is presented when one of the botcasts in interface 1100 is selected.
  • the interface includes functional buttons which, when selected, enable the user to advance to a previous or subsequent topic (e.g., segment in the botcast - such as when each botcast has multiple topics, and/or to a different botcast - such as when each botcast is focused on a single topic).
  • each content audio file/segment of the botcast(s) will include a separate transition, as previously described, to characterize/summarize the relevance of the content in the file/segment to the user.
  • Figure 13 illustrates a flow diagram 1300 that includes various acts associated with exemplary methods that can be implemented by computing systems, such as computing system 110 of Figure 1.
  • the flow diagram 1300 includes a plurality of acts (act 1310, act 1320, act 1330, act 1340, and act 1350) which are associated with various methods for configuring and/or using a botcast of media content customized for a particular user or entity based on one or more profile or preference settings associated with the particular user (which may include contextual circumstances associated with the particular user/entity), as well as for generating one or more transitions associated with the selected content that is personalized to the particular user/entity and that will be incorporated into the botcast with the selected content to identify the relevance of the content to the user.
  • acts act 1310, act 1320, act 1330, act 1340, and act 1350
  • the first illustrated act includes the system identifying selected content to include in a botcast for a particular user (e.g., a single individual user or, alternatively, an entity comprising a plurality of users).
  • a particular user e.g., a single individual user or, alternatively, an entity comprising a plurality of users.
  • the content selection is based on one or more profile or preference settings associated with the particular user and which may include contextual circumstances or conditions associated with the particular user.
  • the system is configured to use the content selector model to perform this act.
  • the system first identifies the user for which the botcast information is to be identified. Then, the system identifies the corresponding stored profile and preference settings and contextual information associated with that user.
  • identifying the content includes identifying specific content that has been manually or explicitly tagged by the user for use in a botcast through the selection of a botcast content flagging/tagging icon or control for the botcast, for example, or through a user selection or highlighting of content that is identified and/or displayed to the user through one or more content display/identification interfaces, or user input identifying a URL or storage address where the content is located.
  • selecting the content include automatically identifying the content based on identifying content that is relevant to a user based on discovering information that is associated with upcoming meetings, missed meetings, project events, scheduled events, assigned tasks, preferential resource topics or resources, or other profile/preference data of the user.
  • This information can be automatically identified by evaluating the different profile/preference data and scanning metadata and information from the profile/preference data that references the identified content and by also determining the referenced content is materially relevant or contextually relevant to the user based on the processing and functionality described throughout, based on a user’s profile data.
  • the next illustrated act includes the system generating a transition associated with the selected content that is personalized to the particular user. This act is performed by the transition model described above.
  • the transition includes an audio message that is supplementary to the selected content and that includes at least one of an identification of a relevance of the selected content to the particular user based on the one or more profile or preference settings associated with the particular user and/or a summary of the selected content that is formatted in a selected summary format that is selected from a plurality of available different summary formats, each of which is based on the one or more profile or preference settings associated with the particular user and which setting may include contextual circumstances associated with the particular user.
  • the next illustrated act includes the system sequencing the selected content with the transition sequence into a playback sequence that is used by media players for presenting and/or rendering the botcast transition(s) and content in the ordered sequence.
  • This sequencing which is performed by the sequencer model, is sometimes based on the profile/preference settings and/or contextual circumstances associated with the particular user, as described, and as may be evaluated subsequent to the selection of the content.
  • the next illustrated act (act 1340) includes the system formatting and/or otherwise linking or assembling the selected content and transition(s) into a digital structure comprising the botcast and with the selected content and transition(s) being stored in an audio format or audio playable format that is selected from a plurality of different audio/audio playable formats, the audio/audio playable format being selected from the plurality of different formats based on the one or more profile or preference settings associated with the particular user and which may include contextual circumstances associated with the particular user.
  • the formatting is performed by the formatting model described previously and may include formatting the transition and the underlying content into a single botcast audio file.
  • the next illustrated act includes the system using the presentation model to store the botcast in a particular location or to broadcast, render, share, distribute the botcast to a particular user, device or third party.
  • the presentation model provides the botcast to an audio player that renders the botcast (including the selected content and transition(s)) according to the ordering of the botcast sequence and in the format selected for the botcast, based on the particular user settings/circumstances.
  • Other embodiments also include modifying the botcast for the same user and/or for different users, with the modification model, based on detecting different dynamic changes to user settings, available content and/or contextual circumstances for the user/ different users.
  • modifications can include adding new content and/or removing content from the botcasts, based on a detected increase/decrease in relevance of the content, respectively, for the different users.
  • the modifications can also include reordering/resequencing of the content and/or transitions in the botcast, reformatting of the content/transitions into different playback audio formats, and/or creating, augmenting, deleting, or otherwise modifying the transitions/content in the botcast.
  • Other embodiments include identifying and parsing the media content into a plurality of media subcomponents/segments, presenting the plurality of media subcomponents to a particular user for user selection, identifying user input selecting the one or more selected subcomponents from the plurality of media subcomponents, and identifying the one or more selected subcomponents as the selected content.
  • Other embodiments include presenting each of the plurality of media subcomponents with a textual description of one or more topics associated with each corresponding media subcomponent or segment.
  • Other embodiments include identifying and selecting content for a botcast that is in a text format and performing TTS (text-to-speech) processing on the selected content to format the selected content into an audio format.
  • TTS text-to-speech
  • transitions for content selected include generating transitions for content selected and wherein the transition includes a summary of the selected content and that is formatted in a summary format (e.g., prosody style and/or other format based on language attributes) and that is selected from a plurality of available different summary formats, based on the one or more profile or preference settings associated with the particular user.
  • a summary format e.g., prosody style and/or other format based on language attributes
  • Other embodiments include modifying the botcast by omitting a portion of the selected content that has been determined to have a diminished relevance to the particular user subsequent to a formatting the selected content and transition as the botcast and/or by modifying or deleting the transition associated with the selected content having the portion omitted and/or reformatting the botcast to reflect the selected content with the portion omitted and the modified or deleted transition associated with the selected content having the portion omitted.
  • the modifying is performed dynamically in response to detecting the diminished relevance of the selected content to the particular user, and the system detects the diminished relevance in response to determining the particular user has listened to the portion of the selected content that has been determined to have the diminished relevance.
  • the modifications occur dynamically in response to detecting the diminished relevance of the selected content to the particular user, wherein detecting the diminished relevance comprises detecting change in a scheduled event associated with a calendar of the particular user.
  • the modification includes modifying the botcast by adding new content and by adding a new corresponding transition that is associated with the new content and/or by resequencing the botcast audio playback sequence.
  • the resequencing can include moving the new content and the new corresponding transition to be interposed between at least two other portions of content and at least two other transitions that correspond to the at least two other portions of the content.
  • Other embodiments include generating derivate, supplemental, or additional botcasts that includes similar or the same underlying content as one or more other botcasts, but which include one or more different transition(s) associated with the content, the different transition being personalized to the different users and omitting transitions that are personalized to other users.
  • the disclosed embodiments include systems and methods for utilizing trained machine learning models for configuring and utilizing botcasts of media content that are personalized and customized for different users based on contexts associated with the users and based on one or more profile or preference settings associated with the users, and in a manner that facilitates computational efficiencies for configuring and utilizing the botcast, and which can reduce the waste of time and computational processing otherwise required by users that attempt to generate their own media compilations of the same content included in the personalized botcasts.
  • Embodiments of the present invention may comprise or utilize a special purpose or general- purpose computer (e.g., computing system 110) including computer hardware, as discussed in greater detail below.
  • Embodiments within the scope of the present invention also include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures.
  • Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer system.
  • Computer-readable media e.g., storage 140 of Figure 1 that store computer-executable instructions (e.g., component 118 of Figure 1) are physical storage media.
  • Computer-readable media that carry computer-executable instructions are transmission media.
  • embodiments of the invention can comprise at least two distinctly different kinds of computer-readable media: physical computer-readable storage media and transmission computer-readable media.
  • Physical computer-readable storage media which is distinct and distinguished from transmission computer-readable media, include physical and tangible hardware.
  • Examples of physical computer-readable storage media include hardware storage devices such as RAM, ROM, EEPROM, CD-ROM or other optical disk storage (such as CDs, DVDs, etc.), magnetic disk storage or other magnetic storage devices, or any other hardware which can be used to store desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer and which are distinguished from merely transitory carrier waves and other transitory media that are not configured as physical and tangible hardware.
  • a “network” (e.g., network 130 of Figure 1) is defined as one or more data links that enable the transport of electronic data between computer systems and/or modules and/or other electronic devices.
  • Transmission media can include any network links and/or data links, including transitory carrier waves, which can be used to carry, or desired program code means in the form of computer-executable instructions or data structures, and which can be accessed by a general purpose or special purpose computer. Combinations of the above are also included within the scope of computer-readable media.
  • program code means in the form of computer-executable instructions or data structures can be transferred automatically from transmission computer-readable media to physical computer-readable storage media (or vice versa).
  • program code means in the form of computer-executable instructions or data structures received over a network or data link can be buffered in RAM within a network interface module (e.g., a “NIC”), and then eventually transferred to computer system RAM and/or to less volatile computer-readable physical storage media at a computer system.
  • NIC network interface module
  • computer-readable physical storage media can be included in computer system components that also (or even primarily) utilize transmission media.
  • Computer-executable instructions comprise, for example, instructions and data which cause a general-purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions.
  • the computer-executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, or even source code.
  • the invention may be practiced in network computing environments with many types of computer system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, pagers, routers, switches, and the like.
  • the invention may also be practiced in distributed system environments where local and remote computer systems, which are linked (either by hardwired data links, wireless data links, or by a combination of hardwired and wireless data links) through a network, both perform tasks.
  • program modules may be located in both local and remote memory storage devices.
  • the functionality described herein can be performed, at least in part, by one or more hardware logic components.
  • illustrative types of hardware logic components include Field-programmable Gate Arrays (FPGAs), Program-specific Integrated Circuits (ASICs), Program-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Systèmes et procédés permettant de configurer et d'utiliser des « botcast », qui contiennent un contenu audio comprenant des transitions correspondant au contenu audio, pour faciliter l'accessibilité et la présentation du contenu multimédia dans les « botcast » selon une pertinence contextuelle pour différents utilisateurs individuels. Les systèmes identifient, accèdent, filtrent, augmentent, adaptent, personnalisent, créent et/ou configurent autrement le contenu multimédia, ainsi que les transitions de contenu dans les « botcast », selon les préférences et profils individuels de chaque utilisateur, et également les circonstances contextuelles pour chaque utilisateur.
PCT/US2022/037652 2021-09-22 2022-07-20 Botcasts - podcasts personnalisés basés sur l'intelligence artificielle (ia) WO2023048803A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163247242P 2021-09-22 2021-09-22
US63/247,242 2021-09-22
US17/530,068 2021-11-18
US17/530,068 US20230092783A1 (en) 2021-09-22 2021-11-18 Botcasts - ai based personalized podcasts

Publications (1)

Publication Number Publication Date
WO2023048803A1 true WO2023048803A1 (fr) 2023-03-30

Family

ID=82748170

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/037652 WO2023048803A1 (fr) 2021-09-22 2022-07-20 Botcasts - podcasts personnalisés basés sur l'intelligence artificielle (ia)

Country Status (1)

Country Link
WO (1) WO2023048803A1 (fr)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210208842A1 (en) * 2015-10-27 2021-07-08 Super Hi Fi, Llc Computerized systems and methods for hosting and dynamically generating and providing customized media and media experiences

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210208842A1 (en) * 2015-10-27 2021-07-08 Super Hi Fi, Llc Computerized systems and methods for hosting and dynamically generating and providing customized media and media experiences

Similar Documents

Publication Publication Date Title
US11929069B2 (en) Proactive incorporation of unsolicited content into human-to-computer dialogs
US9111534B1 (en) Creation of spoken news programs
RU2471251C2 (ru) Устройство на основе личности
US7844215B2 (en) Mobile audio content delivery system
US9978365B2 (en) Method and system for providing a voice interface
US20160055245A1 (en) Systems and methods for providing information discovery and retrieval
US20130231931A1 (en) System, method, and apparatus for generating, customizing, distributing, and presenting an interactive audio publication
US8027999B2 (en) Systems, methods and computer program products for indexing, searching and visualizing media content
US9075874B2 (en) Making user generated audio content on the spoken web navigable by community tagging
US7707268B2 (en) Information-processing apparatus, information-processing methods and programs
JP7171911B2 (ja) ビジュアルコンテンツからのインタラクティブなオーディオトラックの生成
US20070208564A1 (en) Telephone based search system
US10665237B2 (en) Adaptive digital assistant and spoken genome
CA2596456C (fr) Systeme de livraison de contenu audio mobile
Pauletto et al. Exploring expressivity and emotion with artificial voice and speech technologies
US10909999B2 (en) Music selections for personal media compositions
US20230092783A1 (en) Botcasts - ai based personalized podcasts
Anerousis et al. Making voice knowledge pervasive
WO2023048803A1 (fr) Botcasts - podcasts personnalisés basés sur l'intelligence artificielle (ia)
KR102492380B1 (ko) 사내 방송 시스템에서 사내 방송 서비스를 제공하는 방법
US20150079947A1 (en) Emotion Express EMEX System and Method for Creating and Distributing Feelings Messages
McCullough Sound mobility: Smartphones and tablets as ubiquitous polymedia music production studios
Wall Music as heritage: historical and ethnographic perspectives: edited by Barley Norton and Naomi Matsumoto, London and New York, Routledge, 2019, xiv+ 320 pp., $152.00 (hbk), ISBN 978-1-1382-22804-7
Lee et al. Mi-DJ: a multi-source intelligent DJ service
Kleij Theatricality and the Public: Theatre and the English Public from Reformation to Revolution. By Katrin Beushausen

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22748674

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022748674

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022748674

Country of ref document: EP

Effective date: 20240422