WO2024065690A1 - 一种音频广告投放的方法、设备及系统 - Google Patents

一种音频广告投放的方法、设备及系统 Download PDF

Info

Publication number
WO2024065690A1
WO2024065690A1 PCT/CN2022/123309 CN2022123309W WO2024065690A1 WO 2024065690 A1 WO2024065690 A1 WO 2024065690A1 CN 2022123309 W CN2022123309 W CN 2022123309W WO 2024065690 A1 WO2024065690 A1 WO 2024065690A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
advertisement
slot
cloud
advertisement slot
Prior art date
Application number
PCT/CN2022/123309
Other languages
English (en)
French (fr)
Inventor
夏曾华
马中瑞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2022/123309 priority Critical patent/WO2024065690A1/zh
Publication of WO2024065690A1 publication Critical patent/WO2024065690A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/61Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • H04H60/63Arrangements for services using the result of monitoring, identification or recognition covered by groups H04H60/29-H04H60/54 for services of sales

Definitions

  • the present application relates to the field of computer technology, and in particular to a method, device and system for delivering audio advertisements.
  • Podcasts are recorded online radio or online audio programs, such as audiobooks, crosstalk, current affairs news, etc.
  • the podcast market is also growing, and the number of users has reached hundreds of millions.
  • the corresponding advertising market share is also growing.
  • audio advertising has also become an important form of advertising.
  • This type of offline audio advertisement has a low match with the audio program, often affecting the continuity of users listening to the audio program, and has a poor delivery effect.
  • the present application provides a method for delivering audio advertisements, which is used to deliver audio advertisements that meet the personalized needs of users in audio programs.
  • the present application also provides corresponding devices, systems, computer-readable storage media, and computer program products.
  • the first aspect of the present application provides a method for delivering audio advertisements, comprising: a cloud receives an advertisement request from a client, the advertisement request includes information of an audio program, an identifier of a target advertisement slot, and user characteristics, the target advertisement slot is one of at least one advertisement slot mined from the audio program, and the advertisement request is triggered when the client plays the audio program; the cloud determines a vector representation of the target advertisement slot based on the information of the audio program and the identifier of the target advertisement slot, the vector representation of the target advertisement slot is used to describe the content involved in the audio program within a period of time before the target advertisement slot; the cloud obtains an audio advertisement matching the target advertisement slot based on the user characteristics and the vector representation of the target advertisement slot; the cloud sends the audio advertisement to the client, and the audio advertisement is used to be played by the client when the audio program is played to the target advertisement slot.
  • the cloud can be software or services of the cloud platform, or software or services deployed on nodes in the network, such as edge nodes.
  • the cloud can run on independent physical machines or on virtualized resources.
  • the client may be a terminal device or an application, for example, the application runs on the terminal device for the user to use.
  • the client plays an audio program usually refers to when the client is playing the audio program and is about to reach the target advertising slot.
  • an advertising request is triggered at a preset time point before reaching the target advertising slot.
  • the preset time point can be a time point 5 seconds away from the target advertising slot or other numerical values representing the duration.
  • audio ads refer to ads played in audio format.
  • Ad requests are used to request audio ads from the cloud.
  • the information of the audio program may be an identifier or index of the audio program.
  • the audio program is an audio program that the client is about to play, is about to play, or has just played.
  • the audio program may be an audio book, an audio song, a cross talk, or current news.
  • each ad slot in an audio program will have a unique identifier.
  • Each ad slot will have a vector representation, and the identifier of the same ad slot is associated with the vector representation of the ad slot, and the representation and vector representation of at least one ad slot of each audio program are associated with the audio program and stored, and this information can be stored in the audio content library of the cloud platform.
  • the vector representation of the ad slot refers to the vector obtained by encoding the content involved in the previous period of time of the ad slot.
  • the "period of time” in the present application can be a period of time, such as: 1 minute or other numerical values representing the duration.
  • the specific numerical value can be pre-set, and in another implementation, the duration can be a random value within a range.
  • user characteristics may include user portraits and user behavior characteristics.
  • User portraits may include basic information about users, such as gender, age, hobbies, etc.
  • User behavior characteristics may include user behavior information such as clicks, favorites, and comments on historical audio programs.
  • the identifiers of all advertising slots associated with the audio program and the vector representation of the advertising slots can be found based on the information of the audio program. Furthermore, based on the identifier of the target advertising slot, the vector representation of the target advertising slot can be determined.
  • one or more audio ads that are strongly related to the audio program that the user is listening to before the target ad slot can be determined based on the vector representation of the target ad slot, and the audio ads can be further screened or processed based on the user characteristics to obtain the audio ads that match the target ad slot. Because the audio ads determined by this application have a higher degree of match with the audio programs, and combined with the user characteristics, they can better meet the personalized needs of users and improve the delivery effect of audio ads.
  • the cloud determines the audio advertisement that matches the target ad slot based on user characteristics and the vector representation of the target ad slot, including: the cloud recalls multiple audio advertisements from the audio advertisement library based on the vector representation of the target ad slot; the cloud obtains the audio advertisement that matches the target ad slot from the multiple audio advertisements based on the user characteristics.
  • the cloud recalls multiple audio ads that are strongly related to the audio program that the user is listening to before the target ad slot based on the vector representation of the target ad slot, and then selects the ad that has the highest match with the user's features, thereby improving the match between the audio ad and the audio program being played on the client.
  • the above-mentioned step the cloud obtains an audio advertisement that matches a target ad slot from multiple audio advertisements based on user characteristics, including: the cloud predicts the completion rate of multiple audio advertisements based on user characteristics and an advertisement sorting model, wherein the audio advertisement with the largest completion rate is the audio advertisement that matches the target ad slot, or the audio advertisement with the largest completion rate is the source advertisement of the audio advertisement that matches the target ad slot, and the advertisement sorting model is a model that takes user characteristics as input and completion rate as output.
  • the completion rate refers to the predicted probability that an audio ad will be played completely. The closer the content and style of an audio ad is to the user's preferences, the greater the probability that it will be played completely, and the better the delivery effect after being delivered. Therefore, the completion rate of multiple recalled audio ads can be predicted based on user characteristics, and the audio ad with the highest completion rate can be determined as the audio ad or as the source ad of the audio ad, which can improve the delivery effect of the audio ad.
  • the method when the audio advertisement with the highest completion rate is the source advertisement of the audio advertisement that matches the target advertisement slot, the method further includes: the cloud adjusts the style of the audio advertisement with the highest completion rate according to the style of the audio program and user characteristics to obtain the audio advertisement that matches the target advertisement slot.
  • the style of the audio advertisement with the highest completion rate can be adjusted according to the style of the audio program that the user wants to play or is playing and the user characteristics. This can increase the user's acceptance of audio advertisements, thereby improving the delivery effect of audio advertisements.
  • the cloud adjusts the style of the audio advertisement with the highest completion rate according to the style of the audio program and the user characteristics to obtain an audio advertisement that matches the target advertising slot, including: the cloud adjusts the object sound in the audio advertisement with the highest completion rate according to the style vector of the object sound in the audio program and the style vector of the user's preference, the style vector of the object sound in the audio program is obtained by encoding the object sound in the audio program, and the style vector of the user's preference is obtained by encoding the user characteristics; the cloud adjusts the background music in the audio advertisement with the highest completion rate according to the style vector of the background music in the audio program and the style vector of the user's preference, the style vector of the background music in the audio program is obtained by encoding the background music in the audio program; the cloud integrates the adjusted object sound in the audio advertisement with the highest completion rate and the adjusted background music in the audio advertisement with the highest completion rate to obtain the audio advertisement that matches the target advertising slot.
  • the object sounds and background music of the audio program can be separated, and then encoded to obtain the style vector of the object sounds and the style vector of the background music, respectively, and then adjusted in combination with the style vector preferred by the user to obtain the audio advertisement with the highest score.
  • the style of the audio advertisement is consistent with the style of the audio program, which can meet the style preference of the user, improve the user experience, and thus improve the delivery effect of the audio advertisement.
  • the method before the cloud receives the advertising request sent by the client, the method also includes: the cloud determines at least one advertising slot based on the time domain information of the audio program in the voice state and the text content after the audio program is converted into text; the cloud encodes the text content of each advertising slot in the at least one advertising slot within a previous period of time to obtain a vector representation of each advertising slot.
  • the time domain information may include amplitude (amplitude may also be described as sound intensity), change of amplitude over time, etc.
  • the cloud can also perform the task of mining advertising slots before receiving advertising requests.
  • the process of mining advertising slots can be to determine the advertising slots of the audio program from the time domain information of the audio program in the voice state and the text content after the audio program is converted into text.
  • the advertising slots can be mined based on the time domain information and text content of the audio program, the quality of the mined advertising slots is high, and the continuity of the audio program is usually not affected by inserting advertisements in the advertising slots, thereby improving the user experience and the effect of audio advertising delivery.
  • the method further includes: the cloud stores the audio program, the identifier of each advertising slot, and the vector representation of each advertising slot in an associated manner.
  • the audio program, the identifier of each ad slot in the audio program, and the vector representation of each ad slot are stored in association, which can facilitate the rapid determination of the audio ad that matches the requested target ad slot when the client sends an ad request, thereby improving the delivery efficiency of the audio ad.
  • the cloud determines at least one advertising slot based on the time domain information of the audio program in the voice state and the text content after the audio program is converted into text, including: when the time domain information is amplitude, if the duration of the amplitude of the audio program in the voice state continuously lower than the amplitude threshold exceeds the first threshold, the cloud determines the duration of the amplitude continuously lower than the amplitude threshold as the first basic advertising slot; if the time interval between two adjacent words in the text content after the audio program is converted is greater than the second threshold, the cloud determines the time interval between the two adjacent words as the second basic advertising slot, and the time interval between the two adjacent words is determined by the timestamp of each word during text conversion; the cloud determines at least one advertising slot from the union of the first basic advertising slot and the second basic advertising slot.
  • the advertisement slot is selected from the union of the first basic advertisement slot and the second basic advertisement slot, so that the range of advertisement slot selection can be expanded.
  • the corresponding basic advertisement slots can be weighted by punctuation marks, text segmentation, etc., that is, the weight of the corresponding advertisement is increased, and then at least one basic advertisement slot with the largest weight is selected from them to be determined as at least one advertisement slot of the audio program. In this way, the quality of the selected advertisement slots can be improved.
  • the user characteristics include a user portrait and user behavior characteristics for historical audio programs.
  • a second aspect of the present application provides a method for delivering audio advertisements, comprising: when a client plays an audio program, sending an advertisement request to a cloud, the advertisement request including information of the audio program, an identifier of a target advertisement slot, and user characteristics, the target advertisement slot being one of at least one advertisement slot mined from the audio program; the client receiving an audio advertisement sent by the cloud that matches the target advertisement slot; and the client playing the audio advertisement when playing the audio program to the target advertisement slot.
  • the client plays an audio program usually refers to when the client is playing the audio program and is about to reach the target advertising slot.
  • an advertising request is triggered at a preset time point before reaching the target advertising slot.
  • the preset time point can be a time point 5 seconds away from the target advertising slot or other numerical values representing the duration.
  • the user characteristics include user characteristics and user preference characteristics for audio programs.
  • a third aspect of the present application provides a method for mining advertising slots, including: obtaining an audio program of the advertising slot to be mined in the cloud; determining at least one advertising slot based on time domain information of the audio program in a voice state and text content after the audio program is converted into text in the cloud; and encoding the text content of each advertising slot in at least one advertising slot within a previous period of time in the cloud to obtain a vector representation of each advertising slot.
  • the method further includes: the cloud stores the audio program, the identifier of each advertising slot, and the vector representation of each advertising slot in an associated manner.
  • the cloud determines at least one advertising slot based on the time domain information of the audio program in the voice state and the text content after the audio program is converted into text, including: when the time domain information is amplitude, if the duration for which the amplitude of the audio program in the voice state is continuously lower than the amplitude threshold exceeds the first threshold, the cloud determines the duration for which the amplitude is continuously lower than the amplitude threshold as the first basic advertising slot; if the time interval between two adjacent words in the text content after the audio program is converted is greater than the second threshold, the cloud determines the time interval between the two adjacent words as the second basic advertising slot, and the time interval between the two adjacent words is determined by the timestamp of each word during text conversion; the cloud determines at least one advertising slot from the union of the first basic advertising slot and the second basic advertising slot.
  • a cloud device for executing the method in the first aspect or any possible implementation of the first aspect.
  • the cloud device includes a module or unit for executing the method in the first aspect or any possible implementation of the first aspect, such as a processing unit, a sending unit, and a receiving unit.
  • a client for executing the method in the second aspect.
  • the client includes a module or unit for executing the method in the second aspect or any possible implementation of the second aspect, such as a receiving unit, a display unit, and a sending unit.
  • a cloud device for executing the method in the first aspect or any possible implementation of the first aspect.
  • the cloud device includes a module or unit for executing the method in the third aspect or any possible implementation of the third aspect, such as a processing unit, a sending unit, and a receiving unit.
  • a cloud device may include at least one processor, a memory, and a communication interface.
  • the processor is coupled to the memory and the communication interface.
  • the memory is used to store instructions
  • the processor is used to execute the instructions
  • the communication interface is used to communicate with other network elements under the control of the processor.
  • a client including a transceiver, a processor and a memory, wherein the transceiver and the processor are coupled to the memory, and the memory is used to store programs or instructions.
  • the cloud device executes the method in the aforementioned second aspect or any possible implementation of the second aspect.
  • a cloud device may include at least one processor, a memory, and a communication interface.
  • the processor is coupled to the memory and the communication interface.
  • the memory is used to store instructions
  • the processor is used to execute the instructions
  • the communication interface is used to communicate with other network elements under the control of the processor.
  • the tenth aspect of the present application provides a chip system, which includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected by lines; the interface circuit is used to receive signals from the memory of the cloud device and send signals to the processor, and the signals include computer instructions stored in the memory; when the processor executes the computer instructions, the cloud device executes the method in the first aspect or any possible implementation of the first aspect.
  • a chip system which includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected by lines; the interface circuit is used to receive signals from the client's memory and send signals to the processor, and the signals include computer instructions stored in the memory; when the processor executes the computer instructions, the client executes the method in the aforementioned second aspect or any possible implementation of the second aspect.
  • the twelfth aspect of the present application provides a chip system, which includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected by lines; the interface circuit is used to receive signals from the memory of the cloud device and send signals to the processor, and the signals include computer instructions stored in the memory; when the processor executes the computer instructions, the cloud device executes the method in the aforementioned third aspect or any possible implementation of the third aspect.
  • the thirteenth aspect of the present application provides a computer-readable storage medium on which a computer program or instruction is stored.
  • the computer program or instruction is executed on a computer device, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.
  • the present application provides a computer-readable storage medium on which a computer program or instruction is stored.
  • the computer program or instruction is executed on a computer device, the computer device executes the method in the aforementioned second aspect or any possible implementation of the second aspect.
  • the present application provides a computer-readable storage medium on which a computer program or instruction is stored.
  • the computer program or instruction is executed on a computer device, the computer device executes the method in the aforementioned third aspect or any possible implementation of the third aspect.
  • the sixteenth aspect of the present application provides a computer device program product, which includes a computer device program code.
  • the computer device program code When the computer device program code is executed on a computer device, the computer device executes the method in the aforementioned first aspect or any possible implementation of the first aspect.
  • the seventeenth aspect of the present application provides a computer device program product, which includes a computer device program code.
  • the computer device program code When the computer device program code is executed on a computer device, the computer device executes the method in the aforementioned second aspect or any possible implementation of the second aspect.
  • the present application provides a computer device program product, which includes a computer device program code.
  • the computer device program code When the computer device program code is executed on a computer device, the computer device executes the method in the aforementioned third aspect or any possible implementation of the third aspect.
  • an audio advertising system which includes a cloud device and a client, wherein the cloud device is used to execute the method in the aforementioned first aspect or any possible implementation of the first aspect, and the client is used to execute the method in the aforementioned second aspect or any possible implementation of the second aspect.
  • the twentieth aspect of the present application provides an audio advertising system, which includes a cloud device and an audio content library.
  • the cloud device obtains audio programs from the audio content library and executes the method in the third aspect or any possible implementation of the third aspect.
  • the technical effects brought about by the second to twentieth aspects or any possible implementation methods thereof can refer to the technical effects brought about by the first aspect or different possible implementation methods of the first aspect, and will not be repeated here.
  • FIG1A is a schematic diagram of an architecture of an audio advertising system provided by an embodiment of the present application.
  • FIG1B is another schematic diagram of the architecture of the audio advertising system provided in an embodiment of the present application.
  • FIG2A is a schematic diagram of a structure of a client provided in an embodiment of the present application.
  • FIG2B is a schematic diagram of a structure of a cloud device provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of an embodiment of a method for delivering audio advertisements provided in an embodiment of the present application
  • FIG4 is a schematic diagram of the structure of an advertisement ranking model provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of another embodiment of the method for delivering audio advertisements provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of an embodiment of a method for mining advertising slots provided in an embodiment of the present application.
  • FIG7 is a schematic diagram of an example scenario provided in an embodiment of the present application.
  • FIG8 is a schematic diagram of an embodiment of mining advertisement slots and delivering audio advertisements provided by an embodiment of the present application
  • FIG9 is another schematic diagram of the structure of a cloud device provided in an embodiment of the present application.
  • FIG10 is another schematic diagram of the structure of a client provided in an embodiment of the present application.
  • FIG. 11 is another schematic diagram of the structure of the cloud device provided in an embodiment of the present application.
  • the embodiment of the present application provides a method for delivering audio advertisements, which is used to deliver audio advertisements that meet the personalized needs of users in audio programs.
  • the present application also provides corresponding devices, systems, computer-readable storage media, and computer program products, etc. The following are detailed descriptions.
  • FIG1A is a schematic diagram of an architecture of an audio advertising system provided in an embodiment of the present application.
  • the audio advertising system provided in the embodiment of the present application includes a cloud and multiple clients, and the cloud can communicate with the multiple clients through a network.
  • the audio advertising system provided in the embodiment of the present application may also include an audio content library and an audio advertising library.
  • the audio content library and/or the audio advertising library may also be integrated on the cloud.
  • the cloud can be software or services of the cloud platform, or software or services deployed on nodes in the network, such as edge nodes.
  • the client can be a terminal device, or an application, such as an application running on a terminal device for user use.
  • the client can obtain audio programs from the audio content library, and when playing audio programs, it can send an ad request to the cloud.
  • the cloud can determine an audio ad that meets the user's personalized needs from the audio ad library based on the information related to the audio program carried in the ad request and the user's characteristics, and send it to the client for the client to place in the audio program when playing the audio program.
  • audio ads refer to ads played in audio format.
  • Ad requests are used to request audio ads from the cloud.
  • user characteristics may include user portraits and user behavior characteristics.
  • User portraits may include basic information about users, such as gender, age, hobbies, etc.
  • User behavior characteristics may include user behavior information such as clicks, favorites, and comments on historical audio programs.
  • the audio advertisement system provided in the embodiment of the present application can determine audio advertisements based on user characteristics, so that the audio advertisements determined in this way can better meet the personalized needs of users and improve the delivery effect of audio advertisements.
  • the information related to the audio program may include the identifier of the audio program, and may also include the identifier of the advertising slot pre-mined from the audio program.
  • the cloud can determine the audio advertisement for the advertising slot specified in the advertisement request, which can further improve the matching degree between the audio advertisement and the audio program.
  • the advertising slot refers to the time period in the audio program for playing the audio advertisement.
  • the process of mining advertisement slots is usually offline mining, but of course, it can also be online mining.
  • the following introduces the audio advertisement system for mining advertisement slots in conjunction with FIG. 1B .
  • the audio advertisement system may include a cloud and an audio content library.
  • the audio content library may be integrated in the cloud, which may be the same device as the cloud in Fig. 1A or may be a different device.
  • the cloud can obtain the audio program of the advertising slot to be mined from the audio content library, and then determine at least one advertising slot based on the time domain information of the audio program in the voice state and the text content after the audio program is converted into text; the cloud encodes the text content of each advertising slot in at least one advertising slot within a period of time before the advertisement to obtain a vector representation of each advertising slot.
  • the time domain information may include amplitude (amplitude may also be described as sound intensity), change of amplitude over time, etc.
  • the cloud will store the identifier of each ad slot of the same audio program and the vector representation of each ad slot as well as the audio program in association. If the audio program is stored in the audio content library, the identifier of each ad slot of the same audio program and the vector representation of each ad slot can be returned to the audio content library.
  • the audio content library stores the audio program in association with the identifier of each ad slot of the audio program and the vector representation of each ad slot. As shown in Figure 1B, the audio content library can store many audio programs and the identifiers and vector representations of the ad slots in the audio program in association.
  • audio program 1 there are x corresponding ad slots, and the identifiers and vector representations of the ad slots corresponding to audio program 1 are ad slot 1, vector representation 1, ..., ad slot x, vector representation x.
  • Audio program M there are y corresponding ad slots, and the identifiers and vector representations of the ad slots corresponding to audio program 1 are ad slot 1, vector representation 1, ..., ad slot y, vector representation y.
  • x, y, and M are all positive integers.
  • the scheme for mining advertisement slots provided in the embodiment of the present application can mine advertisement slots based on the time domain information of the audio program in the voice state and the text content of the audio program after conversion into text format.
  • the quality of the mined advertisement slots is high, and the continuity of the audio program is usually not affected by inserting an audio advertisement in the advertisement slot, thereby improving the user experience and the effect of delivering audio advertisements.
  • the audio program, the identifier of each advertisement slot in the audio program, and the vector representation of each advertisement slot are stored in association in the audio content library, which can facilitate the rapid determination of the audio advertisement matching the requested target advertisement slot when the client sends an advertisement request, thereby improving the delivery efficiency of audio advertisements.
  • the cloud can be a physical machine or a computing instance such as a virtual machine (VM) or a container.
  • the cloud can also be understood as an advertising system, or the cloud is a device in the advertising system.
  • the terminal device (also called user equipment (UE)) is a device with wireless transceiver function, which can be deployed on land, including indoors or outdoors, handheld or vehicle-mounted; it can also be deployed on the water (such as ships, etc.); it can also be deployed in the air (such as airplanes, balloons and satellites, etc.).
  • UE user equipment
  • the terminal can be a mobile phone, a tablet computer (pad), a computer with wireless transceiver function, a virtual reality (VR) terminal, an augmented reality (AR) terminal, a wireless terminal in industrial control (industrial control), a wireless terminal in self-driving, a wireless terminal in remote medical, a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), a wireless terminal in the Internet of Things (IoT), etc.
  • VR virtual reality
  • AR augmented reality
  • a wireless terminal in industrial control (industrial control) a wireless terminal in self-driving
  • a wireless terminal in remote medical a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), a wireless terminal in the Internet of Things (IoT), etc.
  • the structure of the terminal device provided in the embodiment of the present application can be understood by referring to FIG. 2A below, and the structure of the cloud device can be understood by referring to FIG. 2B below.
  • the terminal device may include a processor 101, a transceiver 102, a memory 103 and a bus 104.
  • the processor 101, the transceiver 102 and the memory 103 are interconnected via the bus 104.
  • the processor 101 is used to control and manage the actions of the terminal device 10, for example, the processor 101 is used to control the process of playing audio programs and audio advertisements.
  • the transceiver 102 is used to support the terminal device 10 to communicate, for example: the transceiver 102 can execute the steps of sending an advertisement request and receiving an audio advertisement.
  • the memory 103 is used to store the program code and data of the terminal device 10.
  • the processor 101 can be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute various exemplary logic blocks, modules and circuits described in conjunction with the disclosure of this application.
  • the processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 104 can be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one thick line is used in FIG. 2A, but it does not mean that there is only one bus or one
  • FIG. 2A above introduces the structure of the terminal device, and the structure of the cloud device is introduced below in conjunction with FIG. 2B .
  • FIG. 2B is a possible logical structure diagram of a cloud device provided in an embodiment of the present application.
  • the cloud device 20 provided in an embodiment of the present application includes: a processor 201, a communication interface 202, a memory 203 and a bus 204.
  • the processor 201, the communication interface 202 and the memory 203 are interconnected via the bus 204.
  • the processor 201 is used to control and manage the actions of the cloud device 20.
  • the processor 201 is used to execute the process of determining audio advertisements.
  • the communication interface 202 is used to support the cloud device 20 to communicate.
  • the communication interface 202 can execute the steps of receiving an advertisement request and sending an audio advertisement.
  • the memory 203 is used to store the program code and data of the cloud device 20.
  • the processor 201 can be a central processing unit, a general processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute various exemplary logic blocks, modules and circuits described in conjunction with the disclosure of this application.
  • the processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 204 can be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one thick line is used in FIG. 2B, but it does not mean that there is only one bus or one type
  • the following describes the method for delivering audio advertisements provided in the embodiment of the present application.
  • the content involved in the cloud execution in the method can be executed by the cloud, or by a component of the cloud (such as a processor, chip, or chip system, etc.).
  • FIG. 3 is a schematic diagram of an embodiment of a method for delivering audio advertisements provided in an embodiment of the present application.
  • an embodiment of the method for delivering audio advertisements provided in an embodiment of the present application includes:
  • the client sends an advertisement request to the cloud.
  • the cloud receives the advertisement request from the client.
  • Ad requests are triggered when the client plays an audio program.
  • the client when the client plays an audio program, it usually refers to when the client is playing the audio program and is about to reach the target advertising slot.
  • an advertising request is triggered at a preset time point before reaching the target advertising slot.
  • the preset time point can be a time point 5 seconds away from the target advertising slot or other numerical values representing the duration.
  • the advertisement request includes information of the audio program, an identifier of the target advertisement slot, and user characteristics.
  • the target advertising slot is one of at least one advertising slot mined from the audio program, such as advertising slot 1 corresponding to audio program 1 in FIG. 1B , and of course, it may also be other advertising slots.
  • the cloud determines the vector representation of the target advertising slot based on the information of the audio program and the identifier of the target advertising slot.
  • the vector representation of the target advertisement slot is used to describe the content involved in the audio program within a period of time before the target advertisement slot.
  • the information of the audio program may be an identifier or index of the audio program.
  • the audio program is an audio program that the client is about to play, is about to play, or has just played.
  • the audio program may be an audio book, an audio song, a cross talk, or current news.
  • one or more ad slots can be mined in an audio program, and each ad slot in an audio program will have a unique identifier.
  • Each ad slot will have a vector representation, and the identifier of the same ad slot is associated with the vector representation of the ad slot, and the representation and vector representation of at least one ad slot of each audio program are associated with the audio program and stored, and this information can be stored in the audio content library of the cloud platform.
  • the vector representation of the ad slot refers to the vector obtained by encoding the content involved in the previous period of time of the ad slot
  • the "period of time" of the present application can be a period of time, such as: 1 minute or other numerical values representing the duration.
  • the specific numerical value can be pre-set
  • the duration can be a random value within a range.
  • vector representation 1 of advertisement slot 1 can be determined based on audio program 1 and advertisement slot 1 .
  • the cloud determines an audio advertisement matching the target advertisement slot according to the user characteristics and the vector representation of the target advertisement slot.
  • the user characteristics may reflect the user's preference for the type or style of audio programs, for example, the type and content of the audio programs that the user likes to listen to, and the audio program reader that the user likes, etc.
  • the vector representation of the ad slot can reflect the content of the audio program
  • audio ads related to the content of the audio program can be selected.
  • Such audio ads are highly integrated with the audio program and will not affect the continuity of the user's listening to the audio program.
  • the cloud further determines the audio ad based on the user's preferences and can obtain the audio ad that best matches the target ad slot.
  • the cloud sends the audio advertisement to the client.
  • the client receives the audio advertisement from the cloud.
  • Audio ads are used by the client to play when the audio program reaches the target ad slot.
  • the client plays the audio advertisement in the target advertisement slot.
  • one or more audio advertisements that are strongly related to the audio program that the user is listening to before the target advertisement slot can be determined based on the vector representation of the target advertisement slot, and the audio advertisements can be further screened or processed based on the user characteristics to obtain the audio advertisements that match the target advertisement slot. Because the audio advertisements determined by the present application have a higher degree of matching with the audio program, and are combined with the user characteristics, they can better meet the personalized needs of the user and improve the delivery effect of the audio advertisements.
  • the above step 303 may include: the cloud recalls multiple audio advertisements from the audio advertisement library according to the vector representation of the target advertisement slot; the cloud obtains an audio advertisement matching the target advertisement slot from the multiple audio advertisements according to user characteristics.
  • the cloud predicts the completion rate of multiple audio advertisements based on user characteristics and an advertisement sorting model, wherein the audio advertisement with the highest completion rate is the audio advertisement that matches the target advertisement slot, or the audio advertisement with the highest completion rate is the source advertisement of the audio advertisement that matches the target advertisement slot, and the advertisement sorting model is a model that takes user characteristics as input and completion rate as output.
  • the process of determining an audio advertisement may include advertisement recall, advertisement sorting, and style transfer, which are introduced below respectively.
  • Ad recall refers to fetching multiple audio ads from the audio ad library that are relevant to the content described by the vector representation of the target ad slot.
  • the cloud can use the ad ranking model to score each recalled audio ad, or it can sort them based on the scores and select the audio ad with the highest score.
  • the advertisement ranking model is a model that takes user features as input and completion rate as output.
  • the advertisement ranking model may be a machine learning model, and the advertisement ranking model may be understood by referring to FIG. 4.
  • the input in the advertisement ranking model may include audio programs, audio advertisements, audio advertisement texts (audio advertisements in text form), slot texts (advertising slots in text form), and may also include slot weights, audio advertisement features, audio program features, context features, etc.
  • the input of the advertisement ranking model provided in the embodiment of the present application also includes user features.
  • the neural network may include a convolutional neural network (deep neural network, DNN), a deep interest network (deep interest network, DIN) or a deep factorization machine (deep factorization machine, DeepFM).
  • the recalled multiple audio ads can be input into the ad ranking model separately, or all at once or in batches, to obtain the completion rate of each audio ad.
  • the completion rate value can be understood as the score of the audio ad, and the audio ad with the highest score is the audio ad with the highest completion rate.
  • the completion rate refers to the predicted probability that an audio ad will be played completely. The closer the content and style of an audio ad is to the user's preferences, the greater the probability that it will be played completely, and the better the delivery effect after it is delivered. Therefore, the completion rate of multiple recalled audio ads can be predicted based on user characteristics, and the audio ad with the highest completion rate can be determined as the audio ad, which can improve the delivery effect of audio ads.
  • the style of the audio advertisement with the highest completion rate can be adjusted according to the style of the audio program to be played or being played by the user and the user characteristics, so as to obtain the audio advertisement matching the target advertisement slot. This can improve the user's acceptance of audio advertisements, thereby improving the delivery effect of audio advertisements.
  • the audio advertisement with the highest completion rate can be directly determined as the audio advertisement that matches the target advertisement slot, or the audio advertisement with the highest completion rate can be used as the source advertisement to perform the above-mentioned style migration to obtain the audio advertisement that matches the target advertisement slot. This is not limited in the present application.
  • the process may include:
  • the object sound is usually the main sound in the audio advertisement, such as the narrator who narrates the advertisement.
  • the cloud adjusts the object sound in the audio advertisement with the highest score according to the style vector of the object sound and the style vector of the user preference.
  • the step 506 is to transfer the style of the object sound in the audio advertisement.
  • the cloud adjusts the background music in the audio advertisement with the highest score according to the style vector of the background music and the style vector of the user's preference.
  • the step 507 is to migrate the style of the background sound in the audio advertisement.
  • Style transfer can be the replacement or partial adjustment of the style of the object sound or background music.
  • step 506 and step 507 are merged to obtain an audio advertisement.
  • steps related to background music processing may not be performed, such as one or more steps in steps 501, 502, 504, 507 or 508.
  • the style of the audio advertisement is consistent with the style of the audio advertisement after adjustment, which can meet the user's style preference, improve the user experience, and thus improve the delivery effect of the audio advertisement.
  • an embodiment of the method for mining advertisement slots provided in an embodiment of the present application includes:
  • the cloud can obtain audio programs with advertising slots to be mined from the audio content library.
  • the cloud detects the audio program in voice state.
  • the cloud determines the duration during which the amplitude is continuously lower than the amplitude threshold as the first basic advertising slot.
  • the cloud converts the audio program into text content and records the timestamps of the words in the text content, and determines the time interval between two adjacent words based on the timestamps of the two adjacent words.
  • the first threshold and the second threshold may be the same or different.
  • the cloud determines the time interval between the two adjacent words as the second basic advertising slot.
  • the cloud increases the weight of the basic ad slot corresponding to the ending punctuation mark in the punctuation mark.
  • the basic advertisement slot in the embodiment of the present application refers to the advertisement slot that is the union of the first basic advertisement slot and the second basic advertisement slot.
  • first basic advertisement slot and the second basic advertisement slot may overlap.
  • the cloud divides the text content of the recovery symbol into text segments and increases the weight of the basic advertising slot between two text segments.
  • steps 602 to 608 above can be understood by referring to the example of FIG. 7 .
  • the cloud may first detect the audio program (as shown in FIG. 7 , it may be a segment captured from the audio program), and detect that there is a segment of audio with a very small amplitude (less than an amplitude threshold) and a duration exceeding a first threshold. It may be determined that the object in the audio program did not make any sound during this duration, that is, it is in a pause state, and this duration may also be determined as a first basic advertising slot.
  • the cloud can convert the audio shown in FIG7 into text through voice recognition.
  • the converted text content includes: “Today's weather is really good. Where should we go to play? The Summer Palace is in Haidian District", where the timestamp of "today” is 1, the timestamp of "weather” is 2, the timestamp of "really good” is 3, the timestamp of "us” is 6, the timestamp of "where to go” is 7, the timestamp of "play” is 8, the timestamp of "Summer Palace” is 12, the timestamp of "in” is 13, and the timestamp of "Haidian District” is 14.
  • the punctuation marks in the text can be restored.
  • the text content after the punctuation marks are restored includes: "The weather is really nice today. Where shall we go? The Summer Palace is in Haidian District.” Because the pauses of the question mark "?” and the period ".” in the punctuation marks are usually longer than those of the comma ",” , the two basic advertising slots corresponding to "?” and “.” can be enhanced, that is, the weights of the two basic advertising slots can be increased.
  • the text content shown in FIG. 7 can be further segmented. "The weather is so nice today. Where shall we go?" can be divided into text segment 1, and "The Summer Palace is in Haidian District.” can be divided into text segment 2. The pause at the division of the two text segments will be longer, so the weight of the basic advertising slot at the division of the two text segments can be further increased, that is, the weight of the basic advertising slot between "play” and "Summer Palace” can be further increased.
  • the cloud selects at least one basic advertisement slot with the largest weight as at least one advertisement slot of the audio program.
  • the basic advertising slot between " ⁇ ” and “ ⁇ ” has the largest weight. If an advertising slot is selected in the audio segment shown in FIG7 , the basic advertising slot between " ⁇ " and “ ⁇ ” can be selected as the advertising slot of the audio segment.
  • the cloud encodes text content of a preset length before each advertisement slot in at least one advertisement slot to obtain a vector representation of each advertisement slot.
  • the cloud can encode the text content of "The weather is so nice today, where should we go to play?", which is a vector representation of the advertising slot between "play” and "Summer Palace”.
  • the cloud can send the identifier and vector representation of each advertising slot to the audio content library and store it in association with the audio content.
  • the process may include ad slot identification and ad slot vector representation.
  • step 801 can be understood by referring to the previous steps 601 to 611, and will not be repeated here.
  • step 803 when the client plays the audio program, it determines whether it has been played to the advertisement slot. If so, execute step 803; if not, continue to play the audio program.
  • step 803. When playing to the advertisement slot, determine whether to send an advertisement request. If so, execute step 804. If not, continue playing the audio program.
  • Steps 804 and 805 may be understood by referring to the above introduction of advertisement recall and advertisement scoring.
  • the step 807 may be understood by referring to the introduction of the previous style transfer section.
  • the ad slot mining process introduced above combines text content to generate a vector representation. In this way, when placing advertisements, audio ads that better match the audio content can be determined. In addition, user features are also used when placing audio ads. This can better meet the personalized needs of users and improve the effect of audio ad placement.
  • a structure of a cloud device 90 provided in an embodiment of the present application includes:
  • the receiving unit 901 is used to receive an advertisement request from a client, wherein the advertisement request includes information of an audio program, an identifier of a target advertisement slot, and user characteristics, wherein the target advertisement slot is one of at least one advertisement slot mined from the audio program, and the advertisement request is triggered when the client plays the audio program.
  • the receiving unit 901 may execute step 301 in the above method embodiment.
  • the first processing unit 902 is used to determine the vector representation of the target advertising slot according to the information of the audio program and the identifier of the target advertising slot, wherein the vector representation of the target advertising slot is used to describe the content involved in the audio program within a period of time before the target advertising slot.
  • the first processing unit 902 can perform step 302 in the above method embodiment.
  • the second processing unit 903 is configured to obtain an audio advertisement matching the target advertisement slot according to the user characteristics and the vector representation of the target advertisement slot.
  • the second processing unit 903 may execute step 303 in the above method embodiment.
  • the sending unit 904 is used to send an audio advertisement to the client, and the audio advertisement is used for the client to play when the audio program is played to the target advertisement slot.
  • the sending unit 904 can execute step 304 in the above method embodiment.
  • one or more audio advertisements that are strongly related to the audio program that the user is listening to before the target advertisement slot can be determined based on the vector representation of the target advertisement slot, and the audio advertisements can be further screened or processed based on the user characteristics to obtain the audio advertisements that match the target advertisement slot. Because the audio advertisements determined by the present application have a higher degree of matching with the audio program, and are combined with the user characteristics, they can better meet the personalized needs of the user and improve the delivery effect of the audio advertisements.
  • the second processing unit 903 is specifically configured to recall a plurality of audio advertisements from the audio advertisement library according to the vector representation of the target advertisement slot; and obtain an audio advertisement matching the target advertisement slot from the plurality of audio advertisements according to user characteristics.
  • the second processing unit 903 is specifically used to predict the completion rate of multiple audio advertisements based on user characteristics and an advertisement sorting model, wherein the audio advertisement with the highest completion rate is the audio advertisement that matches the target advertisement slot, or the audio advertisement with the highest completion rate is the source advertisement of the audio advertisement that matches the target advertisement slot, and the advertisement sorting model is a model that takes user characteristics as input and completion rate as output.
  • the second processing unit 903 is specifically used to adjust the style of the audio advertisement with the highest completion rate according to the style of the audio program and user characteristics to obtain the audio advertisement that matches the target advertising slot when the audio advertisement with the highest completion rate is the source advertisement of the audio advertisement that matches the target advertising slot.
  • the second processing unit 903 is specifically used to adjust the object sound in the audio advertisement with the highest completion rate according to the style vector of the object sound in the audio program and the style vector of the user's preference, the style vector of the object sound in the audio program is obtained by encoding the object sound in the audio program, and the style vector of the user's preference is obtained by encoding user features; adjust the background music in the audio advertisement with the highest completion rate according to the style vector of the background music in the audio program and the style vector of the user's preference, the style vector of the background music in the audio program is obtained by encoding the background music in the audio program; and fuse the adjusted object sound in the audio advertisement with the highest completion rate and the adjusted background music in the audio advertisement with the highest completion rate to obtain an audio advertisement that matches the target advertising slot.
  • the first processing unit 902 is further used to determine at least one advertising slot based on time domain information of the audio program in a voice state and text content after the audio program is converted into text; and encode the text content of each advertising slot in the at least one advertising slot within a previous period of time to obtain a vector representation of each advertising slot.
  • the first processing unit 902 is specifically used for, when the time domain information is amplitude, if the duration for which the amplitude of the audio program in the voice state is continuously lower than the amplitude threshold exceeds a first threshold, then the duration for which the amplitude is continuously lower than the amplitude threshold is determined as a first basic advertising slot; if the time interval between two adjacent words in the text content after the audio program is converted is greater than a second threshold, then the time interval between the two adjacent words is determined as a second basic advertising slot, and the time interval between the two adjacent words is determined by the timestamp of each word during text conversion; and at least one advertising slot is determined from the union of the first basic advertising slot and the second basic advertising slot.
  • the first processing unit 902 is specifically used to select at least one advertisement slot with the largest weight from the union of the first basic advertisement slot and the second basic advertisement slot to determine it as at least one advertisement slot of the audio program, and the weight of each advertisement slot in the at least one advertisement slot is determined by the punctuation marks and/or text segment segment positions corresponding to each advertisement slot.
  • the user characteristics include a user portrait and user behavior characteristics regarding historical audio programs.
  • a structure of a client 100 provided in an embodiment of the present application includes:
  • the sending unit 1001 is used to send an advertisement request to the cloud when playing an audio program.
  • the advertisement request includes information of the audio program, an identifier of a target advertisement slot, and user characteristics.
  • the target advertisement slot is one of at least one advertisement slot mined from the audio program.
  • the receiving unit 1002 is used to receive an audio advertisement sent by the cloud that matches the target advertisement slot.
  • the processing unit 1003 is configured to play an audio advertisement when the audio program is played to a target advertisement slot.
  • the user characteristics include a user portrait and user behavior characteristics regarding historical audio programs.
  • each unit in the client 100 is similar to those described in the embodiments shown in the aforementioned Figures 3 to 8, and will not be repeated here.
  • the embodiment of the present application further provides another structure of the cloud device 110 including:
  • the acquisition unit 1101 is used to acquire the audio program of the advertisement slot to be mined.
  • the first processing unit 1102 is configured to determine at least one advertisement slot based on time domain information of the audio program in a voice state and text content after the audio program is converted into text.
  • the second processing unit 1103 is configured to encode the text content of each advertisement slot in the at least one advertisement slot within a previous period of time to obtain a vector representation of each advertisement slot.
  • the first processing unit 1102 is specifically used for, when the time domain information is amplitude, if the duration for which the amplitude of the audio program in the voice state is continuously lower than the amplitude threshold exceeds a first threshold, then the duration for which the amplitude is continuously lower than the amplitude threshold is determined as a first basic advertising slot; if the time interval between two adjacent words in the text content after the audio program is converted is greater than a second threshold, then the time interval between the two adjacent words is determined as a second basic advertising slot, and the time interval between the two adjacent words is determined by the timestamp of each word during text conversion; and at least one advertising slot is determined from the union of the first basic advertising slot and the second basic advertising slot.
  • the first processing unit 1102 is specifically used to select at least one advertisement slot with the largest weight from the union of the first basic advertisement slots and the second basic advertisement slots to determine it as at least one advertisement slot of the audio program, and the weight of each advertisement slot in the at least one advertisement slot is determined by the punctuation marks and/or text segment segment positions corresponding to each advertisement slot.
  • each unit in the cloud device 110 is similar to those described in the embodiments shown in the aforementioned Figures 3 to 8, and will not be repeated here.
  • a computer-readable storage medium in which computer execution instructions are stored.
  • the cloud device executes the steps executed by the cloud device in Figures 3 to 8 above.
  • a computer-readable storage medium in which computer-executable instructions are stored.
  • the processor of the client executes the computer-executable instructions
  • the client executes the steps performed by the client in Figures 3 to 8 above.
  • a computer program product is also provided.
  • the computer program product includes a computer program code.
  • the computer device executes the steps executed by the cloud device or the client in Figures 3 to 8 above.
  • a chip system which includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected by lines; the interface circuit is used to receive signals from the memory of the terminal and send signals to the processor, and the signals include computer instructions stored in the memory; when the processor executes the computer instructions, the terminal executes the steps performed by the cloud device or the client in the above-mentioned Figures 3 to 8.
  • the chip system may also include a memory, which is used to store program instructions and data necessary for the control device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be an indirect coupling or communication connection through some interfaces, devices or units, which can be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated units may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • the integrated unit When the integrated unit is implemented using software, it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the process or function described in the embodiment of the present application is generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from one website, computer, server or data center to another website, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more available media integrated.
  • the available medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid state drive (SSD)), etc.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Signal Processing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种音频广告投放的方法,包括:客户端(100)播放音频节目时向云端装置(20,90,110)发送广告请求(301),广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征,云端装置(20,90,110)根据音频节目的信息和目标广告槽位的标识确定目标广告槽位的向量表示(302),目标广告槽位的向量表示用于描述音频节目中在目标广告槽位前一段时间内所涉及的内容;云端装置(20,90,110)根据用户特征和目标广告槽位的向量表示确定与目标广告槽位匹配的音频广告(303);云端装置(20,90,110)向客户端(100)发送音频广告(304),客户端(100)在播放音频节目到目标广告槽位时播放音频广告(305)。使得音频广告与音频节目的匹配度更高,而且结合了用户特征,更能满足用户的个性化需求,可以提高音频广告的投放效果。

Description

一种音频广告投放的方法、设备及系统 技术领域
本申请涉及计算机技术领域,具体涉及一种音频广告投放的方法、设备及系统。
背景技术
播客是录制的网络广播或者网络声讯节目,比如有声书、相声、时事新闻等。播客的市场也日益壮大,用户数量已经达到了数亿。在播客的蓬勃发展下,对应的广告市场份额也在不断增长,随着国内外播客的发展,音频广告也成为一个重要的广告形式。
在音频节目中插入音频广告需要先离线挖掘音频节目中的广告槽位,也就是音频广告在音频节目中插入的位置。然后再为每个广告槽位配置音频广告。这样,播放音频节目到该广告槽位时,就会播放为该广告槽位配置的音频广告。
这种离线配置的音频广告,与音频节目的匹配度较低,经常会影响用户收听音频节目的连续性,投放效果较差。
发明内容
本申请提供一种音频广告投放的方法,用于在音频节目中为用户投放满足用户个性化需求的音频广告。本申请还提供了相应的设备、系统、计算机可读存储介质,以及计算机程序产品等。
本申请第一方面提供一种音频广告投放的方法,包括:云端接收来自客户端的广告请求,广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征,目标广告槽位为从音频节目中挖掘出的至少一个广告槽位中的一个,广告请求是客户端播放音频节目时触发的;云端根据音频节目的信息和目标广告槽位的标识确定目标广告槽位的向量表示,目标广告槽位的向量表示用于描述音频节目中在目标广告槽位前的一段时间内所涉及的内容;云端根据用户特征和目标广告槽位的向量表示得到与目标广告槽位匹配的音频广告;云端向客户端发送音频广告,音频广告用于客户端在播放音频节目到目标广告槽位时播放。
本申请中,云端可以是云平台的软件或服务,也可以是部署在例如边缘节点等网络中节点上的软件或服务。云端可以运行在独立的物理机上,也可以运行在虚拟化的资源上。
本申请中,客户端可以是终端设备,也可以是应用,例如该应用运行于终端设备上供用户使用。
本申请中,客户端播放音频节目时通常指的是客户端播放音频节目快到目标广告槽位时,通常在距离到达目标广告槽位前的一个预设时间点触发广告请求,该预设时间点可以是距离目标广告槽位5秒或者其他表示时长的数值的时间点。
本申请中,音频广告指的是通过音频方式播放的广告。广告请求用于向云端请求音频广告。
本申请中,音频节目的信息可以是音频节目的标识或索引等。该音频节目为客户端即将要播放的、正在要播放的或者刚播放完的音频节目。音频节目可以是有声书、音频形式的歌曲、相声或时事新闻等。
本申请中,在广告槽位挖掘阶段,可以在音频节目中挖掘出一个或多个广告槽位,一 个音频节目中的每个广告槽位都会有一个唯一的标识。每个广告槽位都会有一个向量表示,同一个广告槽位的标识与该广告槽位的向量表示关联,并且,每个音频节目的至少一个广告槽位的表示和向量表示与该音频节目关联存储,这些信息都可以存储在云平台的音频内容库中。其中,广告槽位的向量表示指的是对该广告槽位前一段时间内所涉及的内容进行编码得到的向量,本申请的“一段时间”可以是一段时长,如:1分钟或者其他表示时长的数值。又例如一种实现下具体的数值可以是预先设置的,一种实现下,该时长可以是在一个范围内随机取值。
本申请中,用户特征可以包括用户画像和用户行为特征,用户画像可以包括用户的基本信息,如:性别、年龄、爱好等。用户行为特征可以包括用户对历史音频节目的点击、收藏、评论等行为信息。
本申请中,根据音频节目的信息可以查找到与该音频节目关联的所有广告槽位的标识,以及广告槽位的向量表示,进一步,在根据目标广告槽位的标识,就可以确定该目标广告槽位的向量表示。
本申请中,根据目标广告槽位的向量表示可以确定与在该目标广告槽位前用户正在收听的音频节目强相关的一个或多个音频广告,进一步可以根据用户特征进一步筛选或处理音频广告,得到与该目标广告槽位匹配的音频广告。因为,本申请确定的音频广告与音频节目的匹配度更高,而且结合了用户特征,更能满足用户的个性化需求,可以提高音频广告的投放效果。
一种可能的实现方式中,上述步骤:云端根据用户特征和目标广告槽位的向量表示确定与目标广告槽位匹配的音频广告,包括:云端根据目标广告槽位的向量表示从音频广告库中召回多个音频广告;云端根据用户特征从多个音频广告中得到与目标广告槽位匹配的音频广告。
该种可能的实现方式中,云端根据目标广告槽位的向量表示召回与在该目标广告槽位前用户正在收听的音频节目强相关的多个音频广告,然后从中选择与用户特征匹配度最高的广告,这样可以提高音频广告与客户端上正在播放的音频节目的匹配度。
一种可能的实现方式中,上述步骤:云端根据用户特征从多个音频广告中得到与目标广告槽位匹配的音频广告,包括:云端根据用户特征和广告排序模型预测多个音频广告的完播率,其中,完播率最大的音频广告为与目标广告槽位匹配的音频广告,或者,完播率最大的音频广告为与目标广告槽位匹配的音频广告的源广告,广告排序模型是以用户特征为输入,以完播率为输出的模型。
该种可能的实现方式中,完播率指的是预测的音频广告被完整播放的概率。一条音频广告的内容和风格越贴近用户的偏好,被完整播放的概率越大,被投放后所获得的投放效果也会越好。所以,可以根据用户特征对召回的多个音频广告进行完播率预测,从中确定完播率最大的音频广告作为音频广告或者作为音频广告的源广告,这样可以提高音频广告的投放效果。
一种可能的实现方式中,当完播率最大的音频广告为与目标广告槽位匹配的音频广告的源广告时,该方法还包括:云端根据音频节目的风格和用户特征,调整完播率最大的音 频广告的风格得到与目标广告槽位匹配的音频广告。
该种可能的实现方式中,云端通过广告排序模型确定完播率最大的音频广告后,可以再根据用户所要播放或正在播放的音频节目的风格和用户特征调整完播率最大的音频广告的风格,这样可以提高用户对音频广告的接受度,从而提高音频广告的投放效果。
一种可能的实现方式中,上述步骤:云端根据音频节目的风格和用户特征,调整完播率最大的音频广告的风格得到与目标广告槽位匹配的音频广告,包括:云端根据音频节目中对象声音的风格向量和用户偏好的风格向量,调整完播率最大的音频广告中的对象声音,音频节目中对象声音的风格向量是通过编码音频节目中对象声音得到的,用户偏好的风格向量是通过编码用户特征得到的;云端根据音频节目中背景音乐的风格向量和用户偏好的风格向量,调整完播率最大的音频广告中的背景音乐,音频节目中背景音乐的风格向量是通过编码音频节目中背景音乐得到的;云端融合调整后的完播率最大的音频广告中的对象声音,以及调整后的完播率最大的音频广告中背景音乐,得到与目标广告槽位匹配的音频广告。
该种可能的实现方式中,如果音频节目包括对象声音和背景音乐,则可以分离音频节目的对象声音和背景音乐,再分别进行编码得到对象声音的风格向量和背景音乐的风格向量,再结合用户偏好的风格向量调整打分最高的音频广告中的对象声音和背景音乐,得到音频广告。音频广告的风格在调整后与音频节目的风格相一致,可以满足用户的风格偏好,提高用户体验,从而提高音频广告的投放效果。
一种可能的实现方式中,在云端接收客户端发送的广告请求之前,该方法还包括:云端基于音频节目在语音状态下的时域信息,以及音频节目转换为文本后的文本内容,确定至少一个广告槽位;云端对至少一个广告槽位中每个广告槽位前一段时间内的文本内容进行编码,以得到每个广告槽位的向量表示。
该种可能的实现方式中,本申请中,时域信息可以包括振幅(振幅也可以描述为声音强度)、振幅随时间的变化等。
云端还可以在接收广告请求之前执行广告槽位的挖掘任务,挖掘广告槽位的过程可以是从音频节目在语音状态下的时域信息,以及音频节目转换为文本后的文本内容来确定该音频节目的广告槽位。本申请中,因为可以基于音频节目的时域信息和文本内容共同挖掘广告槽位,挖掘出的广告槽位的质量较高,通常不会因为在该广告槽位插入广告而影响音频节目的连续性,从而可以提升用户体验,提高音频广告的投放效果。
一种可能的实现方式中,该方法还包括:云端将音频节目、每个广告槽位的标识和每个广告槽位的向量表示关联存储。
该种可能的实现方式中,将音频节目、音频节目中的每个广告槽位的标识和每个广告槽位的向量表示关联存储,可以便于在客户端发送广告请求时,快速的确定到与所请求的目标广告槽位匹配的音频广告,提高音频广告的投放效率。
一种可能的实现方式中,上述步骤:云端基于音频节目在语音状态下的时域信息,以及音频节目转换为文本后的文本内容,确定至少一个广告槽位,包括:时域信息为振幅时,若音频节目在语音状态下的振幅连续低于振幅阈值的时长超过第一阈值,则云端将振幅连 续低于振幅阈值的时长确定为第一基础广告槽位;若音频节目转换后的文本内容中相邻两个词的时间间隔大于第二阈值,则云端将相邻两个词的时间间隔确定为第二基础广告槽位,相邻两个词的时间间隔时通过文本转换时每个词的时间戳确定的;云端从第一基础广告槽位和第二基础广告槽位的并集中确定至少一个广告槽位。
该种可能的实现方式中,从第一基础广告槽位和第二基础广告槽位的并集中选择广告槽位,可以扩大广告槽位选择的范围。
一种可能的实现方式中,上述步骤:云端从第一基础广告槽位和第二基础广告槽位的并集中确定至少一个广告槽位,包括:云端从第一基础广告槽位和第二基础广告槽位的并集中选择权重最大的至少一个广告槽位确定为音频节目的至少一个广告槽位,至少一个广告槽位中每个广告槽位的权重是通过每个广告槽位对应的标点符号和/或文本段的分割位置确定的。
该种可能的实现方式中,可以通过标点符号、文本分段等方式对相应的基础广告槽位进行提权,也就是增加相应广告为的权重,然后从中选择权重最大的至少一个基础广告槽位确定为音频节目的至少一个广告槽位。这样,可以提高被选出的广告槽位的质量。
一种可能的实现方式中,用户特征包括用户画像,以及用户对历史音频节目的行为特征。
本申请第二方面提供一种音频广告投放的方法,包括:客户端播放音频节目时,向云端发送广告请求,所述广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征,所述目标广告槽位为从所述音频节目中挖掘出的至少一个广告槽位中的一个;所述客户端接收所述云端发送的与所述目标广告槽位匹配的音频广告;所述客户端在播放所述音频节目到所述目标广告槽位时播放所述音频广告。
本申请中,客户端播放音频节目时通常指的是客户端播放音频节目快到目标广告槽位时,通常在距离到达目标广告槽位前的一个预设时间点触发广告请求,该预设时间点可以是距离目标广告槽位5秒或者其他表示时长的数值的时间点。
一种可能的实现方式中,用户特征包括用户特征和用户对音频节目的偏好特征。
本申请第三方面提供一种挖掘广告槽位的方法,包括:云端获取待挖掘广告槽位的音频节目;云端基于音频节目在语音状态下的时域信息,以及音频节目转换为文本后的文本内容,确定至少一个广告槽位;云端对至少一个广告槽位中每个广告槽位前一段时间内的文本内容进行编码,以得到每个广告槽位的向量表示。
一种可能的实现方式中,该方法还包括:云端将音频节目、每个广告槽位的标识和每个广告槽位的向量表示关联存储。
一种可能的实现方式中,上述步骤:云端基于音频节目在语音状态下的时域信息,以及音频节目转换为文本后的文本内容,确定至少一个广告槽位,包括:时域信息为振幅时,若音频节目在语音状态下的振幅连续低于振幅阈值的时长超过第一阈值,则云端将振幅连续低于振幅阈值的时长确定为第一基础广告槽位;若音频节目转换后的文本内容中相邻两个词的时间间隔大于第二阈值,则云端将相邻两个词的时间间隔确定为第二基础广告槽位,相邻两个词的时间间隔时通过文本转换时每个词的时间戳确定的;云端从第一基础广告槽 位和第二基础广告槽位的并集中确定至少一个广告槽位。
一种可能的实现方式中,上述步骤:云端从第一基础广告槽位和第二基础广告槽位的并集中确定至少一个广告槽位,包括:云端从第一基础广告槽位和第二基础广告槽位的并集中选择权重最大的至少一个广告槽位确定为音频节目的至少一个广告槽位,至少一个广告槽位中每个广告槽位的权重是通过每个广告槽位对应的标点符号和/或文本段的分割位置确定的。
本申请第四方面,提供了一种云端装置,用于执行上述第一方面或第一方面的任意可能的实现方式中的方法。具体地,该云端装置包括用于执行上述第一方面或第一方面的任意可能的实现方式中的方法的模块或单元,如:处理单元、发送单元和接收单元。
本申请第五方面,提供了一种客户端,用于执行上述第二方面的方法。具体地,该客户端包括用于执行上述第二方面或第二方面的任意可能的实现方式中的方法的模块或单元,如:接收单元、显示单元和发送单元。
本申请第六方面,提供了一种云端装置,用于执行上述第一方面或第一方面的任意可能的实现方式中的方法。具体地,该云端装置包括用于执行上述第三方面或第三方面的任意可能的实现方式中的方法的模块或单元,如:处理单元、发送单元和接收单元。
本申请第七方面,提供了一种云端装置。该云端装置可以包括至少一个处理器、存储器和通信接口。处理器与存储器和通信接口耦合。存储器用于存储指令,处理器用于执行该指令,通信接口用于在处理器的控制下与其他网元进行通信。该指令在被处理器执行时,使处理器执行第一方面或第一方面的任意可能的实现方式中的方法。
本申请第八方面提供了一种客户端,包括收发器、处理器和存储器,收发器和处理器与存储器耦合,存储器用于存储程序或指令,当程序或指令被处理器执行时,使得云端装置执行前述第二方面或第二方面的任意可能的实现方式中的方法。
本申请第九方面,提供了一种云端装置。该云端装置可以包括至少一个处理器、存储器和通信接口。处理器与存储器和通信接口耦合。存储器用于存储指令,处理器用于执行该指令,通信接口用于在处理器的控制下与其他网元进行通信。该指令在被处理器执行时,使处理器执行第三方面或第三方面的任意可能的实现方式中的方法。
本申请第十方面提供了一种芯片系统,该芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从云端装置的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,云端装置执行前述第一方面或第一方面的任意可能的实现方式中的方法。
本申请第十一方面提供了一种芯片系统,该芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从客户端的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,客户端执行前述第二方面或第二方面的任意可能的实现方式中的方法。
本申请第十二方面提供了一种芯片系统,该芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从云端装置的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机 指令时,云端装置执行前述第三方面或第三方面的任意可能的实现方式中的方法。
本申请第十三方面提供了一种计算机可读存储介质,其上存储有计算机程序或指令,当计算机程序或指令在计算机设备上运行时,使得计算机设备执行前述第一方面或第一方面的任意可能的实现方式中的方法。
本申请第十四方面提供了一种计算机可读存储介质,其上存储有计算机程序或指令,当计算机程序或指令在计算机设备上运行时,使得计算机设备执行前述第二方面或第二方面的任意可能的实现方式中的方法。
本申请第十五方面提供了一种计算机可读存储介质,其上存储有计算机程序或指令,当计算机程序或指令在计算机设备上运行时,使得计算机设备执行前述第三方面或第三方面的任意可能的实现方式中的方法。
本申请第十六方面提供了一种计算机设备程序产品,该计算机设备程序产品包括计算机设备程序代码,当计算机设备程序代码在计算机设备上执行时,使得计算机设备执行前述第一方面或第一方面的任意可能的实现方式中的方法。
本申请第十七方面提供了一种计算机设备程序产品,该计算机设备程序产品包括计算机设备程序代码,当计算机设备程序代码在计算机设备上执行时,使得计算机设备执行前述第二方面或第二方面的任意可能的实现方式中的方法。
本申请第十八方面提供了一种计算机设备程序产品,该计算机设备程序产品包括计算机设备程序代码,当计算机设备程序代码在计算机设备上执行时,使得计算机设备执行前述第三方面或第三方面的任意可能的实现方式中的方法。
本申请第十九方面提供一种音频广告系统,该音频广告系统包括云端装置和客户端,该云端装置用于执行前述第一方面或第一方面的任意可能的实现方式中的方法,客户端用于执行前述第二方面或第二方面的任意可能的实现方式中的方法。
本申请第二十方面提供一种音频广告系统,该音频广告系统包括云端装置和音频内容库,云端装置从音频内容库获取音频节目,并执行上述第三方面或第三方面的任意可能的实现方式中的方法。
其中,第二方面至第二十方面或者其中任一种可能实现方式所带来的技术效果可参见第一方面或第一方面不同可能实现方式所带来的技术效果,此处不再赘述。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1A是本申请实施例提供的音频广告系统的一架构示意图;
图1B是本申请实施例提供的音频广告系统的另一架构示意图;
图2A是本申请实施例提供的客户端的一结构示意图;
图2B是本申请实施例提供的云端装置的一结构示意图;
图3是本申请实施例提供的音频广告投放的方法的一实施例示意图;
图4是本申请实施例提供的一广告排序模型的结构示意图;
图5是本申请实施例提供的音频广告投放的方法的另一实施例示意图;
图6是本申请实施例提供的挖掘广告槽位的方法的一实施例示意图;
图7是本申请实施例提供的一场景示例示意图;
图8是本申请实施例提供的挖掘广告槽位和音频广告投放的一实施例示意图;
图9是本申请实施例提供的云端装置的另一结构示意图;
图10是本申请实施例提供的客户端的另一结构示意图;
图11是本申请实施例提供的云端装置的另一结构示意图。
具体实施方式
下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例提供一种音频广告投放的方法,用于在音频节目中为用户投放满足用户个性化需求的音频广告。本申请还提供了相应的设备、系统、计算机可读存储介质,以及计算机程序产品等。以下分别进行详细说明。
图1A是本申请实施例提供的音频广告系统的一架构示意图。
如图1A所示,本申请实施例提供的音频广告系统包括云端以及多个客户端,该云端可以与多个客户端通过网络进行通信。本申请实施例提供的音频广告系统中还可以包括音频内容库和音频广告库,当然,音频内容库和/或音频广告库也可以集成于云端上。
本申请实施例中,云端可以是云平台的软件或服务,也可以是部署在例如边缘节点等网络中节点上的软件或服务。客户端可以是终端设备,也可以是应用,例如该应用运行于终端设备上供用户使用。
客户端在使用播客类型的应用时,可以从音频内容库获取音频节目,播放音频节目时,可以向云端发送广告请求。云端可以根据广告请求中携带的与音频节目相关的信息,以及用户特征从音频广告库中确定一个能满足用户个性化需求的音频广告,发送给客户端,供客户端在播放音频节目时投放到音频节目中。
本申请中,音频广告指的是通过音频方式播放的广告。广告请求用于向云端请求音频广告。
本申请中,用户特征可以包括用户画像和用户行为特征,用户画像可以包括用户的基本信息,如:性别、年龄、爱好等。用户行为特征可以包括用户对历史音频节目的点击、收藏、评论等行为信息。
本申请实施例提供的音频广告系统可以结合用户特征确定音频广告,这样确定的音频广告更能满足用户的个性化需求,可以提高音频广告的投放效果。
本申请实施例中,与音频节目相关的信息可以包括音频节目的标识,还可以包括预先从音频节目中挖掘出的广告槽位的标识。这样,云端可以为广告请求中所指定的广告槽位确定音频广告,可以进一步提高音频广告与音频节目的匹配度。广告槽位指的是在音频节目中用于播放音频广告的时间段。
本申请实施例中,挖掘广告槽位的过程通常是离线挖掘,当然,也可以在线挖掘。下面结合图1B对用于挖掘广告槽位的音频广告系统进行介绍。
如图1B所示,该音频广告系统可以包括云端和音频内容库。该音频内容库可以集成在云端,该云端可以与图1A中的云端是同一设备,也可以是不同设备。
在挖掘广告槽位时,云端可以从音频内容库中获取待挖掘广告槽位的音频节目,然后云端基于音频节目在语音状态下的时域信息,以及音频节目转换为文本后的文本内容,确定至少一个广告槽位;云端对至少一个广告槽位中每个广告槽位前一段时间内的文本内容进行编码,以得到每个广告槽位的向量表示。
本申请实施例中,时域信息可以包括振幅(振幅也可以描述为声音强度)、振幅随时间的变化等。
云端会将同一个音频节目的每个广告槽位的标识和每个广告槽位的向量表示以及音频节目关联存储,如果音频节目存储在音频内容库中,则可以将同一个音频节目的每个广告槽位的标识和每个广告槽位的向量表示返回到音频内容库中,音频内容库关联存储音频节目,以及音频节目的每个广告槽位的标识和每个广告槽位的向量表示。如图1B中,音频内容库中可以关联存储很多个音频节目和该音频节目中的广告槽位的标识和向量表示。例如:音频节目1,对应的广告槽位有x个,与音频节目1对应的广告槽位的标识和向量表示分别为广告槽位1、向量表示1,…,广告槽位x、向量表示x。音频节目M,对应的广告槽位有y个,与音频节目1对应的广告槽位的标识和向量表示分别为广告槽位1、向量表示1,…,广告槽位y、向量表示y。其中,x、y、M都为正整数。
本申请实施例提供的广告槽位的挖掘方案,因为可以基于音频节目的在语音状态下的时域信息和音频节目转换为文本格式后的文本内容共同挖掘广告槽位,挖掘出的广告槽位的质量较高,通常不会因为在该广告槽位插入音频广告而影响音频节目的连续性,从而可以提升用户体验,提高音频广告的投放效果。而且在音频内容库中将音频节目、音频节目中的每个广告槽位的标识和每个广告槽位的向量表示关联存储,可以便于在客户端发送广告请求时,快速的确定到与所请求的目标广告槽位匹配的音频广告,提高音频广告的投放效率。
本申请实施例中,云端可以是物理机也可以是虚拟机(virtual machine,VM)或容器(container)等计算实例,也可以将云端理解为是广告系统,或者,云端是广告系统中的设备。
客户端为终端设备时,该终端设备(也可以称为用户设备(user equipment,UE))是一种具有无线收发功能的设备,可以部署在陆地上,包括室内或室外、手持或车载;也可 以部署在水面上(如轮船等);还可以部署在空中(例如飞机、气球和卫星上等)。终端可以是手机(mobile phone)、平板电脑(pad)、带无线收发功能的电脑、虚拟现实(virtual reality,VR)终端、增强现实(augmented reality,AR)终端、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程医疗(remote medical)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端、以物联网(internet of things,IoT)中的无线终端等。
本申请实施例提供的终端设备的结构可以参阅如下图2A进行理解,云端装置的结构可以参阅如下图2B进行理解。
请参考图2A,为本申请实施例提供的一种终端设备的结构示意图。如图2A所示,终端设备可以包括处理器101、收发器102、存储器103以及总线104。处理器101、收发器102以及存储器103通过总线104相互连接。在本申请的实施例中,处理器101用于对终端设备10的动作进行控制管理,例如,处理器101用于控制播放音频节目和音频广告过程。收发器102用于支持终端设备10进行通信,例如:收发器102可以执行发送广告请求和接收音频广告的步骤。存储器103,用于存储终端设备10的程序代码和数据。
其中,处理器101可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线104可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2A中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
以上图2A介绍了终端设备的结构,下面结合图2B介绍云端装置的结构。
图2B为本申请的实施例提供的云端装置的一种可能的逻辑结构示意图。如图2B所示,本申请实施例提供的云端装置20包括:处理器201、通信接口202、存储器203以及总线204。处理器201、通信接口202以及存储器203通过总线204相互连接。在本申请的实施例中,处理器201用于对云端装置20的动作进行控制管理,例如,处理器201用于执行确定音频广告过程。通信接口202用于支持云端装置20进行通信,例如:通信接口202可以执行接收广告请求和发送音频广告的步骤。存储器203,用于存储云端装置20的程序代码和数据。
其中,处理器201可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线204可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard  Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2B中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
下面对本申请实施例提供的音频广告投放的方法进行描述。该方法中涉及到云端执行的内容可以由云端执行,也可以由云端的部件(例如处理器、芯片、或芯片系统等)执行。
图3为本申请实施例提供的音频广告投放的方法的一实施例示意图。
如图3所示,本申请实施例提供的音频广告投放的方法的一实施例包括:
301.客户端向云端发送广告请求。对应的,云端接收来自客户端的广告请求。
广告请求是客户端播放音频节目时触发的。
本申请实施例中,客户端播放音频节目时通常指的是客户端播放音频节目快到目标广告槽位时,通常在距离到达目标广告槽位前的一个预设时间点触发广告请求,该预设时间点可以是距离目标广告槽位5秒或者其他表示时长的数值的时间点。
本申请实施例中,广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征。
本申请实施例中,目标广告槽位为从音频节目中挖掘出的至少一个广告槽位中的一个,如:图1B中音频节目1对应的广告槽位1,当然,也可以是其他广告槽位。
302.云端根据音频节目的信息和目标广告槽位的标识确定目标广告槽位的向量表示。
目标广告槽位的向量表示用于描述音频节目中在目标广告槽位前的一段时间内所涉及的内容。
本申请实施例中,音频节目的信息可以是音频节目的标识或索引等。该音频节目为客户端即将要播放的、正在要播放的或者刚播放完的音频节目。音频节目可以是有声书、音频形式的歌曲、相声或时事新闻等。
本申请实施例中,在广告槽位挖掘阶段,可以在音频节目中挖掘出一个或多个广告槽位,一个音频节目中的每个广告槽位都会有一个唯一的标识。每个广告槽位都会有一个向量表示,同一个广告槽位的标识与该广告槽位的向量表示关联,并且,每个音频节目的至少一个广告槽位的表示和向量表示与该音频节目关联存储,这些信息都可以存储在云平台的音频内容库中。其中,广告槽位的向量表示指的是对该广告槽位前一段时间内所涉及的内容进行编码得到的向量,本申请的“一段时间”可以是一段时长,如:1分钟或者其他表示时长的数值。又例如一种实现下具体的数值可以是预先设置的,一种实现下,该时长可以是在一个范围内随机取值。
如图1B中的示意,若音频节目的信息为音频节目1,目标广告槽位的标识为广告槽位1,那么根据音频节目1和广告槽位1就可以确定广告槽位1的向量表示1。
303.云端根据用户特征和目标广告槽位的向量表示确定与目标广告槽位匹配的音频广告。
本申请实施例中,用户特征可以反映出用户对音频节目类型或风格的偏好,例如:用户喜欢收听的音频节目的类型、内容,以及所喜欢的音频节目朗读者等。
本申请实施例中,因为广告槽位的向量表示可以反映音频节目的内容,所以在匹配音频广告时,可以选择与音频节目的内容有关联的音频广告,这样的音频广告与音频节目的 融合度较高,不会影响用户收听音频节目的连续性。云端再结合用户的偏好进一步确定音频广告,可以得到与目标广告槽位匹配度最好的音频广告。
304.云端向客户端发送音频广告。对应的,客户端接收来自云端的音频广告。
音频广告用于客户端在播放音频节目到目标广告槽位时播放。
305.客户端在目标广告槽位中播放音频广告。
本申请实施例中,根据目标广告槽位的向量表示可以确定与在该目标广告槽位前用户正在收听的音频节目强相关的一个或多个音频广告,进一步可以根据用户特征进一步筛选或处理音频广告,得到与该目标广告槽位匹配的音频广告。因为,本申请确定的音频广告与音频节目的匹配度更高,而且结合了用户特征,更能满足用户的个性化需求,可以提高音频广告的投放效果。
上述步骤303可以包括:云端根据目标广告槽位的向量表示从音频广告库中召回多个音频广告;云端根据用户特征从多个音频广告中得到与目标广告槽位匹配的音频广告。
进一步的可以包括:云端根据用户特征和广告排序模型预测多个音频广告的完播率,其中,完播率最大的音频广告为与目标广告槽位匹配的音频广告,或者,完播率最大的音频广告为与目标广告槽位匹配的音频广告的源广告,广告排序模型是以用户特征为输入,以完播率为输出的模型。
也就是说,本申请实施例中,在确定音频广告的过程中可以包括广告召回、广告排序和风格迁移几个部分,下面分别进行介绍。
1.广告召回。
广告召回指的是从音频广告库中获取与目标广告槽位的向量表示所描述的内容相关的多个音频广告。
2.广告排序。
云端可以使用广告排序模型对召回的每个音频广告进行打分,也可以基于打分进行排序,选择其中打分最高的音频广告。
本申请实施例中,广告排序模型是以用户特征为输入,以完播率为输出的模型。该广告排序模型可以是机器学习模型,关于广告排序模型可以参阅图4进行理解。如图4所示,该广告排序模型中的输入可以包括音频节目、音频广告,音频广告文本(文本形式的音频广告),槽位文本(文本形式的广告槽位),还可以包括槽位权重、音频广告特征、音频节目特征、上下文特征等,另外,本申请实施例提供的广告排序模型的输入还包括用户特征。这样,嵌入层对槽位权重、音频广告特征、音频节目特征、上下文特征,以及户特征信息进行处理后,以及对音频节目和音频广告进行音频编码,对音频广告文本和槽位文本进行文本编码后,都输入到连接(Concat)&压平(Flatten)层进行处理后,通过神经网络输入该条音频广告对于该用户的完播率。神经网络可以包括卷积神经网络(deep neural network,DNN)、深度兴趣网络(deep interest network,DIN)或深度因子分解机(deep factorization machine,DeepFM)。
可以将召回的多个音频广告分别输入到广告排序模型中,也可以一次性或分批输入到广告排序模型,就可以得到每个音频广告的完播率。完播率的数值可以理解为是音频广告 的分数,打分最高的音频广告为完播率最大的音频广告。
完播率指的是预测的音频广告被完整播放的概率。一条音频广告的内容和风格越贴近用户的偏好,被完整播放的概率越大,被投放后所获得的投放效果也会越好。所以,可以根据用户特征对召回的多个音频广告进行完播率预测,从中确定完播率最大的音频广告作为音频广告,这样可以提高音频广告的投放效果。
3.风格迁移。
本申请实施例中,云端通过广告排序模型确定完播率最大的音频广告后,可以再根据用户所要播放或正在播放的音频节目的风格和用户特征调整完播率最大的音频广告的风格,得到与目标广告槽位匹配的音频广告。这样可以提高用户音频广告的接受度,从而提高音频广告的投放效果。
需要说明的是,确定完播率最大的音频广告后,也可以直接将完播率最大的音频广告确定为与目标广告槽位匹配的音频广告,也可以以完播率最大的音频广告作为源广告进行上述风格迁移得到与目标广告槽位匹配的音频广告,本申请中对此不做限定。
本申请实施例提供的风格迁移的过程可以参阅图5进行理解。
如图5所示,该过程可以包括:
501.云端分离音频广告中的对象声音和背景音乐。
对象声音通常为音频广告中的主体声音,如解说广告的解说者。
502.云端分离音频节目中的对象声音和背景音乐。
503.云端编码音频节目中的对象声音得到对象声音的风格向量。
504.云端编码音频节目中的背景音乐得到背景音乐的风格向量。
505.云端编码用户特征得到用户偏好的风格向量。
506.云端根据对象声音的风格向量和用户偏好的风格向量,调整打分最高的音频广告中的对象声音。
该步骤506就是迁移音频广告中对象声音的风格。
507.云端根据背景音乐的风格向量和用户偏好的风格向量,调整打分最高的音频广告中背景音乐。
该步骤507就是迁移音频广告中背景声音的风格。
风格迁移可以是对对象声音或者背景音乐的风格的替换,或者部分调整。
508.将步骤506和步骤507风格迁移后的对象声音和背景音乐进行融合,以得到音频广告。
需要说明的是,如果音频广告或音频节目中都没有背景音乐,则可以不执行与背景音乐处理的相关步骤,如:步骤501、502、504、507或508中的一个或多个步骤。
本申请实施例中,音频广告的风格在调整后与音频广告的风格相一致,可以满足用户的风格偏好,提高用户体验,从而提高音频广告的投放效果。
以上介绍了在线投放音频广告的过程,下面再对图1B中离线或在线挖掘广告槽位的过程进行进一步的描述。
如图6所示,本申请实施例提供的挖掘广告槽位的方法的一实施例包括:
601.云端获取待挖掘广告槽位的音频节目。
云端可以从音频内容库获取待挖掘广告槽位的音频节目。
602.云端对音频节目在语音状态下进行检测。
603.时域信息为振幅时,若音频节目在语音状态下的振幅连续低于振幅阈值的时长超过第一阈值,则云端将振幅连续低于振幅阈值的时长确定为第一基础广告槽位。
604.云端将音频节目转换为文本内容并记录文本内容中词的时间戳,根据相邻两个词的时间戳确定相邻两个词的时间间隔。
第一阈值和第二阈值可以相同,也可以不相同。
605.若音频节目转换后的文本内容中相邻两个词的时间间隔大于第二阈值,则云端将相邻两个词的时间间隔确定为第二基础广告槽位。
606.云端恢复文本内容中的标点符号。
607.云端针对标点符号中结束的标点符号对应的基础广告槽位增加权重。
本申请实施例中的基础广告槽位指的是第一基础广告槽位和第二基础广告槽位的并集中的广告槽位。
当然,第一基础广告槽位和第二基础广告槽位可以有重叠。
608.云端对恢复符号的文本内容划分文本段,针对两个文本段之间的基础广告槽位增加权重。
上述步骤602至608的过程可以参阅图7的示例进行理解。
如图7所示,云端可以先对音频节目(图7中示意的可以是从音频节目中截取的一段)进行检测,从中检测出有一段音频振幅很小(小于振幅阈值),且持续时间超过第一阈值,则可以确定在这个时长中音频节目中的对象没有发声,也就是处于停顿状态,也可以将这个时长确定为一个第一基础广告槽位。
然后,云端可以将图7所示的一段音频通过语音识别,转换为文本。从图7中可以看出转换出的文本内容包括:“今天天气真好咱们去哪玩颐和园在海淀区”,其中,“今天”的时间戳为1,“天气”的时间戳为2,“真好”的时间戳为3,“咱们”的时间戳为6,“去哪”的时间戳为7,“玩”的时间戳为8,“颐和园”的时间戳为12,“在”的时间戳为13,“海淀区”的时间戳为14。从各个词的时间戳中可以确定,“真好”和“咱们”之间少了时间戳4和5,也就是说“真好”和“咱们”两个词之间出现了停顿,如果第二阈值等于1(或者其他0到2之间的数值),则该处可以确定为一个第二基础广告槽位。同理,在“玩”和“颐和园”之间的时间戳少了9、10和11,则可以将该处可以确定为一个第二基础广告槽位。在“海淀区”也可以确定为一个第二基础广告槽位。从图7中可见,第一基础广告槽位基本与“玩”和“颐和园”之间第二基础广告槽位重叠。为了便于描述,下面将图7中示意出的第一基础广告槽位和第二基础广告槽位都统称为基础广告槽位。
接下来可以恢复文本中的标点符号,如图7所示,恢复标点符号后的文本内容包括:“今天天气真好,咱们去哪玩?颐和园在海淀区。”,因为标点符号中问号“?”和句号“。”的停顿通常较逗号“,”长,所以可以对“?”和“。”对应的两个基础广告槽位进行提权,也就是增加这两个基础广告槽位的权重。
进一步还可以为图7示意的文本内容进行分段,可以将“今天天气真好,咱们去哪玩?”划分为文本段1,将“颐和园在海淀区。”划分为文本段2,在两个文本段分割处的停顿会更久,所以可以再进一步提权两个文本段分割处的基础广告槽位的权重,也就是进一步增加“玩”和“颐和园”之间的基础广告槽位的权重。
609.云端选择权重最大的至少一个基础广告槽位确定为音频节目的至少一个广告槽位。
执行如上操作后,“玩”和“颐和园”之间的基础广告槽位的权重最大。如果在图7所示意的这段音频中选择一个广告槽位,则可以选择“玩”和“颐和园”之间的基础广告槽位作为该段音频的广告槽位。
610.云端对至少一个广告槽位中每个广告槽位前的预设长度文本内容进行编码,以得到每个广告槽位的向量表示。
云端可以对“今天天气真好,咱们去哪玩?”这段文本内容进行编码,这道该“玩”和“颐和园”之间的广告槽位的向量表示。
611.云端可以将每个广告槽位的标识和向量表示发送到音频内容库中,与音频内容进行关联存储。
这样,当用户在播放这段音频时,播放到该“玩”和“颐和园”之间的广告槽位,则可以选择与旅游相关的音频广告进行推送,推送时还可以考虑用户所在城市,用户历史上的旅游过的景区的类型筛选与“今天天气真好,咱们去哪玩?”相匹配的音频广告投放到“玩”和“颐和园”之间的广告槽位。这样,音频节目和音频广告之间是连续的,不会影响用户的收听体验,可以增加音频广告的投放效果。
为了更好的理解本申请实施例提供的广告挖掘过程与广告投放过程的联系,下面结合图8对两个过程进行结合性描述。
801.基于音频内容的广告槽位挖掘。
该过程可以包括广告槽位识别和广告槽位向量表示。
该步骤801的过程可以参阅前面步骤601至611进行理解,此处不再重复赘述。
802.在音频广告投放过程中,客户端播放音频节目时判断是否播放至广告槽位,若是,执行步骤803,若否,继续播放音频节目。
803.当播放至广告槽位时,判断是否发送广告请求,若是,则执行步骤804,若否,则继续播放音频节目。
如果前面已经播放了足够的广告,也就是播放音乐广告的时长已经超过了预设值,则后续广告槽位不再播放广告,就不需要发送广告请求。
804.从音频广告库召回多个音频广告。
805.对多个音频广告进行个性化音频广告排序。
步骤804和805可以参阅前面广告召回和广告打分的介绍进行理解。
806.判断是否投放音频广告。
根据打分结果判断是否投放音频广告,如果所有音频广告得分均低于分数阈值,则不投放音频广告,如果得分最高的音频广告高于分数阈值,则确定投放该得分最高的音频广 告,并进行步骤807。
807.音频广告风格迁移。
该步骤807可以参阅前面风格迁移部分的介绍进行理解。
808.播放风格迁移之后的音频广告,播放之后继续播放音频节目。
以上所介绍的广告槽位的挖掘过程结合了文本内容生成了向量表示,这样,在投放广告时可以确定出与音频内容更匹配的音频广告,而且在投放音频广告时,还使用了用户特征,这样,更能满足用户的个性化需求,可以提高音频广告的投放效果。
以上介绍了挖掘广告槽位的方法和音频广告投放的方法,下面结合附图介绍本申请实施例中的云端装置和客户端。
如图9所示,本申请实施例提供的云端装置90的一结构包括:
接收单元901,用于接收来自客户端的广告请求,广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征,目标广告槽位为从音频节目中挖掘出的至少一个广告槽位中的一个,广告请求是客户端播放音频节目时触发的。该接收单元901可以执行上述方法实施例中的步骤301。
第一处理单元902,用于根据音频节目的信息和目标广告槽位的标识确定目标广告槽位的向量表示,目标广告槽位的向量表示用于描述音频节目中在目标广告槽位前的一段时间内所涉及的内容。该第一处理单元902可以执行上述方法实施例中的步骤302。
第二处理单元903,用于根据用户特征和目标广告槽位的向量表示得到与目标广告槽位匹配的音频广告。该第二处理单元903可以执行上述方法实施例中的步骤303。
发送单元904,用于向客户端发送音频广告,音频广告用于客户端在播放音频节目到目标广告槽位时播放。该发送单元904可以执行上述方法实施例中的步骤304。
本申请实施例中,根据目标广告槽位的向量表示可以确定与在该目标广告槽位前用户正在收听的音频节目强相关的一个或多个音频广告,进一步可以根据用户特征进一步筛选或处理音频广告,得到与该目标广告槽位匹配的音频广告。因为,本申请确定的音频广告与音频节目的匹配度更高,而且结合了用户特征,更能满足用户的个性化需求,可以提高音频广告的投放效果。
可选地,第二处理单元903,具体用于根据目标广告槽位的向量表示从音频广告库中召回多个音频广告;根据用户特征从多个音频广告中得到与目标广告槽位匹配的音频广告。
可选地,第二处理单元903,具体用于根据用户特征和广告排序模型预测多个音频广告的完播率,其中,完播率最大的音频广告为与目标广告槽位匹配的音频广告,或者,完播率最大的音频广告为与目标广告槽位匹配的音频广告的源广告,广告排序模型是以用户特征为输入,以完播率为输出的模型。
可选地,第二处理单元903,具体用于当完播率最大的音频广告为与目标广告槽位匹配的音频广告的源广告时,根据音频节目的风格和用户特征,调整完播率最大的音频广告的风格得到与目标广告槽位匹配的音频广告。
可选地,第二处理单元903,具体用于根据音频节目中对象声音的风格向量和用户偏好的风格向量,调整完播率最大的音频广告中的对象声音,音频节目中对象声音的风格向 量是通过编码音频节目中对象声音得到的,用户偏好的风格向量是通过编码用户特征得到的;根据音频节目中背景音乐的风格向量和用户偏好的风格向量,调整完播率最大的音频广告中的背景音乐,音频节目中背景音乐的风格向量是通过编码音频节目中背景音乐得到的;融合调整后的完播率最大的音频广告中的对象声音,以及调整后的完播率最大的音频广告中背景音乐,得到与目标广告槽位匹配的音频广告。
可选地,第一处理单元902,还用于基于音频节目在语音状态下的时域信息,以及音频节目转换为文本后的文本内容,确定至少一个广告槽位;对至少一个广告槽位中每个广告槽位前一段时间内的文本内容进行编码,以得到每个广告槽位的向量表示。
可选地,第一处理单元902,具体用于时域信息为振幅时,若音频节目在语音状态下的振幅连续低于振幅阈值的时长超过第一阈值,则将振幅连续低于振幅阈值的时长确定为第一基础广告槽位;若音频节目转换后的文本内容中相邻两个词的时间间隔大于第二阈值,则将相邻两个词的时间间隔确定为第二基础广告槽位,相邻两个词的时间间隔时通过文本转换时每个词的时间戳确定的;从第一基础广告槽位和第二基础广告槽位的并集中确定至少一个广告槽位。
可选地,第一处理单元902,具体用于从第一基础广告槽位和第二基础广告槽位的并集中选择权重最大的至少一个广告槽位确定为音频节目的至少一个广告槽位,至少一个广告槽位中每个广告槽位的权重是通过每个广告槽位对应的标点符号和/或文本段的分割位置确定的。
可选地,用户特征包括用户画像,以及用户对历史音频节目的行为特征。
本申请实施例中,云端装置90中各单元所执行的操作与前述图3至图8所示实施例中描述的类似,此处不再赘述。
如图10所示,本申请实施例提供的客户端100的一结构包括:
发送单元1001,用于播放音频节目时,向云端发送广告请求,广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征,目标广告槽位为从音频节目中挖掘出的至少一个广告槽位中的一个。
接收单元1002,用于接收云端发送的与目标广告槽位匹配的音频广告。
处理单元1003,用于在播放音频节目到目标广告槽位时播放音频广告。
可选地,用户特征包括用户画像,以及用户对历史音频节目的行为特征。
本申请实施例中,客户端100中各单元所执行的操作与前述图3至图8所示实施例中描述的类似,此处不再赘述。
如图11所示,本申请实施例还提供了云端装置110的另一结构包括:
获取单元1101,用于获取待挖掘广告槽位的音频节目。
第一处理单元1102,用于基于所述音频节目在语音状态下的时域信息,以及所述音频节目转换为文本后的文本内容,确定至少一个广告槽位。
第二处理单元1103,用于对所述至少一个广告槽位中每个广告槽位前一段时间内的文本内容进行编码,以得到所述每个广告槽位的向量表示。
可选地,第一处理单元1102,具体用于时域信息为振幅时,若音频节目在语音状态下 的振幅连续低于振幅阈值的时长超过第一阈值,则将振幅连续低于振幅阈值的时长确定为第一基础广告槽位;若音频节目转换后的文本内容中相邻两个词的时间间隔大于第二阈值,则将相邻两个词的时间间隔确定为第二基础广告槽位,相邻两个词的时间间隔时通过文本转换时每个词的时间戳确定的;从第一基础广告槽位和第二基础广告槽位的并集中确定至少一个广告槽位。
可选地,第一处理单元1102,具体用于从所述第一基础广告槽位和所述第二基础广告槽位的并集中选择权重最大的至少一个广告槽位确定为所述音频节目的至少一个广告槽位,所述至少一个广告槽位中每个广告槽位的权重是通过所述每个广告槽位对应的标点符号和/或文本段的分割位置确定的。
本申请实施例中,云端装置110中各单元所执行的操作与前述图3至图8所示实施例中描述的类似,此处不再赘述。
在本申请的另一实施例中,还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当云端装置的处理器执行该计算机执行指令时,云端装置执行上述图3至图8中云端装置所执行的步骤。
在本申请的另一实施例中,还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当客户端的处理器执行该计算机执行指令时,客户端执行上述图3至图8中客户端所执行的步骤。
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机程序代码,当计算机程序代码在计算机上执行时,计算机设备执行上述图3至图8中云端装置或客户端所执行的步骤。
在本申请的另一实施例中,还提供一种芯片系统,该芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从终端的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,终端执行前述上述图3至图8中云端装置或客户端所执行的步骤。在一种可能的设计中,芯片系统还可以包括存储器,存储器,用于保存控制设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元可 以全部或部分地通过软件、硬件、固件或者其任意组合来实现。
当使用软件实现所述集成的单元时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。

Claims (23)

  1. 一种音频广告投放的方法,其特征在于,包括:
    云端接收来自客户端的广告请求,所述广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征,所述目标广告槽位为从所述音频节目中挖掘出的至少一个广告槽位中的一个,所述广告请求是所述客户端播放所述音频节目时触发的;
    所述云端根据所述音频节目的信息和所述目标广告槽位的标识确定所述目标广告槽位的向量表示,所述目标广告槽位的向量表示用于描述所述音频节目中在所述目标广告槽位前的一段时间内所涉及的内容;
    所述云端根据所述用户特征和所述目标广告槽位的向量表示得到与所述目标广告槽位匹配的音频广告;
    所述云端向所述客户端发送所述音频广告,所述音频广告用于所述客户端在播放所述音频节目到所述目标广告槽位时播放。
  2. 根据权利要求1所述的方法,其特征在于,所述云端根据所述用户特征和所述目标广告槽位的向量表示确定与所述目标广告槽位匹配的音频广告,包括:
    所述云端根据所述目标广告槽位的向量表示从音频广告库中召回多个音频广告;
    所述云端根据所述用户特征从所述多个音频广告中得到与所述目标广告槽位匹配的音频广告。
  3. 根据权利要求2所述的方法,其特征在于,所述云端根据所述用户特征从所述多个音频广告中得到与所述目标广告槽位匹配的音频广告,包括:
    所述云端根据所述用户特征和广告排序模型预测所述多个音频广告的完播率,其中,完播率最大的音频广告为所述与所述目标广告槽位匹配的音频广告,或者,完播率最大的音频广告为所述与所述目标广告槽位匹配的音频广告的源广告,所述广告排序模型是以用户特征为输入,以完播率为输出的模型。
  4. 根据权利要求3所述的方法,其特征在于,当所述完播率最大的音频广告为所述与所述目标广告槽位匹配的音频广告的源广告时,所述方法还包括:
    所述云端根据所述音频节目的风格和所述用户特征,调整所述完播率最大的音频广告的风格得到与所述目标广告槽位匹配的音频广告。
  5. 根据权利要求4所述的方法,其特征在于,所述云端根据所述音频节目的风格和所述用户特征,调整所述完播率最大的音频广告的风格得到与所述目标广告槽位匹配的音频广告,包括:
    所述云端根据所述音频节目中对象声音的风格向量和用户偏好的风格向量,调整所述完播率最大的音频广告中的对象声音,所述音频节目中对象声音的风格向量是通过编码所述音频节目中对象声音得到的,所述用户偏好的风格向量是通过编码所述用户特征得到的;
    所述云端根据所述音频节目中背景音乐的风格向量和所述用户偏好的风格向量,调整所述完播率最大的音频广告中的背景音乐,所述音频节目中背景音乐的风格向量是通过编码所述音频节目中背景音乐得到的;
    所述云端融合调整后的所述完播率最大的音频广告中的对象声音,以及调整后的所述 完播率最大的音频广告中背景音乐,得到与所述目标广告槽位匹配的音频广告。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述云端接收客户端发送的广告请求之前,所述方法还包括:
    所述云端基于所述音频节目在语音状态下的时域信息,以及所述音频节目转换为文本后的文本内容,确定至少一个广告槽位;
    所述云端对所述至少一个广告槽位中每个广告槽位前一段时间内的文本内容进行编码,以得到所述每个广告槽位的向量表示。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    所述云端将所述音频节目、所述每个广告槽位的标识和所述每个广告槽位的向量表示关联存储。
  8. 根据权利要求6或7所述的方法,其特征在于,所述云端基于所述音频节目在语音状态下的时域信息,以及所述音频节目转换为文本后的文本内容,确定至少一个广告槽位,包括:
    所述时域信息为振幅时,若所述音频节目在语音状态下的振幅连续低于振幅阈值的时长超过第一阈值,则所述云端将所述振幅连续低于振幅阈值的时长确定为第一基础广告槽位;
    若所述音频节目转换后的文本内容中相邻两个词的时间间隔大于第二阈值,则所述云端将所述相邻两个词的时间间隔确定为第二基础广告槽位,所述相邻两个词的时间间隔时通过文本转换时每个词的时间戳确定的;
    所述云端从所述第一基础广告槽位和所述第二基础广告槽位的并集中确定所述至少一个广告槽位。
  9. 根据权利要求8所述的方法,其特征在于,所述云端从所述第一基础广告槽位和所述第二基础广告槽位的并集中确定所述至少一个广告槽位,包括:
    所述云端从所述第一基础广告槽位和所述第二基础广告槽位的并集中选择权重最大的至少一个广告槽位确定为所述音频节目的至少一个广告槽位,所述至少一个广告槽位中每个广告槽位的权重是通过所述每个广告槽位对应的标点符号和/或文本段的分割位置确定的。
  10. 根据权利要求1-9任一项所述的方法,其特征在于,所述用户特征包括用户画像,以及用户对历史音频节目的行为特征。
  11. 一种音频广告投放的方法,其特征在于,包括:
    客户端播放音频节目时,向云端发送广告请求,所述广告请求中包括音频节目的信息、目标广告槽位的标识,以及用户特征,所述目标广告槽位为从所述音频节目中挖掘出的至少一个广告槽位中的一个;
    所述客户端接收所述云端发送的与所述目标广告槽位匹配的音频广告;
    所述客户端在播放所述音频节目到所述目标广告槽位时播放所述音频广告。
  12. 根据权利要求11所述的方法,其特征在于,所述用户特征包括用户画像,以及用户对历史音频节目的行为特征。
  13. 一种挖掘广告槽位的方法,其特征在于,包括:
    云端获取待挖掘广告槽位的音频节目;
    所述云端基于所述音频节目在语音状态下的时域信息,以及所述音频节目转换为文本后的文本内容,确定至少一个广告槽位;
    所述云端对所述至少一个广告槽位中每个广告槽位前一段时间内的文本内容进行编码,以得到所述每个广告槽位的向量表示。
  14. 根据权利要求13所述的方法,其特征在于,所述方法还包括:
    所述云端将所述音频节目、所述每个广告槽位的标识和所述每个广告槽位的向量表示关联存储。
  15. 根据权利要求13或14所述的方法,其特征在于,所述云端基于所述音频节目在语音状态下的时域信息,以及所述音频节目转换为文本后的文本内容,确定至少一个广告槽位,包括:
    所述时域信息为振幅时,若所述音频节目在语音状态下的振幅连续低于振幅阈值的时长超过第一阈值,则所述云端将所述振幅连续低于振幅阈值的时长确定为第一基础广告槽位;
    若所述音频节目转换后的文本内容中相邻两个词的时间间隔大于第二阈值,则所述云端将所述相邻两个词的时间间隔确定为第二基础广告槽位,所述相邻两个词的时间间隔时通过文本转换时每个词的时间戳确定的;
    所述云端从所述第一基础广告槽位和所述第二基础广告槽位的并集中确定所述至少一个广告槽位。
  16. 根据权利要求15所述的方法,其特征在于,所述云端从所述第一基础广告槽位和所述第二基础广告槽位的并集中确定所述至少一个广告槽位,包括:
    所述云端从所述第一基础广告槽位和所述第二基础广告槽位的并集中选择权重最大的至少一个广告槽位确定为所述音频节目的至少一个广告槽位,所述至少一个广告槽位中每个广告槽位的权重是通过所述每个广告槽位对应的标点符号和/或文本段的分割位置确定的。
  17. 一种云端装置,其特征在于,包括:通信接口、处理器和存储器,所述通信接口和所述处理器与存储器耦合,所述存储器用于存储程序或指令,当所述程序或指令被所述处理器执行时,使得所述云端装置执行如权利要求1至10中任一项所述的方法,或者执行如权利要求13至16中任一项所述的方法。
  18. 一种客户端,其特征在于,包括:收发器、处理器和存储器,所述收发器和所述处理器与存储器耦合,所述存储器用于存储程序或指令,当所述程序或指令被所述处理器执行时,使得所述客户端执行如权利要求11或12所述的方法。
  19. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当所述指令在计算机设备上运行时,使得所述计算机设备执行如权利要求1至16中任一项所述的方法。
  20. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序代码,当 所述计算机程序代码在计算机设备上运行时,使得所述计算机设备执行如权利要求1至16中任一项所述的方法。
  21. 一种芯片系统,其特征在于,所述芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从云端装置的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,云端装置执行如权利要求1至10中任一项所述的方法,或者执行如权利要求13至16中任一项所述的方法。
  22. 一种芯片系统,其特征在于,所述芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从客户端的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,客户端执行如权利要求11或12所述的方法。
  23. 一种音频广告系统,其特征在于,包括:客户端和云端,所述云端用于执行权利要求1-10任一项所述的方法,所述客户端用于执行权利要求11或12所述的方法
PCT/CN2022/123309 2022-09-30 2022-09-30 一种音频广告投放的方法、设备及系统 WO2024065690A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/123309 WO2024065690A1 (zh) 2022-09-30 2022-09-30 一种音频广告投放的方法、设备及系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/123309 WO2024065690A1 (zh) 2022-09-30 2022-09-30 一种音频广告投放的方法、设备及系统

Publications (1)

Publication Number Publication Date
WO2024065690A1 true WO2024065690A1 (zh) 2024-04-04

Family

ID=90475646

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/123309 WO2024065690A1 (zh) 2022-09-30 2022-09-30 一种音频广告投放的方法、设备及系统

Country Status (1)

Country Link
WO (1) WO2024065690A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080262912A1 (en) * 2007-04-20 2008-10-23 Ullas Gargi Media Advertising
US8995427B1 (en) * 2009-09-18 2015-03-31 Alpine Audio Now, LLC System and method for advertisement augmentation via a called voice connection
CN105869013A (zh) * 2016-03-25 2016-08-17 上海证大喜马拉雅网络科技有限公司 音频广告投放装置与方法
US20160337691A1 (en) * 2015-05-12 2016-11-17 Adsparx USA Inc System and method for detecting streaming of advertisements that occur while streaming a media program
CN107977849A (zh) * 2016-10-25 2018-05-01 深圳市百米生活股份有限公司 一种基于音频流实时智能植入信息的方法及系统
CN114663126A (zh) * 2015-09-16 2022-06-24 谷歌有限责任公司 用于自动管理内容槽位在信息资源中的放置的系统和方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080262912A1 (en) * 2007-04-20 2008-10-23 Ullas Gargi Media Advertising
US8995427B1 (en) * 2009-09-18 2015-03-31 Alpine Audio Now, LLC System and method for advertisement augmentation via a called voice connection
US20160337691A1 (en) * 2015-05-12 2016-11-17 Adsparx USA Inc System and method for detecting streaming of advertisements that occur while streaming a media program
CN114663126A (zh) * 2015-09-16 2022-06-24 谷歌有限责任公司 用于自动管理内容槽位在信息资源中的放置的系统和方法
CN105869013A (zh) * 2016-03-25 2016-08-17 上海证大喜马拉雅网络科技有限公司 音频广告投放装置与方法
CN107977849A (zh) * 2016-10-25 2018-05-01 深圳市百米生活股份有限公司 一种基于音频流实时智能植入信息的方法及系统

Similar Documents

Publication Publication Date Title
KR102660922B1 (ko) 복수의 지능형 개인 비서 서비스를 위한 관리 계층
US11302337B2 (en) Voiceprint recognition method and apparatus
US11417341B2 (en) Method and system for processing comment information
US10445365B2 (en) Streaming radio with personalized content integration
US9928834B2 (en) Information processing method and electronic device
JP6807152B2 (ja) 近接に基づく一時的なオーディオ共有
JP6785904B2 (ja) 情報プッシュ方法及び装置
US20180075141A1 (en) Content item usage based song recommendation
JP2020008854A (ja) 音声要求を処理するための方法および装置
US11511200B2 (en) Game playing method and system based on a multimedia file
US9299331B1 (en) Techniques for selecting musical content for playback
CN105556979A (zh) 流式传输媒体
CN109862100B (zh) 用于推送信息的方法和装置
CN108920649A (zh) 一种信息推荐方法、装置、设备和介质
US10757159B2 (en) Retrieval and playout of media content
CN105354293A (zh) 一种移动终端上进行播放对象推送的辅助实现方法及装置
CN104394601A (zh) 一种WiFi无线网络的访问控制方法、装置及路由器
CN103873003A (zh) 音频信号的增益调节方法及装置
WO2023010019A1 (en) Live audio advertising bidding and moderation system
US11775070B2 (en) Vibration control method and system for computer device
WO2024065690A1 (zh) 一种音频广告投放的方法、设备及系统
JP6865259B2 (ja) 対話型楽曲リクエスト方法、装置、端末、記憶媒体及びプログラム
WO2024001548A1 (zh) 歌单生成方法、装置、电子设备及存储介质
CN106407353B (zh) 一种播放动画的方法和装置
US20220129239A1 (en) Democratic skip

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22960309

Country of ref document: EP

Kind code of ref document: A1