WO2016179921A1

WO2016179921A1 - Method, apparatus and device for processing audio popularization information, and non-volatile computer storage medium

Info

Publication number: WO2016179921A1
Application number: PCT/CN2015/087978
Authority: WO
Inventors: 田彪
Original assignee: 北京音之邦文化科技有限公司
Priority date: 2015-05-12
Filing date: 2015-08-25
Publication date: 2016-11-17
Also published as: CN104882146A; CN104882146B

Abstract

A method, apparatus and device for processing audio popularization information, and a non-volatile computer storage medium. The method comprises: obtaining an audio characteristic of the audio popularization information according to acquired original audio data of the audio popularization information (102); obtaining a text characteristic of the audio popularization information according to at least one of the original audio data and the audio characteristic (103); and obtaining a showing situation of the audio popularization information according to at least one of the audio characteristic and the text characteristic (104). Showing of the audio popularization information is not performed completely depending on the text content attribute of the audio popularization information, but by considering the audio characteristics of the audio popularization information, which can more accurately describe the attribute of the audio popularization information, so that accurate showing of the audio popularization information is ensured, and a conversion rate of the audio popularization information is increased.

Description

Method, device, device and non-volatile computer storage medium for processing audio promotion information

The present application claims priority from Chinese Patent Application No. 201510237646.6, entitled "Processing and Apparatus for Processing Audio Promotion Information".

Technical field

The present invention relates to audio processing technology, and in particular, to a method, an apparatus, a device and a non-volatile computer storage medium for processing audio promotion information.

Background technique

In recent years, with the development of Internet technology, audio promotion information has gradually emerged, such as audio advertising, audio games or audio applications. In the process of presenting the audio promotion information to the user, the presentation of the audio promotion information may be determined based on the text content attribute such as the title and content of the audio promotion information, for example, whether the audio promotion information is displayed, the presentation position, the presentation time, and the like.

However, since the text content attribute of the audio promotion information is completely relied on, the audio promotion information is displayed, resulting in a decrease in the conversion rate of the audio promotion information.

Summary of the invention

Aspects of the present invention provide a method, an apparatus, and a device for processing audio promotion information, and a non-volatile computer storage medium for improving the conversion rate of audio promotion information.

An aspect of the present invention provides a method for processing audio promotion information, including:

Obtaining the original audio data of the audio promotion information;

Obtaining an audio feature of the audio promotion information according to the original audio data;

Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;

Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.

The aspect as described above and any possible implementation manner further provide an implementation manner, where the acquiring the original audio data of the audio promotion information includes:

Acquiring the original audio data in real time; or

Obtaining the audio promotion information, and performing decoding processing on the audio promotion information to obtain the original audio data.

An aspect as described above, and any possible implementation, further providing an implementation, wherein the text feature of the audio promotion information is obtained according to at least one of the original audio data and the audio feature, including :

Obtaining, according to the audio feature, a text feature of the audio promotion information by using a correspondence between a pre-established audio feature and a text feature; and/or

According to the original audio data, a text feature of the audio promotion information is obtained by using a voice recognition technology.

An aspect as described above, and any possible implementation, further providing an implementation, wherein the audio push is obtained according to at least one of the audio feature and the text feature The display of wide information, including:

Calculating a matching degree of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;

Obtaining the presentation of the audio promotion information according to the presentation score.

In an aspect as described above and any possible implementation, an implementation is further provided, the promotional attribute feature comprising at least one of the following features:

Attribute characteristics of a page displaying audio promotion information;

The attribute characteristics of the website to which the page displaying the audio promotion information belongs;

The attribute characteristics of the push user of the audio promotion information.

Another aspect of the present invention provides an apparatus for processing audio promotion information, including:

An obtaining unit, configured to obtain original audio data of the audio promotion information;

An audio unit, configured to obtain an audio feature of the audio promotion information according to the original audio data;

a mapping unit, configured to obtain a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;

And a presentation unit, configured to obtain, according to at least one of the audio feature and the text feature, a presentation of the audio promotion information.

The aspect as described above and any possible implementation manner further provide an implementation manner, where the acquiring unit is specifically configured to

Acquiring the original audio data in real time; or

An aspect of the foregoing, and any possible implementation manner, further provide an implementation manner, where the mapping unit is specifically used to

An aspect of the foregoing, and any possible implementation manner, further providing an implementation manner, where the presentation unit is specifically configured to

Calculating a degree of matching of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;

Attribute characteristics of a page displaying audio promotion information;

In another aspect of the invention, an apparatus is provided, comprising:

One or more processors;

Memory

One or more programs, the one or more programs being stored in the memory, when executed by the one or more processors:

Obtaining the original audio data of the audio promotion information;

In another aspect of the present invention, a nonvolatile computer storage medium storing one or more programs when the one or more programs are executed by a device causes The device:

Obtaining the original audio data of the audio promotion information;

According to the foregoing technical solution, the embodiment of the present invention obtains an audio feature of the audio promotion information according to the original audio data of the obtained audio promotion information, and further according to the Obtaining, by at least one of the original audio data and the audio feature, a text feature of the audio promotion information, such that the presentation of the audio promotion information can be obtained according to at least one of the audio feature and the text feature In the case, the audio promotion information is not completely dependent on the text content attribute of the audio promotion information, but the audio feature of the audio promotion information is considered, which can more accurately describe the attributes of the audio promotion information and display the audio promotion information. It can ensure the accurate display of audio promotion information, thus improving the conversion rate of audio promotion information.

In addition, by adopting the technical solution provided by the invention, the automatic promotion of the audio promotion information can be realized without manual participation, and therefore, the pushing cost of the audio promotion information can be effectively improved.

In addition, the technical solution provided by the present invention is simple in operation, and therefore, the efficiency of processing the audio promotion information can be effectively improved.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are the present invention. For some embodiments, other drawings may be obtained from those of ordinary skill in the art in light of the inventive workability.

1 is a schematic flowchart of a method for processing audio promotion information according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of an apparatus for processing audio promotion information according to another embodiment of the present invention.

detailed description

The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

It should be noted that the terminals involved in the embodiments of the present invention may include, but are not limited to, a mobile phone, a personal digital assistant (PDA), a wireless handheld device, a tablet computer, and a personal computer (Personal Computer, PC). ), MP3 player, MP4 player, wearable device (for example, smart glasses, smart watches, smart bracelets, etc.).

In addition, the term "and/or" herein is merely an association relationship describing an associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, and A and B exist at the same time. There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.

FIG. 1 is a schematic flowchart of a method for processing audio promotion information according to an embodiment of the present invention, as shown in FIG. 1 .

101. Obtain original audio data of audio promotion information.

102. Obtain an audio feature of the audio promotion information according to the original audio data.

103. Obtain a text feature of the audio promotion information according to at least one of the original audio data and the audio feature.

104. Obtain the obtaining according to at least one of the audio feature and the text feature The presentation of audio promotion information.

The so-called audio promotion information may refer to a complete audio file, which may be pre-stored in the storage device of the terminal. The audio promotion information may include audio files of various encoding formats in the prior art, for example, Moving Picture Experts Group (MPEG) Layer 3 (MPEG Layer, MP3) format audio file, WMA (Windows Media) The audio file format, the Advanced Audio Coding (AAC) format audio file, or the APE format audio file, etc., are not particularly limited in this embodiment.

In a specific implementation process, the storage device of the terminal may store the device at a slow speed, which may be a hard disk of the computer system, or may be a non-operating memory of the mobile phone, that is, physical memory, for example, a read-only memory (Read-Only) The memory, the ROM, the memory card, and the like are not particularly limited in this embodiment.

In another specific implementation process, the storage device of the terminal may also be a fast storage device, which may be a memory of the computer system, or may be a running memory of the mobile phone, that is, system memory, for example, a random access memory (Random Access Memory). , RAM, etc., this embodiment is not particularly limited.

It should be noted that some or all of the execution entities of 101 to 104 may be applications located in the local terminal, or may be plug-ins or software development kits (SDKs) in the application of the local terminal. For example, the processing engine in the server on the network side, or the distributed system on the network side, may not be specifically limited in this embodiment, and is not particularly limited in this embodiment.

It can be understood that the application may be a local application installed on the terminal (nativeApp), or may be a web application (webApp) of a browser on the terminal. This embodiment is not particularly limited.

In this way, the audio feature of the audio promotion information is obtained according to the original audio data of the acquired audio promotion information, and then the audio promotion information is obtained according to at least one of the original audio data and the audio feature. a text feature, such that the presentation of the audio promotion information is obtained according to at least one of the audio feature and the text feature, and the audio promotion information is not completely dependent on the text content attribute of the audio promotion information, Rather, considering the audio features of the audio promotion information, the attributes of the audio promotion information can be more accurately described, and the audio promotion information can be displayed, which can ensure the accurate display of the audio promotion information, thereby improving the conversion rate of the audio promotion information.

Optionally, in one possible implementation manner of this embodiment, in 101, the original audio data may be collected in real time.

Specifically, the sound signal of the audio promotion information may be specifically collected, and then the sound signal is converted into original audio data. For example, the sound signal is sampled, quantized, and encoded to obtain Pulse Code Modulation (PCM) data.

Optionally, in a possible implementation manner of this embodiment, in 101, the audio promotion information may be specifically acquired, and the audio promotion information is decoded to obtain the original audio data.

In a specific implementation process, the original audio data may be obtained by performing decoding processing on the data block of the audio promotion information. The so-called original audio data is a digital signal converted from an audio signal, for example, the audio signal is sampled, quantized, and encoded to obtain PCM data. For a detailed description of the decoding process, refer to related content in the prior art, and details are not described herein again.

In this embodiment, by performing 101, the obtained original audio data may be For the original audio data corresponding to one channel, if there are multiple channels in the audio promotion information, the subsequent processing processes, that is, 102 to 104, may be respectively performed on the original audio data corresponding to each channel.

In a specific implementation process, specifically, the number of channels of the audio promotion information may be determined, and the data block of the audio promotion information is decoded to obtain original audio data. Then, the original audio data corresponding to each channel can be obtained according to the number of channels and the original audio data.

For example, the frame header of the audio promotion information may be parsed to determine the number of channels of the audio promotion information.

Or, for example, the file header of the audio promotion information may be parsed to determine the number of channels of the audio promotion information.

Alternatively, for example, the other parts of the audio promotion information may be parsed to determine the number of channels of the audio promotion information, which is not specifically limited in this embodiment.

Or, for example, the number of channels of the audio promotion information may be obtained from a configuration file.

It can be understood that there are two steps of "determining the number of channels of the audio promotion information" and "decoding the data blocks of the audio promotion information to obtain original audio data" without a fixed order. The processing device may first perform the step of “determining the number of channels of the audio promotion information”, and then perform the steps of “decoding the data block of the audio promotion information to obtain original audio data”, or may perform first a step of "decoding the data block of the audio promotion information to obtain the original audio data", and then performing the step of "determining the number of channels of the audio promotion information", or may perform both steps simultaneously. This embodiment is not particularly limited.

Optionally, in a possible implementation manner of the embodiment, in the 102, the original audio data may be subjected to a framing process to obtain at least one frame of data, and then, for each frame of the at least one frame of data. Audio analysis processing is performed to obtain audio characteristics of each frame of data.

In a specific implementation process, the original audio data may be framed according to a preset time interval, for example, 20 ms, and some data overlap between adjacent frames, for example, 50% of data overlap, such that At least one frame of data of the original audio data can be obtained.

In another specific implementation process, the audio feature may include, but is not limited to, at least one of a time domain audio feature of the original audio data and a frequency domain audio feature of the original audio data, which is used in this embodiment. No particular limitation is imposed.

The time domain audio feature of the original audio data may include at least one of the following parameters:

Time domain waveform, intensity, zero-crossing rate, Linear Prediction Coding (LPC) coefficient, Linear Prediction Cepstrum Coefficient (LPCC), Mel Frequency Cepstrum Coefficient (MFCC) or Perceptual Linear Predictive (PLP) coefficients, beats, tones, and tonality.

The frequency domain audio features of the original audio data may include, but are not limited to, spectrum information of the original audio data.

Optionally, in a possible implementation manner of the embodiment, in 103, the text feature of the audio promotion information may be obtained by using a correspondence between the pre-established audio feature and the text feature according to the audio feature. .

The so-called text feature can be specifically described in all descriptions of audio promotion information. For example, the audio promotion information has a fast rhythm, the audio promotion information has a slow rhythm, the audio promotion information has a high sound quality, and the audio promotion information has a low sound quality.

The sound quality of the so-called audio promotion information refers to the fidelity of the original audio data after the compression processing. A high-quality audio file that completely restores the original audio data without causing any distortion; while a low-quality audio file cannot completely restore the original audio data, causing partial distortion.

In a specific implementation process, a beat threshold may be preset, for example, Beat Per Minute (BPM), as a representation of the correspondence between audio features and text features. If the obtained beat is less than or equal to the beat threshold, it may be mapped to a text feature for indicating relief, and conversely, if the obtained beat is greater than the beat threshold, it may be mapped to a text feature for indicating joy.

In another specific implementation process, it is also possible to preset the time domain waveform without clipping distortion and the text feature for indicating high sound quality, and the time domain waveform has clipping distortion and text features for indicating low quality. If the obtained time domain waveform has no clipping distortion, it can be mapped to a text feature for indicating high sound quality. Conversely, if the obtained time domain waveform has clipping distortion, it can be mapped to text indicating low quality. feature.

In another specific implementation process, specifically, a pre-specified training sample set may be used to perform training to construct a learning model, which is used to describe a correspondence between an audio feature and a text feature. The training samples included in the training sample set may be labeled known samples, so that the known samples may be directly used for training to construct a learning model; or part of the labeled known samples may be used. If some of the unknown samples are not labeled, then you can use the known samples to train to build the initial learning model, and then use the initial learning model to evaluate the unknown samples to obtain knowledge. In other cases, the unknown sample can be labeled according to the recognition result of the unknown sample to form a known sample, as a newly added known sample, and the newly added known sample and the original known sample can be retrained. To build a new learning model until the constructed learning model or known sample meets the cutoff criteria of the learning model, such as the recognition accuracy is greater than or equal to the preset accuracy threshold or the number of known samples is greater than or equal to the preset The number threshold and the like are not particularly limited in this embodiment.

Optionally, in a possible implementation manner of the embodiment, in 103, a text feature of the audio promotion information may be obtained by using a voice recognition technology according to the original audio data.

The specific voice recognition technology may be any existing technology, as long as the specific keyword can be identified as the text feature of the audio promotion information, and details are not described herein again.

Optionally, in a possible implementation manner of the embodiment, in 103, the text feature of the audio promotion information may be obtained by using a correspondence between the pre-established audio feature and the text feature according to the audio feature. And obtaining, according to the original audio data, a text feature of the audio promotion information by using a voice recognition technology.

Specifically, the technical solution in the foregoing two implementation manners may be used to perform organic combination to obtain text features of the audio promotion information. For a detailed description, reference may be made to the related descriptions in the foregoing two implementation manners, and details are not described herein again.

Optionally, in a possible implementation manner of this embodiment, in 104, a matching degree of the promotion attribute feature and at least one of the audio feature and the text feature may be specifically calculated as the audio. The presentation score of the promotion information, and further, the presentation of the audio promotion information may be obtained according to the presentation score.

Among them, the so-called promotion attribute feature can be described by the topic model of this promotion. The topic model, as its name suggests, is a modeling method for implicit topics in text, audio, and so on. For example, the word "Apple" contains both the theme of Apple and the theme of fruit. Specifically, the promotion attribute feature may include, but is not limited to, at least one of the following features:

Attribute characteristics of a page displaying audio promotion information, such as a shopping page, a game page, a news page, and the like;

The attribute characteristics of the website to which the page displaying the audio promotion information belongs, such as a shopping website, a game website, a news website, etc.;

The audio promotion information pushes the user's attribute characteristics, such as teenagers, seniors, and the like.

As we all know, Internet-based promotion information is the most important profit model of the Internet industry, and traffic realization has become a very important evaluation standard for Internet commercial products. Specifically, in the case of advertisements, this evaluation standard may specifically adopt a Real Time Bidding (RTB) mode. Compared with the traditional purchase form, RTB is a third-party technology that is targeted at millions of websites. Each ad shows an auction that evaluates and bids for bidding techniques. Therefore, in calculating the matching degree, in addition to the audio characteristics of the audio promotion information and the text features, further bidding of the audio promotion information is required.

In this embodiment, the audio feature of the audio promotion information is obtained according to the original audio data of the acquired audio promotion information, and then the audio is obtained according to at least one of the original audio data and the audio feature. Promoting the text feature of the information, so that the presentation of the audio promotion information can be obtained according to at least one of the audio feature and the text feature, and the audio promotion information is not completely relied on the text content attribute of the audio promotion information. Presentation, but consider the audio characteristics of the audio promotion information, which can be more accurate. Describe the attributes of the audio promotion information, and display the audio promotion information, which can ensure the accurate display of the audio promotion information, thereby improving the conversion rate of the audio promotion information.

It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. In addition, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.

In the above embodiments, the descriptions of the various embodiments are different, and the details that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.

2 is a schematic structural diagram of an apparatus for processing audio promotion information according to another embodiment of the present invention, as shown in FIG. 2 . The processing apparatus of the audio promotion information of the embodiment may include an acquisition unit 21, an audio unit 22, a mapping unit 23, and a presentation unit 24. The obtaining unit 21 is configured to obtain original audio data of the audio promotion information, and the audio unit 22 is configured to obtain an audio feature of the audio promotion information according to the original audio data, and a mapping unit 23, configured to use the original Obtaining at least one of audio data and the audio feature, obtaining a text feature of the audio promotion information; and displaying unit 24, configured to obtain the audio promotion according to at least one of the audio feature and the text feature The presentation of information.

It should be noted that part of the processing apparatus for audio promotion information provided by this embodiment Or all of them may be applications located in the local terminal, or may be plug-ins or software development kits (SDKs) in the application of the local terminal, or may be processed in the server on the network side. The engine may be a distributed system located on the network side, which is not limited in this embodiment, and is not particularly limited in this embodiment.

It is to be understood that the application may be a local application (nativeApp) installed on the terminal, or may be a web application (webApp) of the browser on the terminal, which is not specifically limited in this embodiment.

Optionally, in a possible implementation manner of the embodiment, the acquiring unit 21 may be specifically configured to collect the original audio data in real time.

Optionally, in a possible implementation manner of the embodiment, the acquiring unit 21 may be specifically configured to acquire the audio promotion information, and perform decoding processing on the audio promotion information to obtain the original audio data. .

Optionally, in a possible implementation manner of the embodiment, the mapping unit 23 may be specifically configured to obtain the audio promotion by using a correspondence between a pre-established audio feature and a text feature according to the audio feature. a textual feature of the information; and/or obtaining a textual feature of the audio promotional information using speech recognition techniques based on the raw audio data.

Optionally, in a possible implementation manner of the embodiment, the displaying unit 24 may be specifically configured to calculate a matching degree between the promotion attribute feature and at least one of the audio feature and the text feature, to And a presentation score of the audio promotion information; and obtaining a presentation of the audio promotion information according to the presentation score.

Specifically, the promotion attribute feature may include, but is not limited to, at least one of the following features:

It should be noted that the method in the embodiment corresponding to FIG. 1 can be implemented by the audio promotion information processing apparatus provided in this embodiment. For details, refer to related content in the embodiment corresponding to FIG. 1, and details are not described herein again.

In this embodiment, the audio feature of the audio promotion information is obtained by the audio unit according to the original audio data of the audio promotion information acquired by the acquiring unit, and then the mapping unit is configured according to at least the original audio data and the audio feature. And obtaining a text feature of the audio promotion information, so that the presentation unit can obtain the presentation of the audio promotion information according to at least one of the audio feature and the text feature, since the audio promotion is no longer completely relied on The text content attribute of the information is used to display the audio promotion information, but the audio feature of the audio promotion information is considered, which can more accurately describe the attributes of the audio promotion information, and display the audio promotion information, thereby ensuring accurate display of the audio promotion information. Thereby improving the conversion rate of audio promotion information.

It will be apparent to those skilled in the art that for the convenience and brevity of the description, the specific working processes of the systems, devices and units described above may be referred to the foregoing method embodiments. The corresponding process in the description will not be repeated here.

In the several embodiments provided by the present invention, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.

The above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, an audio processing engine, or a network device, etc.) or a processor to perform the embodiments of the present invention. Part of the steps of the method. The foregoing storage medium includes: a USB flash drive, a mobile hard disk, a read-only memory (ROM), and a random access memory (Random Access). A variety of media that can store program code, such as Memory, RAM, or a disk.

It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and are not limited thereto; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that The technical solutions described in the foregoing embodiments are modified, or the equivalents of the technical features are replaced. The modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

A method for processing audio promotion information, comprising:

Obtaining the original audio data of the audio promotion information;

Obtaining an audio feature of the audio promotion information according to the original audio data;

Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;

Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
The method according to claim 1, wherein the obtaining the original audio data of the audio promotion information comprises:

Acquiring the original audio data in real time; or

Obtaining the audio promotion information, and performing decoding processing on the audio promotion information to obtain the original audio data.
The method according to claim 1, wherein the obtaining the text feature of the audio promotion information according to at least one of the original audio data and the audio feature comprises:

Obtaining, according to the audio feature, a text feature of the audio promotion information by using a correspondence between a pre-established audio feature and a text feature; and/or

According to the original audio data, a text feature of the audio promotion information is obtained by using a voice recognition technology.
The method according to any one of claims 1 to 3, wherein the obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature comprises:

Calculating a matching degree of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;

Obtaining the presentation of the audio promotion information according to the presentation score.
The method of claim 4 wherein said promotional attribute feature comprises at least one of the following features:

Attribute characteristics of a page displaying audio promotion information;

The attribute characteristics of the website to which the page displaying the audio promotion information belongs;

The attribute characteristics of the push user of the audio promotion information.
A device for processing audio promotion information, comprising:

An obtaining unit, configured to obtain original audio data of the audio promotion information;

An audio unit, configured to obtain an audio feature of the audio promotion information according to the original audio data;

a mapping unit, configured to obtain a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;

And a presentation unit, configured to obtain, according to at least one of the audio feature and the text feature, a presentation of the audio promotion information.
The device according to claim 6, wherein the obtaining unit is specifically configured to

Acquiring the original audio data in real time; or

Obtaining the audio promotion information, and performing decoding processing on the audio promotion information to obtain the original audio data.
The apparatus according to claim 6, wherein the mapping unit is specifically configured to

Obtaining, according to the audio feature, a text feature of the audio promotion information by using a correspondence between a pre-established audio feature and a text feature; and/or

According to the original audio data, a text feature of the audio promotion information is obtained by using a voice recognition technology.
The device according to any one of claims 6 to 8, wherein the presentation unit is specifically used for

Calculating a degree of matching of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;

Obtaining the presentation of the audio promotion information according to the presentation score.
The apparatus of claim 9 wherein said promotional attribute feature comprises at least one of the following features:

Attribute characteristics of a page displaying audio promotion information;

The attribute characteristics of the website to which the page displaying the audio promotion information belongs;

The attribute characteristics of the push user of the audio promotion information.
A device that includes:

One or more processors;

Memory

One or more programs, the one or more programs being stored in the memory, when executed by the one or more processors:

Obtaining the original audio data of the audio promotion information;

Obtaining an audio feature of the audio promotion information according to the original audio data;

Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;

Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
A non-volatile computer storage medium storing one or more programs, when the one or more programs are executed by a device, causing the device to:

Obtaining the original audio data of the audio promotion information;

Obtaining an audio feature of the audio promotion information according to the original audio data;

Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;

Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.