WO2016179921A1 - Method, apparatus and device for processing audio popularization information, and non-volatile computer storage medium - Google Patents

Method, apparatus and device for processing audio popularization information, and non-volatile computer storage medium Download PDF

Info

Publication number
WO2016179921A1
WO2016179921A1 PCT/CN2015/087978 CN2015087978W WO2016179921A1 WO 2016179921 A1 WO2016179921 A1 WO 2016179921A1 CN 2015087978 W CN2015087978 W CN 2015087978W WO 2016179921 A1 WO2016179921 A1 WO 2016179921A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
promotion information
feature
obtaining
original
Prior art date
Application number
PCT/CN2015/087978
Other languages
French (fr)
Chinese (zh)
Inventor
田彪
Original Assignee
北京音之邦文化科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京音之邦文化科技有限公司 filed Critical 北京音之邦文化科技有限公司
Publication of WO2016179921A1 publication Critical patent/WO2016179921A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to audio processing technology, and in particular, to a method, an apparatus, a device and a non-volatile computer storage medium for processing audio promotion information.
  • the presentation of the audio promotion information may be determined based on the text content attribute such as the title and content of the audio promotion information, for example, whether the audio promotion information is displayed, the presentation position, the presentation time, and the like.
  • the audio promotion information is displayed, resulting in a decrease in the conversion rate of the audio promotion information.
  • aspects of the present invention provide a method, an apparatus, and a device for processing audio promotion information, and a non-volatile computer storage medium for improving the conversion rate of audio promotion information.
  • An aspect of the present invention provides a method for processing audio promotion information, including:
  • the acquiring the original audio data of the audio promotion information includes:
  • a text feature of the audio promotion information is obtained by using a voice recognition technology.
  • the promotional attribute feature comprising at least one of the following features:
  • the attribute characteristics of the push user of the audio promotion information are the attribute characteristics of the push user of the audio promotion information.
  • Another aspect of the present invention provides an apparatus for processing audio promotion information, including:
  • An obtaining unit configured to obtain original audio data of the audio promotion information
  • An audio unit configured to obtain an audio feature of the audio promotion information according to the original audio data
  • mapping unit configured to obtain a text feature of the audio promotion information according to at least one of the original audio data and the audio feature
  • a presentation unit configured to obtain, according to at least one of the audio feature and the text feature, a presentation of the audio promotion information.
  • any possible implementation manner further provide an implementation manner, where the acquiring unit is specifically configured to
  • mapping unit is specifically used to
  • a text feature of the audio promotion information is obtained by using a voice recognition technology.
  • the promotional attribute feature comprising at least one of the following features:
  • the attribute characteristics of the push user of the audio promotion information are the attribute characteristics of the push user of the audio promotion information.
  • an apparatus comprising:
  • One or more processors are One or more processors;
  • One or more programs the one or more programs being stored in the memory, when executed by the one or more processors:
  • a nonvolatile computer storage medium storing one or more programs when the one or more programs are executed by a device causes The device:
  • the embodiment of the present invention obtains an audio feature of the audio promotion information according to the original audio data of the obtained audio promotion information, and further according to the Obtaining, by at least one of the original audio data and the audio feature, a text feature of the audio promotion information, such that the presentation of the audio promotion information can be obtained according to at least one of the audio feature and the text feature
  • the audio promotion information is not completely dependent on the text content attribute of the audio promotion information, but the audio feature of the audio promotion information is considered, which can more accurately describe the attributes of the audio promotion information and display the audio promotion information. It can ensure the accurate display of audio promotion information, thus improving the conversion rate of audio promotion information.
  • the automatic promotion of the audio promotion information can be realized without manual participation, and therefore, the pushing cost of the audio promotion information can be effectively improved.
  • the technical solution provided by the present invention is simple in operation, and therefore, the efficiency of processing the audio promotion information can be effectively improved.
  • FIG. 1 is a schematic flowchart of a method for processing audio promotion information according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of an apparatus for processing audio promotion information according to another embodiment of the present invention.
  • the terminals involved in the embodiments of the present invention may include, but are not limited to, a mobile phone, a personal digital assistant (PDA), a wireless handheld device, a tablet computer, and a personal computer (Personal Computer, PC). ), MP3 player, MP4 player, wearable device (for example, smart glasses, smart watches, smart bracelets, etc.).
  • PDA personal digital assistant
  • PC Personal Computer
  • FIG. 1 is a schematic flowchart of a method for processing audio promotion information according to an embodiment of the present invention, as shown in FIG. 1 .
  • the so-called audio promotion information may refer to a complete audio file, which may be pre-stored in the storage device of the terminal.
  • the audio promotion information may include audio files of various encoding formats in the prior art, for example, Moving Picture Experts Group (MPEG) Layer 3 (MPEG Layer, MP3) format audio file, WMA (Windows Media)
  • MPEG Moving Picture Experts Group
  • MPEG Layer 3 MPEG Layer 3
  • WMA Windows Media
  • the audio file format, the Advanced Audio Coding (AAC) format audio file, or the APE format audio file, etc. are not particularly limited in this embodiment.
  • the storage device of the terminal may store the device at a slow speed, which may be a hard disk of the computer system, or may be a non-operating memory of the mobile phone, that is, physical memory, for example, a read-only memory (Read-Only)
  • a slow speed which may be a hard disk of the computer system, or may be a non-operating memory of the mobile phone, that is, physical memory, for example, a read-only memory (Read-Only)
  • the memory, the ROM, the memory card, and the like are not particularly limited in this embodiment.
  • the storage device of the terminal may also be a fast storage device, which may be a memory of the computer system, or may be a running memory of the mobile phone, that is, system memory, for example, a random access memory (Random Access Memory). , RAM, etc., this embodiment is not particularly limited.
  • a fast storage device which may be a memory of the computer system, or may be a running memory of the mobile phone, that is, system memory, for example, a random access memory (Random Access Memory). , RAM, etc., this embodiment is not particularly limited.
  • execution entities of 101 to 104 may be applications located in the local terminal, or may be plug-ins or software development kits (SDKs) in the application of the local terminal.
  • SDKs software development kits
  • the processing engine in the server on the network side, or the distributed system on the network side may not be specifically limited in this embodiment, and is not particularly limited in this embodiment.
  • the application may be a local application installed on the terminal (nativeApp), or may be a web application (webApp) of a browser on the terminal.
  • This embodiment is not particularly limited.
  • the audio feature of the audio promotion information is obtained according to the original audio data of the acquired audio promotion information, and then the audio promotion information is obtained according to at least one of the original audio data and the audio feature.
  • a text feature such that the presentation of the audio promotion information is obtained according to at least one of the audio feature and the text feature, and the audio promotion information is not completely dependent on the text content attribute of the audio promotion information, Rather, considering the audio features of the audio promotion information, the attributes of the audio promotion information can be more accurately described, and the audio promotion information can be displayed, which can ensure the accurate display of the audio promotion information, thereby improving the conversion rate of the audio promotion information.
  • the original audio data may be collected in real time.
  • the sound signal of the audio promotion information may be specifically collected, and then the sound signal is converted into original audio data.
  • the sound signal is sampled, quantized, and encoded to obtain Pulse Code Modulation (PCM) data.
  • PCM Pulse Code Modulation
  • the audio promotion information may be specifically acquired, and the audio promotion information is decoded to obtain the original audio data.
  • the original audio data may be obtained by performing decoding processing on the data block of the audio promotion information.
  • the so-called original audio data is a digital signal converted from an audio signal, for example, the audio signal is sampled, quantized, and encoded to obtain PCM data.
  • the obtained original audio data may be For the original audio data corresponding to one channel, if there are multiple channels in the audio promotion information, the subsequent processing processes, that is, 102 to 104, may be respectively performed on the original audio data corresponding to each channel.
  • the number of channels of the audio promotion information may be determined, and the data block of the audio promotion information is decoded to obtain original audio data. Then, the original audio data corresponding to each channel can be obtained according to the number of channels and the original audio data.
  • the frame header of the audio promotion information may be parsed to determine the number of channels of the audio promotion information.
  • the file header of the audio promotion information may be parsed to determine the number of channels of the audio promotion information.
  • the other parts of the audio promotion information may be parsed to determine the number of channels of the audio promotion information, which is not specifically limited in this embodiment.
  • the number of channels of the audio promotion information may be obtained from a configuration file.
  • the processing device may first perform the step of “determining the number of channels of the audio promotion information”, and then perform the steps of “decoding the data block of the audio promotion information to obtain original audio data”, or may perform first a step of "decoding the data block of the audio promotion information to obtain the original audio data”, and then performing the step of "determining the number of channels of the audio promotion information", or may perform both steps simultaneously.
  • This embodiment is not particularly limited.
  • the original audio data may be subjected to a framing process to obtain at least one frame of data, and then, for each frame of the at least one frame of data. Audio analysis processing is performed to obtain audio characteristics of each frame of data.
  • the original audio data may be framed according to a preset time interval, for example, 20 ms, and some data overlap between adjacent frames, for example, 50% of data overlap, such that At least one frame of data of the original audio data can be obtained.
  • the audio feature may include, but is not limited to, at least one of a time domain audio feature of the original audio data and a frequency domain audio feature of the original audio data, which is used in this embodiment. No particular limitation is imposed.
  • the time domain audio feature of the original audio data may include at least one of the following parameters:
  • Time domain waveform intensity, zero-crossing rate, Linear Prediction Coding (LPC) coefficient, Linear Prediction Cepstrum Coefficient (LPCC), Mel Frequency Cepstrum Coefficient (MFCC) or Perceptual Linear Predictive (PLP) coefficients, beats, tones, and tonality.
  • LPC Linear Prediction Coding
  • LPCC Linear Prediction Cepstrum Coefficient
  • MFCC Mel Frequency Cepstrum Coefficient
  • PLP Perceptual Linear Predictive
  • the frequency domain audio features of the original audio data may include, but are not limited to, spectrum information of the original audio data.
  • the text feature of the audio promotion information may be obtained by using a correspondence between the pre-established audio feature and the text feature according to the audio feature.
  • the so-called text feature can be specifically described in all descriptions of audio promotion information.
  • the audio promotion information has a fast rhythm
  • the audio promotion information has a slow rhythm
  • the audio promotion information has a high sound quality
  • the audio promotion information has a low sound quality.
  • the sound quality of the so-called audio promotion information refers to the fidelity of the original audio data after the compression processing.
  • a beat threshold may be preset, for example, Beat Per Minute (BPM), as a representation of the correspondence between audio features and text features. If the obtained beat is less than or equal to the beat threshold, it may be mapped to a text feature for indicating relief, and conversely, if the obtained beat is greater than the beat threshold, it may be mapped to a text feature for indicating joy.
  • BPM Beat Per Minute
  • time domain waveform without clipping distortion and the text feature for indicating high sound quality
  • the time domain waveform has clipping distortion and text features for indicating low quality. If the obtained time domain waveform has no clipping distortion, it can be mapped to a text feature for indicating high sound quality. Conversely, if the obtained time domain waveform has clipping distortion, it can be mapped to text indicating low quality. feature.
  • a pre-specified training sample set may be used to perform training to construct a learning model, which is used to describe a correspondence between an audio feature and a text feature.
  • the training samples included in the training sample set may be labeled known samples, so that the known samples may be directly used for training to construct a learning model; or part of the labeled known samples may be used. If some of the unknown samples are not labeled, then you can use the known samples to train to build the initial learning model, and then use the initial learning model to evaluate the unknown samples to obtain knowledge.
  • the unknown sample can be labeled according to the recognition result of the unknown sample to form a known sample, as a newly added known sample, and the newly added known sample and the original known sample can be retrained.
  • the recognition accuracy is greater than or equal to the preset accuracy threshold or the number of known samples is greater than or equal to the preset
  • the number threshold and the like are not particularly limited in this embodiment.
  • a text feature of the audio promotion information may be obtained by using a voice recognition technology according to the original audio data.
  • the specific voice recognition technology may be any existing technology, as long as the specific keyword can be identified as the text feature of the audio promotion information, and details are not described herein again.
  • the text feature of the audio promotion information may be obtained by using a correspondence between the pre-established audio feature and the text feature according to the audio feature. And obtaining, according to the original audio data, a text feature of the audio promotion information by using a voice recognition technology.
  • the technical solution in the foregoing two implementation manners may be used to perform organic combination to obtain text features of the audio promotion information.
  • the technical solution in the foregoing two implementation manners may be used to perform organic combination to obtain text features of the audio promotion information.
  • a matching degree of the promotion attribute feature and at least one of the audio feature and the text feature may be specifically calculated as the audio.
  • the presentation score of the promotion information, and further, the presentation of the audio promotion information may be obtained according to the presentation score.
  • the so-called promotion attribute feature can be described by the topic model of this promotion.
  • the topic model is a modeling method for implicit topics in text, audio, and so on.
  • the word "Apple” contains both the theme of Apple and the theme of fruit.
  • the promotion attribute feature may include, but is not limited to, at least one of the following features:
  • Attribute characteristics of a page displaying audio promotion information such as a shopping page, a game page, a news page, and the like;
  • the attribute characteristics of the website to which the page displaying the audio promotion information belongs such as a shopping website, a game website, a news website, etc.;
  • the audio promotion information pushes the user's attribute characteristics, such as teenagers, seniors, and the like.
  • RTB Real Time Bidding
  • the audio feature of the audio promotion information is obtained according to the original audio data of the acquired audio promotion information, and then the audio is obtained according to at least one of the original audio data and the audio feature.
  • Promoting the text feature of the information so that the presentation of the audio promotion information can be obtained according to at least one of the audio feature and the text feature, and the audio promotion information is not completely relied on the text content attribute of the audio promotion information.
  • Presentation but consider the audio characteristics of the audio promotion information, which can be more accurate. Describe the attributes of the audio promotion information, and display the audio promotion information, which can ensure the accurate display of the audio promotion information, thereby improving the conversion rate of the audio promotion information.
  • the automatic promotion of the audio promotion information can be realized without manual participation, and therefore, the pushing cost of the audio promotion information can be effectively improved.
  • the technical solution provided by the present invention is simple in operation, and therefore, the efficiency of processing the audio promotion information can be effectively improved.
  • the processing apparatus of the audio promotion information of the embodiment may include an acquisition unit 21, an audio unit 22, a mapping unit 23, and a presentation unit 24.
  • the obtaining unit 21 is configured to obtain original audio data of the audio promotion information
  • the audio unit 22 is configured to obtain an audio feature of the audio promotion information according to the original audio data
  • a mapping unit 23 configured to use the original Obtaining at least one of audio data and the audio feature, obtaining a text feature of the audio promotion information
  • part of the processing apparatus for audio promotion information provided by this embodiment Or all of them may be applications located in the local terminal, or may be plug-ins or software development kits (SDKs) in the application of the local terminal, or may be processed in the server on the network side.
  • the engine may be a distributed system located on the network side, which is not limited in this embodiment, and is not particularly limited in this embodiment.
  • the application may be a local application (nativeApp) installed on the terminal, or may be a web application (webApp) of the browser on the terminal, which is not specifically limited in this embodiment.
  • the acquiring unit 21 may be specifically configured to collect the original audio data in real time.
  • the acquiring unit 21 may be specifically configured to acquire the audio promotion information, and perform decoding processing on the audio promotion information to obtain the original audio data. .
  • the mapping unit 23 may be specifically configured to obtain the audio promotion by using a correspondence between a pre-established audio feature and a text feature according to the audio feature. a textual feature of the information; and/or obtaining a textual feature of the audio promotional information using speech recognition techniques based on the raw audio data.
  • the displaying unit 24 may be specifically configured to calculate a matching degree between the promotion attribute feature and at least one of the audio feature and the text feature, to And a presentation score of the audio promotion information; and obtaining a presentation of the audio promotion information according to the presentation score.
  • the promotion attribute feature may include, but is not limited to, at least one of the following features:
  • Attribute characteristics of a page displaying audio promotion information such as a shopping page, a game page, a news page, and the like;
  • the attribute characteristics of the website to which the page displaying the audio promotion information belongs such as a shopping website, a game website, a news website, etc.;
  • the audio promotion information pushes the user's attribute characteristics, such as teenagers, seniors, and the like.
  • the audio feature of the audio promotion information is obtained by the audio unit according to the original audio data of the audio promotion information acquired by the acquiring unit, and then the mapping unit is configured according to at least the original audio data and the audio feature. And obtaining a text feature of the audio promotion information, so that the presentation unit can obtain the presentation of the audio promotion information according to at least one of the audio feature and the text feature, since the audio promotion is no longer completely relied on
  • the text content attribute of the information is used to display the audio promotion information, but the audio feature of the audio promotion information is considered, which can more accurately describe the attributes of the audio promotion information, and display the audio promotion information, thereby ensuring accurate display of the audio promotion information. Thereby improving the conversion rate of audio promotion information.
  • the automatic promotion of the audio promotion information can be realized without manual participation, and therefore, the pushing cost of the audio promotion information can be effectively improved.
  • the technical solution provided by the present invention is simple in operation, and therefore, the efficiency of processing the audio promotion information can be effectively improved.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
  • the software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, an audio processing engine, or a network device, etc.) or a processor to perform the embodiments of the present invention. Part of the steps of the method.
  • the foregoing storage medium includes: a USB flash drive, a mobile hard disk, a read-only memory (ROM), and a random access memory (Random Access).
  • ROM read-only memory
  • Random Access random access memory

Abstract

A method, apparatus and device for processing audio popularization information, and a non-volatile computer storage medium. The method comprises: obtaining an audio characteristic of the audio popularization information according to acquired original audio data of the audio popularization information (102); obtaining a text characteristic of the audio popularization information according to at least one of the original audio data and the audio characteristic (103); and obtaining a showing situation of the audio popularization information according to at least one of the audio characteristic and the text characteristic (104). Showing of the audio popularization information is not performed completely depending on the text content attribute of the audio popularization information, but by considering the audio characteristics of the audio popularization information, which can more accurately describe the attribute of the audio popularization information, so that accurate showing of the audio popularization information is ensured, and a conversion rate of the audio popularization information is increased.

Description

音频推广信息的处理方法、装置、设备及非易失性计算机存储介质Method, device, device and non-volatile computer storage medium for processing audio promotion information
本申请要求了申请日为2015年05月12日,申请号为201510237646.6发明名称为“音频推广信息的处理方法及装置”的中国专利申请的优先权。The present application claims priority from Chinese Patent Application No. 201510237646.6, entitled "Processing and Apparatus for Processing Audio Promotion Information".
技术领域Technical field
本发明涉及音频处理技术,特别涉及一种音频推广信息的处理方法、装置、设备及非易失性计算机存储介质。The present invention relates to audio processing technology, and in particular, to a method, an apparatus, a device and a non-volatile computer storage medium for processing audio promotion information.
背景技术Background technique
近年来,随着互联网技术的发展,逐渐兴起了音频推广信息,例如,音频广告、音频游戏或音频应用等。在这些音频推广信息向用户展现的过程中,可以基于音频推广信息的标题、内容等文本内容属性,确定音频推广信息的展现情况,例如,音频推广信息的是否展现、展现位置、展现时间等。In recent years, with the development of Internet technology, audio promotion information has gradually emerged, such as audio advertising, audio games or audio applications. In the process of presenting the audio promotion information to the user, the presentation of the audio promotion information may be determined based on the text content attribute such as the title and content of the audio promotion information, for example, whether the audio promotion information is displayed, the presentation position, the presentation time, and the like.
然而,由于完全依赖音频推广信息的文本内容属性,进行音频推广信息的展现,从而导致了音频推广信息的转化率的降低。However, since the text content attribute of the audio promotion information is completely relied on, the audio promotion information is displayed, resulting in a decrease in the conversion rate of the audio promotion information.
发明内容Summary of the invention
本发明的多个方面提供一种音频推广信息的处理方法、装置、设备及非易失性计算机存储介质,用以提高音频推广信息的转化率。Aspects of the present invention provide a method, an apparatus, and a device for processing audio promotion information, and a non-volatile computer storage medium for improving the conversion rate of audio promotion information.
本发明的一方面,提供一种音频推广信息的处理方法,包括: An aspect of the present invention provides a method for processing audio promotion information, including:
获取音频推广信息的原始音频数据;Obtaining the original audio data of the audio promotion information;
根据所述原始音频数据,获得所述音频推广信息的音频特征;Obtaining an audio feature of the audio promotion information according to the original audio data;
根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述获取音频推广信息的原始音频数据,包括:The aspect as described above and any possible implementation manner further provide an implementation manner, where the acquiring the original audio data of the audio promotion information includes:
实时采集所述原始音频数据;或者Acquiring the original audio data in real time; or
获取所述音频推广信息,对所述音频推广信息进行解码处理,以获得所述原始音频数据。Obtaining the audio promotion information, and performing decoding processing on the audio promotion information to obtain the original audio data.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征,包括:An aspect as described above, and any possible implementation, further providing an implementation, wherein the text feature of the audio promotion information is obtained according to at least one of the original audio data and the audio feature, including :
根据所述音频特征,利用预先建立的音频特征与文本特征的对应关系,获得所述音频推广信息的文本特征;和/或Obtaining, according to the audio feature, a text feature of the audio promotion information by using a correspondence between a pre-established audio feature and a text feature; and/or
根据所述原始音频数据,采用语音识别技术,获得所述音频推广信息的文本特征。According to the original audio data, a text feature of the audio promotion information is obtained by using a voice recognition technology.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述根据所述音频特征和所述文本特征中的至少一项,获得所述音频推 广信息的展现情况,包括:An aspect as described above, and any possible implementation, further providing an implementation, wherein the audio push is obtained according to at least one of the audio feature and the text feature The display of wide information, including:
计算推广属性特征与所述音频特征和所述文本特征中的至少一项的匹配度,以作为所述音频推广信息的展现得分;Calculating a matching degree of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;
根据所述展现得分,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to the presentation score.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述推广属性特征包括下列特征中的至少一项:In an aspect as described above and any possible implementation, an implementation is further provided, the promotional attribute feature comprising at least one of the following features:
展现音频推广信息的页面的属性特征;Attribute characteristics of a page displaying audio promotion information;
展现音频推广信息的页面所属网站的属性特征;以及The attribute characteristics of the website to which the page displaying the audio promotion information belongs;
音频推广信息的推送用户的属性特征。The attribute characteristics of the push user of the audio promotion information.
本发明的另一方面,提供一种音频推广信息的处理装置,包括:Another aspect of the present invention provides an apparatus for processing audio promotion information, including:
获取单元,用于获取音频推广信息的原始音频数据;An obtaining unit, configured to obtain original audio data of the audio promotion information;
音频单元,用于根据所述原始音频数据,获得所述音频推广信息的音频特征;An audio unit, configured to obtain an audio feature of the audio promotion information according to the original audio data;
映射单元,用于根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;a mapping unit, configured to obtain a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
展现单元,用于根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。And a presentation unit, configured to obtain, according to at least one of the audio feature and the text feature, a presentation of the audio promotion information.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述获取单元,具体用于The aspect as described above and any possible implementation manner further provide an implementation manner, where the acquiring unit is specifically configured to
实时采集所述原始音频数据;或者 Acquiring the original audio data in real time; or
获取所述音频推广信息,对所述音频推广信息进行解码处理,以获得所述原始音频数据。Obtaining the audio promotion information, and performing decoding processing on the audio promotion information to obtain the original audio data.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述映射单元,具体用于An aspect of the foregoing, and any possible implementation manner, further provide an implementation manner, where the mapping unit is specifically used to
根据所述音频特征,利用预先建立的音频特征与文本特征的对应关系,获得所述音频推广信息的文本特征;和/或Obtaining, according to the audio feature, a text feature of the audio promotion information by using a correspondence between a pre-established audio feature and a text feature; and/or
根据所述原始音频数据,采用语音识别技术,获得所述音频推广信息的文本特征。According to the original audio data, a text feature of the audio promotion information is obtained by using a voice recognition technology.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述展现单元,具体用于An aspect of the foregoing, and any possible implementation manner, further providing an implementation manner, where the presentation unit is specifically configured to
计算推广属性特征与所述音频特征和所述文本特征中的至少一项的匹配度,以作为所述音频推广信息的展现得分;以及Calculating a degree of matching of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;
根据所述展现得分,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to the presentation score.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述推广属性特征包括下列特征中的至少一项:In an aspect as described above and any possible implementation, an implementation is further provided, the promotional attribute feature comprising at least one of the following features:
展现音频推广信息的页面的属性特征;Attribute characteristics of a page displaying audio promotion information;
展现音频推广信息的页面所属网站的属性特征;以及The attribute characteristics of the website to which the page displaying the audio promotion information belongs;
音频推广信息的推送用户的属性特征。The attribute characteristics of the push user of the audio promotion information.
本发明的另一方面,提供一种设备,包括: In another aspect of the invention, an apparatus is provided, comprising:
一个或者多个处理器;One or more processors;
存储器;Memory
一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时:One or more programs, the one or more programs being stored in the memory, when executed by the one or more processors:
获取音频推广信息的原始音频数据;Obtaining the original audio data of the audio promotion information;
根据所述原始音频数据,获得所述音频推广信息的音频特征;Obtaining an audio feature of the audio promotion information according to the original audio data;
根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
本发明的另一方面,提供一种非易失性计算机存储介质,所述非易失性计算机存储介质存储有一个或者多个程序,当所述一个或者多个程序被一个设备执行时,使得所述设备:In another aspect of the present invention, a nonvolatile computer storage medium storing one or more programs when the one or more programs are executed by a device causes The device:
获取音频推广信息的原始音频数据;Obtaining the original audio data of the audio promotion information;
根据所述原始音频数据,获得所述音频推广信息的音频特征;Obtaining an audio feature of the audio promotion information according to the original audio data;
根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
由上述技术方案可知,本发明实施例通过根据所获取的音频推广信息的原始音频数据,获得所述音频推广信息的音频特征,进而根据所述 原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征,使得能够根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况,由于不再完全依赖音频推广信息的文本内容属性进行音频推广信息的展现,而是考虑音频推广信息的音频特征这一能够更加准确地描述音频推广信息的属性,进行音频推广信息的展现,能够保证音频推广信息的精准展现,从而提高了音频推广信息的转化率。According to the foregoing technical solution, the embodiment of the present invention obtains an audio feature of the audio promotion information according to the original audio data of the obtained audio promotion information, and further according to the Obtaining, by at least one of the original audio data and the audio feature, a text feature of the audio promotion information, such that the presentation of the audio promotion information can be obtained according to at least one of the audio feature and the text feature In the case, the audio promotion information is not completely dependent on the text content attribute of the audio promotion information, but the audio feature of the audio promotion information is considered, which can more accurately describe the attributes of the audio promotion information and display the audio promotion information. It can ensure the accurate display of audio promotion information, thus improving the conversion rate of audio promotion information.
另外,采用本发明提供的技术方案,无需人工参与,即能实现音频推广信息的自动推送,因此,能够有效提高音频推广信息的推送成本。In addition, by adopting the technical solution provided by the invention, the automatic promotion of the audio promotion information can be realized without manual participation, and therefore, the pushing cost of the audio promotion information can be effectively improved.
另外,采用本发明提供的技术方案,操作简单,因此,能够有效提高音频推广信息的处理的效率。In addition, the technical solution provided by the present invention is simple in operation, and therefore, the efficiency of processing the audio promotion information can be effectively improved.
附图说明DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are the present invention. For some embodiments, other drawings may be obtained from those of ordinary skill in the art in light of the inventive workability.
图1为本发明一实施例提供的音频推广信息的处理方法的流程示意图;1 is a schematic flowchart of a method for processing audio promotion information according to an embodiment of the present invention;
图2为本发明另一实施例提供的音频推广信息的处理装置的结构示意图。 FIG. 2 is a schematic structural diagram of an apparatus for processing audio promotion information according to another embodiment of the present invention.
具体实施方式detailed description
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的全部其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
需要说明的是,本发明实施例中所涉及的终端可以包括但不限于手机、个人数字助理(Personal Digital Assistant,PDA)、无线手持设备、平板电脑(Tablet Computer)、个人电脑(Personal Computer,PC)、MP3播放器、MP4播放器、可穿戴设备(例如,智能眼镜、智能手表、智能手环等)等。It should be noted that the terminals involved in the embodiments of the present invention may include, but are not limited to, a mobile phone, a personal digital assistant (PDA), a wireless handheld device, a tablet computer, and a personal computer (Personal Computer, PC). ), MP3 player, MP4 player, wearable device (for example, smart glasses, smart watches, smart bracelets, etc.).
另外,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。In addition, the term "and/or" herein is merely an association relationship describing an associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, and A and B exist at the same time. There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.
图1为本发明一实施例提供的一种音频推广信息的处理方法的流程示意图,如图1所示。FIG. 1 is a schematic flowchart of a method for processing audio promotion information according to an embodiment of the present invention, as shown in FIG. 1 .
101、获取音频推广信息的原始音频数据。101. Obtain original audio data of audio promotion information.
102、根据所述原始音频数据,获得所述音频推广信息的音频特征。102. Obtain an audio feature of the audio promotion information according to the original audio data.
103、根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征。103. Obtain a text feature of the audio promotion information according to at least one of the original audio data and the audio feature.
104、根据所述音频特征和所述文本特征中的至少一项,获得所述 音频推广信息的展现情况。104. Obtain the obtaining according to at least one of the audio feature and the text feature The presentation of audio promotion information.
所谓的所述音频推广信息,可以是指一个完整的音频文件,可以预先存储在终端的存储设备中。所述音频推广信息可以包括现有技术中各种编码格式的音频文件,例如,动态图像专家组(Moving Picture Experts Group,MPEG)层3(MPEGLayer-3,MP3)格式音频文件、WMA(Windows Media Audio)格式音频文件、高级音频编码(Advanced Audio Coding,AAC)格式音频文件或APE格式音频文件等,本实施例对此不进行特别限定。The so-called audio promotion information may refer to a complete audio file, which may be pre-stored in the storage device of the terminal. The audio promotion information may include audio files of various encoding formats in the prior art, for example, Moving Picture Experts Group (MPEG) Layer 3 (MPEG Layer, MP3) format audio file, WMA (Windows Media) The audio file format, the Advanced Audio Coding (AAC) format audio file, or the APE format audio file, etc., are not particularly limited in this embodiment.
在一个具体的实现过程中,所述终端的存储设备可以慢速存储设备,具体可以为计算机系统的硬盘,或者还可以为手机的非运行内存即物理内存,例如,只读存储器(Read-Only Memory,ROM)和内存卡等,本实施例对此不进行特别限定。In a specific implementation process, the storage device of the terminal may store the device at a slow speed, which may be a hard disk of the computer system, or may be a non-operating memory of the mobile phone, that is, physical memory, for example, a read-only memory (Read-Only) The memory, the ROM, the memory card, and the like are not particularly limited in this embodiment.
在另一个具体的实现过程中,所述终端的存储设备还可以为快速存储设备,具体可以为计算机系统的内存,或者还可以为手机的运行内存即系统内存,例如,随机存储器(Random Access Memory,RAM)等,本实施例对此不进行特别限定。In another specific implementation process, the storage device of the terminal may also be a fast storage device, which may be a memory of the computer system, or may be a running memory of the mobile phone, that is, system memory, for example, a random access memory (Random Access Memory). , RAM, etc., this embodiment is not particularly limited.
需要说明的是,101~104的执行主体的部分或全部可以为位于本地终端的应用,或者还可以为位于本地终端的应用中的插件或软件开发工具包(Software Development Kit,SDK)等功能单元,或者还可以为位于网络侧的服务器中的处理引擎,或者还可以为位于网络侧的分布式系统,本实施例对此不进行特别限定,本实施例对此不进行特别限定。It should be noted that some or all of the execution entities of 101 to 104 may be applications located in the local terminal, or may be plug-ins or software development kits (SDKs) in the application of the local terminal. For example, the processing engine in the server on the network side, or the distributed system on the network side, may not be specifically limited in this embodiment, and is not particularly limited in this embodiment.
可以理解的是,所述应用可以是安装在终端上的本地程序(nativeApp),或者还可以是终端上的浏览器的一个网页程序(webApp), 本实施例对此不进行特别限定。It can be understood that the application may be a local application installed on the terminal (nativeApp), or may be a web application (webApp) of a browser on the terminal. This embodiment is not particularly limited.
这样,通过根据所获取的音频推广信息的原始音频数据,获得所述音频推广信息的音频特征,进而根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征,使得能够根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况,由于不再完全依赖音频推广信息的文本内容属性进行音频推广信息的展现,而是考虑音频推广信息的音频特征这一能够更加准确地描述音频推广信息的属性,进行音频推广信息的展现,能够保证音频推广信息的精准展现,从而提高了音频推广信息的转化率。In this way, the audio feature of the audio promotion information is obtained according to the original audio data of the acquired audio promotion information, and then the audio promotion information is obtained according to at least one of the original audio data and the audio feature. a text feature, such that the presentation of the audio promotion information is obtained according to at least one of the audio feature and the text feature, and the audio promotion information is not completely dependent on the text content attribute of the audio promotion information, Rather, considering the audio features of the audio promotion information, the attributes of the audio promotion information can be more accurately described, and the audio promotion information can be displayed, which can ensure the accurate display of the audio promotion information, thereby improving the conversion rate of the audio promotion information.
可选地,在本实施例的一个可能的实现方式中,在101中,具体可以实时采集所述原始音频数据。Optionally, in one possible implementation manner of this embodiment, in 101, the original audio data may be collected in real time.
具体地,具体可以采集音频推广信息的声音信号,然后,将所述声音信号转换为原始音频数据。例如,对所述声音信号进行抽样、量化和编码处理,以获得脉冲编码调制(Pulse Code Modulation,PCM)数据。Specifically, the sound signal of the audio promotion information may be specifically collected, and then the sound signal is converted into original audio data. For example, the sound signal is sampled, quantized, and encoded to obtain Pulse Code Modulation (PCM) data.
可选地,在本实施例的一个可能的实现方式中,在101中,具体可以获取所述音频推广信息,对所述音频推广信息进行解码处理,以获得所述原始音频数据。Optionally, in a possible implementation manner of this embodiment, in 101, the audio promotion information may be specifically acquired, and the audio promotion information is decoded to obtain the original audio data.
在一个具体的实现过程中,具体可以通过对所述音频推广信息的数据块进行解码处理,获得所述原始音频数据。所谓的原始音频数据,是由对音频信号转换而来的数字信号,例如,对所述音频信号进行抽样、量化和编码处理,以获得PCM数据。解码处理的详细描述可以参见现有技术中的相关内容,此处不再赘述。In a specific implementation process, the original audio data may be obtained by performing decoding processing on the data block of the audio promotion information. The so-called original audio data is a digital signal converted from an audio signal, for example, the audio signal is sampled, quantized, and encoded to obtain PCM data. For a detailed description of the decoding process, refer to related content in the prior art, and details are not described herein again.
本实施例中,通过执行101,所获得的所述原始音频数据,可以为 一个声道所对应的原始音频数据,如果音频推广信息存在多个声道,具体可以对每个声道所对应的原始音频数据,都分别执行后续的处理流程即102~104。In this embodiment, by performing 101, the obtained original audio data may be For the original audio data corresponding to one channel, if there are multiple channels in the audio promotion information, the subsequent processing processes, that is, 102 to 104, may be respectively performed on the original audio data corresponding to each channel.
在一个具体的实现过程中,具体可以确定所述音频推广信息的声道数目,以及对所述音频推广信息的数据块进行解码处理,以获得原始音频数据。然后,则可以根据所述声道数目和所述原始音频数据,获得每个声道所对应的原始音频数据。In a specific implementation process, specifically, the number of channels of the audio promotion information may be determined, and the data block of the audio promotion information is decoded to obtain original audio data. Then, the original audio data corresponding to each channel can be obtained according to the number of channels and the original audio data.
例如,具体可以对所述音频推广信息的帧头进行解析处理,以确定所述音频推广信息的声道数目。For example, the frame header of the audio promotion information may be parsed to determine the number of channels of the audio promotion information.
或者再例如,具体可以对所述音频推广信息的文件头进行解析处理,以确定所述音频推广信息的声道数目。Or, for example, the file header of the audio promotion information may be parsed to determine the number of channels of the audio promotion information.
或者再例如,具体可以对音频推广信息的其他部分进行解析处理,以确定所述音频推广信息的声道数目,本实施例对此不进行特别限定。Alternatively, for example, the other parts of the audio promotion information may be parsed to determine the number of channels of the audio promotion information, which is not specifically limited in this embodiment.
或者再例如,具体还可以从配置文件中,获得所述音频推广信息的声道数目。Or, for example, the number of channels of the audio promotion information may be obtained from a configuration file.
可以理解的是,“确定所述音频推广信息的声道数目”,以及“对所述音频推广信息的数据块进行解码处理,以获得原始音频数据”的两个步骤,没有固定顺序,所述处理装置可以先执行“确定所述音频推广信息的声道数目”的步骤,再执行“对所述音频推广信息的数据块进行解码处理,以获得原始音频数据”的步骤,或者还可以先执行“对所述音频推广信息的数据块进行解码处理,以获得原始音频数据”的步骤,再执行“确定所述音频推广信息的声道数目”的步骤,或者还可以同时执行这两个步骤,本实施例对此不进行特别限定。 It can be understood that there are two steps of "determining the number of channels of the audio promotion information" and "decoding the data blocks of the audio promotion information to obtain original audio data" without a fixed order. The processing device may first perform the step of “determining the number of channels of the audio promotion information”, and then perform the steps of “decoding the data block of the audio promotion information to obtain original audio data”, or may perform first a step of "decoding the data block of the audio promotion information to obtain the original audio data", and then performing the step of "determining the number of channels of the audio promotion information", or may perform both steps simultaneously. This embodiment is not particularly limited.
可选地,在本实施例的一个可能的实现方式中,在102中,具体可以对所述原始音频数据进行分帧处理,以获得至少一帧数据,进而对至少一帧数据中每帧数据进行音频分析处理,以获得每帧数据的音频特征。Optionally, in a possible implementation manner of the embodiment, in the 102, the original audio data may be subjected to a framing process to obtain at least one frame of data, and then, for each frame of the at least one frame of data. Audio analysis processing is performed to obtain audio characteristics of each frame of data.
在一个具体的实现过程中,可以对所述原始音频数据按照预设时间间隔,例如,20ms,进行分帧处理,且相邻帧之间有部分的数据重叠,例如50%的数据重叠,这样,能够获得所述原始音频数据的至少一帧数据。In a specific implementation process, the original audio data may be framed according to a preset time interval, for example, 20 ms, and some data overlap between adjacent frames, for example, 50% of data overlap, such that At least one frame of data of the original audio data can be obtained.
在另一个具体的实现过程中,所述音频特征可以包括但不限于所述原始音频数据的时域音频特征和所述原始音频数据的频域音频特征中的至少一项,本实施例对此不进行特别限定。In another specific implementation process, the audio feature may include, but is not limited to, at least one of a time domain audio feature of the original audio data and a frequency domain audio feature of the original audio data, which is used in this embodiment. No particular limitation is imposed.
所述原始音频数据的时域音频特征,可以包括以下参数中的至少一项:The time domain audio feature of the original audio data may include at least one of the following parameters:
时域波形、强度、过零率、线性预测(Linear Prediction Coding,LPC)系数、线性预测倒谱系数(Linear Prediction Cepstrum Coefficient,LPCC)、梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)或感知线性预测(Perceptual Linear Predictive,PLP)系数、节拍、音调、以及调性。Time domain waveform, intensity, zero-crossing rate, Linear Prediction Coding (LPC) coefficient, Linear Prediction Cepstrum Coefficient (LPCC), Mel Frequency Cepstrum Coefficient (MFCC) or Perceptual Linear Predictive (PLP) coefficients, beats, tones, and tonality.
所述原始音频数据的频域音频特征,可以包括但不限于原始音频数据的频谱信息。The frequency domain audio features of the original audio data may include, but are not limited to, spectrum information of the original audio data.
可选地,在本实施例的一个可能的实现方式中,在103中,具体可以根据所述音频特征,利用预先建立的音频特征与文本特征的对应关系,获得所述音频推广信息的文本特征。Optionally, in a possible implementation manner of the embodiment, in 103, the text feature of the audio promotion information may be obtained by using a correspondence between the pre-established audio feature and the text feature according to the audio feature. .
所谓的文本特征,具体可以为一切能够描述音频推广信息的描述内 容,例如,音频推广信息的节奏快、音频推广信息的节奏慢、音频推广信息的音质高、音频推广信息的音质低等。The so-called text feature can be specifically described in all descriptions of audio promotion information. For example, the audio promotion information has a fast rhythm, the audio promotion information has a slow rhythm, the audio promotion information has a high sound quality, and the audio promotion information has a low sound quality.
所谓的音频推广信息的音质,是指经过压缩处理之后的原始音频数据的保真度。高音质的音频文件,能够完全恢复原始音频数据,而不引起任何失真;而低音质的音频文件,则不能够完全恢复原始音频数据,而引起部分失真。The sound quality of the so-called audio promotion information refers to the fidelity of the original audio data after the compression processing. A high-quality audio file that completely restores the original audio data without causing any distortion; while a low-quality audio file cannot completely restore the original audio data, causing partial distortion.
在一个具体的实现过程中,可以预先设置一个节拍阈值,例如,100下每分钟(Beat Per Minute,BPM),以作为音频特征与文本特征的对应关系的表现形式。若所获得的节拍小于或等于该节拍阈值,则可以映射为用于指示舒缓的文本特征,反之,若所获得的节拍大于该节拍阈值,则可以映射为用于指示欢快的文本特征。In a specific implementation process, a beat threshold may be preset, for example, Beat Per Minute (BPM), as a representation of the correspondence between audio features and text features. If the obtained beat is less than or equal to the beat threshold, it may be mapped to a text feature for indicating relief, and conversely, if the obtained beat is greater than the beat threshold, it may be mapped to a text feature for indicating joy.
在另一个具体的实现过程中,还可以预先设置时域波形无削波失真与用于指示高音质的文本特征,以及时域波形有削波失真与用于指示低音质的文本特征。若所获得的时域波形无削波失真,则可以映射为用于指示高音质的文本特征,反之,若所获得的时域波形有削波失真,则可以映射为用于指示低音质的文本特征。In another specific implementation process, it is also possible to preset the time domain waveform without clipping distortion and the text feature for indicating high sound quality, and the time domain waveform has clipping distortion and text features for indicating low quality. If the obtained time domain waveform has no clipping distortion, it can be mapped to a text feature for indicating high sound quality. Conversely, if the obtained time domain waveform has clipping distortion, it can be mapped to text indicating low quality. feature.
在另一个具体的实现过程中,具体可以采用预先指定的训练样本集,进行训练,以构建学习模型,该学习模型用于描述音频特征与文本特征的对应关系。其中,训练样本集中所包含的训练样本,可以为经过标注的已知样本,这样,可以直接利用这些已知样本进行训练,以构建学习模型;或者还可以一部分为经过标注的已知样本,另一部分为没有经过标注的未知样本,那么,则可以先利用已知样本进行训练,以构建初始学习模型,然后,再利用初始学习模型对未知样本进行评测,以获得识 别结果,进而则可以根据未知样本的识别结果,对未知样本进行标注,以形成已知样本,作为新增加的已知样本,利用新增加的已知样本,以及原始的已知样本重新进行训练,以构建新的学习模型,直到所构建的学习模型或已知样本满足学习模型的截止条件为止,如识别准确率大于或等于预先设置的准确率阈值或已知样本的数量大于或等于预先设置的数量阈值等,本实施例对此不进行特别限定。In another specific implementation process, specifically, a pre-specified training sample set may be used to perform training to construct a learning model, which is used to describe a correspondence between an audio feature and a text feature. The training samples included in the training sample set may be labeled known samples, so that the known samples may be directly used for training to construct a learning model; or part of the labeled known samples may be used. If some of the unknown samples are not labeled, then you can use the known samples to train to build the initial learning model, and then use the initial learning model to evaluate the unknown samples to obtain knowledge. In other cases, the unknown sample can be labeled according to the recognition result of the unknown sample to form a known sample, as a newly added known sample, and the newly added known sample and the original known sample can be retrained. To build a new learning model until the constructed learning model or known sample meets the cutoff criteria of the learning model, such as the recognition accuracy is greater than or equal to the preset accuracy threshold or the number of known samples is greater than or equal to the preset The number threshold and the like are not particularly limited in this embodiment.
可选地,在本实施例的一个可能的实现方式中,在103中,具体可以根据所述原始音频数据,采用语音识别技术,获得所述音频推广信息的文本特征。Optionally, in a possible implementation manner of the embodiment, in 103, a text feature of the audio promotion information may be obtained by using a voice recognition technology according to the original audio data.
具体的语音识别技术,可以采用现有的任何技术,只要能够识别出特定关键词,以作为所述音频推广信息的文本特征都可以,此处不再赘述。The specific voice recognition technology may be any existing technology, as long as the specific keyword can be identified as the text feature of the audio promotion information, and details are not described herein again.
可选地,在本实施例的一个可能的实现方式中,在103中,具体可以根据所述音频特征,利用预先建立的音频特征与文本特征的对应关系,获得所述音频推广信息的文本特征,以及根据所述原始音频数据,采用语音识别技术,获得所述音频推广信息的文本特征。Optionally, in a possible implementation manner of the embodiment, in 103, the text feature of the audio promotion information may be obtained by using a correspondence between the pre-established audio feature and the text feature according to the audio feature. And obtaining, according to the original audio data, a text feature of the audio promotion information by using a voice recognition technology.
具体地,具体可以采用上述两个实现方式中的技术方案进行有机结合,获得所述音频推广信息的文本特征。详细描述可以分别参考上述两个实现方式中的相关描述,此处不再赘述。Specifically, the technical solution in the foregoing two implementation manners may be used to perform organic combination to obtain text features of the audio promotion information. For a detailed description, reference may be made to the related descriptions in the foregoing two implementation manners, and details are not described herein again.
可选地,在本实施例的一个可能的实现方式中,在104中,具体可以计算推广属性特征与所述音频特征和所述文本特征中的至少一项的匹配度,以作为所述音频推广信息的展现得分,进而,则可以根据所述展现得分,获得所述音频推广信息的展现情况。 Optionally, in a possible implementation manner of this embodiment, in 104, a matching degree of the promotion attribute feature and at least one of the audio feature and the text feature may be specifically calculated as the audio. The presentation score of the promotion information, and further, the presentation of the audio promotion information may be obtained according to the presentation score.
其中,所谓的推广属性特征,可以由本次推广的主题模型进行描述。主题模型,顾名思义,就是对文本、音频等内容中隐含主题的一种建模方法。例如,“苹果”这个词的背后既包含是苹果公司这样一个主题,也包括了水果的主题。具体来说,所述推广属性特征可以包括但不限于下列特征中的至少一项:Among them, the so-called promotion attribute feature can be described by the topic model of this promotion. The topic model, as its name suggests, is a modeling method for implicit topics in text, audio, and so on. For example, the word "Apple" contains both the theme of Apple and the theme of fruit. Specifically, the promotion attribute feature may include, but is not limited to, at least one of the following features:
展现音频推广信息的页面的属性特征,如购物页面、游戏页面、新闻页面等;Attribute characteristics of a page displaying audio promotion information, such as a shopping page, a game page, a news page, and the like;
展现音频推广信息的页面所属网站的属性特征,如购物网站、游戏网站、新闻网站等;以及The attribute characteristics of the website to which the page displaying the audio promotion information belongs, such as a shopping website, a game website, a news website, etc.;
音频推广信息的推送用户的属性特征,如青少年、老年人等。The audio promotion information pushes the user's attribute characteristics, such as teenagers, seniors, and the like.
众所周知,基于互联网的推广信息,是互联网行业最主要的赢利模式,流量变现成为互联网商业产品非常重要的评价标准。具体地,以广告为例,这个评价标准具体可以采用实时竞价(Real Time Bidding,RTB)模式,跟传统购买形式相比,RTB是一种利用第三方技术在数以百万计的网站上针对每一个广告展示曝光进行评估以及出价的竞价技术。因此,在计算匹配度时,除了需要考虑音频推广信息的音频特征以及文本特征之外,还需要进一步音频推广信息的出价。As we all know, Internet-based promotion information is the most important profit model of the Internet industry, and traffic realization has become a very important evaluation standard for Internet commercial products. Specifically, in the case of advertisements, this evaluation standard may specifically adopt a Real Time Bidding (RTB) mode. Compared with the traditional purchase form, RTB is a third-party technology that is targeted at millions of websites. Each ad shows an auction that evaluates and bids for bidding techniques. Therefore, in calculating the matching degree, in addition to the audio characteristics of the audio promotion information and the text features, further bidding of the audio promotion information is required.
本实施例中,通过根据所获取的音频推广信息的原始音频数据,获得所述音频推广信息的音频特征,进而根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征,使得能够根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况,由于不再完全依赖音频推广信息的文本内容属性进行音频推广信息的展现,而是考虑音频推广信息的音频特征这一能够更加准确 地描述音频推广信息的属性,进行音频推广信息的展现,能够保证音频推广信息的精准展现,从而提高了音频推广信息的转化率。In this embodiment, the audio feature of the audio promotion information is obtained according to the original audio data of the acquired audio promotion information, and then the audio is obtained according to at least one of the original audio data and the audio feature. Promoting the text feature of the information, so that the presentation of the audio promotion information can be obtained according to at least one of the audio feature and the text feature, and the audio promotion information is not completely relied on the text content attribute of the audio promotion information. Presentation, but consider the audio characteristics of the audio promotion information, which can be more accurate. Describe the attributes of the audio promotion information, and display the audio promotion information, which can ensure the accurate display of the audio promotion information, thereby improving the conversion rate of the audio promotion information.
另外,采用本发明提供的技术方案,无需人工参与,即能实现音频推广信息的自动推送,因此,能够有效提高音频推广信息的推送成本。In addition, by adopting the technical solution provided by the invention, the automatic promotion of the audio promotion information can be realized without manual participation, and therefore, the pushing cost of the audio promotion information can be effectively improved.
另外,采用本发明提供的技术方案,操作简单,因此,能够有效提高音频推广信息的处理的效率。In addition, the technical solution provided by the present invention is simple in operation, and therefore, the efficiency of processing the audio promotion information can be effectively improved.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. In addition, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the descriptions of the various embodiments are different, and the details that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.
图2为本发明另一实施例提供的音频推广信息的处理装置的结构示意图,如图2所示。本实施例的音频推广信息的处理装置可以包括获取单元21、音频单元22、映射单元23和展现单元24。其中,获取单元21,用于获取音频推广信息的原始音频数据;音频单元22,用于根据所述原始音频数据,获得所述音频推广信息的音频特征;映射单元23,用于根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;展现单元24,用于根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。2 is a schematic structural diagram of an apparatus for processing audio promotion information according to another embodiment of the present invention, as shown in FIG. 2 . The processing apparatus of the audio promotion information of the embodiment may include an acquisition unit 21, an audio unit 22, a mapping unit 23, and a presentation unit 24. The obtaining unit 21 is configured to obtain original audio data of the audio promotion information, and the audio unit 22 is configured to obtain an audio feature of the audio promotion information according to the original audio data, and a mapping unit 23, configured to use the original Obtaining at least one of audio data and the audio feature, obtaining a text feature of the audio promotion information; and displaying unit 24, configured to obtain the audio promotion according to at least one of the audio feature and the text feature The presentation of information.
需要说明的是,本实施例所提供的音频推广信息的处理装置的部分 或全部可以为位于本地终端的应用,或者还可以为位于本地终端的应用中的插件或软件开发工具包(Software Development Kit,SDK)等功能单元,或者还可以为位于网络侧的服务器中的处理引擎,或者还可以为位于网络侧的分布式系统,本实施例对此不进行特别限定,本实施例对此不进行特别限定。It should be noted that part of the processing apparatus for audio promotion information provided by this embodiment Or all of them may be applications located in the local terminal, or may be plug-ins or software development kits (SDKs) in the application of the local terminal, or may be processed in the server on the network side. The engine may be a distributed system located on the network side, which is not limited in this embodiment, and is not particularly limited in this embodiment.
可以理解的是,所述应用可以是安装在终端上的本地程序(nativeApp),或者还可以是终端上的浏览器的一个网页程序(webApp),本实施例对此不进行特别限定。It is to be understood that the application may be a local application (nativeApp) installed on the terminal, or may be a web application (webApp) of the browser on the terminal, which is not specifically limited in this embodiment.
可选地,在本实施例的一个可能的实现方式中,所述获取单元21,具体可以用于实时采集所述原始音频数据。Optionally, in a possible implementation manner of the embodiment, the acquiring unit 21 may be specifically configured to collect the original audio data in real time.
可选地,在本实施例的一个可能的实现方式中,所述获取单元21,具体可以用于获取所述音频推广信息,对所述音频推广信息进行解码处理,以获得所述原始音频数据。Optionally, in a possible implementation manner of the embodiment, the acquiring unit 21 may be specifically configured to acquire the audio promotion information, and perform decoding processing on the audio promotion information to obtain the original audio data. .
可选地,在本实施例的一个可能的实现方式中,所述映射单元23,具体可以用于根据所述音频特征,利用预先建立的音频特征与文本特征的对应关系,获得所述音频推广信息的文本特征;和/或根据所述原始音频数据,采用语音识别技术,获得所述音频推广信息的文本特征。Optionally, in a possible implementation manner of the embodiment, the mapping unit 23 may be specifically configured to obtain the audio promotion by using a correspondence between a pre-established audio feature and a text feature according to the audio feature. a textual feature of the information; and/or obtaining a textual feature of the audio promotional information using speech recognition techniques based on the raw audio data.
可选地,在本实施例的一个可能的实现方式中,所述展现单元24,具体可以用于计算推广属性特征与所述音频特征和所述文本特征中的至少一项的匹配度,以作为所述音频推广信息的展现得分;以及根据所述展现得分,获得所述音频推广信息的展现情况。Optionally, in a possible implementation manner of the embodiment, the displaying unit 24 may be specifically configured to calculate a matching degree between the promotion attribute feature and at least one of the audio feature and the text feature, to And a presentation score of the audio promotion information; and obtaining a presentation of the audio promotion information according to the presentation score.
具体来说,所述推广属性特征可以包括但不限于下列特征中的至少一项: Specifically, the promotion attribute feature may include, but is not limited to, at least one of the following features:
展现音频推广信息的页面的属性特征,如购物页面、游戏页面、新闻页面等;Attribute characteristics of a page displaying audio promotion information, such as a shopping page, a game page, a news page, and the like;
展现音频推广信息的页面所属网站的属性特征,如购物网站、游戏网站、新闻网站等;以及The attribute characteristics of the website to which the page displaying the audio promotion information belongs, such as a shopping website, a game website, a news website, etc.;
音频推广信息的推送用户的属性特征,如青少年、老年人等。The audio promotion information pushes the user's attribute characteristics, such as teenagers, seniors, and the like.
需要说明的是,图1对应的实施例中方法,可以由本实施例提供的音频推广信息的处理装置实现。详细描述可以参见图1对应的实施例中的相关内容,此处不再赘述。It should be noted that the method in the embodiment corresponding to FIG. 1 can be implemented by the audio promotion information processing apparatus provided in this embodiment. For details, refer to related content in the embodiment corresponding to FIG. 1, and details are not described herein again.
本实施例中,通过音频单元根据获取单元所获取的音频推广信息的原始音频数据,获得所述音频推广信息的音频特征,进而由映射单元根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征,使得展现单元能够根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况,由于不再完全依赖音频推广信息的文本内容属性进行音频推广信息的展现,而是考虑音频推广信息的音频特征这一能够更加准确地描述音频推广信息的属性,进行音频推广信息的展现,能够保证音频推广信息的精准展现,从而提高了音频推广信息的转化率。In this embodiment, the audio feature of the audio promotion information is obtained by the audio unit according to the original audio data of the audio promotion information acquired by the acquiring unit, and then the mapping unit is configured according to at least the original audio data and the audio feature. And obtaining a text feature of the audio promotion information, so that the presentation unit can obtain the presentation of the audio promotion information according to at least one of the audio feature and the text feature, since the audio promotion is no longer completely relied on The text content attribute of the information is used to display the audio promotion information, but the audio feature of the audio promotion information is considered, which can more accurately describe the attributes of the audio promotion information, and display the audio promotion information, thereby ensuring accurate display of the audio promotion information. Thereby improving the conversion rate of audio promotion information.
另外,采用本发明提供的技术方案,无需人工参与,即能实现音频推广信息的自动推送,因此,能够有效提高音频推广信息的推送成本。In addition, by adopting the technical solution provided by the invention, the automatic promotion of the audio promotion information can be realized without manual participation, and therefore, the pushing cost of the audio promotion information can be effectively improved.
另外,采用本发明提供的技术方案,操作简单,因此,能够有效提高音频推广信息的处理的效率。In addition, the technical solution provided by the present invention is simple in operation, and therefore, the efficiency of processing the audio promotion information can be effectively improved.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例 中的对应过程,在此不再赘述。It will be apparent to those skilled in the art that for the convenience and brevity of the description, the specific working processes of the systems, devices and units described above may be referred to the foregoing method embodiments. The corresponding process in the description will not be repeated here.
在本发明所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机装置(可以是个人计算机,音频处理引擎,或者网络装置等)或处理器(processor)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access  Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, an audio processing engine, or a network device, etc.) or a processor to perform the embodiments of the present invention. Part of the steps of the method. The foregoing storage medium includes: a USB flash drive, a mobile hard disk, a read-only memory (ROM), and a random access memory (Random Access). A variety of media that can store program code, such as Memory, RAM, or a disk.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。 It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and are not limited thereto; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that The technical solutions described in the foregoing embodiments are modified, or the equivalents of the technical features are replaced. The modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (12)

  1. 一种音频推广信息的处理方法,其特征在于,包括:A method for processing audio promotion information, comprising:
    获取音频推广信息的原始音频数据;Obtaining the original audio data of the audio promotion information;
    根据所述原始音频数据,获得所述音频推广信息的音频特征;Obtaining an audio feature of the audio promotion information according to the original audio data;
    根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
    根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
  2. 根据权利要求1所述的方法,其特征在于,所述获取音频推广信息的原始音频数据,包括:The method according to claim 1, wherein the obtaining the original audio data of the audio promotion information comprises:
    实时采集所述原始音频数据;或者Acquiring the original audio data in real time; or
    获取所述音频推广信息,对所述音频推广信息进行解码处理,以获得所述原始音频数据。Obtaining the audio promotion information, and performing decoding processing on the audio promotion information to obtain the original audio data.
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征,包括:The method according to claim 1, wherein the obtaining the text feature of the audio promotion information according to at least one of the original audio data and the audio feature comprises:
    根据所述音频特征,利用预先建立的音频特征与文本特征的对应关系,获得所述音频推广信息的文本特征;和/或Obtaining, according to the audio feature, a text feature of the audio promotion information by using a correspondence between a pre-established audio feature and a text feature; and/or
    根据所述原始音频数据,采用语音识别技术,获得所述音频推广信息的文本特征。According to the original audio data, a text feature of the audio promotion information is obtained by using a voice recognition technology.
  4. 根据权利要求1~3任一权利要求所述的方法,其特征在于,所述根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况,包括: The method according to any one of claims 1 to 3, wherein the obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature comprises:
    计算推广属性特征与所述音频特征和所述文本特征中的至少一项的匹配度,以作为所述音频推广信息的展现得分;Calculating a matching degree of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;
    根据所述展现得分,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to the presentation score.
  5. 根据权利要求4所述的方法,其特征在于,所述推广属性特征包括下列特征中的至少一项:The method of claim 4 wherein said promotional attribute feature comprises at least one of the following features:
    展现音频推广信息的页面的属性特征;Attribute characteristics of a page displaying audio promotion information;
    展现音频推广信息的页面所属网站的属性特征;以及The attribute characteristics of the website to which the page displaying the audio promotion information belongs;
    音频推广信息的推送用户的属性特征。The attribute characteristics of the push user of the audio promotion information.
  6. 一种音频推广信息的处理装置,其特征在于,包括:A device for processing audio promotion information, comprising:
    获取单元,用于获取音频推广信息的原始音频数据;An obtaining unit, configured to obtain original audio data of the audio promotion information;
    音频单元,用于根据所述原始音频数据,获得所述音频推广信息的音频特征;An audio unit, configured to obtain an audio feature of the audio promotion information according to the original audio data;
    映射单元,用于根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;a mapping unit, configured to obtain a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
    展现单元,用于根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。And a presentation unit, configured to obtain, according to at least one of the audio feature and the text feature, a presentation of the audio promotion information.
  7. 根据权利要求6所述的装置,其特征在于,所述获取单元,具体用于The device according to claim 6, wherein the obtaining unit is specifically configured to
    实时采集所述原始音频数据;或者Acquiring the original audio data in real time; or
    获取所述音频推广信息,对所述音频推广信息进行解码处理,以获得所述原始音频数据。Obtaining the audio promotion information, and performing decoding processing on the audio promotion information to obtain the original audio data.
  8. 根据权利要求6所述的装置,其特征在于,所述映射单元,具体用于 The apparatus according to claim 6, wherein the mapping unit is specifically configured to
    根据所述音频特征,利用预先建立的音频特征与文本特征的对应关系,获得所述音频推广信息的文本特征;和/或Obtaining, according to the audio feature, a text feature of the audio promotion information by using a correspondence between a pre-established audio feature and a text feature; and/or
    根据所述原始音频数据,采用语音识别技术,获得所述音频推广信息的文本特征。According to the original audio data, a text feature of the audio promotion information is obtained by using a voice recognition technology.
  9. 根据权利要求6~8任一权利要求所述的装置,其特征在于,所述展现单元,具体用于The device according to any one of claims 6 to 8, wherein the presentation unit is specifically used for
    计算推广属性特征与所述音频特征和所述文本特征中的至少一项的匹配度,以作为所述音频推广信息的展现得分;以及Calculating a degree of matching of the promotion attribute feature with at least one of the audio feature and the text feature as a presentation score of the audio promotion information;
    根据所述展现得分,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to the presentation score.
  10. 根据权利要求9所述的装置,其特征在于,所述推广属性特征包括下列特征中的至少一项:The apparatus of claim 9 wherein said promotional attribute feature comprises at least one of the following features:
    展现音频推广信息的页面的属性特征;Attribute characteristics of a page displaying audio promotion information;
    展现音频推广信息的页面所属网站的属性特征;以及The attribute characteristics of the website to which the page displaying the audio promotion information belongs;
    音频推广信息的推送用户的属性特征。The attribute characteristics of the push user of the audio promotion information.
  11. 一种设备,包括:A device that includes:
    一个或者多个处理器;One or more processors;
    存储器;Memory
    一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时:One or more programs, the one or more programs being stored in the memory, when executed by the one or more processors:
    获取音频推广信息的原始音频数据;Obtaining the original audio data of the audio promotion information;
    根据所述原始音频数据,获得所述音频推广信息的音频特征;Obtaining an audio feature of the audio promotion information according to the original audio data;
    根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征; Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
    根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
  12. 一种非易失性计算机存储介质,所述非易失性计算机存储介质存储有一个或者多个程序,当所述一个或者多个程序被一个设备执行时,使得所述设备:A non-volatile computer storage medium storing one or more programs, when the one or more programs are executed by a device, causing the device to:
    获取音频推广信息的原始音频数据;Obtaining the original audio data of the audio promotion information;
    根据所述原始音频数据,获得所述音频推广信息的音频特征;Obtaining an audio feature of the audio promotion information according to the original audio data;
    根据所述原始音频数据和所述音频特征中的至少一项,获得所述音频推广信息的文本特征;Obtaining a text feature of the audio promotion information according to at least one of the original audio data and the audio feature;
    根据所述音频特征和所述文本特征中的至少一项,获得所述音频推广信息的展现情况。 Obtaining the presentation of the audio promotion information according to at least one of the audio feature and the text feature.
PCT/CN2015/087978 2015-05-12 2015-08-25 Method, apparatus and device for processing audio popularization information, and non-volatile computer storage medium WO2016179921A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510237646.6A CN104882146B (en) 2015-05-12 2015-05-12 The processing method and processing device of audio promotion message
CN201510237646.6 2015-05-12

Publications (1)

Publication Number Publication Date
WO2016179921A1 true WO2016179921A1 (en) 2016-11-17

Family

ID=53949614

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/087978 WO2016179921A1 (en) 2015-05-12 2015-08-25 Method, apparatus and device for processing audio popularization information, and non-volatile computer storage medium

Country Status (2)

Country Link
CN (1) CN104882146B (en)
WO (1) WO2016179921A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111818225A (en) * 2020-06-30 2020-10-23 深圳传音控股股份有限公司 Audio data processing method, terminal equipment and storage
CN112863518A (en) * 2021-01-29 2021-05-28 深圳前海微众银行股份有限公司 Method and device for voice data theme recognition

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919662B (en) * 2017-02-14 2021-08-31 复旦大学 Music identification method and system
CN107808305A (en) * 2017-09-28 2018-03-16 百度在线网络技术(北京)有限公司 Popularization fact implementation method, device and the storage medium of information flow promotion message

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1582444A (en) * 1999-12-30 2005-02-16 诺基亚有限公司 Selective media stream advertising technique
CN101034455A (en) * 2006-03-06 2007-09-12 腾讯科技(深圳)有限公司 Method and system for implementing online advertisement
WO2007133754A2 (en) * 2006-05-12 2007-11-22 Owl Multimedia, Inc. Method and system for music information retrieval
CN102254265A (en) * 2010-05-18 2011-11-23 北京首家通信技术有限公司 Rich media internet advertisement content matching and effect evaluation method
US20130339343A1 (en) * 2012-06-18 2013-12-19 Ian Paul Hierons Systems and methods to facilitate media search
CN103631802A (en) * 2012-08-24 2014-03-12 腾讯科技(深圳)有限公司 Song information searching method, device and corresponding server
CN103685520A (en) * 2013-12-13 2014-03-26 深圳Tcl新技术有限公司 Method and device for pushing songs on basis of voice recognition
CN103853778A (en) * 2012-12-04 2014-06-11 大陆汽车投资(上海)有限公司 Methods for updating music label information and pushing music, as well as corresponding device and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1582444A (en) * 1999-12-30 2005-02-16 诺基亚有限公司 Selective media stream advertising technique
CN101034455A (en) * 2006-03-06 2007-09-12 腾讯科技(深圳)有限公司 Method and system for implementing online advertisement
WO2007133754A2 (en) * 2006-05-12 2007-11-22 Owl Multimedia, Inc. Method and system for music information retrieval
CN102254265A (en) * 2010-05-18 2011-11-23 北京首家通信技术有限公司 Rich media internet advertisement content matching and effect evaluation method
US20130339343A1 (en) * 2012-06-18 2013-12-19 Ian Paul Hierons Systems and methods to facilitate media search
CN103631802A (en) * 2012-08-24 2014-03-12 腾讯科技(深圳)有限公司 Song information searching method, device and corresponding server
CN103853778A (en) * 2012-12-04 2014-06-11 大陆汽车投资(上海)有限公司 Methods for updating music label information and pushing music, as well as corresponding device and system
CN103685520A (en) * 2013-12-13 2014-03-26 深圳Tcl新技术有限公司 Method and device for pushing songs on basis of voice recognition

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111818225A (en) * 2020-06-30 2020-10-23 深圳传音控股股份有限公司 Audio data processing method, terminal equipment and storage
CN112863518A (en) * 2021-01-29 2021-05-28 深圳前海微众银行股份有限公司 Method and device for voice data theme recognition
CN112863518B (en) * 2021-01-29 2024-01-09 深圳前海微众银行股份有限公司 Method and device for recognizing voice data subject

Also Published As

Publication number Publication date
CN104882146A (en) 2015-09-02
CN104882146B (en) 2018-05-15

Similar Documents

Publication Publication Date Title
US11132172B1 (en) Low latency audio data pipeline
US10614803B2 (en) Wake-on-voice method, terminal and storage medium
US20200402500A1 (en) Method and device for generating speech recognition model and storage medium
WO2020173134A1 (en) Attention mechanism-based speech synthesis method and device
WO2017084360A1 (en) Method and system for speech recognition
TWI711967B (en) Method, device and equipment for determining broadcast voice
JP2019527371A (en) Voiceprint identification method and apparatus
WO2017031846A1 (en) Noise elimination and voice recognition method, apparatus and device, and non-volatile computer storage medium
CN103943104B (en) A kind of voice messaging knows method for distinguishing and terminal unit
WO2018059342A1 (en) Method and device for processing dual-source audio data
US20170092261A1 (en) System and method for crowd-sourced data labeling
WO2020237769A1 (en) Accompaniment purity evaluation method and related device
WO2016179921A1 (en) Method, apparatus and device for processing audio popularization information, and non-volatile computer storage medium
US20240021202A1 (en) Method and apparatus for recognizing voice, electronic device and medium
Bahat et al. Self-content-based audio inpainting
WO2021259300A1 (en) Sound effect adding method and apparatus, storage medium, and electronic device
JP2008170820A (en) Content provision system and method
CN109582825B (en) Method and apparatus for generating information
CN108877779B (en) Method and device for detecting voice tail point
US20230127787A1 (en) Method and apparatus for converting voice timbre, method and apparatus for training model, device and medium
US20160093286A1 (en) Synthesizing an aggregate voice
WO2021227308A1 (en) Video resource generation method and apparatus
CN112116903A (en) Method and device for generating speech synthesis model, storage medium and electronic equipment
WO2023169258A1 (en) Audio detection method and apparatus, storage medium and electronic device
CN107680584B (en) Method and device for segmenting audio

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15891624

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06.04.2018)

122 Ep: pct application non-entry in european phase

Ref document number: 15891624

Country of ref document: EP

Kind code of ref document: A1