CN109657094A - Audio-frequency processing method and terminal device - Google Patents
Audio-frequency processing method and terminal device Download PDFInfo
- Publication number
- CN109657094A CN109657094A CN201811423356.0A CN201811423356A CN109657094A CN 109657094 A CN109657094 A CN 109657094A CN 201811423356 A CN201811423356 A CN 201811423356A CN 109657094 A CN109657094 A CN 109657094A
- Authority
- CN
- China
- Prior art keywords
- entry
- text
- audio
- searched
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 238000003860 storage Methods 0.000 claims abstract description 11
- 239000013598 vector Substances 0.000 claims description 27
- 238000000034 method Methods 0.000 claims description 22
- 238000004590 computer program Methods 0.000 claims description 14
- 239000000284 extract Substances 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000004883 computer application Methods 0.000 abstract description 2
- 230000006835 compression Effects 0.000 description 20
- 238000007906 compression Methods 0.000 description 20
- 238000012549 training Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000011218 segmentation Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000005096 rolling process Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Abstract
The present invention is suitable for computer application technology, provides a kind of audio-frequency processing method, terminal device and computer readable storage medium, comprising: passes through and obtains audio file to be processed;Parsing audio file obtains urtext information;The playing time of every entry in entry text and broadcasting entry text in urtext information including audio file;The text to be searched for obtaining user's input, the matched entry of text institute determining and to be searched in entry text, and play the target play moment of entry;According to entry and target play moment, audio corresponding with entry is played.Its position and playing time in entry text is determined by the entry inputted according to user and is played out, audio file is flexibly presented to the user according to the mode that user selects, audio software is improved and plays the intelligence of audio and the usage experience of user.
Description
Technical field
The invention belongs to computer application technology more particularly to a kind of audio-frequency processing methods, terminal device and calculating
Machine readable storage medium storing program for executing.
Background technique
With the development of Computer Multimedia Technology, now with various types of audio and video playout softwares, Yong Huke
To play music by these softwares, video is appreciated, life & amusement mode is increased.It can only lead in existing audio playing software
It crosses and different play mode is set to play audio file, audio broadcasting, flexibility cannot be carried out according to the broadcasting demand of user
It is lower.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of audio-frequency processing method, terminal device and computer-readable storages
Medium, to solve that audio broadcasting, the lower problem of flexibility cannot be carried out according to the broadcasting demand of user in the prior art.
The first aspect of the embodiment of the present invention provides a kind of audio-frequency processing method, comprising:
Obtain audio file to be processed;
It parses the audio file and obtains urtext information;Including the audio file in the urtext information
Entry text and the playing time for playing every entry in the entry text;
The text to be searched for obtaining user's input determines matched with the text institute to be searched in the entry text
Entry, and play the target play moment of the entry;
According to the entry and the target play moment, audio corresponding with the entry is played.
The second aspect of the embodiment of the present invention provides a kind of terminal device, including memory, processor and is stored in
In the memory and the computer program that can run on the processor, when the processor executes the computer program
It performs the steps of
Obtain audio file to be processed;
It parses the audio file and obtains urtext information;Including the audio file in the urtext information
Entry text and the playing time for playing every entry in the entry text;
The text to be searched for obtaining user's input determines matched with the text institute to be searched in the entry text
Entry, and play the target play moment of the entry;
According to the entry and the target play moment, audio corresponding with the entry is played.
The third aspect of the embodiment of the present invention provides a kind of terminal device, comprising:
Acquiring unit, for obtaining audio file to be processed;
Resolution unit obtains urtext information for parsing the audio file;Include in the urtext information
The entry text of the audio file and the playing time for playing every entry in the entry text;
Matching unit is determined with described in the entry text wait search for obtaining the text to be searched of user's input
The matched entry of Suo Wenben institute, and play the target play moment of the entry;
Broadcast unit, for playing and the entry pair according to the entry and the target play moment
The audio answered.
The fourth aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the computer storage medium
It is stored with computer program, the computer program includes program instruction, and described program instruction makes institute when being executed by a processor
State the method that processor executes above-mentioned first aspect.
Existing beneficial effect is the embodiment of the present invention compared with prior art:
The embodiment of the present invention is by obtaining audio file to be processed;Parsing audio file obtains urtext information;It is former
The playing time of every entry in entry text and broadcasting entry text in beginning text information including audio file;It obtains and uses
The text to be searched of family input, the matched entry of text institute determining and to be searched in entry text, and play target
The target play moment of entry;According to entry and target play moment, audio corresponding with entry is played.Pass through root
It determines its position and playing time in entry text according to the entry that user inputs and plays out, so that audio file can be with
It is flexibly presented to the user according to the mode that user selects, improves audio software and play the intelligence of audio and making for user
With experience.
Detailed description of the invention
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art
Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention some
Embodiment for those of ordinary skill in the art without any creative labor, can also be according to these
Attached drawing obtains other attached drawings.
Fig. 1 is the flow chart for the audio-frequency processing method that the embodiment of the present invention one provides;
Fig. 2 is the flow chart of audio-frequency processing method provided by Embodiment 2 of the present invention;
Fig. 3 is the schematic diagram for the terminal device that the embodiment of the present invention three provides;
Fig. 4 is the schematic diagram for the terminal device that the embodiment of the present invention four provides.
Specific embodiment
In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed
Body details, to understand thoroughly the embodiment of the present invention.However, it will be clear to one skilled in the art that there is no these specific
The present invention also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity
The detailed description of road and method, in case unnecessary details interferes description of the invention.
In order to illustrate technical solutions according to the invention, the following is a description of specific embodiments.
It is the flow chart for the audio-frequency processing method that the embodiment of the present invention one provides referring to Fig. 1, Fig. 1.The present embodiment sound intermediate frequency
The executing subject of processing method is terminal.Terminal includes but is not limited to that smart phone, tablet computer, wearable device etc. are mobile eventually
End, can also be desktop computer etc..Audio-frequency processing method as shown in the figure may comprise steps of:
S101: audio file to be processed is obtained.
Before handling audio file, the audio file is first obtained.Its mode obtained can be by wirelessly passing
The modes such as defeated, cable network obtain, herein without limitation.Audio file is generally divided into two classes: audio files and Instrument Digital connect
Mouth (Musical Instrument Digital Interface, MIDI) file, audio files is by sound recording device
The original sound of recording directly has recorded the binary sampled data of actual sound.MIDI file is a kind of musical performance instruction
Sequence is played using audio output device or the electronic musical instrument being connected with computer, audio text in the present embodiment
Part is MIDI file.
Audio file is a kind of file important in internet multimedia.The format of audio file in the present embodiment can be with
Including but not limited to: waveform audio file format (WAVE Form Audio File Forma, WAVE), Audio Interchange File lattice
Formula (Audio Interchange File Format, AIFF), sense of hearing audio format (Audio, AU), dynamic image expert group
(Moving Picture Experts Group, MPEG), instant Public Address System (RealAudio, RAM), musical instrument digital interface
(Musical Instrument Digital Interface, MIDI).Wherein, WAVE format is one kind of Microsoft's exploitation
Sound file format, it meets resource interchange file format (Resource Interchange File Format, PIFF) text
Part specification is supported for saving the audio-frequency information resource of WINDOWS platform by WINDOWS platform and its application program.WAVE
Format supports a variety of audio digits, sample frequency and sound channel, is sound file format popular on personal computer, file ruler
It is very little bigger, it is chiefly used in storing brief sound clip.Audio type file is a kind of digital audio format through overcompression,
It is common sound file format in web application.MPEG type file represents Moving picture compression standard, audio text here
Part format refers to the audio-frequency unit in MPGE standard, i.e. dynamic image expert group audio layer, because of its sound quality and memory space
Sexual valence is relatively high.RealAudio type file is mainly used for the real-time Transmission audio-frequency information on the wide area network of low rate.Network connects
Rate difference is connect, client sound quality obtained is also different.MIDI type file is digital music or synthesized instrument
Unified international standard, it defines the mode of Computer Music program, synthesizer and other electronic equipments exchange music signal,
Also specify the agreement between the electronic musical instrument of different manufacturers and the cable and hardware and equipment of computer connection.It can be used for for not
With musical instruments such as musical instrument creation digital audio analog violoncello, violin, pianos.Each audio file has included audio text
Part information, wherein audio file information may include the urtext information of audio file, file format, file frame number, each
The playing time of text and finish time and text duration etc..For example, the audio file information in a song may include
The lyrics duration, wrirte music, write words, singer etc..
S102: it parses the audio file and obtains urtext information;It include the audio in the urtext information
The entry text of file and the playing time for playing every entry in the entry text.
After getting audio file to be processed, which is parsed, obtains urtext information.Tool
Body, since the type of audio file is different, so the coding mode of each audio file and its corresponding analysis mode are not yet
It is identical.Type in this programme with specific reference to each audio file is parsed, by the format and its volume that determine audio file
Code mode, can parse the urtext information of the audio file according to coding mode.Wherein, it is wrapped in urtext information
It includes the entry text of audio file and plays the playing time of every entry in entry text.
Illustratively, song information and lyrics text have been included at least in the audio file of a first song.By reading and solving
Audio lyrics file information is analysed, obtains urtext information, the urtext information obtained in a song is exactly the song
The lyrics and playing time.Wherein, before and after every lyrics at the time of respectively at the beginning of the lyrics and at the end of
It carves.It is as follows to parse text formatting:
32.34: having passed through the lane street Hou Gu: 35.89
35.89: it is oblique to look at the setting sun afar for you by green wall: 39.03
39.03: only because being casual quick glance: 42.79
42.79: upsetting my state of mind regardless of day and night: 46.07
46.07: wanting to be turned into village Zhou Biancheng butterfly: 49.57
49.57: driving high official position across numerous luxuriant leaf: 52.74
52.74: although be that hills and mountains are layer upon layer of: 56.56
56.56: not also being in the mood for flowing herein and even rest: 59.82
In the present embodiment, playing time can be used to indicate that each sentence, each entry or each in audio text
At the time of word starts to play, i.e., to start to play starting for the 0th second for the audio file, play every entry in the audio file
First character at the time of be playing time.For example, in the examples described above, entry " although being that hills and mountains are layer upon layer of " is corresponding to be broadcast
Put is 52.74 seconds constantly, i.e., plays the entry within the 52.74th second from starting to play the audio file.Further, in order to more
Refinement, the accurate playing time for indicating each word in audio file, word, we can also determine in audio text information
Each word playing time, i.e., the word at the time of starting to play for playing time.
In numerous audio compression methods, these methods compression digital audio as far as possible while keeping sound quality is allowed to
Occupy smaller memory space.MPEG compression is lossy compression, which means that being certain to lose when compressing with this method
A part of audio-frequency information.But since the control of compression method is difficult to find this loss.Using several extremely complex and harsh
Mathematical algorithm so that the partial loss that only will be barely audible in original audio is fallen.This leaves more to important information
12 times of significant effects of audio compression can exactly be should be his quality, mpeg audio catches on by this method by space
Get up.(the Moving Picture Experts Group Audio Layer of dynamic image expert's compression standard audio level 3
III, MP3) file is broadly divided into three parts: label _ V2 (ID3V2), audio data frame, label _ V1 (ID3V1).Wherein,
ID3V2 contains author in the position that file starts, and composition, the information such as album, length is not fixed, and extends the information of ID3V1
Amount;It include a series of frame in frame, in the middle position of file, number is determined by file size and frame length;The length of each frame
It may be not fixed, it is also possible to it is fixed, it is determined by bit rate, each frame is divided into frame head and data entity two parts again;Frame head has recorded
The information such as bit rate, sample rate, the version of MP3, it is mutually indepedent between each frame.
Illustratively, mp3 file is made of frame, and frame is the smallest composition unit of mp3 file.Mpeg audio file root
Three layers are divided into according to compression quality and coding complexity, and respectively corresponds these three audio files of MP1, MP2, MP3, and according to
Different purposes uses the coding of different levels.The level of mpeg audio coding is higher, and encoder is more complicated, and compression ratio is also got over
The compression ratio of height, MP1 and MP2 are respectively 4:1 and 6:1-8:1, and the compression ratio of MP3 is then up to 10:1~12:1, one minute sound
The uncompressed memory space for needing 10MB of the music of matter, and there was only 1MB or so after MP3 compressed encoding.But MP3 is to sound
Frequency signal first carries out frequency spectrum point to audio file when MP3 is encoded to reduce audio distortions degree using lossy compression mode
Analysis, then filter noise level with filter, then by way of quantization by it is remaining each break up arrangement, eventually form
Mp3 file with higher compression ratios, and compressed file is enable to reach the sound of relatively former source of sound in playback
Effect.
WMA is a kind of media file format that Microsoft defines, it is a kind of Streaming Media.Each wma file, its head 16
A byte be it is fixed, be hexadecimal " 30 26 B2,75 8E, 66 CF, 11 A6 D9,00 AA, 00 62 CE 6C ",
For identifying whether this is wma file.Next 8 bytes are that an integer high position exists below, indicate entire WMA text
The size on part head, this head the inside contain all non-audio informations such as label information, and subsequent head is audio-frequency information.
The inside houses many frames offset is 31 since file, the standard tag information for having us to need, extension tag letter
Breath, wma file control information etc..Each frame is not isometric, but frame head is 24 fixed bytes, wherein preceding 16 byte
It is the name for identifying this frame, rear 8 bytes are used to indicate that the size of this frame.Since we only need read write tag
Information, and label information is stored in respectively in two frames, respectively standard label frame and extension tag frame, so only needing to locate
The two frames are managed, other frames can be skipped completely according to the frame length of acquisition.Standard label frame only includes title of song, art
Four family, copyright and remarks contents.Its frame name is hexadecimal " 33 26 B2,75 8E, 66 CF, 11 A6 D9 00
00 62 CE 6C " of AA, the integer of followed by 5 respectively 2 bytes, first four respectively indicate after the frame head of 24 bytes
Title of song, artist, copyright, the size of remarks.After this 10 bytes, the content of this five information is just stored in order
?.All texts are all to store by the coding mode of wide character, and have one behind each character string in wma file
A 0 termination character.The number for the information for including inside extension tag frame be it is uncertain, each information is also the same according to frame
Mode organize.The frame name of extension tag frame is hexadecimal " 40 07 E3 D2 of A4 D0 D2,11 97 F0
00 A0 C9 5E A8 50 " first has in this frame of the integer representation of two byte one shared after the frame head of 24 bytes
Information number is extended, followed by extension information.Each extension information includes extension information name and corresponding value.First have one
The integer of a 2 bytes come indicate extension information name size, followed by extension name of the information, then have 2 bytes
Integer mark, and be that the integer of 2 byte is used to indicate the size of value, be this value with that.When extension is believed
When breath name is WMFSDKVersion, what this value indicated is the version of this wma file;When extension information name is WM/
When AlbumTitle, what this value represented is exactly album name;When extending information name is WM/Genre, this value is represented just
It is school;Similarly, it is easy to the purposes of this value is found out from the name of extension information.The name and value of these extension information are almost
It is all to be stored with the character string of wide character.Integer mark only to WM/TrackNumber and WM/Track, believe by the two extensions
It is useful to cease name, subsequent value is indicated in the form of the integer of 4 bytes when integer is identified as 3, that is, song
Information, when integer is identified as 0, song information is indicated with common character string forms.
AMR adaptive multi-rate audio compression audio coding formats, being one makes voice coding optimize, be exclusively used in
Effect ground compression speech frequency.AMR audio is mainly used for the audio compression of mobile device, and compression ratio is very high but sound quality is poor,
It is mainly used for the audio compression of voice class, is not suitable for the compression of the music class audio frequency more demanding to sound quality.It is a not using 1-8
Same position speed coding.AMR mono- shares 16 kinds of coding modes.0-7 corresponds to 8 kinds of different coding modes, and every kind of coding mode is adopted
Sample frequency is different;8-15 is for noise or retains use.All AMR file header marks are 6 bytes.This file is every frame 21
Byte.In the file header of AMR audio, the head of file is inconsistent in the case of monophonic and multichannel, under mono case
File header only include a magic number, and file header had both included magic number in the case of multichannel, after which also included one 32
Channel description field.32 bit ports in the case of multichannel describe character, and first 28 are all reserved character, it is necessary to it is arranged to 0,
Last 4 illustrate the sound channel number used.It is exactly speech frame block continuous in time after the file header of AMR audio, each
Frame block includes that the speech frame of several 8 hytes alignment is arranged successively since first sound channel relative to several sound channels.Often
One speech frame is all since one 8 frame heads, and wherein P is that filler must be set as 0, and each frame is the alignment of 8 hytes
's.
It should be noted that the size of its audio frame is different for different coding modes, bit rate is also not
With, the calculation of audio data frame sign are as follows: mono- frame of AMR corresponds to 20ms, then there is the audio data of 50 frames for one second.Due to than
Special rate is different, and the size of data of every frame is also different.If bit rate is 12.2kbs, the audio data digit of sampling per second
Are as follows: 12200/50=244bit=30.5byte is rounded as 31 bytes.Rounding will round up, along with the frame of a byte
Head, the size of such data frame are 32 bytes.
S103: obtaining the text to be searched of user's input, the determining and text institute to be searched in the entry text
Matched entry, and play the target play moment of the entry.
After the entry text being resolved in urtext information, by obtaining the text to be searched of user's input,
The determining target play moment with the text to be searched matched entry of institute and broadcasting entry in entry text.It is exemplary
Ground, entry text can be and the lyrics of song audio files, user input text to be searched can be a word or
One sentence of person, this word or sentence inputted by user search corresponding playing time in former lyrics file.?
The object searched in the present embodiment is the entry of user's input, and is stored with each entry in urtext information and its broadcasts
The moment is put, by searching for the playing time of entry corresponding entry and the entry in urtext information, to determine
The target play moment of the entry.
In practical applications, the mode for obtaining the text to be searched of user's input can be user in the window of audio player
Mouth input entry, or cursor placement can be determined into text to be searched in some position of entry text.In basis
When text to be searched is matched in entry text, can by way of calculating similarity factor, calculate text to be searched with
The highest part of similarity factor in entry text, so determine entry and play the entry target play when
It carves.
Specifically, when calculating the similarity factor between text to be searched and entry text, can first by two objects into
Row participle, determines at least one entry in two objects.Specific similarity factor calculation can be by calculating two sequences
The distance between column or similarity factor determine operation deviation value.Illustratively, Euclidean distance, Euclidean distance can be passed through
Standardization, Mahalanobis generalised distance, manhatton distance, Chebyshev distance, Minkowski distance or Hamming distances
Mode calculate the distance between text to be searched and entry text, can also pass through and calculate cosine similarity factor and adjustment
Cosine similarity factor, Pearson correlation coefficients, log-likelihood similarity factor, log-likelihood likelihood, mutual information gain or word
Similarity factor between text and entry text to be searched is calculated the mode of similarity factor.
Illustratively, the operation that can be calculated by Jaccard similarity factor between text and entry text to be searched is inclined
From value:Wherein, X, Y are respectively used to indicate that the entry in text to be searched and entry text quantifies
Value can determine text and word to be searched by calculating the Jaccard similarity factor between text and entry text to be searched
Operation deviation value between bar text.
S104: according to the entry and the target play moment, audio corresponding with the entry is played.
The matched entry of text institute determining and to be searched in entry text, and determine to play the mesh of entry
After marking playing time, according to entry and target play moment, audio corresponding with entry is played.Illustratively,
After user in practical applications has selected certain lyrics or has input some word, by the words and phrases determiner in entry
Position and target play moment in text, play out.
Further, in the present embodiment, after determining entry and target play moment, the target word can be played
Item, broadcast mode can be the continuous loop play entry, be also possible to only play an entry;It can also be
It plays entry and continues immediately to play the audio-frequency unit after entry.In addition to this it is possible to be other broadcasting sides
Formula, in the present embodiment without limitation.
Above scheme, by obtaining audio file to be processed;It parses the audio file and obtains urtext information;Institute
It states the entry text in urtext information including the audio file and plays in the entry text broadcasting for every entry
Put the moment;The text to be searched for obtaining user's input determines matched with the text institute to be searched in the entry text
Entry, and play the target play moment of the entry;When being played according to the entry and the target
It carves, plays audio corresponding with the entry.Its position in entry text is determined by the entry inputted according to user
It sets and playing time and plays out, audio file is flexibly presented to the user according to the mode that user selects, is mentioned
High audio software plays the usage experience of the intelligence and user of audio.
Referring to fig. 2, Fig. 2 is the flow chart of audio-frequency processing method provided by Embodiment 2 of the present invention.The present embodiment sound intermediate frequency
The executing subject of processing method is terminal.Terminal includes but is not limited to that smart phone, tablet computer, wearable device etc. are mobile eventually
End, can also be desktop computer etc..Audio-frequency processing method as shown in the figure may comprise steps of:
S201: audio file to be processed is obtained.
The implementation of S101 is identical in S201 embodiment corresponding with Fig. 1 in the present embodiment, specifically refers to
The associated description of S101 in the corresponding embodiment of Fig. 1, details are not described herein.
S202: it parses the audio file and obtains urtext information;It include the audio in the urtext information
The entry text of file and the playing time for playing every entry in the entry text.
The implementation of S102 is identical in S202 embodiment corresponding with Fig. 1 in the present embodiment, specifically refers to
The associated description of S102 in the corresponding embodiment of Fig. 1, details are not described herein.
S203: obtaining the text to be searched of user's input, and at least one key is extracted from the text to be searched
Word.
It is getting audio file to be processed and after obtaining urtext information in the audio file, is obtaining user
The text to be searched of input, wherein text to be searched can be a word, be also possible to a sentence.By obtaining user
The text to be searched of input, extracts at least one keyword from text to be searched.
Further, step S203 can specifically include step S2031~S2032:
S2031: the text to be searched of user's input is obtained, and the text to be searched is pre-processed, obtains pre- place
Text after reason.
In practical applications, the mode for obtaining the text to be searched of user's input can be also possible to by obtaining in real time
Timing acquisition, the mode of timing can be user oneself set time period, by periodically obtain user's input wait search
Suo Wenben, it is ensured that the running quality of audio file has preferable controllability.
After getting text to be searched, text to be searched is pre-processed, in the present embodiment, pretreated side
Formula may include delete redundancy entry, the operation such as data correction or data filter out, herein without limitation.Specifically, due to
The entry of family input or selection many situations all there is text that punctuation mark etc. is anticipated without word in this case can
To delete the redundancies entry such as punctuate, audio processing efficiency is improved.In many cases, all there is mistake in the entry text of user's input
Word can identify the wrong word in text to be searched when pretreated, predict the entry meaning in text to be searched and use
The correct entry of input is wanted at family, and corrects it, and improves the accuracy of vocabulary entry search.In many cases, user also can
Input much has duplicate entry when input, and such case will will increase the duration and error rate of vocabulary entry search, therefore,
It by identifying duplicate entry, and carries out data and filters out, reduce the entry data of text to be searched, improve vocabulary entry search
Efficiency and accuracy rate.
S2032: according to preparatory trained participle model, segmenting the text after the pretreatment, obtain to
A few keyword.
For any Men Yuyan, word is most basic unit, and computer will be understood and be analyzed to natural language,
First must just word segmentation processing be carried out to original long text.Participle technique is exactly to pass through computer to carry out certainly word in text
A kind of technology of dynamic identification, for using English as the Romance of representative, there is advantageous advantage in word segmentation processing,
Default is separated with space between word and word.However Chinese word segmentation is just appeared to it is complicated and much more difficult,
Minimum unit is word in Chinese text, and there is no apparent separators between word.
It treats in matched text and is segmented first in the present embodiment, at least one keyword at extraction.It optionally, can be with
It is segmented by the segmentation methods based on string matching, dictionary data is loaded by certain data structure, to input
Text-string carry out cutting according to certain scanning sequency and matching strategy, and carry out character string with the word in dictionary
Match, thinks to identify a word if successful match, the participle clear thinking based on dictionary pattern matching, principle is simple and is easy to
It realizes.Understanding process of the computer by the imitation mankind to sentence can also be wished by the participle based on understanding, the algorithm, from
Semantic, grammer angle analyzes text sentence.Therefore need to prepare in advance the related letter in terms of a large amount of language, grammer
Breath and knowledge.
Participle model is trained previously according to the historical data of entry text, when being trained to participle model, is first obtained
Get the good training set data of labor standard.It is labelled with the position of participle in these training set datas, determines kinds of characters institute
Corresponding participle position, wherein participle position includes starting position, end position and the middle position of participle.Secondly, to acquisition
To training set data carry out pretreatment and feature extraction.By filtering out non-targeted character, a Chinese character is given, is sentenced
Breaking, whether it belongs to punctuation mark, number, Chinese figure or letter;If being not belonging to any kind therein, statistics should
The position of positioned word, is indicated with B, M, E, S when character occurs in training corpus.Wherein, B is for indicating the character
It is the beginning of each word;M is for indicating the character in the middle position of some word;E is for indicating that the character is the knot of some word
Beam position;S is for indicating that the character can one word of independent composition.The position of character is matched by rule-statistical, counts character
Corresponding location conten determines the position classification of the character;Illustratively, the threshold value that this programme is taken is 90%, as long as word
Accord with position frequency of occurrence is more than total degree 90%, then it is assumed that most of the character is in the corresponding character of word;
Secondly, predicting the position of key character by participle model.The spy that participle model is taken in the present embodiment
Sign may include N-gram feature, may include but be not limited to such as ci, cici+1 and cici+2 feature in this feature.Wherein,
Ci is for indicating character types corresponding to former and later two keywords, the wherein feature of i=-2, -1,0,1,2 or 5;cici+1
For indicating the character combination feature of adjacent spaces, the wherein feature of i=-2, -1,0,1 or 4;Cici+2 is separated by for indicating
The character combination feature of one character, wherein i=-1,0 or 2 feature.The spy that participle model in the present embodiment is taken
Sign can also include character repetition information characteristics, calculate whether some character is repeat character (RPT), function sets with first three character
For duplication (c0, ci), wherein i=-2, -1 or 2 feature.The spy that participle model is taken in the present embodiment
Sign can also include character class feature, for calculating three character types before the character.
Finally, being learnt using trellis traversal method to the parameter in model in the present embodiment, the index mainly traversed
Have: learning rate, frequency of training, lot number amount, termination error etc..The condition that model training terminates includes but is not limited to that frequency of training reaches
Some index has been had arrived to certain number, error.When carrying out parameter learning, the numerical value determination to each index includes
But be not limited to following: learning rate chooses three dimensions such as 0.01,0.02,0.03;Frequency of training chooses 500,1000,2,000 three
Dimension;Lot number amount chooses 100,200,500 three dimensions;Termination error choose 0.05,0.01,0.5 three dimension.By to not
Same online learning methods, available design parameter combination.The model combination of different parameters composition is obtained by model training:
Params1, params2, params3 ... .params n }, wherein params n is for indicating that it is different that training obtains
Parameter.After obtaining training parameter, the model combination that these parameters form is tested, determines the accuracy of test, and
The highest model of accuracy is chosen as participle model to obtain to carry out word segmentation processing to the text to be searched after pretreatment
At least one keyword, for indicating the entry in text to be searched, to carry out vocabulary entry search.
S204: carrying out fuzzy matching in the urtext information according to the keyword, obtain with it is described to be searched
The matched entry of text institute.
After the keyword in text to be searched has been determined, according to the keyword urtext information entry text
Middle carry out fuzzy matching obtains and the matched entry of text institute to be searched.
Further, step S204 can specifically include step S2041~S2044.
S2041: the first term vector corresponding with the keyword is generated according to the keyword.
In the case where the text to be matched got is one or more keyword, can directly be existed by keyword
It is searched in target text, determines position and the playing time of entry of the keyword in target text, be somebody's turn to do with playing
The corresponding audio-frequency unit of entry.Further, for example, often there is duplicate text in a first song, therefore, it is necessary to
User inputs at least two keywords, to be matched in target text by greater number of keyword, accurately determines mesh
Mark entry.In the present embodiment, by quantifying keyword, corresponding first term vector of keyword is obtained.
Assuming that between each keyword in text to be matched and urtext information be it is incoherent, indicated with vector
Keyword in text, to simplify the complex relationship in text between keyword.Text is regarded as and is independent from each other word
Item group (T1,T2,T3,…,Ti,…Tn) constitute, for each entry TiIt is assigned according to its significance level in the text certain
Weight wi, and by (T1,T2,T3,…,Ti,…Tn) regard reference axis in a n-dimensional coordinate system, w as1,w2,…,wi,…,
wmRespectively corresponding coordinate value, in this way by (T1,T2,T3,…,Ti,…Tn) decompose obtained orthogonal entry set of vectors and just constitute
One text vector space.
S2042: the urtext information is divided into simple sentence, and determines the second term vector of each simple sentence.
Based on the text vector space in step S2042, text to be matched is reduced to the vector being made of keyword: a
=(wa1,wa2,…,wai,…,wam)T;It is the vector being made of keyword: b=(w by urtext Information Simplificationb1,wb2,…,
wbi,…,wbm)T。
S2043: it according to first term vector and each second term vector, calculates in the urtext information
Simple sentence matching degree between every entry and the keyword.
It in practical applications, can be by the distance between vector or both similarity calculations urtext information
Simple sentence matching degree between every entry and keyword.It is alternatively possible to pass through Euclidean distance, the standard of Euclidean distance
Change, Mahalanobis generalised distance, manhatton distance, Chebyshev distance, Minkowski distance or Hamming distances mode
To calculate the distance between text to be searched and entry text, calculating cosine similarity factor can also be passed through and adjust cosine phase
Like coefficient, Pearson correlation coefficients, log-likelihood similarity factor, log-likelihood likelihood, mutual information gain or word to similar
The mode of coefficient calculates the similarity factor between text and entry text to be searched.
The simple sentence matching degree of the two is calculated according to crucial term vector are as follows:
Wherein, a=(wa1,wa2,…,wai,…,wam)TFor indicating that keyword that text to be matched is simplified to is constituted
Vector;B=(wb1,wb2,…,wbi,…,wbm)TFor indicate urtext Information Simplification at the vector that is constituted of keyword.
S2044: identify that the highest entry of simple sentence matching degree is the entry.
Identify that urtext information corresponding with the maximum keyword of crucial Word similarity of text to be matched is to wait for this
Matched text corresponds to target text.If the audio file is a song, the lyrics are searched for generally by audio lyrics file
Keyword function, lyrics position where quickly positioning.Click the simple search result list of the broadcasting lyrics of positioning.Quickly jump to
The playing time position of the current lyrics.The corresponding audio-frequency unit of the entry is played, audio-source and audio lyrics text are enable
This rolling in unison shows and its plays, that is, realizes that user directly passes through control mouse and indicates the corresponding audio original lyrics,
To control audio while play.
It further, can also be by presetting a matching degree threshold value, for will be greater than or be equal to the matching degree
The simple sentence matching degree of threshold value screens, and is presented to user, selectes corresponding entry by user oneself, increases user
Master control.
S205: according to the playing time of every entry in the entry and the urtext information, determine described in
The target play moment of entry.
The implementation of S103 is identical in S205 embodiment corresponding with Fig. 1 in the present embodiment, specifically refers to
The associated description of S103 in the corresponding embodiment of Fig. 1, details are not described herein.
S206: according to the entry and the target play moment, audio corresponding with the entry is played.
The implementation of S104 is identical in S206 embodiment corresponding with Fig. 1 in the present embodiment, specifically refers to
The associated description of S104 in the corresponding embodiment of Fig. 1, details are not described herein.
S207: the target audio that the currently playing moment played is obtained, and identifies the content of text of the target audio.
While playing audio, obtain currently playing target audio, i.e., the currently playing moment, preset time it
Interior a segment of audio.Voice in the target audio is identified, the content of text of target audio is obtained.Optionally, may be used
To construct audio identification model, identify target by audio identification model by carrying out data analysis according to history audio file
The content of text of audio.Specifically, in building audio identification model process, including two stages, training stage and identification rank
Section.In the training stage, by the content of text in manually identifying audio file, and using the characteristic vector of audio file as template
It is stored in template library;In cognitive phase, the characteristic vector of input audio file is successively subjected to phase with each template in template library
Compare like degree, is exported similarity soprano as recognition result.
S208: according to the content of text and the currently playing moment, correct recording in the urtext information
With target play moment corresponding to the matched entry text of the content of text.
After recognizing the corresponding content of text of currently played target audio, broadcast according to text content and currently
It puts the moment, corrects the entry text and playing time of urtext information.Illustratively, when music player is playing music
When, it can also roll simultaneously and change the lyrics, but the lyrics of song heard of user may be different with the lyrics that are rolled or be broadcast
It puts that the moment is inconsistent, needs to identify the content of text of currently played audio in this case, just to go to correct original song
Word.
It modifies while playing audio file to audio file information, including modification content of text and broadcasting text
Playing time, wherein the playing time of text can refine to play each sentence, each word playing time.It is repairing
When changing content of text, the content of text that acquisition current time is played carries out language according to the content of text that current time is played
Sound identification, the text in text results and audio file information obtained according to identification compare, if inconsistent, modify sound
Playing time and content in frequency file information.It is broadcast correcting target corresponding to the entry text in urtext information
When putting the moment, audio playback progress component can be monitored by angular timed task, the synchronous rolling for controlling text plays.
Specifically, when correcting the playing time of text, current time when can be by the currently playing text, with this article
This playing time recorded in urtext information is compared, if inconsistent, it is determined that inconsistent sentence, entry occurs
Or single word, determine and record that the current time for playing these texts is correct playing time, while modifying former before
The playing time of these texts recorded in beginning text information.Entry or playing time modification in entire audio file
After finishing, according to lyrics name+timestamp format backup history file, while the audio file after modification is saved, realize language
The error correction of text file after sound identification plays.
Above scheme, by obtaining audio file to be processed;It parses the audio file and obtains urtext information;Institute
It states the entry text in urtext information including the audio file and plays in the entry text broadcasting for every entry
Put the moment;The text to be searched of user's input is obtained, and extracts at least one keyword from the text to be searched;According to
The keyword carries out fuzzy matching in the urtext information, obtains and the matched mesh of the text institute to be searched
Mark entry;According to the playing time of every entry in the entry and the urtext information, the target word is determined
The target play moment of item.According to the entry and the target play moment, play corresponding with the entry
Audio.The target audio that the currently playing moment played is obtained, and identifies the content of text of the target audio;According to the text
This content and the currently playing moment, correct recorded in the urtext information with the matched target of the content of text
Target play moment corresponding to entry text.By directly playing the entry of user's selection, realizes and broadcast by user's determination
Event is put, and quickly controls the playback progress of player, and passes through while carrying out the calibration and modification of entry text in broadcasting,
It improves audio software and plays the intelligence of audio and the usage experience of user.
It is a kind of schematic diagram for terminal device that the embodiment of the present invention three provides referring to Fig. 3, Fig. 3.What terminal device included
Each unit is used to execute each step in the corresponding embodiment of FIG. 1 to FIG. 2.Referring specifically to the corresponding implementation of FIG. 1 to FIG. 2
Associated description in example.For ease of description, only the parts related to this embodiment are shown.The terminal device of the present embodiment
300 include:
Acquiring unit 301, for obtaining audio file to be processed;
Resolution unit 302 obtains urtext information for parsing the audio file;It is wrapped in the urtext information
It includes the entry text of the audio file and plays the playing time of every entry in the entry text;
Matching unit 303, for obtaining the text to be searched of user's input, in the entry text it is determining with it is described to
The matched entry of text institute is searched for, and plays the target play moment of the entry;
Broadcast unit 304, for playing and the entry according to the entry and the target play moment
Corresponding audio.
Further, the matching unit 303 may include:
Extraction unit for obtaining the text to be searched of user's input, and extracts at least from the text to be searched
One keyword;
Search unit obtains and institute for carrying out fuzzy matching in the urtext information according to the keyword
State the matched entry of text institute to be searched;
Determination unit, for the playing time according to every entry in the entry and the urtext information,
Determine the target play moment of the entry.
Further, the extraction unit may include:
Pretreatment unit for obtaining the text to be searched of user's input, and pre-processes the text to be searched,
Text after being pre-processed;
Participle unit, for being segmented to the text after the pretreatment according to preparatory trained participle model,
Obtain at least one described keyword.
Further, described search unit may include:
Primary vector unit, for generating the first term vector corresponding with the keyword according to the keyword;
Secondary vector unit for the urtext information to be divided into simple sentence, and determines the of each simple sentence
Two term vectors;
Computing unit, for calculating the urtext according to first term vector and each second term vector
The simple sentence matching degree between every entry and the keyword in information;
Recognition unit, the highest entry of simple sentence matching degree is the entry for identification.
Further, the terminal device can also include:
Content recognition unit for obtaining the target audio that the currently playing moment played, and identifies the target audio
Content of text;
Unit is corrected, for correcting the urtext information according to the content of text and the currently playing moment
Middle record with target play moment corresponding to the matched entry text of the content of text.
Above scheme, by obtaining audio file to be processed;It parses the audio file and obtains urtext information;Institute
It states the entry text in urtext information including the audio file and plays in the entry text broadcasting for every entry
Put the moment;The text to be searched of user's input is obtained, and extracts at least one keyword from the text to be searched;According to
The keyword carries out fuzzy matching in the urtext information, obtains and the matched mesh of the text institute to be searched
Mark entry;According to the playing time of every entry in the entry and the urtext information, the target word is determined
The target play moment of item.According to the entry and the target play moment, play corresponding with the entry
Audio.The target audio that the currently playing moment played is obtained, and identifies the content of text of the target audio;According to the text
This content and the currently playing moment, correct recorded in the urtext information with the matched target of the content of text
Target play moment corresponding to entry text.By directly playing the entry of user's selection, realizes and broadcast by user's determination
Event is put, and quickly controls the playback progress of player, and passes through while carrying out the calibration and modification of entry text in broadcasting,
It improves audio software and plays the intelligence of audio and the usage experience of user.
Fig. 4 is the schematic diagram for the terminal device that the embodiment of the present invention four provides.As shown in figure 4, the terminal of the embodiment is set
Standby 4 include: processor 40, memory 41 and are stored in the meter that can be run in the memory 41 and on the processor 40
Calculation machine program 42.The processor 40 is realized when executing the computer program 42 in above-mentioned each audio-frequency processing method embodiment
The step of, such as step 101 shown in FIG. 1 is to 103.Alternatively, realization when the processor 40 executes the computer program 42
The function of each module/unit in above-mentioned each Installation practice, such as the function of unit 301 to 303 shown in Fig. 3.
Illustratively, the computer program 42 can be divided into one or more module/units, it is one or
Multiple module/units are stored in the memory 41, and are executed by the processor 40, to complete the present invention.Described one
A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, which is used for
Implementation procedure of the computer program 42 in the terminal device 4 is described.
The terminal device 4 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set
It is standby.The terminal device may include, but be not limited only to, processor 40, memory 41.It will be understood by those skilled in the art that Fig. 4
The only example of terminal device 4 does not constitute the restriction to terminal device 4, may include than illustrating more or fewer portions
Part perhaps combines certain components or different components, such as the terminal device can also include input-output equipment, net
Network access device, bus etc..
Alleged processor 40 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng.
The memory 41 can be the internal storage unit of the terminal device 4, such as the hard disk or interior of terminal device 4
It deposits.The memory 41 is also possible to the External memory equipment of the terminal device 4, such as be equipped on the terminal device 4
Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge
Deposit card (Flash Card, FC) etc..Further, the memory 41 can also have been deposited both the inside including the terminal device 4
Storage unit also includes External memory equipment.The memory 41 is for storing the computer program and terminal device institute
Other programs and data needed.The memory 41 can be also used for temporarily storing the number that has exported or will export
According to.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function
Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different
Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing
The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also
To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list
Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system
The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment
The part of load may refer to the associated description of other embodiments.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or
In use, can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-mentioned implementation
All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program
Calculation machine program can be stored in a computer readable storage medium.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality
Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each
Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified
Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all
It is included within protection scope of the present invention.
Claims (10)
1. a kind of audio-frequency processing method characterized by comprising
Obtain audio file to be processed;
It parses the audio file and obtains urtext information;It include the entry of the audio file in the urtext information
Text and the playing time for playing every entry in the entry text;
The text to be searched for obtaining user's input, determining in the entry text and matched target of the text institute to be searched
Entry, and play the target play moment of the entry;
According to the entry and the target play moment, audio corresponding with the entry is played.
2. audio-frequency processing method as described in claim 1, which is characterized in that the text to be searched for obtaining user's input,
The determining and matched entry of the text institute to be searched in the entry text, and play the mesh of the entry
Mark playing time, comprising:
The text to be searched of user's input is obtained, and extracts at least one keyword from the text to be searched;
Fuzzy matching is carried out in the urtext information according to the keyword, obtains being matched with the text to be searched
The entry;
According to the playing time of entry in the entry and the urtext information, the target of the entry is determined
Playing time.
3. audio-frequency processing method as claimed in claim 2, which is characterized in that the text to be searched for obtaining user's input,
And at least one keyword is extracted from the text to be searched, comprising:
The text to be searched of user's input is obtained, and the text to be searched is pre-processed, the text after being pre-processed
This;
According to preparatory trained participle model, the text after the pretreatment is segmented, is obtained described at least one
Keyword.
4. audio-frequency processing method as claimed in claim 2, which is characterized in that it is described according to the keyword in the original text
Fuzzy matching is carried out in this information, is obtained and the matched entry of the text institute to be searched, comprising:
The first term vector corresponding with the keyword is generated according to the keyword;
The urtext information is divided into simple sentence, and determines the second term vector of each simple sentence;
According to first term vector and each second term vector, calculate every entry in the urtext information with
Simple sentence matching degree between the keyword;
Identify that the highest entry of simple sentence matching degree is the entry.
5. audio-frequency processing method according to any one of claims 1-4, which is characterized in that it is described according to the entry and
The target play moment, after playing corresponding with entry audio, further includes:
The target audio that the currently playing moment played is obtained, and identifies the content of text of the target audio;
According to the content of text and the currently playing moment, correct being recorded in the urtext information with the text
Target play moment corresponding to the entry text of content matching.
6. a kind of terminal device, which is characterized in that including memory and processor, being stored in the memory can be described
The computer program run on processor, which is characterized in that when the processor executes the computer program, realize following step
It is rapid:
Obtain audio file to be processed;
It parses the audio file and obtains urtext information;It include the entry of the audio file in the urtext information
Text and the playing time for playing every entry in the entry text;
The text to be searched for obtaining user's input, determining in the entry text and matched target of the text institute to be searched
Entry, and play the target play moment of the entry;
According to the entry and the target play moment, audio corresponding with the entry is played.
7. terminal device as claimed in claim 6, which is characterized in that the text to be searched for obtaining user's input, in institute
It is determining in predicate bar text to be broadcast with the matched entry of the text institute to be searched, and the target of the broadcasting entry
Put the moment, comprising:
The text to be searched of user's input is obtained, and extracts at least one keyword from the text to be searched;
Fuzzy matching is carried out in the urtext information according to the keyword, obtains being matched with the text to be searched
The entry;
According to the playing time of every entry in the entry and the urtext information, the entry is determined
Target play moment.
8. terminal device as claimed in claim 6, which is characterized in that the text to be searched for obtaining user's input, and from
At least one keyword is extracted in the text to be searched, comprising:
The text to be searched of user's input is obtained, and the text to be searched is pre-processed, the text after being pre-processed
This;
According to preparatory trained participle model, the text after the pretreatment is segmented, is obtained described at least one
Keyword.
9. a kind of terminal device characterized by comprising
Acquiring unit, for obtaining audio file to be processed;
Resolution unit obtains urtext information for parsing the audio file;It include described in the urtext information
The entry text of audio file and the playing time for playing every entry in the entry text;
Matching unit, for obtaining the text to be searched of user's input, the determining and text to be searched in the entry text
The matched entry of this institute, and play the target play moment of the entry;
Broadcast unit, for playing corresponding with the entry according to the entry and the target play moment
Audio.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In when the computer program is executed by processor the step of any one of such as claim 1 to 5 of realization the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811423356.0A CN109657094A (en) | 2018-11-27 | 2018-11-27 | Audio-frequency processing method and terminal device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811423356.0A CN109657094A (en) | 2018-11-27 | 2018-11-27 | Audio-frequency processing method and terminal device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109657094A true CN109657094A (en) | 2019-04-19 |
Family
ID=66111614
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811423356.0A Pending CN109657094A (en) | 2018-11-27 | 2018-11-27 | Audio-frequency processing method and terminal device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109657094A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110750230A (en) * | 2019-09-30 | 2020-02-04 | 北京淇瑀信息科技有限公司 | Voice interface display method and device and electronic equipment |
CN111161738A (en) * | 2019-12-27 | 2020-05-15 | 苏州欧孚网络科技股份有限公司 | Voice file retrieval system and retrieval method thereof |
CN114115674A (en) * | 2022-01-26 | 2022-03-01 | 荣耀终端有限公司 | Method for positioning sound recording and document content, electronic equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130158992A1 (en) * | 2011-12-17 | 2013-06-20 | Hon Hai Precision Industry Co., Ltd. | Speech processing system and method |
CN104301771A (en) * | 2013-07-15 | 2015-01-21 | 中兴通讯股份有限公司 | Method and device for adjusting playing progress of video file |
CN104409087A (en) * | 2014-11-18 | 2015-03-11 | 广东欧珀移动通信有限公司 | Method and system of playing song documents |
CN107071542A (en) * | 2017-04-18 | 2017-08-18 | 百度在线网络技术(北京)有限公司 | Video segment player method and device |
CN107798143A (en) * | 2017-11-24 | 2018-03-13 | 珠海市魅族科技有限公司 | A kind of information search method, device, terminal and readable storage medium storing program for executing |
CN108399150A (en) * | 2018-02-07 | 2018-08-14 | 深圳壹账通智能科技有限公司 | Text handling method, device, computer equipment and storage medium |
-
2018
- 2018-11-27 CN CN201811423356.0A patent/CN109657094A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130158992A1 (en) * | 2011-12-17 | 2013-06-20 | Hon Hai Precision Industry Co., Ltd. | Speech processing system and method |
CN104301771A (en) * | 2013-07-15 | 2015-01-21 | 中兴通讯股份有限公司 | Method and device for adjusting playing progress of video file |
CN104409087A (en) * | 2014-11-18 | 2015-03-11 | 广东欧珀移动通信有限公司 | Method and system of playing song documents |
CN107071542A (en) * | 2017-04-18 | 2017-08-18 | 百度在线网络技术(北京)有限公司 | Video segment player method and device |
CN107798143A (en) * | 2017-11-24 | 2018-03-13 | 珠海市魅族科技有限公司 | A kind of information search method, device, terminal and readable storage medium storing program for executing |
CN108399150A (en) * | 2018-02-07 | 2018-08-14 | 深圳壹账通智能科技有限公司 | Text handling method, device, computer equipment and storage medium |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110750230A (en) * | 2019-09-30 | 2020-02-04 | 北京淇瑀信息科技有限公司 | Voice interface display method and device and electronic equipment |
CN111161738A (en) * | 2019-12-27 | 2020-05-15 | 苏州欧孚网络科技股份有限公司 | Voice file retrieval system and retrieval method thereof |
CN114115674A (en) * | 2022-01-26 | 2022-03-01 | 荣耀终端有限公司 | Method for positioning sound recording and document content, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108304375B (en) | Information identification method and equipment, storage medium and terminal thereof | |
WO2023065544A1 (en) | Intention classification method and apparatus, electronic device, and computer-readable storage medium | |
Kotti et al. | Speaker segmentation and clustering | |
CN108052499B (en) | Text error correction method and device based on artificial intelligence and computer readable medium | |
US7162482B1 (en) | Information retrieval engine | |
US7634407B2 (en) | Method and apparatus for indexing speech | |
CN107741928A (en) | A kind of method to text error correction after speech recognition based on field identification | |
CN108460011B (en) | Entity concept labeling method and system | |
US20080208891A1 (en) | System and methods for recognizing sound and music signals in high noise and distortion | |
CN113836277A (en) | Machine learning system for digital assistant | |
US20050240413A1 (en) | Information processing apparatus and method and program for controlling the same | |
JP2009508156A (en) | Music analysis | |
CN109657094A (en) | Audio-frequency processing method and terminal device | |
CN107247768A (en) | Method for ordering song by voice, device, terminal and storage medium | |
CN110222225A (en) | The abstraction generating method and device of GRU codec training method, audio | |
KR20170136200A (en) | Method and system for generating playlist using sound source content and meta information | |
CN111414513A (en) | Music genre classification method and device and storage medium | |
US20220414338A1 (en) | Topical vector-quantized variational autoencoders for extractive summarization of video transcripts | |
CN113407775B (en) | Video searching method and device and electronic equipment | |
CN110516109B (en) | Music label association method and device and storage medium | |
Hong et al. | Content-based video-music retrieval using soft intra-modal structure constraint | |
Gupta et al. | Songs recommendation using context-based semantic similarity between lyrics | |
CN115359785A (en) | Audio recognition method and device, computer equipment and computer-readable storage medium | |
Owen et al. | Computed synchronization for multimedia applications | |
KR20190009821A (en) | Method and system for generating playlist using sound source content and meta information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |