CN102236686A - Voice sectional song search method - Google Patents

Voice sectional song search method Download PDF

Info

Publication number
CN102236686A
CN102236686A CN2010101692232A CN201010169223A CN102236686A CN 102236686 A CN102236686 A CN 102236686A CN 2010101692232 A CN2010101692232 A CN 2010101692232A CN 201010169223 A CN201010169223 A CN 201010169223A CN 102236686 A CN102236686 A CN 102236686A
Authority
CN
China
Prior art keywords
client
song
voice
user
server end
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010101692232A
Other languages
Chinese (zh)
Inventor
李霄寒
黄伟
蔡洪滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shengle Information Technolpogy Shanghai Co Ltd
Original Assignee
Shengle Information Technolpogy Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shengle Information Technolpogy Shanghai Co Ltd filed Critical Shengle Information Technolpogy Shanghai Co Ltd
Priority to CN2010101692232A priority Critical patent/CN102236686A/en
Publication of CN102236686A publication Critical patent/CN102236686A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a voice sectional song search method, which is realized through interaction of a client side and a server side; and a song database is required to be established at the server side. The method comprises the following steps of: during search, the client side prompts a user to speak out song information to be searched in sections and transmits the voice spoken by the user to the server side after the user finishes speaking; after receiving a voice signal, the server side automatically identifies a character corresponding to the voice signal, then searches in the song database in layers and transmits a search result to the client side; and finally, the client side provides the song search result to the user. According to the method, a rational voice interaction flow is designed at the client side, so that a super large song search space can be decomposed into combination of a plurality of smaller song search spaces by the server side; and thus, search efficiency and automatic voice recognition accuracy in the super large song search space are improved, user experience is improved, meanwhile, hardware cost is saved for service providers.

Description

Voice segment formula song retrieval method
Technical field
The present invention relates to a kind of voice segment formula song retrieval method.
Background technology
Download of songs is current moving and a very active class of business of internet arena.The searching requirement that this business is provided by client according to the user finds the song that meets search condition for user's download in server Qu Ku.Modal example is the name of user in terminal input singer or song, and server finds this singer or the pairing head of song title or a series of songs to return to the user.
On a portable terminal, because the restriction of terminal physical size, the efficient of input method is lower usually, the user can be very consuming time by keyboard or screen input one first complete song title, and because phonetic input accounts for dominant position on mobile terminal input method, the user imports that the situation right and wrong of unisonance wrongly written or mispronounced characters usually see, be entered as " wearing strange " such as " legend " Wang Fei, this may directly cause certain server end searching algorithm can't find corresponding song.At above problem, a reasonable solution is to use speech recognition, such as, allow the user directly say the name of song, user terminal or server search target song or a series of song by speech recognition algorithm from the title of the song database then, from Qu Ku, obtain actual song again and return to the user, so both can improve user's input efficiency, can avoid the problem of unisonance wrongly written or mispronounced characters again, can promote user experience to a certain extent, improve the wish that the user uses the download of songs service.
But, speech recognition technology has a natural weakness, experience with people's ear is similar, speech recognition is not very accurately, and the accuracy rate of machine talk identification descends along with the increase of database, for example, increase along with the title of the song database, the similar song title of pronouncing also can increase, as " favourable turn " of Pan Weibai and " legend " of Wang Fei, and like this can the serious accuracy rate that reduces speech recognition.And in order to contain most of users' demand, a medium sized song database will comprise independently song of hundreds of thousands head usually, carries out speech recognition in a big like this scope, is a very large challenge for recognition accuracy.Simultaneously, speech recognition algorithm is computation-intensive normally, and its calculated amount increases sharply along with the increase of database, this arithmetic capability to server is a very large challenge, in order in time to handle a plurality of users' concurrent demand, the service provider often needs more hardware resource is provided, and has increased cost.
Summary of the invention
The technical problem to be solved in the present invention provides a kind of voice segment formula song retrieval method, and it can improve the accuracy rate of search efficiency and automatic speech recognition, the economize on hardware cost.
For solving the problems of the technologies described above, voice segment formula song retrieval method of the present invention, by the mutual realization of client and server end, and server end has song database, and this method comprises the following steps:
(1) the Client-Prompt customer segment is said the song information that needs retrieval, and the voice that the user says are sent to server end;
(2) server end receives the voice signal that client transmits;
(3) server end carries out automatic speech recognition to the voice signal that receives, and identifies this voice signal corresponding characters;
(4) server end starts search engine, the song of hierarchical search user appointment in song database;
(5) server end sends to client with the song search result;
(6) Search Results that transmits of client reception server end;
(7) client offers the user with the song search result.
Described step (1) comprises the following steps:
(1) the Client-Prompt user first talks about first section voice;
(2) the client recording also detects sound end simultaneously;
(3) client judges according to the sound end testing result whether the user finishes, if, then entered for (4) step, if not, then continued for (2) step;
(4) the Client-Prompt user says second section voice;
(5) the client recording also detects sound end simultaneously;
(6) client judges according to the sound end testing result whether the user finishes, if, then entered for (7) step, if not, then continued for (5) step;
(7) client sends to server end simultaneously with two sections voice.
Described step (1) also can comprise the following steps:
(1) the Client-Prompt user first talks about first section voice;
(2) the client recording also detects sound end simultaneously;
(3) client judges according to the sound end testing result whether the user finishes, if, then entered for (4) step, if not, then continued for (2) step;
(4) the Client-Prompt user says second section voice, simultaneously, first section voice is sent to server end;
(5) the client recording also detects sound end simultaneously;
(6) client judges according to the sound end testing result whether the user finishes, if, then entered for (7) step, if not, then continued for (5) step;
(7) client sends to server end with second section voice.
Song retrieval method of the present invention is passed through in client interactive voice flow process reasonable in design, allow customer segment import voice, make server end be adopted the way of hierarchical search, the song search spatial decomposition of a super large is become the combination in several less song search spaces, dwindled the hunting zone, therefore, compare with existing song retrieval method, search method of the present invention can effectively shorten retrieval time, improve the accuracy rate of automatic speech recognition, promote user experience, simultaneously, can also reduce the load of server, be service provider's economize on hardware cost.
Description of drawings
The present invention is further detailed explanation below in conjunction with accompanying drawing and embodiment:
Fig. 1 is the process flow diagram of the embodiment of the invention;
Fig. 2 is the synoptic diagram of the embodiment of the invention.
Embodiment
Understand for technology contents of the present invention, characteristics and effect being had more specifically, existing in conjunction with illustrated embodiment, details are as follows:
Fig. 1 and 2 is respectively the process step figure and the synoptic diagram of the embodiment of the invention, has used singer's name and song title to search for song in this embodiment, and server end has singer's name database and the song database with singer's classification foundation by name.As illustrated in fig. 1 and 2, when the user prepares to search for title of the song that singer Wang Fei sings for the song of " legend ", carry out according to the following step:
At first, the Client-Prompt user says the singer of the song that need retrieve, i.e. singer's name.The user says singer's name " Wang Fei ", on one side client is recorded, Yi Bian use voice activity detection algorithm to detect the terminal of voice signal in real time, whether finishes this section voice to judge the user.More common voice activity detection algorithm is the terminal of utilizing temporal signatures such as voice signal energy and short-time zero-crossing rate to come recognition of speech signals mostly at present.This deterministic process does not rely on the information of server end, and is therefore very fast.If client judges that the user does not finish as yet, then continue recording and detect sound end, if client judges that the user has finished singer's name, then point out the user to continue song title, and simultaneously singer's name voice signal is sent to server end by network.
After server end receives singer's name voice signal that client sends, at first identify this voice signal corresponding characters with speech recognition algorithm, then, start singer's name search engine, in singer's name database, search the close singer's name of sending with client of singer's name " Wang Fei ", singer's database is compared obviously much smaller with song database, can expect to obtain higher discrimination.In the search that this is taken turns, system chooses with immediate several results of user speech, is 3 results---" Wang Fei " among the embodiment, and " Dou Wei ", " Wang Feng " lists singer's name candidate list in.
Meanwhile, the Client-Prompt user continues song title, the user says song title " legend ", client is recorded on one side, use voice activity detection algorithm to judge whether the user finishes on one side,, then continue recording and detect sound end if client judges that the user does not finish as yet, if client judges that the user has finished song title, then the song title voice signal is sent to server end by network.
After server end receives the song title voice signal that client sends, at first identify this voice signal corresponding characters with speech recognition algorithm, then, start the song title search engine, in song database with singer's classification foundation by name, search for three singers---" Wang Fei " in aforementioned singer's name candidate list, " Dou Wei ", " Wang Feng " pairing song database, obtain name with the immediate several first songs of song title " legend ", for example " Wang Fei/legend ", " Dou Wei/outside window ", " Wang Feng/loyalty ", and send this Search Results to client by network.
Client receives the song search result that server end transmits, and this Search Results is showed the user with the form of list of songs, and the user can be as required, and the song in the selective listing is downloaded, and for example downloads " Wang Fei/legend ".
In sum, voice segment formula song retrieval method of the present invention, reciprocal process by above-mentioned client and server end, solved automatic speech recognition under the situation of search volume increase, the problem that discrimination descends fast, also improve search efficiency simultaneously, saved hardware cost, guaranteed favorable user experience.

Claims (3)

1. voice segment formula song retrieval method, by the mutual realization of client and server end, and server end has song database, it is characterized in that, comprises the following steps:
(1) the Client-Prompt customer segment is said the song information that needs retrieval, and the voice that the user says are sent to server end;
(2) server end receives the voice signal that client transmits;
(3) server end carries out automatic speech recognition to the voice signal that receives, and identifies this voice signal corresponding characters;
(4) server end starts search engine, the song of hierarchical search user appointment in song database;
(5) server end sends to client with the song search result;
(6) Search Results that transmits of client reception server end;
(7) client offers the user with the song search result.
2. voice segment formula song retrieval method as claimed in claim 1 is characterized in that described step (1) comprises the following steps:
(1) the Client-Prompt user first talks about first section voice;
(2) the client recording also detects sound end simultaneously;
(3) client judges according to the sound end testing result whether the user finishes, if, then entered for (4) step, if not, then continued for (2) step;
(4) the Client-Prompt user says second section voice;
(5) the client recording also detects sound end simultaneously;
(6) client judges according to the sound end testing result whether the user finishes, if, then entered for (7) step, if not, then continued for (5) step;
(7) client sends to server end simultaneously with two sections voice.
3. voice segment formula song retrieval method as claimed in claim 1 is characterized in that described step (1) comprises the following steps:
(1) the Client-Prompt user first talks about first section voice;
(2) the client recording also detects sound end simultaneously;
(3) client judges according to the sound end testing result whether the user finishes, if, then entered for (4) step, if not, then continued for (2) step;
(4) the Client-Prompt user says second section voice, simultaneously, first section voice is sent to server end;
(5) the client recording also detects sound end simultaneously;
(6) client judges according to the sound end testing result whether the user finishes, if, then entered for (7) step, if not, then continued for (5) step;
(7) client sends to server end with second section voice.
CN2010101692232A 2010-05-07 2010-05-07 Voice sectional song search method Pending CN102236686A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101692232A CN102236686A (en) 2010-05-07 2010-05-07 Voice sectional song search method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101692232A CN102236686A (en) 2010-05-07 2010-05-07 Voice sectional song search method

Publications (1)

Publication Number Publication Date
CN102236686A true CN102236686A (en) 2011-11-09

Family

ID=44887341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101692232A Pending CN102236686A (en) 2010-05-07 2010-05-07 Voice sectional song search method

Country Status (1)

Country Link
CN (1) CN102236686A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103000173A (en) * 2012-12-11 2013-03-27 优视科技有限公司 Voice interaction method and device
CN103631802A (en) * 2012-08-24 2014-03-12 腾讯科技(深圳)有限公司 Song information searching method, device and corresponding server
CN103903617A (en) * 2012-12-24 2014-07-02 联想(北京)有限公司 Voice recognition method and electronic device
CN104469029A (en) * 2014-11-21 2015-03-25 科大讯飞股份有限公司 Method and device for telephone number query through voice
CN105118518A (en) * 2015-07-15 2015-12-02 百度在线网络技术(北京)有限公司 Sound semantic analysis method and device
CN105448293A (en) * 2014-08-27 2016-03-30 北京羽扇智信息科技有限公司 Voice monitoring and processing method and voice monitoring and processing device
CN105912558A (en) * 2015-02-24 2016-08-31 卡西欧计算机株式会社 Voice retrieval apparatus, and voice retrieval method
CN106409294A (en) * 2016-10-18 2017-02-15 广州视源电子科技股份有限公司 Method and apparatus for preventing voice command misidentification
CN106601250A (en) * 2015-11-10 2017-04-26 刘芨可 Speech control method and device and equipment
CN107221323A (en) * 2017-06-05 2017-09-29 北京智能管家科技有限公司 Method for ordering song by voice, terminal and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750117A (en) * 2004-09-16 2006-03-22 乐金电子(惠州)有限公司 Song researching system of accompany machine and its method for constituting melody data base
CN1940918A (en) * 2005-09-29 2007-04-04 英华达(上海)电子有限公司 MP3 song selection on manual device by speech recognition
CN101206859A (en) * 2007-11-30 2008-06-25 清华大学 Method for ordering song by voice

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750117A (en) * 2004-09-16 2006-03-22 乐金电子(惠州)有限公司 Song researching system of accompany machine and its method for constituting melody data base
CN1940918A (en) * 2005-09-29 2007-04-04 英华达(上海)电子有限公司 MP3 song selection on manual device by speech recognition
CN101206859A (en) * 2007-11-30 2008-06-25 清华大学 Method for ordering song by voice

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9704485B2 (en) 2012-08-24 2017-07-11 Tencent Technology (Shenzhen) Company Limited Multimedia information retrieval method and electronic device
CN103631802A (en) * 2012-08-24 2014-03-12 腾讯科技(深圳)有限公司 Song information searching method, device and corresponding server
CN103631802B (en) * 2012-08-24 2015-05-20 腾讯科技(深圳)有限公司 Song information searching method, device and corresponding server
CN103000173B (en) * 2012-12-11 2015-06-17 优视科技有限公司 Voice interaction method and device
CN103000173A (en) * 2012-12-11 2013-03-27 优视科技有限公司 Voice interaction method and device
CN103903617A (en) * 2012-12-24 2014-07-02 联想(北京)有限公司 Voice recognition method and electronic device
CN105448293B (en) * 2014-08-27 2019-03-12 北京羽扇智信息科技有限公司 Audio monitoring and processing method and equipment
CN105448293A (en) * 2014-08-27 2016-03-30 北京羽扇智信息科技有限公司 Voice monitoring and processing method and voice monitoring and processing device
CN104469029A (en) * 2014-11-21 2015-03-25 科大讯飞股份有限公司 Method and device for telephone number query through voice
CN104469029B (en) * 2014-11-21 2017-11-07 科大讯飞股份有限公司 Number checking method and device is carried out by voice
CN105912558A (en) * 2015-02-24 2016-08-31 卡西欧计算机株式会社 Voice retrieval apparatus, and voice retrieval method
CN105118518A (en) * 2015-07-15 2015-12-02 百度在线网络技术(北京)有限公司 Sound semantic analysis method and device
CN106601250A (en) * 2015-11-10 2017-04-26 刘芨可 Speech control method and device and equipment
CN106409294A (en) * 2016-10-18 2017-02-15 广州视源电子科技股份有限公司 Method and apparatus for preventing voice command misidentification
CN106409294B (en) * 2016-10-18 2019-07-16 广州视源电子科技股份有限公司 The method and apparatus for preventing voice command from misidentifying
CN107221323A (en) * 2017-06-05 2017-09-29 北京智能管家科技有限公司 Method for ordering song by voice, terminal and storage medium
CN107221323B (en) * 2017-06-05 2019-05-28 北京儒博科技有限公司 Method for ordering song by voice, terminal and storage medium

Similar Documents

Publication Publication Date Title
CN102236686A (en) Voice sectional song search method
US9905228B2 (en) System and method of performing automatic speech recognition using local private data
US10872600B1 (en) Background audio identification for speech disambiguation
CN110797027B (en) Multi-recognizer speech recognition
US10043520B2 (en) Multilevel speech recognition for candidate application group using first and second speech commands
EP2685450B1 (en) Device and method for recognizing content using audio signals
US8886635B2 (en) Apparatus and method for recognizing content using audio signal
US9099092B2 (en) Speaker and call characteristic sensitive open voice search
CN106233246B (en) User interface system, user interface control device and user interface control method
US20190370398A1 (en) Method and apparatus for searching historical data
CN102968987A (en) Speech recognition method and system
CN110136749A (en) The relevant end-to-end speech end-point detecting method of speaker and device
CN105095406A (en) Method and apparatus for voice search based on user feature
CN104778946A (en) Voice control method and system
WO2009102827A1 (en) Method and apparatus for voice searching for stored content using uniterm discovery
US10741178B2 (en) Method for providing vehicle AI service and device using the same
CN104252464A (en) Information processing method and information processing device
CN110070859B (en) Voice recognition method and device
CN104282301A (en) Voice command processing method and system
CN103443853A (en) Automated conversation assistance
CN106653013A (en) Speech recognition method and device
CN103187076A (en) Voice music control device
CN109671427B (en) Voice control method and device, storage medium and air conditioner
CN104575496A (en) Method and device for automatically sending multimedia documents and mobile terminal
CN110379419A (en) Phonetic feature matching process based on convolutional neural networks

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20111109