CN106791916A

CN106791916A - A kind of methods, devices and systems of recommendation of audio data

Info

Publication number: CN106791916A
Application number: CN201611057391.6A
Authority: CN
Inventors: 宁可
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2016-11-26
Filing date: 2016-11-26
Publication date: 2017-05-31
Anticipated expiration: 2036-11-26
Also published as: CN106791916B

Abstract

The invention discloses a kind of methods, devices and systems of recommendation of audio data, belong to field of computer technology.Methods described includes：Receive audio and recommend instruction, carry out image taking, obtain target image；To the target image, image recognition is carried out, obtain the object information of target object included in the target image；The object information of the target object is sent to server；The exhibition information of the corresponding target audio data of object information of the target object that the server sends is received, the exhibition information to the target audio data shows.Using the present invention, the flexibility of recommendation of audio data can be improved.

Description

A kind of methods, devices and systems of recommendation of audio data

Technical field

The present invention relates to computer realm, more particularly to a kind of methods, devices and systems of recommendation of audio data.

Background technology

With the development of terminal technology and audio signal processing technique, all kinds of voice datas are listened to (as sung using terminals such as mobile phones Song, news, story etc.) form a kind of custom for many people.

Propose the mode to user's recommendation of audio data in the prior art, usually by network side server by some Very popular voice data is pushed to the terminal of user, is shown to user, selects to play for user.

Realize it is of the invention during, inventor find prior art at least there is problems with：

The recommendation of voice data is carried out through the above way, and for different users, the voice data of recommendation is all identical , it is impossible to the different demands based on different user, the recommendation of personalization is carried out, so that, the flexibility of recommendation of audio data is poor.

The content of the invention

In order to solve problem of the prior art, a kind of method, the device of recommendation of audio data are the embodiment of the invention provides And system.The technical scheme is as follows：

First aspect, there is provided a kind of method of recommendation of audio data, methods described includes：

Receive audio and recommend instruction, carry out image taking, obtain target image；

To the target image, image recognition is carried out, the object letter of the target object for obtaining being included in the target image Breath；

The object information of the target object is sent to server；

Receive the displaying letter of the corresponding target audio data of object information of the target object that the server sends Breath, the exhibition information to the target audio data shows.

Optionally, after the exhibition information to the target audio data shows, also include：

When the selection of the exhibition information of the first voice data in the exhibition information for receiving the correspondence target audio data During instruction, the acquisition for sending first voice data to the server is asked；

First voice data that the server sends is received, first voice data is played out.

The acquisition for sending the target audio data to the server is asked；

The target audio data that the server sends are received, the target audio data are played out.

Optionally, the Item Information is text message, type of items information or item image information.

Second aspect, there is provided a kind of method of recommendation of audio data, methods described includes：

The object information of the target object that receiving terminal sends；

According to the object information for prestoring and the corresponding relation of voice data, the object information of the target object is determined Corresponding target audio data；

The exhibition information of the target audio data is sent to the terminal.

Optionally, after the exhibition information to the terminal transmission target audio data, also include：

Receive the acquisition request of the first voice data in the target audio data that the terminal sends；

First voice data is sent to the terminal.

Receive the acquisition request of the target audio data that the terminal sends；

The target audio data are sent to the terminal.

The third aspect, there is provided a kind of terminal, the terminal includes：

Taking module, instruction is recommended for receiving audio, carries out image taking, obtains target image；

Identification module, for the target image, carrying out image recognition, obtains the target included in the target image The object information of object；

Sending module, the object information for sending the target object to server；

Receiver module, the corresponding target audio of object information for receiving the target object that the server sends The exhibition information of data, the exhibition information to the target audio data shows.

Optionally, the sending module, is additionally operable to when in the exhibition information that receive the correspondence target audio data the During the selection instruction of the exhibition information of one voice data, the acquisition for sending first voice data to the server is asked；

The receiver module, is additionally operable to receive first voice data that the server sends；

The terminal also includes playing module, for being played out to first voice data.

Optionally, the sending module, the acquisition for being additionally operable to send the target audio data to the server is asked；

The receiver module, is additionally operable to receive the target audio data that the server sends；

The terminal also includes playing module, for being played out to the target audio data.

Fourth aspect, there is provided a kind of server, the server includes：

Receiver module, the object information of the target object sent for receiving terminal；

Determining module, for according to the object information for prestoring and the corresponding relation of voice data, determining the target The corresponding target audio data of object information of object；

Sending module, the exhibition information for sending the target audio data to the terminal.

Optionally, the receiver module, is additionally operable to receive the first sound in the target audio data that the terminal sends The acquisition request of frequency evidence；

The sending module, is additionally operable to send first voice data to the terminal.

Optionally, the receiver module, the acquisition for being additionally operable to receive the target audio data that the terminal sends please Ask；

The sending module, is additionally operable to send the target audio data to the terminal.

5th aspect, there is provided a kind of system of recommendation of audio data, the system includes terminal and server, wherein：

The terminal, instruction is recommended for receiving audio, carries out image taking, obtains target image；To the target figure Picture, carries out image recognition, obtains the object information of target object included in the target image；Institute is sent to the server State the object information of target object；Receive the corresponding target audio of object information of the target object that the server sends The exhibition information of data, the exhibition information to the target audio data shows；

The server, the object information for receiving the target object that the terminal sends；According to the thing for prestoring Body information and the corresponding relation of voice data, determine the corresponding target audio data of object information of the target object；To institute State the exhibition information that terminal sends the target audio data.

The beneficial effect that technical scheme provided in an embodiment of the present invention is brought is：

In the embodiment of the present invention, receive audio and recommend instruction, carry out image taking, obtain target image, to target image, Image recognition is carried out, the object information of target object included in target image is obtained, the thing of target object is sent to server Body information, the exhibition information of the corresponding target audio data of object information of the target object that the reception server sends, to target The exhibition information of voice data is shown.So, the object that user shoots can be based on, recommends related audio, Ke Yiti The flexibility of recommendation of audio data high.

Brief description of the drawings

Technical scheme in order to illustrate more clearly the embodiments of the present invention, below will be to that will make needed for embodiment description Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings Accompanying drawing.

Fig. 1 is a kind of schematic diagram of system architecture provided in an embodiment of the present invention；

Fig. 2 is a kind of schematic flow sheet of the method for recommendation of audio data provided in an embodiment of the present invention；

Fig. 3 is a kind of schematic flow sheet of the method for recommendation of audio data provided in an embodiment of the present invention；

Fig. 4 is a kind of schematic flow sheet of the method for recommendation of audio data provided in an embodiment of the present invention；

Fig. 5 is a kind of interface schematic diagram provided in an embodiment of the present invention；

Fig. 6 is a kind of structural representation of terminal provided in an embodiment of the present invention；

Fig. 7 is a kind of structural representation of server provided in an embodiment of the present invention；

Fig. 8 is a kind of structural representation of terminal provided in an embodiment of the present invention；

Fig. 9 is a kind of structural representation of server provided in an embodiment of the present invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention Formula is described in further detail.

A kind of method of recommendation of audio data is the embodiment of the invention provides, the method can be common by terminal and server Realize.As shown in figure 1, the system architecture of the method can be made up of terminal and server.Wherein, terminal can be mobile phone, put down The mobile terminals such as plate computer, can be provided with music application program.Server can be after music application program Platform server.

Server can including processor, memory, transceiver etc. part.Processor, can be CPU (Central Processing Unit, CPU) etc., it is right with voice data according to the object information for prestoring to can be used for Should be related to, determine the corresponding target audio data of object information of target object, wait treatment.Memory, can be RAM (Random Access Memory, random access memory), Flash (flash memory) etc., can be used for storing receive data, Data generated in data, processing procedure needed for processing procedure etc., such as corresponding relation, the target of object information and voice data Voice data etc..Transceiver, can be used for carrying out data transmission with terminal or other servers, for example, sending target to terminal Voice data, transceiver can be including antenna, match circuit, modem etc..

Terminal can be including processor, memory, transceiver, image pickup section, display unit, audio output part etc. Part.Processor, can be CPU etc..Memory, can be RAM (Random Access Memory, random access memory Device), Flash (flash memory) etc. is generated in the data, the processing procedure that can be used for needed for storing data, the processing procedure for receiving Data etc., such as target image, the object information of target object, the exhibition information of target audio data.Transceiver, Ke Yiyong Carry out data transmission in server, for example, the target audio data that can be sent with the reception server, can send to server The object information of target object, transceiver can be including antenna, match circuit, modem etc..Image pickup section can be with It is camera, for photographic subjects image.Display unit is displayed for exhibition information, target image etc..Audio output part Part can be audio amplifier, earphone etc., can be used for playing target audio data.Terminal can also include input block, audio detection Part etc..Input block can be touch-screen, keyboard, mouse etc..Audio detection part, can be microphone etc..

As shown in Fig. 2 the handling process of terminal can include the steps in the method：

Step 201, receives audio and recommends instruction, carries out image taking, obtains target image.

Step 202, to target image, carries out image recognition, the object letter of the target object for obtaining being included in target image Breath.

Step 203, the object information of target object is sent to server.

Step 204, the displaying letter of the corresponding target audio data of object information of the target object that the reception server sends Breath, the exhibition information to target audio data shows.

As shown in figure 3, the handling process of server can include the steps in the method：

Step 301, the object information of the target object that receiving terminal sends.

Step 302, according to the object information for prestoring and the corresponding relation of voice data, determines the object of target object The corresponding target audio data of information.

Step 303, the exhibition information of target audio data is sent to terminal.

As shown in figure 4, terminal and server are interacted and the process that processes may include steps of in the method：

Step 401, terminal receives audio and recommends instruction, carries out image taking, obtains target image.

Wherein, the audio in the present embodiment can be music, news, story etc..

In force, user can in the terminal install audio play-back application.When user wants to listen to audio Time can open audio play-back application.The work(of " sweep and sweep recommendation of audio " can be provided with the audio play-back application Can, as shown in figure 5, the function provided the user it is a kind of scanning object come by way of obtaining related audio.User can be with The function button of " sweep and sweep recommendation of audio " is clicked on, terminal can then receive audio and recommend instruction.At this point it is possible to triggering terminal is opened Image pickup section is opened, image taking is carried out.Now, terminal can show a rectangle frame in screen, for pointing out user By the image control of the object of desired scanning in the rectangle frame, as shown in Figure 5.Terminal can show determination button simultaneously, when When terminal detects the click commands for determining button, terminal can obtain the target image for photographing at current time.Or, terminal Object detection can be carried out during persistently image taking is carried out, when detecting preassigned some type of object (such as Text, material object etc.) when, then obtain current target image.Or, terminal receive audio recommend instruction when, Ke Yizhi Obtain and take the target image that current time image taking is obtained.

Step 402, terminal-pair target image carries out image recognition, obtains the thing of target object included in target image Body information.

Wherein, Item Information is text message, type of items information or item image information.Type of items information can be Apple, television set, desk, glasses etc..Item image information can be the whole target image comprising target item, or target Parts of images comprising target item in image.

In force, after terminal obtains target image, under different working modes, terminal can be recognized in the target image Different types of content.If mode of operation is text identification pattern, target object at this moment may be considered content of text, Terminal can carry out text identification to target image, obtain the text that target image is included.If mode of operation is type of items Recognition mode, then terminal can be based on default object type recognizer (can be existing third party's recognizer), really Determine type of items, in specific algorithm, terminal can prestore the corresponding characteristics of image of a large amount of different objects types, other to target Image carries out image characteristics extraction, determines that target image meets the corresponding characteristics of image of which object type, and then determine object Type.If mode of operation is image acquisition mode, at this moment can believe whole target image as the object of target object Breath, or can also be using object information of the parts of images comprising target object as target object in target image.

In above-mentioned audio play-back application, mode of operation option can be provided with, user can select to open above-mentioned A kind of mode of operation in mode of operation.User wants during based on the related audio of certain String searching seen, to open Text identification pattern, for example, user sees the name of certain TV play shown on TV, it is desirable to the related song of search, or Person, user sees the barrage comment that someone delivers on computer, it is desirable to the related article of search, can open text identification mould Formula.User wants during based on the related audio of certain object search, type of items recognition mode can be opened, for example, user sees To the fruit do not seen, it is desirable to search the introduction of this fruit, or, user sees an object, it is desirable to search on The song of this thing, can open type of items recognition mode.User is wanted i.e. based on the sound that certain picture searching is related During frequency, image acquisition mode can be opened, for example, user sees the poster of the new special edition of certain singer, it is desirable to listen to this special edition In song, or, user sees the front cover of certain this book, it is desirable to listen to this this book audio version, can open image acquisition Pattern.

Step 403, terminal to server sends the object information of target object.

Step 404, the object information of the target object that server receiving terminal sends.

The mode of above-mentioned data transfer can be with varied, and the present embodiment is not repeated.

Step 405, server determines target object according to the object information for prestoring and the corresponding relation of voice data The corresponding target audio data of object information.

In force, technical staff can set up in advance for different Item Information sets associated voice data The corresponding relation of object information and voice data, and store in the server.For example, Item Information be text message " sky it City ", can set corresponding voice data for song《Air city》, animation《Air city》Original music and animation《My god The city of sky》Introduce audio etc..Again for example, Item Information is apple, corresponding voice data can be set for song《Small apple Really》, some songs related to apple, the relevant knowledge of apple introduce audio etc..Again for example, Item Information is certain film Poster, it is that the primary sound of the film, the film introduce audio, the related news of the film that can set corresponding voice data Deng.Each Item Information can correspond to one or more voice datas.

After the object information of the target object that terminal transmission is received in server, server can be in above-mentioned corresponding relation The middle corresponding target audio data of object information for searching target object.

Step 406, server sends the exhibition information of target audio data to terminal.

In force, whois lookup can further obtain each target being locally stored to after target audio data The exhibition information of voice data, exhibition information can include target audio data title, can also include some summary infos, Picture concerned etc..And then, the exhibition information of each target audio data can be sent to terminal by server.

Step 407, the exhibition of the corresponding target audio data of object information of the target object that terminal the reception server sends Show information, the exhibition information to target audio data shows.

In force, after terminal receives the exhibition information of server transmission, audio recommendation list, such as Fig. 5 can be shown It is shown, the option of each target audio data is shown in audio recommendation list, in the position of the option of each target audio data Put, show corresponding exhibition information, so that user browses and selects to play.

In the embodiment of the present invention, the mode that the target audio data to recommending are played out can below be given with varied Several feasible processing modes are gone out：

Mode one, user selects the voice data for wanting broadcasting to play out in all target audio data, accordingly, After the exhibition information to target audio data shows, can be handled as follows：

Step one, when the exhibition information of the first voice data in the exhibition information for receiving the correspondence target audio data Selection instruction when, terminal to server send the first voice data acquisition request.

Step 2, the acquisition request of the first voice data in the target audio data that server receiving terminal sends.

Step 3, server sends the first voice data to terminal.

Step 4, the first voice data that terminal the reception server sends, plays out to the first voice data.

In force, after terminal display audio recommendation list, user may browse through each target audio for wherein showing The option of data.The target audio data recommended in some cases be probably it is omnifarious, such as it is above-mentioned based on object type Information determines the mode of voice data, with this information it is possible to determine go out some songs and article introduces audio, also news etc..User can be The voice data for oneself wanting to play is searched in audio recommendation list.Then, in audio recommendation list, the voice data is clicked on The option of (i.e. the first voice data), now terminal can then receive corresponding selection instruction, then can be with triggering terminal to service Device sends the acquisition request of the first voice data, and server then can send the first audio number after receiving acquisition request to terminal According to.After terminal receives the first voice data, the first voice data can be automatically played.

All target audio data are all played out by mode two, accordingly, in the displaying letter to target audio data After breath is shown, can be handled as follows：

Step one, terminal to server sends the acquisition request of target audio data.

Step 2, the acquisition request of the target audio data that server receiving terminal sends.

Step 3, server sends target audio data to terminal.

Target audio data are played out by step 4, the target audio data that terminal the reception server sends.

In force, after terminal display audio recommendation list, audio recommendation list can be corresponded to and shows that a broadcasting is pressed Key.The possible degree of correlation of target audio data recommended in some cases is very high, and is all the voice data that user wants, for example The above-mentioned mode that voice data is determined based on item image information, if user's scanning is a poster for album of songs, that The voice data recommended is probably all songs of the album of songs.At this moment, user may wish to enter all target audios Row is played, then can click on the broadcasting button, and now, the broadcasting that terminal can then receive all target audio data of correspondence refers to Order, then can be asked with triggering terminal to the acquisition that server sends target audio data.After server receives acquisition request, then All target audio data can be sent to terminal.After terminal receives target audio data, can be to each target audio data Carry out played in order or shuffle.

Based on identical technology design, the embodiment of the present invention additionally provides a kind of terminal, as shown in fig. 6, the terminal includes：

Taking module 610, instruction is recommended for receiving audio, carries out image taking, obtains target image；

Identification module 620, for the target image, carrying out image recognition, obtains what is included in the target image The object information of target object；

Sending module 630, the object information for sending the target object to server；

Receiver module 640, the corresponding target of object information for receiving the target object that the server sends The exhibition information of voice data, the exhibition information to the target audio data shows.

Optionally, the sending module 630, is additionally operable to when in the exhibition information for receiving the correspondence target audio data During the selection instruction of the exhibition information of the first voice data, the acquisition for sending first voice data to the server please Ask；

The receiver module 640, is additionally operable to receive first voice data that the server sends；

Optionally, the sending module 630, the acquisition for being additionally operable to send the target audio data to the server please Ask；

The receiver module 640, is additionally operable to receive the target audio data that the server sends；

Based on identical technology design, the embodiment of the present invention additionally provides a kind of server, as shown in fig. 7, the server Including：

Receiver module 710, the object information of the target object sent for receiving terminal；

Determining module 720, for according to the object information for prestoring and the corresponding relation of voice data, determining the mesh Mark the corresponding target audio data of object information of object；

Sending module 730, the exhibition information for sending the target audio data to the terminal.

Optionally, the receiver module 710, is additionally operable to receive first in the target audio data that the terminal sends The acquisition request of voice data；

The sending module 730, is additionally operable to send first voice data to the terminal.

Optionally, the receiver module 710, is additionally operable to receive the acquisition of the target audio data that the terminal sends Request；

The sending module 730, is additionally operable to send the target audio data to the terminal.

Based on identical technology design, the embodiment of the present invention additionally provides a kind of system of recommendation of audio data, the system System includes terminal and server, wherein：

On the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant the method Embodiment in be described in detail, explanation will be not set forth in detail herein.

It should be noted that：Above-described embodiment provide terminal, server in recommendation of audio data, only with above-mentioned each work( The division of energy module is carried out for example, in practical application, as needed can distribute by different functions above-mentioned functions Module is completed, will terminal, the internal structure of server be divided into different functional modules, with complete it is described above whole or Person's partial function.In addition, the embodiment of the method for the terminal, server and recommendation of audio data of above-described embodiment offer belongs to same Design, it implements process and refers to embodiment of the method, repeats no more here.

Fig. 8 is refer to, it illustrates the structural representation of the terminal involved by the embodiment of the present invention, the terminal can be used for The method that the recommendation of audio data of offer in above-described embodiment are provided.Specifically：

Terminal 1200 can include RF (Radio Frequency, radio frequency) circuit 110, include one or more The memory 120 of computer-readable recording medium, input block 130, display unit 140, sensor 150, voicefrequency circuit 160, WiFi (wireless fidelity, Wireless Fidelity) module 170, include one or the treatment of more than one processing core The part such as device 180 and power supply 190.It will be understood by those skilled in the art that the terminal structure shown in Fig. 8 do not constitute it is right The restriction of terminal, can include part more more or less than diagram, or combine some parts, or different part cloth Put.Wherein：

RF circuits 110 can be used to receiving and sending messages or communication process in, the reception and transmission of signal, especially, by base station After downlink information is received, transfer to one or more than one processor 180 is processed；In addition, will be related to up data is activation to Base station.Generally, RF circuits 110 include but is not limited to antenna, at least one amplifier, tuner, one or more oscillators, use Family identity module (SIM) card, transceiver, coupler, LNA (Low Noise Amplifier, low-noise amplifier), duplex Device etc..Additionally, RF circuits 110 can also be communicated by radio communication with network and other equipment.The radio communication can make With any communication standard or agreement, and including but not limited to GSM (Global System of Mobile communication, entirely Ball mobile communcations system), GPRS (General Packet Radio Service, general packet radio service), CDMA (Code Division Multiple Access, CDMA), WCDMA (Wideband Code Division Multiple Access, WCDMA), LTE (Long Term Evolution, Long Term Evolution), Email, SMS (Short Messaging Service, Short Message Service) etc..

Memory 120 can be used to store software program and module, and processor 180 is by running storage in memory 120 Software program and module, so as to perform various function application and data processing.Memory 120 can mainly include storage journey Sequence area and storage data field, wherein, the application program (ratio that storing program area can be needed for storage program area, at least one function Such as sound-playing function, image player function) etc.；Storage data field can be stored and use created number according to terminal 1200 According to (such as voice data, phone directory etc.) etc..Additionally, memory 120 can include high-speed random access memory, can also wrap Include nonvolatile memory, for example, at least one disk memory, flush memory device or other volatile solid-state parts. Correspondingly, memory 120 can also include Memory Controller, to provide processor 180 and input block 130 to memory 120 access.

Input block 130 can be used to receive the numeral or character information of input, and generation is set and function with user The relevant keyboard of control, mouse, action bars, optics or trace ball signal input.Specifically, input block 130 may include to touch Sensitive surfaces 131 and other input equipments 132.Touch sensitive surface 131, also referred to as touch display screen or Trackpad, can collect use Family thereon or neighbouring touch operation (such as user is using any suitable objects such as finger, stylus or annex in touch-sensitive table Operation on face 131 or near Touch sensitive surface 131), and corresponding attachment means are driven according to formula set in advance.It is optional , Touch sensitive surface 131 may include two parts of touch detecting apparatus and touch controller.Wherein, touch detecting apparatus detection is used The touch orientation at family, and the signal that touch operation brings is detected, transmit a signal to touch controller；Touch controller is from touch Touch information is received in detection means, and is converted into contact coordinate, then give processor 180, and can receiving processor 180 The order sent simultaneously is performed.Furthermore, it is possible to using polytypes such as resistance-type, condenser type, infrared ray and surface acoustic waves Realize Touch sensitive surface 131.Except Touch sensitive surface 131, input block 130 can also include other input equipments 132.Specifically, Other input equipments 132 can include but is not limited to physical keyboard, function key (such as volume control button, switch key etc.), One or more in trace ball, mouse, action bars etc..

Display unit 140 can be used to showing by user input information or be supplied to the information and terminal 1200 of user Various graphical user interface, these graphical user interface can be made up of figure, text, icon, video and its any combination. Display unit 140 may include display panel 141, optionally, can use LCD (Liquid Crystal Display, liquid crystal Show device), the form such as OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) configure display panel 141.Further, Touch sensitive surface 131 can cover display panel 141, when Touch sensitive surface 131 is detected thereon or neighbouring is touched After touching operation, processor 180 is sent to determine the type of touch event, with preprocessor 180 according to the type of touch event Corresponding visual output is provided on display panel 141.Although in fig. 8, Touch sensitive surface 131 and display panel 141 are conducts Two independent parts come realize input and input function, but in some embodiments it is possible to by Touch sensitive surface 131 with display Panel 141 is integrated and realization is input into and output function.

Terminal 1200 may also include at least one sensor 150, such as optical sensor, motion sensor and other sensings Device.Specifically, optical sensor may include ambient light sensor and proximity transducer, wherein, ambient light sensor can be according to environment The light and shade of light adjusts the brightness of display panel 141, and proximity transducer can close display when terminal 1200 is moved in one's ear Panel 141 and/or backlight.As one kind of motion sensor, in the detectable all directions of Gravity accelerometer (generally Three axles) acceleration size, size and the direction of gravity are can detect that when static, can be used for recognize mobile phone attitude application (ratio Such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap)；Extremely The other sensors such as gyroscope, barometer, hygrometer, thermometer, the infrared ray sensor that be can also configure in terminal 1200, herein Repeat no more.

Voicefrequency circuit 160, loudspeaker 161, microphone 162 can provide the COBBAIF between user and terminal 1200.Sound Electric signal after the voice data conversion that frequency circuit 160 will can be received, is transferred to loudspeaker 161, is converted to by loudspeaker 161 Voice signal is exported；On the other hand, the voice signal of collection is converted to electric signal by microphone 162, is received by voicefrequency circuit 160 After be converted to voice data, it is such as another to be sent to through RF circuits 110 then after voice data output processor 180 is processed Terminal, or voice data is exported to memory 120 so as to further treatment.Voicefrequency circuit 160 is also possible that earplug is inserted Hole, to provide the communication of peripheral hardware earphone and terminal 1200.

WiFi belongs to short range wireless transmission technology, and terminal 1200 can help user to receive and dispatch electricity by WiFi module 170 Sub- mail, browse webpage and access streaming video etc., it has provided the user wireless broadband internet and has accessed.Although Fig. 8 shows Go out WiFi module 170, but it is understood that, it is simultaneously not belonging to must be configured into for terminal 1200, completely can be according to need To be omitted in the essential scope for do not change invention.

Processor 180 is the control centre of terminal 1200, using various interfaces and each portion of connection whole mobile phone Point, by running or performing software program and/or module of the storage in memory 120, and storage is called in memory 120 Interior data, perform the various functions and processing data of terminal 1200, so as to carry out integral monitoring to mobile phone.Optionally, process Device 180 may include one or more processing cores；Preferably, processor 180 can integrated application processor and modulation /demodulation treatment Device, wherein, application processor mainly processes operating system, user interface and application program etc., and modem processor is mainly located Reason radio communication.It is understood that above-mentioned modem processor can not also be integrated into processor 180.

Terminal 1200 also includes the power supply 190 (such as battery) powered to all parts, it is preferred that power supply can be by electricity Management system is logically contiguous with processor 180, so as to realize management charging, electric discharge and power consumption by power-supply management system The functions such as management.Power supply 190 can also include one or more direct current or AC power, recharging system, power supply event The random component such as barrier detection circuit, power supply changeover device or inverter, power supply status indicator.

Although not shown, terminal 1200 can also will not be repeated here including camera, bluetooth module etc..Specifically at this In embodiment, the display unit of terminal 1200 is touch-screen display, and terminal 1200 also includes memory, and one or More than one program, one of them or more than one program storage is configured to by one or one in memory Individual above computing device states one or more than one program bag contains the instruction for being used for carrying out following operation：

The object information of the target object is sent to server；

The acquisition for sending the target audio data to the server is asked；

Fig. 9 is the structural representation of server provided in an embodiment of the present invention.The server 1900 can be because of configuration or performance The different and larger difference of producing ratio, can include one or more central processing units (central processing Units, CPU) 1922 (for example, one or more processors) and memory 1932, one or more storage applications The storage medium 1930 (such as one or more mass memory units) of program 1942 or data 1944.Wherein, memory 1932 and storage medium 1930 can be it is of short duration storage or persistently storage.The program stored in storage medium 1930 can include one Individual or more than one module (diagram is not marked), each module can be included to the series of instructions operation in server.More enter One step ground, central processing unit 1922 be could be arranged to be communicated with storage medium 1930, and storage medium is performed on server 1900 Series of instructions operation in 1930.

Server 1900 can also include one or more power supplys 1926, one or more wired or wireless nets Network interface 1950, one or more input/output interfaces 1958, one or more keyboards 1956, and/or, one or More than one operating system 1941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM Etc..

Server 1900 can include memory, and one or more than one program, one of them or one Individual procedure above is stored in memory, and is configured to one or one by one or more than one computing device Individual procedure above includes the instruction for carrying out following operation：

The object information of the target object that receiving terminal sends；

The exhibition information of the target audio data is sent to the terminal.

First voice data is sent to the terminal.

The target audio data are sent to the terminal.

One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware To complete, it is also possible to instruct the hardware of correlation to complete by program, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..

The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all it is of the invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included within the scope of the present invention.

Claims

1. a kind of method of recommendation of audio data, it is characterised in that methods described includes：

To the target image, image recognition is carried out, obtain the object information of target object included in the target image；

The object information of the target object is sent to server；

The exhibition information of the corresponding target audio data of object information of the target object that the server sends is received, it is right The exhibition information of the target audio data is shown.

2. method according to claim 1, it is characterised in that the exhibition information to the target audio data is carried out After display, also include：

When the selection instruction of the exhibition information of the first voice data in the exhibition information for receiving the correspondence target audio data When, the acquisition for sending first voice data to the server is asked；

3. method according to claim 1, it is characterised in that the exhibition information to the target audio data is carried out After display, also include：

The acquisition for sending the target audio data to the server is asked；

4. method according to claim 1, it is characterised in that the Item Information is text message, type of items information Or item image information.

5. a kind of method of recommendation of audio data, it is characterised in that methods described includes：

The object information of the target object that receiving terminal sends；

According to the object information for prestoring and the corresponding relation of voice data, the object information correspondence of the target object is determined Target audio data；

The exhibition information of the target audio data is sent to the terminal.

6. method according to claim 5, it is characterised in that described to send the target audio data to the terminal After exhibition information, also include：

First voice data is sent to the terminal.

7. method according to claim 5, it is characterised in that described to send the target audio data to the terminal After exhibition information, also include：

The target audio data are sent to the terminal.

8. method according to claim 5, it is characterised in that the Item Information is text message, type of items information Or item image information.

9. a kind of terminal, it is characterised in that the terminal includes：

Identification module, for the target image, carrying out image recognition, obtains the target object included in the target image Object information；

Receiver module, the corresponding target audio data of object information for receiving the target object that the server sends Exhibition information, the exhibition information to the target audio data shows.

10. terminal according to claim 9, it is characterised in that the sending module, is additionally operable to described when receive correspondence In the exhibition information of target audio data during the selection instruction of the exhibition information of the first voice data, institute is sent to the server State the acquisition request of the first voice data；

11. terminals according to claim 9, it is characterised in that the sending module, are additionally operable to be sent to the server The acquisition request of the target audio data；

12. terminals according to claim 9, it is characterised in that the Item Information is text message, type of items information Or item image information.

13. a kind of servers, it is characterised in that the server includes：

Determining module, for according to the object information for prestoring and the corresponding relation of voice data, determining the target object The corresponding target audio data of object information；

14. servers according to claim 13, it is characterised in that the receiver module, are additionally operable to receive the terminal The acquisition request of the first voice data in the target audio data for sending；

15. servers according to claim 13, it is characterised in that the receiver module, are additionally operable to receive the terminal The acquisition request of the target audio data for sending；

16. servers according to claim 13, it is characterised in that the Item Information is text message, type of items Information or item image information.

17. a kind of systems of recommendation of audio data, it is characterised in that the system includes terminal and server, wherein：

The terminal, instruction is recommended for receiving audio, carries out image taking, obtains target image；To the target image, enter Row image recognition, obtains the object information of target object included in the target image；The mesh is sent to the server Mark the object information of object；Receive the corresponding target audio data of object information of the target object that the server sends Exhibition information, the exhibition information to the target audio data shows；

The server, the object information for receiving the target object that the terminal sends；According to the object letter for prestoring Breath and the corresponding relation of voice data, determine the corresponding target audio data of object information of the target object；To the end End sends the exhibition information of the target audio data.