CN109710796A

CN109710796A - Voice-based image searching method, device, storage medium and terminal

Info

Publication number: CN109710796A
Application number: CN201910032376.3A
Authority: CN
Inventors: 郭子亮
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2019-01-14
Filing date: 2019-01-14
Publication date: 2019-05-03

Abstract

The embodiment of the present application discloses voice-based image searching method, device, storage medium and terminal.This method comprises: receiving the first voice messaging；The first search term is extracted from first voice messaging, and picture searching is carried out according to first search term, and the first search result is fed back into user；Receive the second voice messaging；The second search term is extracted from second voice messaging, and picture searching is carried out from first search result according to second search term, and the second search result is fed back into user.The embodiment of the present application can realize the binary search for being directed to previous search result, promote the precision of search result by using above-mentioned technical proposal when carrying out picture searching using voice.

Description

Voice-based image searching method, device, storage medium and terminal

Technical field

The invention relates to field of terminal technology more particularly to voice-based image searching method, device, storages Medium and terminal.

Background technique

Speech recognition technology is that one kind allows machine that voice signal is changed into corresponding text by identification and understanding process Or the technology of order.In recent years, with the fast development of speech recognition technology, applied field is more and more extensive.Currently, Speech recognition technology has been successfully applied in various intelligent terminals, keeps the function of intelligent terminal more abundant.

Speech recognition technology is generally present in intelligent terminal in the form of voice assistant, and user can use voice assistant It is issued and is ordered to terminal by the way of natural language, and terminal can identify and understand to the natural language of user, in turn Corresponding operation is executed, is brought great convenience for user.In the related technology, user can use voice progress picture and search Rope needs to improve however, the picture searching scheme currently based on voice is still incomplete.

Summary of the invention

The embodiment of the present application provides a kind of voice-based image searching method, device, storage medium and terminal, Ke Yiyou Change voice-based picture searching scheme.

In a first aspect, the embodiment of the present application provides a kind of voice-based image searching method, comprising:

Receive the first voice messaging；

The first search term is extracted from first voice messaging, and picture searching is carried out according to first search term, First search result is fed back into user；

Receive the second voice messaging；

The second search term is extracted from second voice messaging, and is searched for according to second search term from described first As a result picture searching is carried out in, and the second search result is fed back into user.

Second aspect, the embodiment of the present application provide a kind of voice-based picture searching device, comprising:

First speech reception module, for receiving the first voice messaging；

First search module is searched for extracting the first search term from first voice messaging, and according to described first Rope word carries out picture searching, and the first search result is fed back to user；

Second speech reception module, for receiving the second voice messaging；

Second search module is searched for extracting the second search term from second voice messaging, and according to described second Rope word carries out picture searching from first search result, and the second search result is fed back to user.

The third aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence realizes the voice-based image searching method as described in the embodiment of the present application when the program is executed by processor.

Fourth aspect, the embodiment of the present application provide a kind of terminal, including memory, and processor and storage are on a memory And the computer program that can be run in processor, the processor realize such as the embodiment of the present application when executing the computer program The voice-based image searching method.

The voice-based picture searching scheme provided in the embodiment of the present application receives the first voice messaging, from the first language The first search term is extracted in message breath, and picture searching is carried out according to the first search term, and the first search result is fed back into user, The second voice messaging is received, the second search term is identified from the second voice messaging, and search for from first according to the second search term As a result picture searching is carried out in, and the second search result is fed back into user.By using above-mentioned technical proposal, language can utilized When sound carries out picture searching, realizes the binary search for being directed to previous search result, promote the precision of search result.

Detailed description of the invention

Fig. 1 is a kind of flow diagram of voice-based image searching method provided by the embodiments of the present application；

Fig. 2 is the flow diagram of the voice-based image searching method of another kind provided by the embodiments of the present application；

Fig. 3 is the flow diagram of another voice-based image searching method provided by the embodiments of the present application；

Fig. 4 is a kind of structural block diagram of voice-based picture searching device provided by the embodiments of the present application；

Fig. 5 is a kind of structural schematic diagram of terminal provided by the embodiments of the present application；

Fig. 6 is the structural schematic diagram of another terminal provided by the embodiments of the present application.

Specific embodiment

Further illustrate the technical solution of the application below with reference to the accompanying drawings and specific embodiments.It is understood that It is that specific embodiment described herein is used only for explaining the application, rather than the restriction to the application.It further needs exist for illustrating , part relevant to the application is illustrated only for ease of description, in attached drawing rather than entire infrastructure.

It should be mentioned that some exemplary embodiments are described as before exemplary embodiment is discussed in greater detail The processing or method described as flow chart.Although each step is described as the processing of sequence by flow chart, many of these Step can be implemented concurrently, concomitantly or simultaneously.In addition, the sequence of each step can be rearranged.When its operation The processing can be terminated when completion, it is also possible to have the additional step being not included in attached drawing.The processing can be with Corresponding to method, function, regulation, subroutine, subprogram etc..

Currently, being both provided with the sound collections component such as microphone in many terminals, sound collection component is recorded in addition to realizing Outside function, additionally it is possible to combined with speech recognition technology to realize the functions such as voice assistant.When terminal enters voice assistant function Afterwards, the problem of user can be interacted with terminal using natural language, and terminal can answer user or voice according to user Instruction execution corresponding operation enriches the human-computer interaction function of terminal, also brings great convenience for the use of user.Phase In the technology of pass, by functional applications such as voice assistants in picture searching, used due to might have stored a large amount of pictures in terminal, including such as The picture that the photo of the daily shooting in family, good friend send and the picture downloaded from network etc., when user requires to look up oneself It is often time-consuming and laborious when the picture needed, and by functions such as voice assistants, user can be using natural language expressing oneself Wish is searched for, helps user to be automatically performed picture searching by terminal.However, in the related technology, voice assistant etc. is to user feedback Search result in often still contain more picture, user needs to continue to screen in search result, still not enough just Benefit.In the embodiment of the present application, voice-based picture searching scheme is optimized, the accurate of search result can be promoted Degree.

Fig. 1 is a kind of flow diagram of voice-based image searching method provided by the embodiments of the present application, this method It can be executed by voice-based picture searching device, wherein the device can be implemented by software and/or hardware, and can generally be integrated in In terminal.As shown in Figure 1, this method comprises:

Step 101 receives the first voice messaging.

Illustratively, the terminal in the embodiment of the present application may include the equipment such as mobile phone, tablet computer and computer.

Illustratively, the first voice messaging can be received under voice assistant function or other voice interactive functions.It needs Bright, voice assistant can also have other addresses, such as voice assistant, voice house keeper and speech secretary, and the application is implemented Example is not construed as limiting.In order to make it easy to understand, will be illustrated by taking voice assistant function as an example below.User can be called out using key The modes such as awake, icon wakes up or voice wakes up trigger voice assistant function, and the embodiment of the present application is without limitation.It is helped in triggering voice After hand function, terminal enters audition state, such as opens microphone sound collection component and acquire ambient sound data, then from ring Voice messaging is extracted in the voice data of border, as the first voice messaging.

Step 102 extracts the first search term from first voice messaging, and carries out figure according to first search term Piece search, feeds back to user for the first search result.

Illustratively, after getting the first voice messaging, the first voice messaging is carried out using speech recognition technology Semantics recognition, and then extract the first search term wherein included.

Under voice assistant function, user may need to complete many things by voice assistant, and picture searching is it One of, therefore, optionally, the first search term is extracted from the first voice messaging, may particularly include: judging first language It whether include the corresponding trigger word of picture searching event in message breath, if comprising extracting first from the first voice messaging and searching Rope word.Trigger word for example may include " looking for ", " searching ", " picture " and " photo " etc., include figure in the first voice messaging when identifying When the corresponding trigger word of piece search events, it is believed that user wants to carry out picture searching, then extracts from the first voice messaging the One search term.For example, user says " photo for helping me to ask for Shenzhen ", wherein containing " looking for " and " photo ", it is believed that include The corresponding trigger word of picture searching event.

Illustratively, the first search term can correspond to the keyword of picture searching condition, be that user expresses oneself search meaning The vocabulary of hope.It is interior that first search term may include that image credit, picture generate the time, picture generates that place and picture include Hold etc..Such as the example above, " Shenzhen " refers to photograph taking place, can be used as the first search term.

Illustratively, after extracting the first search term, can by range to be searched picture and the first search term carry out Matching, the picture that matching result meets preset requirement is screened as search result.Range to be searched for example may include The picture of terminal local storage can also include the picture preset in application program (such as social category application program) in terminal, also It may include the picture etc. on internet, can be configured according to actual needs.Optionally, range to be searched may include more It is a, and corresponding different priority.E.g., including the picture on the picture and internet of terminal local storage can be searched preferentially The picture of rope terminal local storage, search meets default want when the picture for not finding to meet preset requirement, then from internet The picture asked.

In the embodiment of the present application, search operation can voluntarily be completed by terminal, can also be corresponding by voice assistant function Server is completed, and is not limited this.It voluntarily completes to be advantageous in that by terminal, it is possible to reduce network flow is saved in data transmission Amount etc..Illustratively, described that picture searching is carried out according to first search term when being completed by server, comprising: will be described First search term is sent to corresponding server-side, and first search term is used to indicate the server-side according to first search Word carries out corresponding picture searching；Receive the first search result that the server-side returns.Exist by the benefit that server is completed In can use server computing resource abundant and accelerate search speed, and then improve search result feedback efficiency.Optionally, When search operation is completed by server-side, the embodiment of the present application may also include that when entering voice assistant function, judge in terminal Whether this atlas changes；If so, updating this current atlas to the server-side.The benefit being arranged in this way exists In accomplishing the synchronization of atlas in time, guarantee the accuracy of search result.

It illustratively, can be using the picture searched as the first search after completing picture searching according to the first search term As a result user is fed back to.Specific feedback form the application without limitation, such as can be shown in the form of thumbnail.It can Choosing, while carrying out the first search result feedback, voice assistant can also carry out voice feedback, such as play sound " for you Find following picture ", promote man-machine interaction experience.

Step 103 receives the second voice messaging.

In general, may still include greater number of picture in the first search result, reason may be user using nature The search condition of language description may not enough precisely, it is also possible to search condition is excessively loose, cause qualified picture compared with It is more.In the related technology, after search result is fed back to user, voice assistant function will not receive again user for expressing this The voice messaging of secondary search wish, user can only search manually the picture oneself really wanted to look up in the first search result, Or this search is abandoned, it re-searches for, however it remains time-consuming and laborious problem.

In the embodiment of the present application, after the first search result is fed back to user, terminal can continue in voice assistant function The lower voice messaging for receiving user's input, is used for subsequent binary search, reduces the manual search operation of user.For example, second Voice messaging can be " photo for finding out last year shooting ".

Step 104 extracts the second search term from second voice messaging, and according to second search term from described Picture searching is carried out in first search result, and the second search result is fed back into user.

Illustratively, the second search term can correspond to the keyword of picture searching condition, be that user expresses oneself search meaning The vocabulary of hope.Second search term may also comprise image credit, picture generates the time, picture generates place and picture includes Content etc..

Such as the example above, the photo of greater number of " Shenzhen " shooting is contained in the first search result, and user may It needs to shoot in the recent period, so having said " photo for finding out last year shooting ", " last year " therein refers to the photograph taking time, It can be used as the second search term.Then, terminal is searched again for from the photo for including in the first search result, filters out last year In the photo of Shenzhen shooting, user is fed back to as the second search result.Specific feedback form the application without limitation, such as It can be shown in the form of thumbnail.Optionally, while carrying out the second search result feedback, voice assistant can be with Voice feedback is carried out, such as plays sound " filtering out following picture for you ", promotes man-machine interaction experience.

The voice-based image searching method provided in the embodiment of the present application receives the first voice messaging, from the first language The first search term is extracted in message breath, and picture searching is carried out according to the first search term, and the first search result is fed back into user, The second voice messaging is received, the second search term is identified from the second voice messaging, and search for from first according to the second search term As a result picture searching is carried out in, and the second search result is fed back into user.By using above-mentioned technical proposal, language can utilized When sound carries out picture searching, realizes the binary search for being directed to previous search result, promote the precision of search result.

In some embodiments, the second voice messaging of the reception, comprising: first search result is fed back into use described In the first preset duration after family, the second voice messaging of user's input is received.The advantages of this arrangement are as follows can permit Family allowable again input voice information while, audition state to control voice assistant is in by the first preset duration of setting Time saves power consumption.In general, user is after seeing the first search result, generally in a short time it is determined whether need into Row searches again for, and by the way that the first preset duration is arranged, excessive ambient sound can be acquired to avoid terminal, and then carry out excessively The operation such as speech recognition, to save power consumption.Wherein, the first preset duration can be arranged according to the actual situation, the embodiment of the present application Without limitation, it such as can be 3 seconds or 5 seconds.Optionally, if detecting the second voice messaging when reaching the first preset duration It does not receive, that is to say, that the words of user have said half, can terminate to listen when detecting that the second voice messaging receives Sound-like state.

In some embodiments, the second voice messaging of the reception, comprising: first search result is fed back into use described In the second preset duration after family, first identifier is shown；When the first identifier is triggered, shape is obtained into voice messaging State, and receive the second voice messaging.The advantages of this arrangement are as follows can after the first search result is fed back to user, The triggering mark for entering audition state, i.e. first identifier are provided, allow user to start by way of triggering first identifier secondary Search routine.The second preset duration, which is arranged, to be advantageous in that, can show first identifier within the limited time, avoids the first mark Know the display area for occupying screen for a long time.Optionally, first identifier can be shown that suspension ball can in the form of suspension ball Think that translucent, triggering mode for example can be click.

In some embodiments, before the second voice messaging of the reception, further includes: obtain first search result In picture number；Judge whether the picture number is greater than preset quantity threshold value, if more than then triggering and receiving the second voice letter Breath.Whether need to enter two the advantages of this arrangement are as follows can be measured automatically according to the picture number in the first search result Secondary search routine.Wherein, preset quantity threshold value can be arranged according to actual needs, such as can be 5.

In some embodiments, the search dimension that first search term and second search term include includes: atlas At least one of title, time, red-letter day, place, personage, scene, picture character and picture type.The benefit being arranged in this way exists In can enrich the search dimension of search term, improve search precision.

In some embodiments, first search term and second search term include at least two search dimensions.This Sample setting is advantageous in that, can permit the search that user disposably carries out multiple dimensions, improves search efficiency.As lifted above Example, the first voice messaging can be " me is helped to ask for the photo that last year claps in Shenzhen ", and such first search term can include " last year " and " Shenzhen ", namely include two search dimensions of when and where.If the picture number people in the first search result Compare more, user can also input the second voice messaging, such as " find daytime I take pictures certainly ", and such second search term can wrap " daytime ", " I " and " self-timer " is included, namely includes three time, personage and atlas title search dimensions, to accurately find Picture needed for user.

In some embodiments, described that the first search term is extracted from first voice messaging, and according to described first Search term carries out picture searching, the first search result is fed back to user, comprising: extract first from first voice messaging Search term, and identify the corresponding tone information of at least two search dimensions for including in first search term；It will be with described The picture that at least two search dimensions for including in one search term match is as candidate atlas；According to the tone information to institute The picture stated in candidate atlas is ranked up, and obtains the first search result；First search result is fed back into user.In this way Setting is advantageous in that, can determine that user stresses degree to different search dimensions according to tone information, and then according to side Weight degree is ranked up, convenient for the picture for being more in line with user's actual search wish is come forward position.

Fig. 2 is the flow diagram of the voice-based image searching method of another kind provided by the embodiments of the present application, the party Method includes the following steps:

Step 201, under voice assistant function, receive user input the first voice messaging.

Step 202 extracts the first search term from the first voice messaging, and carries out picture searching according to the first search term, First search result is fed back into user.

Wherein, the first search term may include at least two search dimensions.Described search dimension can include: atlas title, when Between, red-letter day, place, personage, scene, at least one of picture character and picture type.Wherein, atlas title for example can be Including the systems atlas such as camera, self-timer, personage, video, screenshot, bluetooth, it may also include the customized atlas of user and third party Atlas (atlas of such as third party application), atlas title can be identical as the album name in photograph album；Time may include tool At the time of body, period etc. may also comprise；Red-letter day may include such as Spring Festival, Valentine's Day, Christmas Day and red-letter day on National Day；Scene It may include the photographed scene etc. of picture, can be marked when shooting, the identification of picture scene can also be carried out in search；Picture Text can be the text for including in picture, such as can be the text for including in photographed scene, such as shop sign, can also be The text etc. added on picture is handled by later image；Picture type for example may include static images and dynamic picture etc., It may also comprise the format of picture, such as gif and jpg.

Illustratively, the picture for including in the first search result can be shown on the display screen of terminal.Optionally, may be used It is preferentially scanned in terminal local photograph album, if search result is unique, photograph album can be jumped to and show search result details；If searching Hitch fruit have it is multiple, can jump to search result selection the page.If can carry out network without search result in terminal local photograph album and search Then rope jumps browser searches image results page, while carrying out voice feedback, such as play sound and " find for you such as the following figure Piece ".

Picture number in step 203, the first search result of acquisition.

Step 204 judges whether picture number is greater than preset quantity threshold value, if so, thening follow the steps 205；Otherwise, terminate Process.

Illustratively, it when the picture number in the first search result is more, is not easy to user and searches manually, so into The demand of binary search process can be than stronger；And when the picture number in the first search result is less, user may be quickly The picture needed for oneself is found, therefore, it may not be necessary to enter binary search process.

Step 205 is feeding back to the first search result in the first preset duration after user, receives user's input Second voice messaging.

Illustratively, when the picture number in the first search result is greater than preset quantity threshold value, user sees that first searches After hitch fruit, it is likely that it will do it binary search, can generally determine in a short time, then can directly input the second voice messaging, Plain efficiency is searched in raising.

Step 206 extracts the second search term from the second voice messaging, and according to the second search term from the first search result Middle carry out picture searching, feeds back to user for the second search result.

Illustratively, the second search term also includes at least two search dimensions.Described search dimension can include: atlas name At least one of title, time, red-letter day, place, personage, scene, picture character and picture type.

Voice-based image searching method provided by the embodiments of the present application receives the first language under voice assistant function Message breath extracts the first search term comprising multiple search dimensions from the first voice messaging, and is carried out according to the first search term First search result is fed back to user by picture searching, when picture number is more in search result, is received in specified duration Second voice messaging of user's input, and secondary picture is carried out from the first search result according to the second search term therefrom extracted Search, then the second search result is fed back into user, search effect can be promoted while promoting the precision of search result Rate.

On the basis of the above embodiments, step 202 may particularly include: extracting first from first voice messaging and searches Rope word, and identify the corresponding tone information of at least two search dimensions for including in first search term；It will be with described first The picture that at least two search dimensions for including in search term match is as candidate atlas；According to the tone information to described Picture in candidate atlas is ranked up, and obtains the first search result；First search result is fed back into user.User exists When speaking, often comparing concern or important information can say words with emphasis, and to express emphasis, terminal can be from the first voice messaging The corresponding tone information of each search dimension is identified, so that it is determined which search dimension user more payes attention to.Illustratively, may be used Putting in order for picture is determined according to the corresponding light and heavy degree of tone information, it such as can be according to the tone by again right to light sequence Picture is ranked up.Specifically, in the picture that will be matched at least two search dimensions for including in first search term After candidate atlas, further includes: each picture searches for dimension corresponding with described at least two in the candidate atlas of record With degree value.Correspondingly, described be ranked up the picture in the candidate atlas according to the tone information, it may include: root According to tone information by carrying out dimension sequence at least two search dimension to light sequence again；For every in candidate atlas A picture, at least two matching degree value corresponding to current image carry out matching degree sequence, and determine matching degree sequence and dimension Spend the consistent degree of sequence；The picture in candidate atlas is ranked up according to the sequence of consistent degree from high to low, obtains One search result.

For example, containing 3 search dimension A, B and C in the first search term, user is when inputting the first voice messaging, 3 The tone information of the corresponding search term of dimension is searched for from again to being respectively gently B, C and A, then when being ranked up, will with B, C and The picture that the matching degree of A successively reduces preferentially be discharged to front, in this way when showing the first search result, can allow user referring initially to Compare the picture of the search dimension of concern to oneself.

Fig. 3 is the flow diagram of another voice-based image searching method provided by the embodiments of the present application, the party Method includes:

Step 301 detects that voice assistant function is triggered.

Step 302, into voice assistant function, will be currently local when determining that this atlas in terminal changes Atlas is updated to the corresponding server-side of voice assistant.

Illustratively, the operation of picture searching for the first time in the application can be completed by server-side, and search efficiency can be improved, In order to guarantee the accuracy of search range, the update that this atlas can be carried out when voice assistant function starts is synchronous.

Step 303, the first voice messaging for receiving user's input.

Step 304, when in the first voice messaging including the corresponding trigger word of picture searching event, from the first voice messaging The first search term of middle extraction, and identify the corresponding tone information of at least two search dimensions for including in the first search term.

Wherein, described search dimension can include: atlas title, the time, red-letter day, place, personage, scene, picture character and At least one of picture type.

First search term is sent to corresponding server-side by step 305, and instruction server-side carries out phase according to the first search term The picture searching answered, using at least two pictures that match of search dimensions that include in the first search term as candidate's atlas, The picture in candidate atlas is ranked up according to tone information, obtains the first search result, receives server-side returns first Search result.

First search result is fed back to user by step 306.

Step 307 judges whether the picture number in the first search result is greater than preset quantity threshold value, if so, executing Step 308；Otherwise, terminate process.

Step 308 is feeding back to the first search result in the second preset duration after user, shows suspension ball mark Know.

Step 309, when suspension ball mark is triggered, obtain state into voice messaging, and receive the of user's input Two voice messagings.

Step 310 extracts the second search term from the second voice messaging, and according to the second search term from the first search result Middle carry out picture searching, feeds back to user for the second search result.

In the embodiment of the present application, binary search operation can voluntarily be completed by terminal, and search range is for the first time at this time It is reduced on the basis of search, it is possible to reduce flow is saved in the data transmission between server.

Optionally, this step may particularly include: extracting the second search term from the second voice messaging, and identifies the second search The corresponding tone information of at least two search dimensions for including in word；To include with the second search term in the first search result The picture that at least two search dimensions match is as the second candidate atlas；According to tone information to the figure in the second candidate atlas Piece is ranked up, and obtains the second search result；Second search result is fed back into user.The advantages of this arrangement are as follows can be with The picture in the second search result is ranked up based on tone information, convenient for the figure of user's actual search wish will be more in line with Piece comes forward position.

Voice-based image searching method provided by the embodiments of the present application, in entrance voice assistant function, originally to general Atlas update is synchronized to corresponding server-side, and when being searched for for the first time, terminal extracts first from the first voice messaging Search term, and it is sent to server-side, picture searching for the first time is rapidly completed by server-side, terminal is corresponding according to different search dimensions The tone information display is ranked up to search result, when in search result for the first time include picture number it is more when, display suspend Ball mark, allows user to input the second voice messaging, and by terminal according to the second search term extracted from the second voice messaging It is scanned for from the first search result, and to the second search result of user feedback, the precision of search result can be promoted, and While taking into account reduction data interaction, search efficiency is further promoted.

Fig. 4 is a kind of structural block diagram of voice-based picture searching device provided by the embodiments of the present application, which can It by software and or hardware realization, is typically integrated in terminal, figure can be carried out by executing voice-based image searching method Piece search.As shown in figure 4, the device includes:

First speech reception module 401, for receiving the first voice messaging；

First search module 402, for extracting the first search term from first voice messaging, and according to described first Search term carries out picture searching, and the first search result is fed back to user；

Second speech reception module 403, for receiving the second voice messaging；

Second search module 404, for extracting the second search term from second voice messaging, and according to described second Search term carries out picture searching from first search result, and the second search result is fed back to user.

The voice-based picture searching device provided in the embodiment of the present application receives the first voice messaging, from the first language The first search term is extracted in message breath, and picture searching is carried out according to the first search term, and the first search result is fed back into user, The second voice messaging is received, the second search term is identified from the second voice messaging, and search for from first according to the second search term As a result picture searching is carried out in, and the second search result is fed back into user.By using above-mentioned technical proposal, language can utilized When sound carries out picture searching, realizes the binary search for being directed to previous search result, promote the precision of search result.

Optionally, the second voice messaging of the reception, comprising:

In first preset duration the first search result fed back to after user, receive the second voice messaging.

Optionally, the second voice messaging of the reception, comprising:

In second preset duration the first search result fed back to after user, show first identifier；

When the first identifier is triggered, state is obtained into voice messaging, and receive the second voice messaging.

Optionally, which may also include that

Quantity obtains module, for obtaining in first search result before the second voice messaging of the reception Picture number；

Quantitative determination module, for judging whether the picture number is greater than preset quantity threshold value, if more than then triggering connects Receive the second voice messaging.

Optionally, the search dimension that first search term and second search term include includes: atlas title, when Between, red-letter day, place, personage, scene, at least one of picture character and picture type.

Optionally, first search term and second search term include at least two search dimensions.

Optionally, described to extract the first search term from first voice messaging, and according to first search term into First search result is fed back to user by row picture searching, comprising:

The first search term is extracted from first voice messaging, and identifies include in first search term at least two The corresponding tone information of a search dimension；

Using the picture to match at least two search dimensions for including in first search term as candidate atlas；

The picture in the candidate atlas is ranked up according to the tone information, obtains the first search result；

First search result is fed back into user.

The embodiment of the present application also provides a kind of storage medium comprising computer executable instructions, and the computer is executable Instruction is used to execute voice-based image searching method when being executed by computer processor, this method comprises:

Receive the first voice messaging；

Receive the second voice messaging；

Storage medium --- any various types of memory devices or storage equipment.Term " storage medium " is intended to wrap It includes: install medium, such as CD-ROM, floppy disk or magnetic tape equipment；Computer system memory or random access memory, such as DRAM, DDRRAM, SRAM, EDORAM, Lan Basi (Rambus) RAM etc.；Nonvolatile memory, such as flash memory, magnetic medium (example Such as hard disk or optical storage)；Register or the memory component of other similar types etc..Storage medium can further include other types Memory or combinations thereof.In addition, storage medium can be located at program in the first computer system being wherein performed, or It can be located in different second computer systems, second computer system is connected to the first meter by network (such as internet) Calculation machine system.Second computer system can provide program instruction to the first computer for executing.Term " storage medium " can To include two or more that may reside in different location (such as in the different computer systems by network connection) Storage medium.Storage medium can store the program instruction that can be performed by one or more processors and (such as be implemented as counting Calculation machine program).

Certainly, a kind of storage medium comprising computer executable instructions, computer provided by the embodiment of the present application The voice-based picture searching operation that executable instruction is not limited to the described above, can also be performed the application any embodiment institute The relevant operation in voice-based image searching method provided.

The embodiment of the present application provides a kind of terminal, can integrate in the terminal provided by the embodiments of the present application voice-based Picture searching device.Fig. 5 is a kind of structural schematic diagram of terminal provided by the embodiments of the present application.Terminal 500 may include: storage Device 501, processor 502 and the computer program that is stored on memory 501 and can be run in processor, the processor 502 The voice-based image searching method as described in the embodiment of the present application is realized when executing the computer program.

Terminal provided by the embodiments of the present application can be realized when carrying out picture searching using voice and be directed to previous search As a result binary search promotes the precision of search result.

Fig. 6 is the structural schematic diagram of another terminal provided by the embodiments of the present application, which may include: shell (figure In be not shown), memory 601, central processing unit (central processing unit, CPU) 602 (also known as processor, with Lower abbreviation CPU), circuit board (not shown) and power circuit (not shown).The circuit board is placed in the shell The space interior surrounded；The CPU602 and the memory 601 are arranged on the circuit board；The power circuit, is used for It powers for each circuit or device of the terminal；The memory 601, for storing executable program code；It is described CPU602 is run and the executable program code pair by reading the executable program code stored in the memory 601 The computer program answered, to perform the steps of

Receive the first voice messaging；

Receive the second voice messaging；

The terminal further include: Peripheral Interface 603, RF (Radio Frequency, radio frequency) circuit 605, voicefrequency circuit 606, loudspeaker 611, power management chip 608, input/output (I/O) subsystem 609, other input/control devicess 610, touching Touch screen 612, other input/control devicess 610 and outside port 604, these components pass through one or more communication bus or Signal wire 607 communicates.

It should be understood that graphic terminal 600 is only an example of terminal, and terminal 600 can have than figure Shown in more or less component, two or more components can be combined, or can have different portions Part configuration.Various parts shown in the drawings can be including one or more signal processings and/or specific integrated circuit Hardware, software or hardware and software combination in realize.

Just the terminal provided in this embodiment for picture searching is described in detail below, which is with mobile phone Example.

Memory 601, the memory 601 can be accessed by CPU602, Peripheral Interface 603 etc., and the memory 601 can It can also include nonvolatile memory to include high-speed random access memory, such as one or more disk memory, Flush memory device or other volatile solid-state parts.

The peripheral hardware that outputs and inputs of equipment can be connected to CPU602 and deposited by Peripheral Interface 603, the Peripheral Interface 603 Reservoir 601.

I/O subsystem 609, the I/O subsystem 609 can be by the input/output peripherals in equipment, such as touch screen 612 With other input/control devicess 610, it is connected to Peripheral Interface 603.I/O subsystem 609 may include 6091 He of display controller For controlling one or more input controllers 6092 of other input/control devicess 610.Wherein, one or more input controls Device 6092 processed receives electric signal from other input/control devicess 610 or sends electric signal to other input/control devicess 610, Other input/control devicess 610 may include physical button (push button, rocker buttons etc.), dial, slide switch, behaviour Vertical pole clicks idler wheel.It is worth noting that input controller 6092 can with it is following any one connect: keyboard, infrared port, The indicating equipment of USB interface and such as mouse.

Touch screen 612, the touch screen 612 are the input interface and output interface between user terminal and user, can It is shown to user depending on output, visual output may include figure, text, icon, video etc..

Display controller 6091 in I/O subsystem 609 receives electric signal from touch screen 612 or sends out to touch screen 612 Electric signals.Touch screen 612 detects the contact on touch screen, and the contact that display controller 6091 will test is converted to and is shown The interaction of user interface object on touch screen 612, i.e. realization human-computer interaction, the user interface being shown on touch screen 612 Object can be the icon of running game, the icon for being networked to corresponding network etc..It is worth noting that equipment can also include light Mouse, light mouse are the extensions for the touch sensitive surface for not showing the touch sensitive surface visually exported, or formed by touch screen.

RF circuit 605 is mainly used for establishing the communication of mobile phone Yu wireless network (i.e. network side), realizes mobile phone and wireless network The data receiver of network and transmission.Such as transmitting-receiving short message, Email etc..Specifically, RF circuit 605 receives and sends RF letter Number, RF signal is also referred to as electromagnetic signal, and RF circuit 605 converts electrical signals to electromagnetic signal or electromagnetic signal is converted to telecommunications Number, and communicated by the electromagnetic signal with communication network and other equipment.RF circuit 605 may include for executing The known circuit of these functions comprising but it is not limited to antenna system, RF transceiver, one or more amplifiers, tuner, one A or multiple oscillators, digital signal processor, CODEC (COder-DECoder, coder) chipset, user identifier mould Block (Subscriber Identity Module, SIM) etc..

Voicefrequency circuit 606 is mainly used for receiving audio data from Peripheral Interface 603, which is converted to telecommunications Number, and the electric signal is sent to loudspeaker 611.

Loudspeaker 611 is reduced to sound for mobile phone to be passed through RF circuit 605 from the received voice signal of wireless network And the sound is played to user.

Power management chip 608, the hardware for being connected by CPU602, I/O subsystem and Peripheral Interface are powered And power management.

Voice-based picture searching device, storage medium and the terminal provided in above-described embodiment can be performed the application and appoint Voice-based image searching method provided by embodiment of anticipating has and executes the corresponding functional module of this method and beneficial to effect Fruit.The not technical detail of detailed description in the above-described embodiments, reference can be made to being based on voice provided by the application any embodiment Image searching method.

Note that above are only the preferred embodiment and institute's application technology principle of the application.It will be appreciated by those skilled in the art that The application is not limited to specific embodiment described here, be able to carry out for a person skilled in the art it is various it is apparent variation, The protection scope readjusted and substituted without departing from the application.Therefore, although being carried out by above embodiments to the application It is described in further detail, but the application is not limited only to above embodiments, in the case where not departing from the application design, also It may include more other equivalent embodiments, and scope of the present application is determined by the scope of the appended claims.

Claims

1. a kind of voice-based image searching method characterized by comprising

Receive the first voice messaging；

It extracts the first search term from first voice messaging, and picture searching is carried out according to first search term, by the One search result feeds back to user；

Receive the second voice messaging；

Extract the second search term from second voice messaging, and according to second search term from first search result Middle carry out picture searching, feeds back to user for the second search result.

2. the method according to claim 1, wherein the second voice messaging of the reception, comprising:

3. the method according to claim 1, wherein the second voice messaging of the reception, comprising:

4. the method according to claim 1, wherein before the second voice messaging of the reception, further includes:

Obtain the picture number in first search result；

Judge whether the picture number is greater than preset quantity threshold value, if more than then triggering and receiving the second voice messaging.

5. method according to claim 1 to 4, which is characterized in that first search term and second search term Including search dimension include: in atlas title, time, red-letter day, place, personage, scene, picture character and picture type extremely It is one few.

6. method according to claim 1 to 4, which is characterized in that first search term and second search term Including at least two search dimensions.

7. according to the method described in claim 6, it is characterized in that, described extract the first search from first voice messaging Word, and picture searching is carried out according to first search term, the first search result is fed back into user, comprising:

The first search term is extracted from first voice messaging, and identifies that include in first search term at least two search The corresponding tone information of Suo Weidu；

First search result is fed back into user.

8. a kind of voice-based picture searching device characterized by comprising

First speech reception module, for receiving the first voice messaging；

First search module, for extracting the first search term from first voice messaging, and according to first search term Picture searching is carried out, the first search result is fed back into user；

Second speech reception module, for receiving the second voice messaging；

Second search module, for extracting the second search term from second voice messaging, and according to second search term Picture searching is carried out from first search result, the second search result is fed back into user.

9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The voice-based image searching method as described in any in claim 1-7 is realized when row.

10. a kind of terminal, which is characterized in that including memory, processor and storage can be run on a memory and in processor Computer program, the processor is realized as claimed in claim 1 based on language when executing the computer program The image searching method of sound.