CN108989551B

CN108989551B - Position prompting method and device, storage medium and electronic equipment

Info

Publication number: CN108989551B
Application number: CN201810682024.8A
Authority: CN
Inventors: 许钊铵
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-06-27
Filing date: 2018-06-27
Publication date: 2020-12-01
Anticipated expiration: 2038-06-27
Also published as: CN108989551A

Abstract

The embodiment of the application discloses a position prompting method, a position prompting device, a storage medium and electronic equipment, wherein whether position change occurs at present can be detected, and if the position change occurs, a real-time image of the current position is obtained. And carrying out object identification on the acquired real-time image to obtain object information. The method comprises the steps of collecting voice signals in an external environment and obtaining instructions to be executed included in the voice signals. And when the instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the position prompt information in a voice mode. According to the electronic equipment, when the user cannot find the electronic equipment, the user is prompted about the position according to the object information of the object at the current position, and the user can be helped to recall the position of the last electronic equipment, so that the user can be better guided to find the electronic equipment, and the probability of finding the electronic equipment is improved.

Description

Position prompting method and device, storage medium and electronic equipment

Technical Field

The present application relates to the field of electronic device technologies, and in particular, to a position prompting method and apparatus, a storage medium, and an electronic device.

Background

At present, with the development of technology, the interaction modes between human and machine become more and more abundant. In the related art, a user can control an electronic device such as a mobile phone and a tablet computer through voice, that is, after receiving a voice signal sent by the user, the electronic device can execute a corresponding operation according to the voice signal. For example, when the user cannot find the electronic device, the electronic device may perform location prompting in a manner of ringing according to the voice signal of the user to guide the user to find the electronic device, but not all users can recognize the location by listening to the sound.

Disclosure of Invention

The embodiment of the application provides a position prompting method and device, a storage medium and electronic equipment, and the probability of finding the electronic equipment can be improved.

In a first aspect, an embodiment of the present application provides a position prompting method, where the position prompting method includes:

detecting whether the position changes at present, and if so, acquiring a real-time image of the current position;

carrying out object recognition on the real-time image to obtain object information;

acquiring a voice signal in an external environment, and acquiring a command to be executed included in the voice signal;

and if the instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the position prompt information in a voice mode.

In a second aspect, an embodiment of the present application provides a position prompting apparatus, including:

the image acquisition module is used for detecting whether the position changes at present or not, and acquiring a real-time image of the current position if the position changes at present;

the object identification module is used for carrying out object identification on the real-time image to obtain object information;

the instruction acquisition module is used for acquiring voice signals in an external environment and acquiring instructions to be executed included in the voice signals;

and the position prompt module is used for generating position prompt information according to the object information and outputting the position prompt information in a voice mode when the instruction to be executed is an instruction for triggering position prompt.

In a third aspect, the present application provides a storage medium, on which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the steps in the position prompting method provided by the embodiment of the present application.

In a fourth aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, where the memory has a computer program, and the processor is configured to execute the steps in the location hint method provided in the embodiment of the present application by calling the computer program.

The electronic equipment in the embodiment of the application can detect whether the position changes at present, and if so, a real-time image of the present position is obtained. And carrying out object identification on the acquired real-time image to obtain object information. The method comprises the steps of collecting voice signals in an external environment and obtaining instructions to be executed included in the voice signals. And when the instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the position prompt information in a voice mode. According to the electronic equipment, when the user cannot find the electronic equipment, the user is prompted about the position according to the object information of the object at the current position, and the user can be helped to recall the position of the last electronic equipment, so that the user can be better guided to find the electronic equipment, and the probability of finding the electronic equipment is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flow chart of a position indication method according to an embodiment of the present disclosure.

Fig. 2 is a schematic diagram of a location of an electronic device in the embodiment of the present application.

Fig. 3 is a schematic diagram of an electronic device outputting location hint information in an embodiment of the present application.

Fig. 4 is another schematic flow chart of the position indication method according to the embodiment of the present application.

Fig. 5 is another schematic diagram of a location of an electronic device in the embodiment of the present application.

Fig. 6 is a schematic structural diagram of a position indication device according to an embodiment of the present application.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Fig. 8 is another schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Referring to the drawings, wherein like reference numbers refer to like elements, the principles of the present application are illustrated as being implemented in a suitable computing environment. The following description is based on illustrated embodiments of the application and should not be taken as limiting the application with respect to other embodiments that are not detailed herein.

In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing unit of the computer in electronic signals representing data in a structured form. This operation transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in language specific to above, it is not intended to be limited to the specific form set forth herein, and it will be recognized by those of ordinary skill in the art that various of the steps and operations described below may be implemented in hardware.

The term module, as used herein, may be considered a software object executing on the computing system. The various components, modules, engines, and services described herein may be viewed as objects implemented on the computing system. The apparatus and method described herein may be implemented in software, but may also be implemented in hardware, and are within the scope of the present application.

The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules listed, but rather, some embodiments may include other steps or modules not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

An execution main body of the position prompting method may be the position prompting device provided in the embodiment of the present application, or an electronic device integrated with the position prompting device, where the position prompting device may be implemented in a hardware or software manner. The electronic device may be a smart phone, a tablet computer, a palm computer, a notebook computer, or a desktop computer.

Referring to fig. 1, fig. 1 is a schematic flow chart of a position prompting method according to an embodiment of the present disclosure. As shown in fig. 1, a flow of the position prompting method provided in the embodiment of the present application may be as follows:

101. and detecting whether the position changes at present, and if so, acquiring a real-time image of the current position.

In the embodiment of the application, the electronic device can acquire the position information in real time and identify whether the position change occurs currently according to the acquired position information, for example, the electronic device can compare the currently acquired position information with the previously acquired position information, and if the two pieces of position information are different, it is determined that the position change occurs currently; for another example, the electronic device may calculate a distance value between the currently acquired location information and the previously acquired location information, and determine whether the distance value is greater than a preset distance value (the preset distance value may be an appropriate value according to actual needs by a person skilled in the art, which is not specifically limited in the embodiment of the present application, for example, it may be set to 1 meter, that is, it is determined that a location change occurs when the moving distance of the electronic device exceeds one meter), and if so, it is determined that a location change occurs currently. When the current position change is detected, the electronic equipment shoots through the camera so as to shoot the obtained real-time image of the current position.

When the position information is acquired, the electronic device firstly determines whether the electronic device is currently in an outdoor environment or an indoor environment, for example, the electronic device may identify whether the electronic device is currently in the outdoor environment or the indoor environment according to the strength of the received satellite positioning signal, determine that the electronic device is in the indoor environment if the strength of the received satellite positioning signal is lower than a preset threshold, and determine that the electronic device is in the outdoor environment if the strength of the received satellite positioning signal is higher than or equal to the preset threshold. Accordingly, the electronic device may employ satellite positioning technology to obtain the location information when determined to be in the outdoor environment, and may employ indoor positioning technology to obtain the location information when determined to be in the indoor environment.

102. And carrying out object identification on the acquired real-time image to obtain object information.

After the real-time image of the current position is acquired, the electronic device may perform object identification on the acquired real-time image by using a preset object identification technology to obtain object information, and it should be noted that what kind of object identification technology is used as the preset object identification technology for performing object identification on the real-time image in the embodiment of the present application is not particularly limited, and may be selected by a person skilled in the art according to actual needs. For example, the electronic device may recognize the real-time image through a pre-trained image semantic segmentation model to obtain object information of an object existing in the real-time image.

The object information may include the type, color, or other information that can describe the characteristics of the object. For example, referring to fig. 2, a user holds an electronic device by hand and moves the electronic device from a certain position to a current position shown in the figure, sits on a chair shown in fig. 4, and uses the electronic device, at this time, the electronic device will determine that a position change occurs at present, and at this time, a real-time image of the current position, that is, an image of a desktop shown in the figure, is obtained, the real-time image includes a black telephone, and performs object recognition on the real-time image, and the obtained object information is: the category "telephone", the color "black".

103. The method comprises the steps of collecting voice signals in an external environment and obtaining instructions to be executed included in the voice signals.

It should be noted that the electronic device may collect the voice signal in the external environment in a plurality of different manners, for example, when the electronic device is not externally connected with a microphone, the electronic device may collect the voice in the external environment through a built-in microphone to obtain the voice signal; for another example, when the electronic device is externally connected with a microphone, the electronic device may collect voice in an external environment through the externally connected microphone to obtain a voice signal.

When the electronic device collects a voice signal in an external environment through a microphone (the microphone here may be a built-in microphone or an external microphone), if the microphone is an analog microphone, the electronic device collects an analog voice signal, and at this time, the electronic device needs to sample the analog voice signal to convert the analog voice signal into a digitized voice signal, for example, the electronic device can sample at a sampling frequency of 16 KHz; in addition, if the microphone is a digital microphone, the electronic device directly collects the digitized voice signal through the digital microphone without conversion.

When the instruction to be executed included in the voice signal is acquired, the electronic equipment firstly judges whether a voice analysis engine exists locally, if so, the electronic equipment inputs the voice signal into the local voice analysis engine for voice analysis, and a voice analysis text is obtained. The voice analysis is performed on the voice signal, that is, the voice signal is converted from "audio" to "text".

Furthermore, when a plurality of speech analysis engines exist locally, the electronic device may select one speech analysis engine from the plurality of speech analysis engines to perform speech analysis on the speech signal in the following manner:

first, the electronic device may randomly select one speech analysis engine from a plurality of local speech analysis engines to perform speech analysis on the speech signal.

And secondly, the electronic equipment can select the voice analysis engine with the highest analysis success rate from the plurality of voice analysis engines to perform voice analysis on the voice signal.

And thirdly, the electronic equipment can select the voice analysis engine with the shortest analysis time length from the plurality of voice analysis engines to carry out voice analysis on the voice signal.

Fourthly, the electronic equipment can also select a voice analysis engine with the analysis success rate reaching the preset success rate and the shortest analysis time from the plurality of voice analysis engines to carry out voice analysis on the voice signal.

It should be noted that, a person skilled in the art may also select a speech analysis engine according to a manner not listed above, or may perform speech analysis on the speech signal by combining multiple speech analysis engines, for example, the electronic device may perform speech analysis on the speech signal by using two speech analysis engines at the same time, and when speech analysis texts obtained by two speech analysis engines are the same, use the same speech analysis text as a speech analysis text of the speech signal; for another example, the electronic device may perform speech analysis on the speech signal through at least three speech analysis engines, and when speech analysis texts obtained by at least two of the speech analysis engines are the same, use the same speech analysis text as the speech analysis text of the speech signal.

After the voice analysis text of the voice signal is obtained through analysis, the electronic equipment further obtains the instruction to be executed included in the voice signal from the voice analysis text.

The electronic equipment stores a plurality of instruction keywords in advance, and a single instruction keyword or a plurality of instruction keyword combinations correspond to one instruction. When the to-be-executed instruction included in the voice signal is obtained from the voice analysis text obtained through analysis, the electronic equipment firstly carries out word segmentation operation on the voice analysis text to obtain a word sequence corresponding to the voice analysis text, and the word sequence includes a plurality of words.

After the word sequence corresponding to the voice analysis text is obtained, the electronic device matches the instruction keywords with the word sequence, that is, the instruction keywords in the word sequence are found out, so that the corresponding instruction is obtained through matching, and the instruction obtained through matching is used as an instruction to be executed of the voice signal. Wherein the matching search of the instruction keywords comprises complete matching and/or fuzzy matching.

After determining whether a voice analysis engine exists locally, if not, the electronic device sends the voice signal to a server (the server is a server providing voice analysis service), instructs the server to analyze the voice signal, and returns a voice analysis text obtained by analyzing the voice signal. After receiving the voice analysis text returned by the server, the electronic device can acquire the instruction to be executed included in the voice signal from the voice analysis text.

104. And when the instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the position prompt information in a voice mode.

In the embodiment of the application, after obtaining that the voice signal comprises a to-be-executed instruction, the electronic device identifies whether the to-be-executed instruction is an instruction for triggering position prompt, wherein the instruction for triggering position prompt can be set according to input data of a machine owner, and if the to-be-executed instruction is identified as the instruction for triggering position prompt, the speaker of the voice signal is determined to be the machine owner. For example, the owner sets an instruction keyword combination of "small europe" + "you" + "where" as an instruction for triggering the position prompt, so that when the owner cannot find the electronic device, "small europe you are" can be said, and correspondingly, when the electronic device receives the voice signal "small europe you are" the electronic device determines that the speaker of the voice signal "small europe you are" the owner, and the instruction to be executed included in the voice signal "small europe you are" is the instruction for triggering the position prompt.

Based on the above description, those skilled in the art understand that, when the position of the electronic device changes, the electronic device obtains a real-time image of the position, and performs object recognition on the real-time image to obtain object information, and if the electronic device receives a voice signal for performing position prompting after the position of the electronic device changes last time, it indicates that the owner cannot find the electronic device, and the position of the electronic device, that is, the position where the position of the electronic device changes last time, is the aforementioned "current position", and the "aforementioned object information" may reflect the aforementioned "current position". Therefore, when recognizing that the instruction to be executed included in the voice signal is an instruction for triggering position prompt, the electronic device further generates position prompt information according to the object information to prompt the position of the electronic device, so that the owner can recall the current position of the electronic device used last time to find the electronic device.

When the electronic device generates the position prompt information, the identified object information and the preset information (which may be set by a person skilled in the art according to actual needs and is not specifically limited in the embodiments of the present application) may be spliced, and the obtained spliced information is used as the position prompt information.

For example, assume that the preset information is "owner, i.e. there is a place near me", and assume that the acquired object information is: the type of the telephone is "telephone", the color of the telephone is "black", the electronic device splices the position prompt information according to the form of the preset information, the color information and the type information, and the obtained position prompt information is "owner, and there is a black telephone near me", as shown in fig. 3.

As can be seen from the above, the electronic device in the embodiment of the present application can detect whether a position change occurs at present, and if so, obtain a real-time image of the present position. And carrying out object identification on the acquired real-time image to obtain object information. The method comprises the steps of collecting voice signals in an external environment and obtaining instructions to be executed included in the voice signals. And when the instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the position prompt information in a voice mode. According to the electronic equipment, when the user cannot find the electronic equipment, the user is prompted about the position according to the object information of the object at the current position, and the user can be helped to recall the position of the last electronic equipment, so that the user can be better guided to find the electronic equipment, and the probability of finding the electronic equipment is improved.

Referring to fig. 4, fig. 4 is another schematic flow chart of a position prompting method according to an embodiment of the present application, and as shown in fig. 4, the position prompting method may include:

201. and detecting whether the position changes at present, and if so, acquiring a real-time image of the current position.

202. A salient region in the acquired real-time image is determined.

It is easy to understand that there are many objects in the real-time image, and it takes a short time to complete the recognition of all the objects, and not every object is noticed by the user. For example, referring to fig. 5, a user holds the electronic device by hand and moves the electronic device from a certain position to a current position shown in the figure, sits on a chair shown in fig. 4 and uses the electronic device, at this time, the electronic device determines that a position change occurs currently, and at this time, a real-time image of the current position, that is, an image of a desktop shown in the figure, is acquired, and the real-time image includes a black telephone, an object 1, and an object 2.

Therefore, in order to improve the efficiency of object identification, after acquiring the real-time image, the electronic device first identifies a salient region (colloquially, a region that may be noticed) of the real-time image, and determines the salient region in the real-time image, so as to identify an object in the salient region.

Wherein, the electronic device can identify the salient region in the real-time image through a pre-trained identification model. The recognition model is a machine learning algorithm, and the machine learning algorithm can learn which objects in the picture have higher significance through continuous feature learning, that is, learn how to recognize significant regions in the image, such as people and animals which are generally considered to have higher significance than sky, grassland and buildings.

In addition, the electronic device may further identify the salient region in the real-time image by a key region focusing method, and the like, which is not specifically limited in the embodiment of the present application.

203. And carrying out object identification on the salient region in the real-time image.

In the embodiment of the present application, the electronic device performs object identification after determining the salient region in the real-time image, and the identified object information reflects an object that is easily noticed by the user in the real-time image, in other words, an object that is easy to help the user recall the last use position (i.e. the "current position" of the electronic device).

The electronic device can perform object recognition on the salient region in the real-time image by adopting a preset object recognition technology to obtain object information. It should be noted that, in the embodiment of the present application, what kind of object recognition technology is used as the preset object recognition technology for performing object recognition on the real-time image is not specifically limited, and a person skilled in the art can select the preset object recognition technology according to actual needs. For example, the electronic device may recognize the real-time image through a pre-trained image semantic segmentation model to obtain object information of an object existing in a salient region of the real-time image.

In addition, the object information may include the type, color, or other information that can describe the characteristics of the object. For example, when the electronic device is moved from a certain position to a current position (assuming that there is a black telephone at the current position), the electronic device determines that a position change occurs at the current time, and acquires a real-time image of the current position and performs object identification on the real-time image, where the obtained object information is: the category "telephone", the color "black".

204. The method comprises the steps of collecting a voice signal with noise in an external environment, and obtaining a historical noise signal corresponding to the voice signal with noise.

It is easily understood that various noises exist in the environment, such as noises generated by operating a computer, noises generated by knocking a keyboard, and the like in an office. Therefore, when the electronic device collects the voice signal, it is obviously difficult to collect a pure voice signal.

Correspondingly, when the electronic device is in a noisy environment, if the user sends out a voice signal, the electronic device collects a noisy voice signal in the external environment, the noisy voice signal is formed by combining the voice signal sent out by the user and a noise signal in the external environment, and if the user does not send out the voice signal, the electronic device only collects the noise signal in the external environment. The electronic equipment buffers the collected voice signals with noise and noise signals.

In this embodiment of the present application, when the electronic device collects a noisy speech signal in an external environment, taking a start time of the noisy speech signal as an end time, obtaining a historical noise signal of a preset duration (the preset duration may be a suitable value according to actual needs by a person skilled in the art, which is not specifically limited in this embodiment of the present application, and may be set to 500ms, for example) collected before receiving the noisy speech signal, and taking the noise signal as the historical noise signal corresponding to the noisy speech signal.

For example, the preset time duration is configured to be 500ms, the starting time of the noisy speech signal is 24 minutes, 56 seconds and 500ms at 20 days 18 hours at 06 months in 2018, the electronic device acquires the noise signal with the time duration of 500ms buffered from 24 minutes and 56 seconds at 18 days 18 hours at 20 months at 06 months in 2018 to 24 minutes and 56 seconds at 18 days at 24 days at 20 days at 06 months in 2018, and takes the noise signal as the historical noise signal corresponding to the noisy speech signal.

205. And acquiring a noise signal during the acquisition of the voice signal with noise according to the historical noise signal.

After acquiring the historical noise signal corresponding to the voice signal with noise, the electronic equipment further acquires the noise signal during the acquisition of the voice signal with noise according to the acquired historical noise signal.

For example, the electronic device may predict noise distribution during the period of acquiring the noisy speech signal according to the acquired historical noise signal, so as to obtain a noise signal during the period of acquiring the noisy speech signal.

For another example, in consideration of noise stability, noise change in continuous time is usually small, and the electronic device may use the acquired historical noise signal as a noise signal during the acquisition of the noisy speech signal, wherein if the duration of the historical noise signal is greater than that of the noisy speech signal, a noise signal having the same duration as that of the noisy speech signal may be intercepted from the historical noise signal as a noise signal during the acquisition of the noisy speech signal; if the duration of the historical noise signal is less than the duration of the voice signal with noise, the historical noise signal can be copied, and a plurality of historical noise signals are spliced to obtain a noise signal with the duration same as that of the voice signal with noise, and the noise signal is used as the noise signal during the acquisition of the voice signal with noise.

206. And performing reverse phase superposition on the noise signal and the voice signal with the noise, and taking the noise-reduced voice signal obtained by superposition as the voice signal to be processed.

After the noise signal during the collection of the voice signal with noise is acquired, the electronic equipment firstly carries out phase inversion processing on the acquired noise signal, then superposes the noise signal after the phase inversion processing and the voice signal with noise to eliminate the noise part in the voice signal with noise to obtain a voice signal with noise, and uses the obtained voice signal with noise as the voice signal to be processed for subsequent processing.

207. And acquiring an instruction to be executed included in the voice signal.

When the instruction to be executed included in the voice signal is acquired, the electronic equipment firstly judges whether a voice analysis engine exists locally, if so, the electronic equipment inputs the voice signal into the local voice analysis engine for voice analysis, and a voice analysis text is obtained. The voice analysis is performed on the voice signal, that is, the voice signal is converted from "audio" to "text". After the voice analysis text of the voice signal is obtained through analysis, the electronic equipment further obtains the instruction to be executed included in the voice signal from the voice analysis text.

208. And when the acquired instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the generated position prompt information in a voice mode.

In one embodiment, "determining a salient region in the captured real-time image" includes:

(1) calling a pre-trained neural network, and acquiring the image gradient of the real-time image through the neural network;

(2) generating an image to be processed corresponding to the real-time image according to the acquired image gradient;

(3) carrying out binarization processing on the image to be processed to obtain a binarized image to be processed;

(4) and obtaining the saliency area of the real-time image according to the connected area in the binary image to be processed.

The electronic device first preprocesses the real-time image, for example, the real-time image is normalized according to 256 × 256 pixels, then the preprocessed real-time image is input to a pre-trained neural network, and an image gradient of the preprocessed real-time image is calculated through the neural network.

After the image gradient of the real-time image is obtained, a to-be-processed image corresponding to the real-time image is further generated according to the maximum absolute value of the image gradient on different color channels (such as an R channel, a B channel, and a G channel), and the to-be-processed image can reflect the saliency region of the real-time image to a certain extent.

And after the to-be-processed image corresponding to the real-time image is obtained, the electronic equipment performs binarization processing on the to-be-processed image to obtain a binarized to-be-processed image. The method of binarizing the image to be processed is not particularly limited, and for example, a maximum inter-class variance method may be used.

After the binaryzation image to be processed is obtained, the electronic device further determines a connected region in the binaryzation image to be processed, and then the saliency region of the real-time image is obtained according to the determined connected region. For example, if the electronic device determines a connected region from the image to be processed, the connected region may be directly used as the saliency region of the real-time image. For another example, if the electronic device determines a plurality of connected regions from the image to be processed, the determined plurality of connected regions may be directly used as the plurality of salient regions in the real-time image.

In an embodiment, "obtaining the saliency region of the real-time image according to the connected region in the binarized image to be processed" includes:

and when a plurality of connected areas exist in the binary image to be processed, selecting one connected area from the plurality of connected areas as a significant area of the real-time image.

After the electronic device determines the connected regions in the binarized image to be processed, if a plurality of connected regions exist in the binarized image to be processed, one connected region is selected from the plurality of connected regions to serve as the saliency region of the real-time image. For example, the electronic device may randomly select one connected region as the saliency region of the real-time image from a plurality of connected regions determined by the random selection method.

In one embodiment, "selecting one connected region from a plurality of connected regions as the salient region of the real-time image" includes:

and selecting the communication area with the largest area from the plurality of communication areas as the significance area of the real-time image.

The larger the area of the connected region, that is, the size of the corresponding object, the more easily the corresponding object is noticed. Therefore, the electronic device may select the connected region with the largest area from the plurality of connected regions as the salient region of the real-time image, so that when the salient region is identified, the identified object information reflects an object most easily noticed by the user in the real-time image, thereby better helping the user recall the position where the electronic device was last used (i.e. the "current position" of the electronic device).

In an embodiment, before "acquiring the instruction to be executed included in the aforementioned voice signal", the method further includes:

(1) acquiring the voiceprint characteristics of the voice signal, and verifying the acquired voiceprint characteristics;

(2) and when the voiceprint feature verification passes, acquiring a command to be executed included in the voice signal.

It is easy to understand that although the instruction for triggering the location prompt is set by the owner, it cannot be excluded that the instruction is known to others. Therefore, the electronic equipment also authenticates the identity of the speaker of the voice signal according to the voiceprint characteristics.

After obtaining the voice signal to be processed, the electronic device further obtains the voiceprint features included in the voice signal. Wherein the voiceprint feature includes, but is not limited to, at least one feature component of a spectrum feature component, a cepstrum feature component, a formant feature component, a pitch feature component, a reflection coefficient feature component, a tone feature component, a speech rate feature component, an emotion feature component, a prosody feature component, and a rhythm feature component.

Then, the electronic device obtains the similarity between the voiceprint feature and a preset voiceprint feature (the preset voiceprint feature is a voiceprint feature pre-recorded by the owner), and determines whether the obtained similarity is greater than or equal to the preset similarity (the obtained similarity can be set by a person skilled in the art according to actual needs). And when the acquired similarity is greater than or equal to the preset similarity, determining that the voiceprint feature verification is passed, and judging that the speaker of the voice signal is the owner.

The electronic device can obtain the distance between the voiceprint feature and a preset voiceprint feature, and the obtained distance is used as the similarity between the voiceprint feature and the preset voiceprint feature. It should be noted that, any feature distance (such as euclidean distance, manhattan distance, chebyshev distance, etc.) may be selected by those skilled in the art according to actual needs to measure the distance between the aforementioned voiceprint feature and the preset voiceprint feature.

For example, the cosine distance between the voiceprint feature and the preset voiceprint feature may be obtained, specifically referring to the following formula:

wherein e represents the cosine distance between the voiceprint feature and the preset voiceprint feature, f represents the voiceprint feature, N represents the dimension of the voiceprint feature and the preset voiceprint feature (the dimension of the voiceprint feature is the same as that of the preset voiceprint feature), and f represents the dimension of the voiceprint feature and the preset voiceprint feature_iFeature vector, g, representing the ith dimension of the voiceprint feature_iAnd representing a feature vector of an ith dimension in the preset voiceprint features.

In one embodiment, obtaining a noise signal during collection of a noisy speech signal from a historical noise signal comprises:

(1) performing model training by taking the historical noise signal as sample data to obtain a noise prediction model;

(2) and predicting a noise signal during the collection of the voice signal with noise according to the noise prediction model.

After the electronic equipment acquires the historical noise signal, the historical noise signal is used as sample data, model training is carried out according to a preset training algorithm, and a noise prediction model is obtained.

It should be noted that the training algorithm is a machine learning algorithm, and the machine learning algorithm may predict data by continuously performing feature learning, for example, the electronic device may predict a current noise distribution according to a historical noise distribution. Wherein the machine learning algorithm may include: decision tree algorithm, regression algorithm, bayesian algorithm, neural network algorithm (which may include deep neural network algorithm, convolutional neural network algorithm, recursive neural network algorithm, etc.), clustering algorithm, etc., and the selection of which training algorithm to use as the preset training algorithm for model training may be selected by those skilled in the art according to actual needs.

For example, a preset training algorithm configured for the electronic device is a gaussian mixture model algorithm (which is a regression algorithm), after a historical noise signal is obtained, the historical noise signal is used as sample data, model training is performed according to the gaussian mixture model algorithm, a gaussian mixture model is obtained through training (a noise prediction model includes a plurality of gaussian units and is used for describing noise distribution), and the gaussian mixture model is used as a noise prediction model. Then, the electronic device inputs the start time and the end time of the noisy speech signal acquisition period as the input of the noise prediction model, inputs the input to the noise prediction model for processing, and outputs the noise signal of the noisy speech signal acquisition period by the noise prediction model.

In one embodiment, a position prompting device is further provided. Referring to fig. 6, fig. 6 is a schematic structural diagram of a position indicating device 400 according to an embodiment of the present disclosure. The position prompting device is applied to an electronic device, and comprises an image acquisition module 401, an object recognition module 402, an instruction acquisition module 403 and a position prompting module 404, as follows:

the image obtaining module 401 is configured to detect whether a current position changes, and if so, obtain a real-time image of the current position.

And an object identification module 402, configured to perform object identification on the obtained real-time image to obtain object information.

The instruction obtaining module 403 is configured to collect a voice signal in an external environment, and obtain an instruction to be executed included in the voice signal.

And the position prompt module 404 is configured to generate position prompt information according to the object information and output the position prompt information in a voice manner when the instruction to be executed is an instruction for triggering position prompt.

In one embodiment, object identification module 402 may be configured to:

determining a salient region in the acquired real-time image;

and carrying out object identification on the salient region in the real-time image.

In one embodiment, object identification module 402 may be configured to:

calling a pre-trained neural network, and acquiring the image gradient of the real-time image through the neural network;

generating an image to be processed corresponding to the real-time image according to the acquired image gradient;

carrying out binarization processing on the image to be processed to obtain a binarized image to be processed;

and obtaining the saliency area of the real-time image according to the connected area in the binary image to be processed.

In one embodiment, object identification module 402 may be configured to:

In one embodiment, the instruction obtaining module 403 may be configured to:

acquiring a historical noise signal corresponding to a voice signal with noise when the voice signal with noise in the external environment is acquired;

acquiring a noise signal during the acquisition of a voice signal with noise according to the historical noise signal;

and performing reverse phase superposition on the noise signal and the voice signal with the noise, and taking the noise-reduced voice signal obtained by superposition as the voice signal.

In one embodiment, the instruction obtaining module 403 may be configured to:

performing model training by taking the historical noise signal as sample data to obtain a noise prediction model;

and predicting a noise signal during the collection of the voice signal with noise according to the noise prediction model.

In one embodiment, the instruction obtaining module 403 may be configured to:

acquiring the voiceprint characteristics of the voice signal, and verifying the acquired voiceprint characteristics;

and when the voiceprint feature verification passes, acquiring a command to be executed included in the voice signal.

The steps executed by each module in the position prompting device 400 may refer to the method steps described in the above method embodiments. The position prompting device 400 can be integrated into an electronic device, such as a mobile phone, a tablet computer, and the like.

In specific implementation, the modules may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and specific implementation of the units may refer to the foregoing embodiments, which are not described herein again.

As can be seen from the above, the position prompt apparatus of the present embodiment can detect whether the current position changes by the image acquisition module 401, and if so, acquire the real-time image of the current position. The object recognition module 402 performs object recognition on the acquired real-time image to obtain object information. The instruction obtaining module 403 collects a voice signal in the external environment, and obtains an instruction to be executed included in the voice signal. When the instruction to be executed is an instruction for triggering position prompt, the position prompt module 404 generates position prompt information according to the object information, and outputs the position prompt information in a voice manner. According to the electronic equipment, when the user cannot find the electronic equipment, the user is prompted about the position according to the object information of the object at the current position, and the user can be helped to recall the position of the last electronic equipment, so that the user can be better guided to find the electronic equipment, and the probability of finding the electronic equipment is improved.

In an embodiment, an electronic device is also provided. Referring to fig. 7, an electronic device 500 includes a processor 501 and a memory 502. The processor 501 is electrically connected to the memory 502.

The processor 500 is a control center of the electronic device 500, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device 500 and processes data by running or loading a computer program stored in the memory 502 and calling data stored in the memory 502.

The memory 502 may be used to store software programs and modules, and the processor 501 executes various functional applications and data processing by running the computer programs and modules stored in the memory 502. The memory 502 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 502 may also include a memory controller to provide the processor 501 with access to the memory 502.

In this embodiment, the processor 501 in the electronic device 500 loads instructions corresponding to one or more processes of the computer program into the memory 502, and the processor 501 runs the computer program stored in the memory 502, so as to implement various functions as follows:

carrying out object identification on the obtained real-time image to obtain object information;

and when the instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the position prompt information in a voice mode.

Referring to fig. 8, in some embodiments, the electronic device 500 may further include: a display 503, radio frequency circuitry 504, audio circuitry 505, and a power supply 506. The display 503, the rf circuit 504, the audio circuit 505, and the power source 506 are electrically connected to the processor 501.

The display 503 may be used to display information entered by or provided to the user as well as various graphical user interfaces, which may be made up of graphics, text, icons, video, and any combination thereof. The Display 503 may include a Display panel, and in some embodiments, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The rf circuit 504 may be used for transceiving rf signals to establish wireless communication with a network device or other electronic devices through wireless communication, and for transceiving signals with the network device or other electronic devices.

The audio circuit 505 may be used to provide an audio interface between the user and the electronic device through a speaker, microphone.

The power supply 506 may be used to power various components of the electronic device 500. In some embodiments, power supply 506 may be logically coupled to processor 501 through a power management system, such that functions of managing charging, discharging, and power consumption are performed through the power management system.

Although not shown in fig. 8, the electronic device 500 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.

In some embodiments, in performing object recognition on the acquired real-time image, the processor 501 may perform the following steps:

determining a salient region in the acquired real-time image;

In some embodiments, in determining a salient region in the acquired real-time image, processor 501 may perform the following steps:

In some embodiments, when obtaining the saliency region of the real-time image from the connected region in the binarized image to be processed, the processor 501 may perform the following steps:

In some embodiments, when selecting a connected region from the plurality of connected regions as the saliency region of the real-time image, the processor 501 may perform the following steps:

In some embodiments, when collecting the voice signal in the external environment, the processor 501 may further perform the following steps:

In some embodiments, when acquiring a noise signal during noisy speech signal acquisition from a historical noise signal, processor 501 may further perform the following steps:

An embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and when the computer program runs on a computer, the computer is caused to execute the position indication method in any one of the above embodiments, for example: detecting whether the position changes at present, and if so, acquiring a real-time image of the current position; carrying out object identification on the obtained real-time image to obtain object information; acquiring a voice signal in an external environment, and acquiring a command to be executed included in the voice signal; and when the instruction to be executed is an instruction for triggering position prompt, generating position prompt information according to the object information, and outputting the position prompt information in a voice mode.

In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It should be noted that, for the position indication method in the embodiment of the present application, it can be understood by a person skilled in the art that all or part of the process of implementing the position indication method in the embodiment of the present application can be completed by controlling the relevant hardware through a computer program, where the computer program can be stored in a computer-readable storage medium, such as a memory of an electronic device, and executed by at least one processor in the electronic device, and during the execution process, the process of the embodiment of the position indication method can be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.

In the position indication device according to the embodiment of the present application, each functional module may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.

The position prompting method, the position prompting device, the storage medium and the electronic device provided by the embodiment of the application are introduced in detail, a specific example is applied in the description to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A position prompting method is characterized by comprising the following steps:

determining a salient region in the real-time image, and performing object identification on the salient region in the real-time image to obtain object information;

and when the instruction to be executed is an instruction for triggering position prompt, splicing the object information and preset information for prompting the position into position prompt information, and outputting the position prompt information in a voice mode.

2. The position prompting method of claim 1, wherein the step of determining a salient region in the real-time image comprises:

generating an image to be processed corresponding to the real-time image according to the image gradient;

and obtaining the significance region of the real-time image according to the connected region in the binarized image to be processed.

3. The position prompt method according to claim 2, wherein the step of obtaining the saliency region of the real-time image according to the connected region in the binarized image to be processed comprises:

and when a plurality of connected areas exist in the binary image to be processed, selecting one connected area from the plurality of connected areas as the significant area.

4. A position cue method as claimed in claim 3 wherein the step of selecting one connected region from the plurality of connected regions as the salient region comprises:

and selecting the communication area with the largest area from the plurality of communication areas as the significance area.

5. The position prompting method of any one of claims 1-4, wherein the step of obtaining the instruction to be executed included in the voice signal is preceded by the step of:

acquiring the voiceprint characteristics of the voice signal and verifying the voiceprint characteristics;

and when the voiceprint feature verification passes, acquiring an instruction to be executed included in the voice signal.

6. A method for location prompting according to any of claims 1-4, wherein the step of collecting speech signals in the external environment further comprises:

collecting a voice signal with noise in an external environment, and acquiring a historical noise signal corresponding to the voice signal with noise;

acquiring a noise signal during the collection of the voice signal with the noise according to the historical noise signal;

7. A position prompting device, characterized in that the position prompting device comprises:

the object identification module is used for determining a salient region in the real-time image and identifying an object in the salient region in the real-time image to obtain object information;

and the position prompting module is used for splicing the object information and preset information for prompting the position into position prompting information when the instruction to be executed is an instruction for triggering position prompting, and outputting the position prompting information in a voice mode.

8. A storage medium having stored thereon a computer program, characterized in that, when the computer program is run on a computer, it causes the computer to execute a position prompting method according to any one of claims 1 to 6.

9. An electronic device comprising a processor and a memory, wherein the processor is configured to perform the location hint method of any one of claims 1 to 6 by invoking a computer program.