CN109257490B - Audio processing method and device, wearable device and storage medium - Google Patents

Audio processing method and device, wearable device and storage medium Download PDF

Info

Publication number
CN109257490B
CN109257490B CN201811001212.6A CN201811001212A CN109257490B CN 109257490 B CN109257490 B CN 109257490B CN 201811001212 A CN201811001212 A CN 201811001212A CN 109257490 B CN109257490 B CN 109257490B
Authority
CN
China
Prior art keywords
audio
data
preset
information
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811001212.6A
Other languages
Chinese (zh)
Other versions
CN109257490A (en
Inventor
林肇堃
魏苏龙
麦绮兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201811001212.6A priority Critical patent/CN109257490B/en
Publication of CN109257490A publication Critical patent/CN109257490A/en
Application granted granted Critical
Publication of CN109257490B publication Critical patent/CN109257490B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/72442User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for playing music files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72436User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. short messaging services [SMS] or e-mails
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/12Details of telephonic subscriber devices including a sensor for measuring a physical value, e.g. temperature or motion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Acoustics & Sound (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Environmental & Geological Engineering (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

An audio processing method, an audio processing apparatus, a wearable device and a storage medium provided in an embodiment of the present application, the method includes: acquiring audio data acquired by wearable equipment; if the audio data is detected to include preset event information, extracting audio fragment data corresponding to the preset event information from the audio data; storing the audio clip data. This application embodiment gathers outside sound through wearing formula equipment, if detect including the preset event information that can start the recording in the sound, gather and the record sound, can improve the operating efficiency of recording and reduce the redundant information in the audio information.

Description

Audio processing method and device, wearable device and storage medium
Technical Field
The embodiment of the application relates to the technical field of wearable equipment, in particular to an audio processing method and device, wearable equipment and a storage medium.
Background
With the development of wearable devices, the fields in which wearable devices are applied are increasing. Wearable devices are generally worn by users for long periods of time, and can collect more user-related data than other mobile devices, which can better assist the user in their daily lives and tasks. However, the audio capture function of the current wearable device is not complete enough, and needs to be improved.
Disclosure of Invention
The audio processing method and device, the wearable device and the storage medium provided by the embodiment of the application can optimize the audio acquisition function of the intelligent glasses.
In a first aspect, an embodiment of the present application provides an audio processing method, including:
acquiring audio data acquired by wearable equipment;
if the audio data is detected to include preset event information, extracting audio fragment data corresponding to the preset event information from the audio data;
storing the audio clip data.
In a second aspect, an embodiment of the present application provides an audio processing apparatus, including:
the sound detection module is used for acquiring audio data acquired by the wearable equipment;
the sound acquisition module is used for extracting audio fragment data corresponding to preset event information from the audio data if the audio data is detected to comprise the preset event information;
and the storage module is used for storing the audio fragment data.
In a third aspect, an embodiment of the present application provides a wearable device, including: the audio processing system comprises a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to realize the audio processing method according to the embodiment of the application.
In a fourth aspect, embodiments of the present application provide a storage medium containing wearable device-executable instructions, which when executed by a wearable device processor, are configured to perform an audio processing method as described in embodiments of the present application.
According to the audio processing scheme provided by the embodiment of the application, the audio data collected by the wearable device is obtained; if the audio data is detected to include preset event information, extracting audio fragment data corresponding to the preset event information from the audio data; storing the audio clip data. This application embodiment gathers outside sound through wearing formula equipment, if detect including the preset event information that can start the recording in the sound, gather and the record sound, can improve the operating efficiency of recording and reduce the redundant information in the audio information.
Drawings
Fig. 1 is a schematic flowchart of an audio processing method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of another audio processing method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of another audio processing method according to an embodiment of the present application;
fig. 4 is a schematic flowchart of another audio processing method according to an embodiment of the present application;
fig. 5 is a schematic flowchart of another audio processing method according to an embodiment of the present application;
fig. 6 is a block diagram of an audio processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a wearable device according to an embodiment of the present disclosure;
fig. 8 is a schematic physical diagram of a wearable device provided in an embodiment of the present application.
Detailed Description
The technical scheme of the application is further explained by the specific implementation mode in combination with the attached drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Fig. 1 is a schematic flowchart of an audio processing method provided in an embodiment of the present application, where the method may be executed by an audio processing apparatus, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in a wearable device, or may be integrated in other devices installed with an operating system. As shown in fig. 1, the method includes:
and S110, acquiring audio data collected by the wearable device.
Wherein, the wearable device can be a wearable device with a smart operating system, exemplarily, the wearable device can be smart glasses, and the smart glasses are generally worn around the eyes of the user. The wearable device is integrated with various sensors capable of collecting various information, including: the gesture sensor is used for acquiring gesture information of a user, the shooting module is used for acquiring images, the microphone is used for acquiring sound, the sign sensor is used for detecting sign information of the user, and the like.
The external sound can be monitored, and the audio data corresponding to the external sound can be collected. The external sound may be a sound in an environment in which the wearable device is located. The sound in the external environment can be monitored through the microphone, and if the external sound is monitored, the audio data corresponding to the external sound is collected through the wearable device. The sound may be processed, or the sound may be converted to output corresponding character information. The sound may be subjected to feature extraction processing to extract audio feature information included in the sound.
And S111, if the audio data is detected to include the preset event information, extracting audio fragment data corresponding to the preset event information from the audio data.
The preset event information is information which may contain a key event and needs to be stored, wherein the key event is an event which is relatively concerned, has a high execution level or has a time requirement for a user. Illustratively, the preset event information may be information on a meeting, information on time, information on a place, and the like. The preset event information may be preset by a system or set by a user, if the user pays attention to the word "news", the user may set the word "news" as the preset event information, and may collect audio clip data corresponding to the sound "news" when detecting that the sound contains "news".
The preset event information may include: text or audio features are preset. If the voice is converted and the corresponding text information is output, whether the output text information contains preset text or not can be judged. And if the audio feature information contained in the sound is extracted, determining whether the audio feature information comprises preset audio features. If the output text information contains preset text or the audio characteristic information contains preset audio characteristics, it can be determined that the sound contains preset event information.
After the sound is determined to include the preset event information, the sound includes the key event, the sound needs to be stored, and audio clip data corresponding to the preset event information in the sound can be collected. After the time information is determined to be included in the sound, a recording function of a microphone of the wearable device is started, and audio clip data corresponding to the preset event information in the sound is collected.
Optionally, a starting time of the occurrence of the preset event information is determined, and audio fragment data of a preset time period is extracted from the audio data with the starting time as a reference time point.
The starting moment of the preset event information is the starting time point corresponding to the preset event information in the audio data acquired by the wearable device. Illustratively, in the ratio of 19:17: when the sound collected at 05 is matched with the preset event information, the time 19:17: and 05 is the starting moment of the occurrence of the preset event information. The extracting of the audio clip data of the preset time period from the audio data with the start time as a reference time point may be: and extracting time data of a preset time period from the audio data by taking the starting time as a starting time.
The preset time period may be a duration preset by a system or preset by a user, and for example, if the preset time period is five minutes, audio clip data of five minutes is collected and stored with the occurrence time point as a reference time after the collected sound is detected to include preset event information.
If the preset event information is detected to be included in the sound collected by the microphone when the audio clip data are collected within the preset time period, the audio clip data of the preset time period are extracted from the audio data again by taking the starting moment of the new preset event information as a reference time point. And newly performed audio clip data is stored simultaneously with previously acquired audio clip data.
And S112, storing the audio fragment data.
The audio clip data may be stored in a memory in the wearable device, or the audio clip data may be transmitted to a background server for storage through a communication module of the wearable device. And after the collected audio data is detected to include the preset event information, storing the collected audio fragment data corresponding to the preset event information so that a user can obtain the stored audio fragment data in the following needs.
The embodiment of the application discloses an audio processing method, which comprises the steps of obtaining audio data collected by wearable equipment; if the audio data is detected to include preset event information, extracting audio fragment data corresponding to the preset event information from the audio data; storing the audio clip data. This application embodiment gathers outside sound through wearing formula equipment, if detect including the preset event information that can start the recording in the sound, gather and the record sound, can improve the operating efficiency of recording and reduce the redundant information in the audio information.
Fig. 2 is a schematic flow chart of another audio processing method provided in an embodiment of the present application, where on the basis of the technical solution provided in the embodiment, audio clip data extracted from the audio data by using the start time as a reference time point for a preset time period is optimized, and optionally, as shown in fig. 2, the method includes:
and S120, acquiring audio data acquired by the wearable device.
Reference may be made to the above description for specific embodiments, which are not repeated herein.
S121, if the sound is detected to include preset event information, determining the starting time of the preset event information, taking the starting time as a reference time point, extracting first audio fragment data of a first time period with the reference time point as an end time from the audio data, and extracting second audio fragment data of a second time period with the reference time point as a start time.
And acquiring audio data before and after the time axis related to the preset event information. Because the speaking style of some people can speak some related information before the key information is mentioned, the key information is gradually introduced. Some audio data may be lost if only audio clip data after the occurrence time point of the preset event information is collected and stored. By taking the starting time as a reference time point, extracting first audio fragment data of a first time period taking the reference time point as an ending time from the audio data, and extracting second audio fragment data of a second time period taking the reference time point as a starting time, namely by collecting the audio fragment data of the first time period before the reference time point and the audio fragment data of the second time period after the reference time point, the audio data of key information before the starting time of the preset event information can be prevented from being missed when the preset event information is detected.
Wherein a duration of the first period and the second period added may be the same as a duration of the preset period.
And S122, storing the first audio fragment data and the second audio fragment data.
The first audio clip data and the second audio clip data may be combined into complete audio clip data to be stored, and reference may be made to the above description in a specific embodiment, which is not described herein again.
This application embodiment is through confirming the inception moment that preset event information appears, with inception moment is the benchmark time point, gathers the audio fragment data of the first time quantum before the benchmark time point, and gathers the audio fragment data of the second time quantum after the benchmark time point can improve the integrality of the information of the audio fragment data of gathering, avoids having partial audio data to be lost.
Fig. 3 is a schematic flow chart of another audio processing method provided in an embodiment of the present application, where on the basis of the technical solution provided in the foregoing embodiment, storage of the audio clip data is optimized, and optionally, as shown in fig. 3, the method includes:
s130, acquiring audio data collected by the wearable device.
S131, if the audio data is detected to include the preset event information, extracting audio fragment data corresponding to the preset event information from the audio data.
For the above-mentioned specific implementation of the operations, reference may be made to the above-mentioned related description, and further description is omitted here.
S132, converting the audio fragment data into corresponding text information, and storing the text information.
The collected audio segment data can be subjected to character conversion processing, converted into character information and stored. The converted text information is stored, so that the efficiency of obtaining information by a user can be improved, the user does not need to turn on the audio data to listen to the audio data to determine the information, and the acquired key information can be directly determined through the text information.
The text information and the audio data can be stored at the same time, and a user can roughly determine whether the audio data is required by the user through the text information and then confirm according to the audio data. The method and the device avoid the situation that a user may not obtain accurate key information due to the fact that great errors exist in the converted text information.
Alternatively, converting the audio clip data into corresponding text information, and storing the text information may be implemented by:
determining a to-be-converted sound clip corresponding to a conversion time period in the audio clip data, and converting the to-be-converted sound clip into text information; and the conversion time period comprises a fixed time interval taking the starting moment of the preset event information as the starting time.
And storing the text information and the audio fragment data.
The conversion time period may include a fixed time interval with a starting time of the preset event information as a starting time; the switching time period comprises the occurrence time point. For example, when the preset event information occurs at a start time of 19:17:05, the transition period may be ten seconds from 19:17:00 to 19:17: 10. The duration of the fixed time interval can be preset by a system or set by a user, and can also be adjusted according to different preset event information.
The text information and the audio fragment data are correspondingly stored, the voice fragments to be converted corresponding to the time period are subjected to text conversion processing, the voice fragments to be converted related to the key information are converted into texts, the whole audio data does not need to be converted, and the workload of the system for converting the audio fragment data can be reduced. And performing character conversion on the voice clip to be converted related to the preset event information to obtain character information corresponding to the key information, wherein the character information corresponding to the key information can be used as an abstract, so that a user can quickly know the content related to the audio clip data, whether the audio clip data is required by the user is determined, and the operating efficiency of the user can be improved.
According to the embodiment of the application, the audio clip data are converted into the corresponding text information, and the text information is stored, so that the efficiency of obtaining key information by a user is improved.
Fig. 4 is a schematic flow chart of another audio processing method provided in an embodiment of the present application, and on the basis of the technical solution provided in the foregoing embodiment, as shown in fig. 4, optionally, the method includes:
s140, collecting environmental information of the wearable device and collecting position information of the wearable device.
Wherein, the wearable device is integrated with an environment acquisition module, and the environment acquisition module can acquire the environment information of the environment where the wearable device is located.
Optionally, the environmental information includes at least one of image data, infrared data, luminance data, and sound data. Correspondingly, the environment acquisition module comprises a shooting assembly for acquiring image data, an infrared sensor for acquiring infrared data, a light sensor for acquiring brightness data and a sound sensor for acquiring sound data.
The acquired image data of the environment can be subjected to image recognition, and the environmental conditions of the environment including indoor or outdoor conditions and objects included in the image can be acquired from the image data. Under the condition of dark ambient light, an infrared image of the environment can be obtained through infrared data, and the environmental condition of the environment can also be determined. The environment can be determined to be indoor or outdoor according to the brightness data. According to the sound sensor, the user can be determined to be indoors or outdoors and the noise level of the environment.
Be provided with the orientation module who is used for gathering positional information on the wearable equipment, orientation module can be GPS (Global Positioning System) module, correspondingly, positional information includes the GPS data of wearable equipment.
S141, determining a use scene of the wearable device according to the environment information and the position information, and triggering the wearable device to acquire audio data if the use scene is a preset scene.
The use scene of wearing formula equipment can be indoor or outdoor, and to the difference of use scene, the user is to gathering the demand of sound different according to wearing formula equipment. For example, if a user wears the wearable device and walks on a road, the user's demand for collecting sounds and recording them is low. However, if the user is in a room, such as a conference room or a classroom, the user needs to collect and record sound.
The using scene of the wearable device can be determined according to the environment information and the position information, and whether the wearable device needs to be triggered to acquire audio data or not is judged according to the using scene. The preset scene comprises a scene suitable for the recording operation, and can be system preset or user preset. Illustratively, the preset scenes include scenes such as meeting rooms, classrooms and meeting places.
The location type corresponding to the current position information can be determined in preset map information, the preset map information can be from map application programs such as a Baidu map and a Gagde map, and the corresponding location type can be determined from the preset map information through the collected position information. Then, the specific condition of the environment where the wearable device is located can be determined according to the environment information, and then the corresponding use scene can be determined from the place type. For example, if the location type determined by the location information is a teaching building, it may be determined that the user is indoors or outdoors according to the environment information, and if the user is outdoors and the user may be beside the teaching building, the wearable device need not be triggered to acquire audio data. The wearable device may be triggered to collect audio data if it is determined by the environmental information that the user is indoors.
S142, audio data collected by the wearable device are obtained, and if the audio data are detected to include preset event information, audio fragment data corresponding to the preset event information are extracted from the audio data.
And S143, storing the audio clip data.
For the above-mentioned specific implementation of the operations, reference may be made to the above-mentioned related description, and further description is omitted here.
According to the embodiment of the application, the using scene of the wearable device is determined according to the environment information and the position information, and if the using scene is a preset scene, audio data collected by the wearable device are obtained; the method comprises the steps of determining the use scene of the wearable device as the scene suitable for the recording operation according to the environment information and the position information, and triggering the wearable device to acquire audio data, so that the situation that a microphone of the wearable device is continuously opened to cause large power consumption for the wearable device can be avoided.
Fig. 5 is a schematic flow chart of another audio processing method provided in an embodiment of the present application, where on the basis of the technical solution provided in the embodiment, an operation of extracting audio clip data corresponding to preset event information from the audio data is optimized if it is detected that the audio data includes the preset event information, and optionally, as shown in fig. 5, the method includes:
s150, acquiring audio data collected by the wearable device.
Reference may be made to the above description for specific embodiments, which are not repeated herein.
And S151, performing character recognition on the audio data to acquire character information corresponding to the audio data.
S152, if the text information comprises preset keywords, extracting audio fragment data related to the preset keywords from the audio data; the preset keywords comprise at least one of time, place and preset words.
The text information may be text information for performing recognition conversion on the audio data.
And judging whether the character information identified and converted comprises a preset keyword or not, if so, indicating that the detected sound comprises the preset keyword, and acquiring audio fragment data corresponding to the preset keyword. The preset keywords are events which are relatively concerned, have high execution level or have time requirements for the user, and can be system preset or user setting.
For a specific embodiment of collecting audio clip data corresponding to the preset event information, reference may be made to the above description, and details are not repeated here.
And S153, storing the audio fragment data.
Reference may be made to the above description for specific embodiments, which are not repeated herein.
Optionally, before the extracting, from the audio data, audio segment data related to the preset keyword, the following operation is further included:
and carrying out voice recognition on the audio data to acquire voice characteristic information and corresponding text information contained in the audio data.
Accordingly, if a preset keyword is included in the text information, extracting audio clip data related to the preset keyword from the audio data may be implemented by:
and if the sound characteristic information is matched with preset characteristic information and the text information comprises preset keywords, acquiring audio fragment data corresponding to the preset keywords.
The voice feature information includes information that embodies voice characteristics of the user and distinguishes the user from others, and exemplarily includes voiceprint information. Every person's sound all has corresponding unique vocal print information, and vocal print information can embody user's sound speciality to distinguish with other people.
The preset feature information is feature information of a preset sound of the user, and may be preset voiceprint information. For example, the user may acquire his or her voice through the wearable device and determine corresponding feature information as the preset feature information, in order to collect his or her voice. Later, when detecting sound through wearing formula equipment, if the sound characteristic information that draws according to the sound that detects matches with preset characteristic information, the sound that indicates to gather is user's sound. And further judging whether the character information which is identified and converted comprises a preset keyword or not, if so, indicating that the detected sound is the sound of the user, and the speaking content of the user comprises the preset keyword, so that audio fragment data corresponding to the preset keyword can be collected.
According to the embodiment of the application, the audio data are identified to extract the sound characteristic information and the corresponding text information of the audio data, and if the sound characteristic information is matched with the preset characteristic information and the text information comprises preset keywords, audio fragment data corresponding to the preset keywords are collected; the accuracy of sound collection can be further improved, and a user can select specific human sound to collect.
Fig. 6 is a block diagram of an audio processing apparatus according to an embodiment of the present application, where the apparatus may execute an audio processing method, and as shown in fig. 6, the apparatus includes:
the sound detection module 220 is used for acquiring audio data acquired by the wearable device;
the sound collection module 221 is configured to extract audio segment data corresponding to preset event information from the audio data if it is detected that the audio data includes the preset event information;
a storage module 222, configured to store the audio clip data.
The audio processing device provided in the embodiment of the application acquires audio data acquired by wearable equipment; if the audio data is detected to include preset event information, extracting audio fragment data corresponding to the preset event information from the audio data; storing the audio clip data. This application embodiment gathers outside sound through wearing formula equipment, if detect including the preset event information that can start the recording in the sound, gather and the record sound, can improve the operating efficiency of recording and reduce the redundant information in the audio information.
Optionally, the sound collection module is specifically configured to:
and determining the starting time of the preset event information, and extracting audio fragment data of a preset time period from the audio data by taking the starting time as a reference time point.
Optionally, the sound collection module is specifically configured to:
extracting first audio segment data of a first time period with the reference time point as an end time and second audio segment data of a second time period with the reference time point as a start time from the audio data with the start time as a reference time point;
correspondingly, the storage module is specifically configured to: storing the first audio clip data and the second audio clip data.
Optionally, the storage module is specifically configured to:
and converting the audio clip data into corresponding text information, and storing the text information.
Optionally, the storage module is specifically configured to:
determining a to-be-converted sound clip corresponding to a conversion time period in the audio clip data, and converting the to-be-converted sound clip into text information; the conversion time period comprises a fixed time interval taking the starting moment of the preset event information as the starting time;
and storing the text information and the audio fragment data.
Optionally, the sound detection module is specifically configured to:
the information acquisition module is used for acquiring the environmental information of the wearable equipment and acquiring the position information of the wearable equipment before acquiring the audio data acquired by the wearable equipment;
and the scene module is used for determining the use scene of the wearable device according to the environment information and the position information, and triggering the wearable device to acquire audio data if the use scene is a preset scene.
Optionally, the sound collection module specifically includes:
the character acquisition module is used for carrying out character recognition on the audio data so as to acquire character information corresponding to the audio data;
the acquisition module is used for extracting audio fragment data related to a preset keyword from the audio data if the text information comprises the preset keyword; the preset keywords comprise at least one of time, place and preset words.
Optionally, the method further comprises:
the voice feature module is used for performing voice recognition on the audio data before audio fragment data related to the preset keywords are extracted from the audio data so as to obtain voice feature information contained in the audio data;
correspondingly, the acquisition module is specifically configured to:
and if the sound characteristic information is matched with preset characteristic information and the text information comprises preset keywords, extracting audio fragment data related to the preset keywords from the audio data.
The present embodiment provides a wearable device on the basis of the foregoing embodiments, fig. 7 is a schematic structural diagram of the wearable device provided in the embodiment of the present application, and fig. 8 is a schematic physical diagram of the wearable device provided in the embodiment of the present application. As shown in fig. 7 and 8, the wearable device 200 includes: memory 201, a processor (CPU) 202, a display Unit 203, a touch panel 204, a heart rate detection module 205, a distance sensor 206, a camera 207, a bone conduction speaker 208, a microphone 209, a breathing light 210, which communicate via one or more communication buses or signal lines 211.
It should be understood that the illustrated wearable device 200 is merely one example of a wearable device, and that the wearable device 200 may have more or fewer components than shown in the figures, may combine two or more components, or may have a different configuration of components. The various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
The wearable device for audio processing provided in the present embodiment is described in detail below, and the wearable device is exemplified by smart glasses.
A memory 201, the memory 201 being accessible by the processor 202, the memory 201 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other volatile solid state storage devices.
The display component 203 can be used for displaying image data and a control interface of an operating system, the display component 203 is embedded in a frame of the intelligent glasses, an internal transmission line 211 is arranged inside the frame, and the internal transmission line 211 is connected with the display component 203.
And a touch panel 204, the touch panel 204 being disposed at an outer side of a temple of at least one smart glasses for acquiring touch data, the touch panel 204 being connected to the processor 202 through an internal transmission line 211. The touch panel 204 can detect finger sliding and clicking operations of the user, and accordingly transmit the detected data to the processor 202 for processing to generate corresponding control instructions, which may be, for example, a left shift instruction, a right shift instruction, an up shift instruction, a down shift instruction, and the like. Illustratively, the display part 203 may display the virtual image data transmitted by the processor 202, and the virtual image data may be correspondingly changed according to the user operation detected by the touch panel 204, specifically, the virtual image data may be switched to a previous or next virtual image frame when a left shift instruction or a right shift instruction is detected; when the display section 203 displays video play information, the left shift instruction may be to perform playback of the play content, and the right shift instruction may be to perform fast forward of the play content; when the editable text content is displayed on the display part 203, the left shift instruction, the right shift instruction, the upward shift instruction, and the downward shift instruction may be displacement operations on a cursor, that is, the position of the cursor may be moved according to a touch operation of a user on the touch pad; when the content displayed by the display part 203 is a game moving picture, the left shift instruction, the right shift instruction, the upward shift instruction and the downward shift instruction can be used for controlling an object in a game, for example, in an airplane game, the flying direction of an airplane can be controlled by the left shift instruction, the right shift instruction, the upward shift instruction and the downward shift instruction respectively; when the display part 203 can display video pictures of different channels, the left shift instruction, the right shift instruction, the up shift instruction, and the down shift instruction can perform switching of different channels, wherein the up shift instruction and the down shift instruction can be switching to a preset channel (such as a common channel used by a user); when the display section 203 displays a still picture, the left shift instruction, the right shift instruction, the up shift instruction, and the down shift instruction may perform switching between different pictures, where the left shift instruction may be switching to a previous picture, the right shift instruction may be switching to a next picture, the up shift instruction may be switching to a previous set, and the down shift instruction may be switching to a next set. The touch panel 204 can also be used to control display switches of the display section 203, for example, when the touch area of the touch panel 204 is pressed for a long time, the display section 203 is powered on to display an image interface, when the touch area of the touch panel 204 is pressed for a long time again, the display section 203 is powered off, and when the display section 203 is powered on, the brightness or resolution of an image displayed in the display section 203 can be adjusted by performing a slide-up and slide-down operation on the touch panel 204.
Heart rate detection module 205 for measure user's heart rate data, the heart rate indicates the heartbeat number of minute, and this heart rate detection module 205 sets up at the mirror leg inboard. Specifically, the heart rate detection module 205 may obtain human body electrocardiographic data by using a dry electrode in an electric pulse measurement manner, and determine the heart rate according to an amplitude peak value in the electrocardiographic data; this heart rate detection module 205 can also be by adopting the light transmission and the light receiver component of photoelectric method measurement rhythm of the heart, and is corresponding, and this heart rate detection module 205 sets up in the mirror leg bottom, the earlobe department of human auricle. After the heart rate data is collected by the heart rate detection module 205, the heart rate data can be correspondingly sent to the processor 202 for data processing, so that the current heart rate value of the wearer is obtained, in one embodiment, after the heart rate value of the user is determined, the processor 202 can display the heart rate value in the display component 203 in real time, the optional processor 202 can correspondingly trigger an alarm when the heart rate value is determined to be low (such as less than 50) or high (such as greater than 100), and simultaneously, the heart rate value and/or generated alarm information are sent to the server through the communication module.
And a distance sensor 206, which can be disposed on the frame, wherein the distance sensor 206 is used for sensing the distance from the human face to the frame 101, and the distance sensor 206 can be implemented by using an infrared sensing principle. Specifically, the distance sensor 206 transmits the acquired distance data to the processor 202, and the processor 202 controls the brightness of the display section 203 according to the distance data. Illustratively, the processor 202 is configured to turn on the corresponding control display 203 when the distance sensor 206 detects a distance of less than 5 cm, and to turn off the corresponding control display 203 when the distance sensor 206 detects an object approaching.
In addition, other types of sensors can be arranged on the glasses frame of the intelligent glasses, and at least one of the following sensors is included: acceleration sensor, gyroscope sensor and pressure sensor for detect the user and rock, touch or press the operation of intelligent glasses, and send sensing data to treater 202, whether open camera 207 and carry out image acquisition with the judgement. Fig. 7 shows an acceleration sensor 212 as an example, it being understood that this is not a limitation of the present embodiment.
And the breathing lamp 210 can be arranged at the edge of the frame, and when the display part 203 closes the display screen, the breathing lamp 210 can be lightened to be in a gradual dimming effect according to the control of the processor 202.
The camera 207 may be a front camera module disposed at the upper frame of the frame for collecting image data in front of the user, a rear camera module for collecting eyeball information of the user, or a combination thereof. Specifically, when the camera 207 collects a front image, the collected image is sent to the processor 202 for recognition and processing, and a corresponding trigger event is triggered according to a recognition result. Illustratively, when a user wears the smart glasses at home, by identifying the collected front image, if a furniture item is identified, correspondingly inquiring whether a corresponding control event exists, if so, correspondingly displaying a control interface corresponding to the control event in the display part 203, and the user can control the corresponding furniture item through the touch panel 204, wherein the furniture item and the smart glasses are in network connection through bluetooth or wireless ad hoc network; when a user wears the intelligent glasses outdoors, a target recognition mode can be correspondingly started, the target recognition mode can be used for recognizing specific people, the camera 207 sends collected images to the processor 202 for face recognition processing, if preset faces are recognized, voice broadcasting can be correspondingly performed through a loudspeaker integrated on the intelligent glasses, the target recognition mode can also be used for recognizing different plants, for example, the processor 202 records current images collected by the camera 207 according to touch operation of the touch panel 204 and sends the current images to a server through the communication module for recognition, the server recognizes the plants in the collected images and feeds back related plant names to the intelligent glasses, and feedback data are displayed in the display part 203. The camera 207 may also be configured to capture an image of an eye of a user, such as an eyeball, and generate different control instructions by recognizing rotation of the eyeball, for example, the eyeball rotates upward to generate an upward movement control instruction, the eyeball rotates downward to generate a downward movement control instruction, the eyeball rotates leftward to generate a left movement control instruction, and the eyeball rotates rightward to generate a right movement control instruction, where the display unit 203 may display, as appropriate, virtual image data transmitted by the processor 202, where the virtual image data may be changed according to a control instruction generated by a change in movement of the eyeball of the user detected by the camera 207, specifically, a frame switching may be performed, and when a left movement control instruction or a right movement control instruction is detected, a previous or next virtual image frame may be correspondingly switched; when the display part 203 displays video playing information, the left control instruction can be to play back the played content, and the right control instruction can be to fast forward the played content; when the editable text content is displayed on the display part 203, the left movement control instruction, the right movement control instruction, the upward movement control instruction and the downward movement control instruction may be displacement operations of a cursor, that is, the position of the cursor may be moved according to a touch operation of a user on the touch pad; when the content displayed by the display part 203 is a game animation picture, the left movement control command, the right movement control command, the upward movement control command and the downward movement control command can control an object in a game, for example, in an airplane game, the flying direction of an airplane can be controlled by the left movement control command, the right movement control command, the upward movement control command and the downward movement control command respectively; when the display part 203 can display video pictures of different channels, the left shift control instruction, the right shift control instruction, the upward shift control instruction and the downward shift control instruction can switch different channels, wherein the upward shift control instruction and the downward shift control instruction can be switching to a preset channel (such as a common channel used by a user); when the display section 203 displays a still picture, the left shift control instruction, the right shift control instruction, the up shift control instruction, and the down shift control instruction may switch between different pictures, where the left shift control instruction may be to a previous picture, the right shift control instruction may be to a next picture, the up shift control instruction may be to a previous picture set, and the down shift control instruction may be to a next picture set.
And a bone conduction speaker 208, the bone conduction speaker 208 being provided on an inner wall side of at least one temple, for converting the received audio signal transmitted from the processor 202 into a vibration signal. The bone conduction speaker 208 transmits sound to the inner ear of the human body through the skull, converts an electrical signal of the audio frequency into a vibration signal, transmits the vibration signal into the cochlea of the skull, and then is sensed by the auditory nerve. The bone conduction speaker 208 is used as a sound production device, so that the thickness of a hardware structure is reduced, the weight is lighter, meanwhile, the influence of electromagnetic radiation is avoided without electromagnetic radiation, and the bone conduction speaker has the advantages of noise resistance, water resistance and binaural liberation.
A microphone 209 may be disposed on the lower frame of the frame for capturing external (user, ambient) sounds and transmitting them to the processor 202 for processing. Illustratively, the microphone 209 collects the sound emitted by the user and performs voiceprint recognition by the processor 202, and if the sound is recognized as a voiceprint for authenticating the user, the subsequent voice control can be correspondingly received, specifically, the user can emit voice, the microphone 209 sends the collected voice to the processor 202 for recognition so as to generate a corresponding control instruction according to the recognition result, such as "power on", "power off", "display brightness increase", "display brightness decrease", and the processor 202 subsequently executes a corresponding control process according to the generated control instruction.
The audio processing device of the wearable device and the wearable device provided in the above embodiments can execute the audio processing method of the wearable device provided in any embodiment of the present invention, and have corresponding functional modules and beneficial effects for executing the method. Technical details that are not described in detail in the above embodiments may be referred to an audio processing method of a wearable device provided in any embodiment of the present invention.
Embodiments of the present application also provide a storage medium containing wearable device-executable instructions, which when executed by a wearable device processor, are configured to perform an audio processing method, the method including:
acquiring audio data acquired by wearable equipment;
if the audio data is detected to include preset event information, extracting audio fragment data corresponding to the preset event information from the audio data;
storing the audio clip data.
In one possible embodiment, extracting audio clip data corresponding to the preset event information from the audio data includes:
and determining the starting time of the preset event information, and extracting audio fragment data of a preset time period from the audio data by taking the starting time as a reference time point.
In one possible embodiment, extracting audio clip data of a preset time period from the audio data with the start time as a reference time point includes:
extracting first audio segment data of a first time period with the reference time point as an end time and second audio segment data of a second time period with the reference time point as a start time from the audio data with the start time as a reference time point;
accordingly, the storing the audio clip data comprises:
storing the first audio clip data and the second audio clip data.
In one possible embodiment, storing the audio clip data comprises:
and converting the audio clip data into corresponding text information, and storing the text information.
In one possible embodiment, converting the audio clip data into corresponding text information and storing the text information includes:
determining a to-be-converted sound clip corresponding to a conversion time period in the audio clip data, and converting the to-be-converted sound clip into text information; the conversion time period comprises a fixed time interval taking the starting moment of the preset event information as the starting time;
and storing the text information and the audio fragment data.
In one possible embodiment, before acquiring the audio data collected by the wearable device, the method further includes:
collecting environmental information of wearable equipment and collecting position information of the wearable equipment;
determining a use scene of the wearable device according to the environment information and the position information, and triggering the wearable device to acquire audio data if the use scene is a preset scene.
In one possible embodiment, if it is detected that preset event information is included in the audio data, extracting audio clip data corresponding to the preset event information from the audio data includes:
performing character recognition on the audio data to acquire character information corresponding to the audio data;
if the text information comprises preset keywords, extracting audio segment data related to the preset keywords from the audio data; the preset keywords comprise at least one of time, place and preset words.
In one possible embodiment, before extracting audio segment data related to the preset keyword from the audio data, the method further includes:
performing sound identification on the audio data to acquire sound characteristic information contained in the audio data;
correspondingly, if the text information includes a preset keyword, extracting audio segment data related to the preset keyword from the audio data, including:
and if the sound characteristic information is matched with preset characteristic information and the text information comprises preset keywords, extracting audio fragment data related to the preset keywords from the audio data.
Storage medium-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, Lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the internet). The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.
Of course, the storage medium provided in the embodiments of the present application contains computer-executable instructions, and the computer-executable instructions are not limited to the audio processing operations described above, and may also perform related operations in the audio processing method provided in any embodiment of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present application and the technical principles employed. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, although the present application has been described in more detail with reference to the above embodiments, the present application is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present application, and the scope of the present application is determined by the scope of the appended claims.

Claims (9)

1. An audio processing method, comprising:
acquiring audio data acquired by wearable equipment; the audio data is audio fragment data corresponding to the preset event information which is acquired and recorded by starting a recording function of the wearable device when the wearable device detects that external sound comprises the preset event information, wherein the preset event information comprises preset audio features;
determining the starting time of the preset event information, and extracting audio fragment data of a preset time period from the audio data by taking the starting time as a reference time point;
storing the audio clip data;
the extracting of the audio clip data of the preset time period from the audio data with the starting time as a reference time point includes:
extracting time data of a preset time period from the audio data with the starting time as a starting time,
alternatively, the first and second electrodes may be,
extracting first audio fragment data of a first time period with the reference time point as an end time and second audio fragment data of a second time period with the reference time point as a start time from the audio data with the starting time as a reference time point, wherein the added time length of the first time period and the second time period is the same as the time length of the preset time period;
the storing the audio clip data comprises:
and converting the audio clip data into text information, and correspondingly storing the audio clip data and the text information, wherein the text information is used as an abstract of the audio clip data.
2. The method according to claim 1, wherein when first audio piece-data of a first time period having the reference time point as an end time and second audio piece-data of a second time period having the reference time point as a start time are extracted from the audio data with the start time point as a reference time point;
accordingly, the storing the audio clip data comprises:
storing the first audio clip data and the second audio clip data.
3. The method of claim 1, wherein converting the audio clip data into corresponding text information and storing the text information comprises:
determining a to-be-converted sound clip corresponding to a conversion time period in the audio clip data, and converting the to-be-converted sound clip into text information; the conversion time period comprises a fixed time interval taking the starting moment of the preset event information as the starting time;
and storing the text information and the audio fragment data.
4. The method of any one of claims 1 to 3, wherein prior to obtaining the audio data collected by the wearable device, further comprising:
collecting environmental information of wearable equipment and collecting position information of the wearable equipment;
determining a use scene of the wearable device according to the environment information and the position information, and triggering the wearable device to acquire audio data if the use scene is a preset scene.
5. The method according to any one of claims 1 to 3, wherein if it is detected that preset event information is included in the audio data, extracting audio clip data corresponding to the preset event information from the audio data comprises:
performing character recognition on the audio data to acquire character information corresponding to the audio data;
if the text information comprises preset keywords, extracting audio segment data related to the preset keywords from the audio data; the preset keywords comprise at least one of time, place and preset words.
6. The method according to claim 5, wherein before extracting audio segment data related to the preset keyword from the audio data, further comprising:
performing sound identification on the audio data to acquire sound characteristic information contained in the audio data;
correspondingly, if the text information includes a preset keyword, extracting audio segment data related to the preset keyword from the audio data, including:
and if the sound characteristic information is matched with preset characteristic information and the text information comprises preset keywords, extracting audio fragment data related to the preset keywords from the audio data.
7. An audio processing apparatus, comprising:
the sound detection module is used for acquiring audio data acquired by the wearable equipment; the audio data is audio fragment data corresponding to the preset event information which is acquired and recorded by starting a recording function of the wearable device when the wearable device detects that external sound comprises the preset event information, wherein the preset event information comprises preset audio features;
the sound acquisition module is used for extracting audio fragment data corresponding to the preset event information from the audio data;
the storage module is used for storing the audio fragment data;
the sound collection module is specifically configured to: determining the starting time of the preset event information, and extracting audio fragment data of a preset time period from the audio data by taking the starting time as a reference time point;
the extracting of the audio clip data of the preset time period from the audio data with the starting time as a reference time point includes:
extracting time data of a preset time period from the audio data with the starting time as a starting time,
alternatively, the first and second electrodes may be,
extracting first audio fragment data of a first time period with the reference time point as an end time and second audio fragment data of a second time period with the reference time point as a start time from the audio data with the starting time as a reference time point, wherein the added time length of the first time period and the second time period is the same as the time length of the preset time period;
the storage module is specifically configured to: and converting the audio clip data into text information, and correspondingly storing the audio clip data and the text information, wherein the text information is used as an abstract of the audio clip data.
8. A wearable device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the audio processing method according to any of claims 1-6 when executing the computer program.
9. A storage medium containing wearable device-executable instructions, which when executed by a wearable device processor, are configured to perform the audio processing method of any of claims 1-6.
CN201811001212.6A 2018-08-30 2018-08-30 Audio processing method and device, wearable device and storage medium Active CN109257490B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811001212.6A CN109257490B (en) 2018-08-30 2018-08-30 Audio processing method and device, wearable device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811001212.6A CN109257490B (en) 2018-08-30 2018-08-30 Audio processing method and device, wearable device and storage medium

Publications (2)

Publication Number Publication Date
CN109257490A CN109257490A (en) 2019-01-22
CN109257490B true CN109257490B (en) 2021-07-09

Family

ID=65048964

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811001212.6A Active CN109257490B (en) 2018-08-30 2018-08-30 Audio processing method and device, wearable device and storage medium

Country Status (1)

Country Link
CN (1) CN109257490B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111128243B (en) * 2019-12-25 2022-12-06 苏州科达科技股份有限公司 Noise data acquisition method, device and storage medium
CN111564165B (en) * 2020-04-27 2021-09-28 北京三快在线科技有限公司 Data storage method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105336329A (en) * 2015-09-25 2016-02-17 联想(北京)有限公司 Speech processing method and system
CN105657129A (en) * 2016-01-25 2016-06-08 百度在线网络技术(北京)有限公司 Call information obtaining method and device
CN106024009A (en) * 2016-04-29 2016-10-12 北京小米移动软件有限公司 Audio processing method and device
CN106448702A (en) * 2016-09-14 2017-02-22 努比亚技术有限公司 Recording data processing device and method, and mobile terminal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6497372B2 (en) * 2016-09-29 2019-04-10 トヨタ自動車株式会社 Voice dialogue apparatus and voice dialogue method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105336329A (en) * 2015-09-25 2016-02-17 联想(北京)有限公司 Speech processing method and system
CN105657129A (en) * 2016-01-25 2016-06-08 百度在线网络技术(北京)有限公司 Call information obtaining method and device
CN106024009A (en) * 2016-04-29 2016-10-12 北京小米移动软件有限公司 Audio processing method and device
CN106448702A (en) * 2016-09-14 2017-02-22 努比亚技术有限公司 Recording data processing device and method, and mobile terminal

Also Published As

Publication number Publication date
CN109257490A (en) 2019-01-22

Similar Documents

Publication Publication Date Title
CN110291489B (en) Computationally efficient human identification intelligent assistant computer
CN109145847B (en) Identification method and device, wearable device and storage medium
CN109259724B (en) Eye monitoring method and device, storage medium and wearable device
JP6574937B2 (en) COMMUNICATION SYSTEM, CONTROL METHOD, AND STORAGE MEDIUM
US11010601B2 (en) Intelligent assistant device communicating non-verbal cues
CN105009598B (en) The equipment of interest when acquisition viewer's viewing content
US20130177296A1 (en) Generating metadata for user experiences
US9100667B2 (en) Life streaming
CN109032384B (en) Music playing control method and device, storage medium and wearable device
US20200175976A1 (en) Contextually relevant spoken device-to-device communication between iot devices
CN109061903B (en) Data display method and device, intelligent glasses and storage medium
KR20170033641A (en) Electronic device and method for controlling an operation thereof
CN109254659A (en) Control method, device, storage medium and the wearable device of wearable device
CN109224432B (en) Entertainment application control method and device, storage medium and wearable device
CN109241900B (en) Wearable device control method and device, storage medium and wearable device
US20210350823A1 (en) Systems and methods for processing audio and video using a voice print
CN109255064A (en) Information search method, device, intelligent glasses and storage medium
CN109257490B (en) Audio processing method and device, wearable device and storage medium
CN109240639A (en) Acquisition methods, device, storage medium and the terminal of audio data
WO2019039591A4 (en) Read-out system and read-out method
US20210368279A1 (en) Smart hearing assistance in monitored property
CN109068126B (en) Video playing method and device, storage medium and wearable device
CN113574525A (en) Media content recommendation method and equipment
CN109067627A (en) Appliances equipment control method, device, wearable device and storage medium
CN106564059B (en) A kind of domestic robot system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant