CN108093124B - Audio positioning method and device and mobile terminal - Google Patents

Audio positioning method and device and mobile terminal Download PDF

Info

Publication number
CN108093124B
CN108093124B CN201711132035.0A CN201711132035A CN108093124B CN 108093124 B CN108093124 B CN 108093124B CN 201711132035 A CN201711132035 A CN 201711132035A CN 108093124 B CN108093124 B CN 108093124B
Authority
CN
China
Prior art keywords
audio data
sub
key information
content
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711132035.0A
Other languages
Chinese (zh)
Other versions
CN108093124A (en
Inventor
王亚运
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN201711132035.0A priority Critical patent/CN108093124B/en
Publication of CN108093124A publication Critical patent/CN108093124A/en
Application granted granted Critical
Publication of CN108093124B publication Critical patent/CN108093124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Library & Information Science (AREA)
  • General Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides an audio positioning method, an audio positioning device and a mobile terminal. The method comprises the following steps: receiving first audio data; extracting first key information from the first audio data; searching sub audio data matched with the first key information from second audio data; adding a feature label to the sub-audio data. The user can quickly find out the required content from the second audio data according to the feature marks, the efficiency of repeatedly listening to the recording is improved, and the use experience of the user in repeatedly listening to the recording is improved.

Description

Audio positioning method and device and mobile terminal
Technical Field
The invention relates to the technical field of mobile terminals, in particular to an audio positioning method and device and a mobile terminal.
Background
With the development of science and technology, mobile terminals have become indispensable communication tools in people's lives, wherein the recording function is one of the necessary functions of mobile terminals. Many users of mobile terminals use the recording function, and the good recording experience is also an important factor for improving the working efficiency of many workers engaged in special professions, such as meeting recorders.
The experience upgrading of present recording is the tone quality effect of reinforcing recording more, and does not promote well to the work efficiency that the user listened the recording repeatedly.
Disclosure of Invention
The embodiment of the invention provides an audio positioning method, an audio positioning device and a mobile terminal, and aims to solve the problem that the efficiency of acquiring and positioning recording contents is low when a user listens to a recording repeatedly.
In order to solve the above technical problem, an embodiment of the present invention provides an audio positioning method applied to a mobile terminal, where the method includes:
receiving first audio data;
extracting first key information from the first audio data;
searching sub audio data matched with the first key information from second audio data;
adding a feature label to the sub-audio data.
The embodiment of the invention also provides an audio positioning device, which is deployed on a mobile terminal, and comprises:
the audio data receiving module is used for receiving first audio data;
the key information extraction module is used for extracting first key information from the first audio data;
the sub audio data searching module is used for searching sub audio data matched with the first key information from second audio data;
and the characteristic mark adding module is used for adding characteristic marks to the sub audio data.
An embodiment of the present invention further provides a mobile terminal, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements the steps of the audio positioning method as described above.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the audio positioning method are implemented as described above.
In the embodiment of the invention, the mobile terminal extracts the first key information from the first audio data, matches the second audio data with the first key information, and adds the feature tag to the sub-audio data matched with the first key information in the second audio data, so that a user can quickly find out required content from the second audio data according to the feature tag, the efficiency of repeatedly listening to the recording is improved, and the use experience of the user in repeatedly listening to the recording is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
Fig. 1 is a flowchart illustrating steps of an audio positioning method according to a first embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps of an audio positioning method according to a second embodiment of the present invention;
fig. 3 is a block diagram of an audio positioning apparatus according to a third embodiment of the present invention;
fig. 4 is a second block diagram of an audio positioning apparatus according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a mobile terminal according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Fig. 1 is a flowchart illustrating steps of an audio positioning method according to an embodiment of the present invention. The method comprises the following steps:
step 101, receiving first audio data.
In this embodiment, a user inputs first audio data including search content to a mobile terminal, and the mobile terminal receives the first audio data submitted by the user. The first audio data may contain a word or may contain a plurality of words. For example, the user inputs first audio data containing the search content "the second content is Y", and the mobile terminal receives the first audio data. The first audio data is not limited in detail in the embodiment of the present invention, and may be set according to actual situations.
Step 102, extracting first key information from the first audio data.
In this embodiment, after receiving the first audio data, the first key information is extracted from the first audio data. Specifically, the first audio data may be converted into text content and displayed on a screen of the mobile terminal, and first key information selected by a user from the plurality of text contents may be received. For example, the first audio data is converted into a text content of "Y is the second content", and "Y" selected by the user is received as the first key information. Or converting the first audio data into text content, splitting the text content into a plurality of phrases, and using the split phrases as the first key information. For example, the first audio data is converted into text content "the second content is Y", two phrases "the second" and "Y" are split from the text content, and the "second" and "Y" are used as the first key information. The embodiment of the invention does not limit in detail how to extract the first key information, and can be set according to actual conditions.
Step 103, searching the sub audio data matched with the first key information from the second audio data.
In this embodiment, the second audio data is to-be-identified audio data, and after the first key information is extracted from the first audio data, the sub audio data matched with the first key information is searched from the second audio data. Specifically, the second audio data may be converted into text content, a phrase matching the first key information is found from the text content, and the audio data where the matching phrase is located is used as the found sub-audio data. For example, the converting the second audio data into text content includes: today's conferences include three pieces of content X, Y, Z; the first content is X … …; the second content is Y … …; the third is Z … …. And finding the phrase "second" and "Y" matched with the first key information from the converted text content, wherein the "second content is Y … …" is the found sub-audio data. The embodiment of the invention does not limit the division of the molecular audio data in detail, and can be set according to the actual situation.
And 104, adding a characteristic mark to the sub-audio data.
In this embodiment, a feature mark is added to the searched sub audio data. For example, if the searched sub audio data is between 0:31 '45 "-0: 57' 11" of the second audio data, the sub audio data between 0:31 '45 "-0: 57' 11" may be marked with a special color to distinguish the searched sub audio data from other audio data; a label may also be displayed at 0: 31' 45 ". The embodiment of the invention does not limit in detail how the feature mark is added, and can be set according to actual conditions.
In summary, in the embodiment of the present invention, the mobile terminal extracts the first key information from the first audio data, searches for the sub-audio data in the second audio data that matches the first key information, and adds the feature tag to the searched sub-audio data, so that the user can quickly find the required content from the second audio data according to the feature tag, thereby improving the efficiency of repeatedly listening to the recording, and improving the user experience when the user repeatedly listens to the recording.
Example two
Fig. 2 is a flowchart illustrating steps of an audio positioning method according to an embodiment of the present invention. The method comprises the following steps:
step 201, receiving first audio data.
Step 202, extracting first key information from the first audio data.
Step 203, identifying the audio content of the second audio data.
In this embodiment, the second audio data is subjected to speech recognition, and the audio content of the second audio data is recognized. For example, the second audio data is subjected to speech recognition, and the recognized audio content includes: today's conferences include three pieces of content X, Y, Z; the first content is X … …; the second content is Y … …; the third is Z … …. The embodiment of the invention does not limit the speech recognition in detail, and can be set according to the actual situation.
Step 204, dividing the second audio data into a plurality of sub audio data according to the audio content.
In this embodiment, the second audio data is divided according to the audio content. For example, the second audio data is divided into four sub-audio data according to the audio contents, the first sub-audio data including "the conference today includes three contents X, Y, Z"; the second sub audio data includes "the first content is X … …", and the third sub audio data includes "the second content is Y … …"; the fourth sub audio data includes "the third content is Z … …".
Step 205, searching for the sub audio data matched with the first key information from the plurality of sub audio data.
In this embodiment, searching for the sub audio data matched with the first key information from the plurality of sub audio data may specifically include the following steps:
and the first substep is to extract second key information for each sub audio data.
In this embodiment, after the second audio data is divided into a plurality of sub audio data, the second key information is extracted from each sub audio data. For example, second key information is extracted for the first sub audio data, the second key information being "three", "X", "Y", "Z"; extracting second key information from the second sub-audio data, wherein the second key information is 'first' and 'X'; extracting second key information from the third sub-audio data, wherein the second key information is 'second', 'Y'; and extracting second key information from the fourth sub audio data, wherein the second key information is 'third' and 'Z'.
And a second substep of performing word sense matching on the first key information and the second key information and determining the matching degree of the content of the first key information and the content of each piece of sub-audio data.
In this embodiment, word sense matching is performed on the first key information and the second key information. For example, the first key information "second" and "Y" are word sense-matched with the second key information "third", "X", "Y" and "Z" of the first sub-audio data, word sense-matched with the second key information "first" and "X" of the second sub-audio data, word sense-matched with the second key information "second" and "Y" of the third sub-audio data, and word sense-matched with the second key information "third" and "Z" of the fourth sub-audio data. Optionally, the word sense matching includes at least one of same word matching, similar word matching, and multilingual matching. Word banks can be established for matching by matching of near-meaning words, similar words and multiple languages. The embodiment of the present invention is not limited in detail, and may be set according to actual situations.
And after word sense matching, determining the matching degree of the content of the first key information and the content of each sub audio data. For example, the second key information "Y" in the first sub-audio data matches "Y" in the first key information, which accounts for 1/2 of the two first key information, and the matching degree is determined to be 50%; the second key information in the second sub audio data is not matched with the first key information, and the matching degree is determined to be 0; the second key information "second" and "Y" in the third sub-audio data are matched with the second key information "second" and "Y" in the first key information, the matching degree is determined to be 100%, the second key information in the fourth sub-audio data is not matched with the first key information, and the matching degree is determined to be 0. For example, the second key information extracted from the third sub-audio data includes 10 pieces of "second", 8 pieces of "Y", and 1 piece of "a", and after performing word sense matching with the first key information, the number of matches is 9, and the matching degree is determined to be 90%. How to determine the matching degree is not limited in detail in this embodiment, and may be set according to actual situations.
And a third substep of searching the sub-audio data with the matching degree meeting the preset condition.
In this embodiment, after the matching degree between the first key information and each piece of sub-audio data is determined, the sub-audio data whose matching degree meets the preset condition is searched. For example, if the preset condition is that the matching degree is greater than 80%, the third sub-audio data is found; if the preset condition is that the matching degree is greater than or equal to 50%, the first sub audio data and the third sub audio data are searched. The preset conditions are not limited in detail in the embodiment of the invention, and can be set according to actual conditions.
In step 206, corresponding feature labels are added to the plurality of searched sub-audio data.
In this embodiment, the found sub audio data is added with the feature marks respectively, for example, the first sub audio data and the third sub audio data are found, a red mark may be added to the first sub audio data, and a yellow mark may be added to the third sub audio data; it is also possible to add a mark containing "Y" to the first sub audio data and a mark containing "Y" to the third sub audio data. The embodiment of the invention does not limit the feature marks in detail, and can be set according to actual conditions.
And step 207, receiving a click operation instruction for the feature tag.
In this embodiment, when the user finds the content to be listened to again according to the feature tag, the user clicks the feature tag, and the mobile terminal receives an operation instruction for clicking the feature tag. For example, the mobile terminal receives an operation instruction of clicking a yellow mark.
And step 208, playing the sub-audio data corresponding to the feature tag.
In this embodiment, after receiving an operation instruction for clicking the feature tag, the sub-audio data corresponding to the feature tag is played. For example, after receiving an operation instruction of clicking the yellow mark, the third sub-audio data "the second content is Y … …" corresponding to the yellow mark is played. Therefore, the user can quickly find the required content from the second audio data according to the feature marks, and the efficiency of listening to the recording again is improved.
In summary, in the embodiment of the present invention, the mobile terminal extracts the first key information from the first audio data, matches the second audio data with the first key information, and adds the feature tag to the sub-audio data in the second audio data that matches the first key information, so that the user can quickly find the required content from the second audio data according to the feature tag, thereby improving the efficiency of repeatedly listening to the recording and improving the user experience when the user repeatedly listens to the recording.
EXAMPLE III
Fig. 3 shows a block diagram of an audio positioning apparatus according to an embodiment of the present invention. The audio positioning device is deployed on a mobile terminal, and comprises an audio data receiving module 301, a key information extracting module 302, a sub audio data searching module 303 and a feature tag adding module 304:
an audio data receiving module 301, configured to receive first audio data;
a key information extraction module 302, configured to extract first key information from the first audio data;
a sub audio data searching module 303, configured to search, from second audio data, sub audio data that matches the first key information;
a feature label adding module 304, configured to add a feature label to the sub audio data.
On the basis of fig. 3, optionally, the sub audio data searching module 303 includes an audio content identifying sub-module 3031, an audio data dividing sub-module 3032, and a sub audio data searching sub-module 3033, as shown in fig. 4:
an audio content identification submodule 3031, configured to identify audio content of the second audio data;
an audio data dividing sub-module 3032, configured to divide the second audio data into a plurality of sub-audio data according to the audio content;
the sub audio data searching sub-module 3033 is configured to search sub audio data matching the first key information from the plurality of sub audio data.
On the basis of fig. 4, optionally, the sub audio data searching sub-module 3033 includes a key information extracting unit, a matching degree determining unit, and a sub audio data searching unit:
a key information extraction unit for extracting second key information for each sub audio data, respectively;
a matching degree determining unit, configured to perform word sense matching on the first key information and the second key information, and determine a matching degree between the content of the first key information and the content of each piece of sub-audio data;
and the sub audio data searching unit is used for searching the sub audio data of which the matching degree meets the preset condition.
On the basis of fig. 4, optionally, the feature label adding module 304 is specifically configured to add corresponding feature labels to the multiple found sub audio data respectively.
On the basis of fig. 3, optionally, after the feature label adding module 304, the apparatus further includes an operation instruction receiving module 305 and a sub audio data playing module 306, see fig. 4:
an operation instruction receiving module 305, configured to receive a click operation instruction for the feature tag;
a sub audio data playing module 306, configured to play the sub audio data corresponding to the feature tag.
The audio positioning apparatus provided in the embodiment of the present invention can implement each process implemented by the method embodiments of fig. 1 to fig. 2, and is not described herein again to avoid repetition. According to the method and the device, the user can quickly find the required content from the second audio data according to the characteristic marks, the efficiency of repeatedly listening to the recording is improved, and the use experience of the user in repeatedly listening to the recording is improved.
Fig. 5 is a schematic diagram of a hardware structure of a mobile terminal for implementing various embodiments of the present invention, where the mobile terminal 400 includes, but is not limited to: radio frequency unit 401, network module 402, audio output unit 403, input unit 404, sensor 405, display unit 406, user input unit 407, interface unit 408, memory 409, processor 410, and power supply 411. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 5 is not intended to be limiting of mobile terminals, and that a mobile terminal may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the mobile terminal includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.
A processor 410 for receiving first audio data; extracting first key information from the first audio data; searching sub audio data matched with the first key information from second audio data; adding a feature label to the sub-audio data.
According to the method and the device, the user can quickly find the required content from the second audio data according to the feature marks, the efficiency of repeatedly listening to the recording is improved, and the use experience of the user in repeatedly listening to the recording is improved.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 401 may be used for receiving and sending signals during a message sending and receiving process or a call process, and specifically, receives downlink data from a base station and then processes the received downlink data to the processor 410; in addition, the uplink data is transmitted to the base station. Typically, radio unit 401 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. Further, the radio unit 401 can also communicate with a network and other devices through a wireless communication system.
The mobile terminal provides the user with wireless broadband internet access through the network module 402, such as helping the user send and receive e-mails, browse web pages, and access streaming media.
The audio output unit 403 may convert audio data received by the radio frequency unit 401 or the network module 402 or stored in the memory 409 into an audio signal and output as sound. Also, the audio output unit 403 may also provide audio output related to a specific function performed by the mobile terminal 400 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 403 includes a speaker, a buzzer, a receiver, and the like.
The input unit 404 is used to receive audio or video signals. The input Unit 404 may include a Graphics Processing Unit (GPU) 4041 and a microphone 4042, and the Graphics processor 4041 processes image data of a still picture or video obtained by an image capturing apparatus (such as a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 406. The image frames processed by the graphic processor 4041 may be stored in the memory 409 (or other storage medium) or transmitted via the radio frequency unit 401 or the network module 402. The microphone 4042 may receive sound, and may be capable of processing such sound into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 401 in case of the phone call mode.
The mobile terminal 400 also includes at least one sensor 405, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 4061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 4061 and/or the backlight when the mobile terminal 400 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of the mobile terminal (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 405 may also include a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, etc., which will not be described in detail herein.
The display unit 406 is used to display information input by the user or information provided to the user. The Display unit 406 may include a Display panel 4061, and the Display panel 4061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 407 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the user input unit 407 includes a touch panel 4071 and other input devices 4072. Touch panel 4071, also referred to as a touch screen, may collect touch operations by a user on or near it (e.g., operations by a user on or near touch panel 4071 using a finger, a stylus, or any suitable object or attachment). The touch panel 4071 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 410, receives a command from the processor 410, and executes the command. In addition, the touch panel 4071 can be implemented by using various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 4071, the user input unit 407 may include other input devices 4072. Specifically, the other input devices 4072 may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, and a joystick, which are not described herein again.
Further, the touch panel 4071 can be overlaid on the display panel 4061, and when the touch panel 4071 detects a touch operation thereon or nearby, the touch operation is transmitted to the processor 410 to determine the type of the touch event, and then the processor 410 provides a corresponding visual output on the display panel 4061 according to the type of the touch event. Although in fig. 5, the touch panel 4071 and the display panel 4061 are two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 4071 and the display panel 4061 may be integrated to implement the input and output functions of the mobile terminal, which is not limited herein.
The interface unit 408 is an interface through which an external device is connected to the mobile terminal 400. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 408 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the mobile terminal 400 or may be used to transmit data between the mobile terminal 400 and external devices.
The memory 409 may be used to store software programs as well as various data. The memory 409 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 409 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 410 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 409 and calling data stored in the memory 409, thereby integrally monitoring the mobile terminal. Processor 410 may include one or more processing units; preferably, the processor 410 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 410.
The mobile terminal 400 may further include a power supply 411 (e.g., a battery) for supplying power to various components, and preferably, the power supply 411 may be logically connected to the processor 410 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system.
In addition, the mobile terminal 400 includes some functional modules that are not shown, and thus, are not described in detail herein.
Preferably, an embodiment of the present invention further provides a mobile terminal, which includes a processor 410, a memory 409, and a computer program that is stored in the memory 409 and can be run on the processor 410, and when being executed by the processor 410, the computer program implements each process of the above-mentioned embodiment of the audio positioning method, and can achieve the same technical effect, and in order to avoid repetition, details are not described here again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned embodiment of the audio positioning method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (7)

1. An audio positioning method is applied to a mobile terminal, and comprises the following steps:
receiving first audio data;
extracting first key information from the first audio data;
searching sub audio data matched with the first key information from second audio data;
adding a feature label to the sub-audio data;
the searching for the sub audio data matching with the first key information from the second audio data comprises:
identifying audio content of the second audio data;
dividing the second audio data into a plurality of sub audio data according to the audio content;
searching sub audio data matched with the first key information from the plurality of sub audio data;
the searching for sub audio data matched with the first key information comprises:
respectively extracting second key information from each sub audio data;
performing word sense matching on the first key information and the second key information, and determining the matching degree of the content of the first key information and the content of each sub-audio data;
searching the sub-audio data with the matching degree meeting a preset condition;
the step of determining the matching degree between the content of the first key information and the content of each piece of sub-audio data includes:
calculating the proportion of the second key information in the first key information;
and determining the matching degree of the content of the first key information and the content of each piece of sub-audio data according to the proportion.
2. The method of claim 1, wherein the adding feature labels to the sub-audio data comprises:
and respectively adding corresponding characteristic marks to the plurality of searched sub-audio data.
3. The method of claim 1, wherein after said adding feature labels to said sub-audio data, said method further comprises:
receiving a click operation instruction of the feature mark;
and playing the sub-audio data corresponding to the feature marks.
4. An audio positioning apparatus, disposed in a mobile terminal, the apparatus comprising:
the audio data receiving module is used for receiving first audio data;
the key information extraction module is used for extracting first key information from the first audio data;
the sub audio data searching module is used for searching sub audio data matched with the first key information from second audio data;
the characteristic mark adding module is used for adding a characteristic mark to the sub audio data;
the sub audio data searching module comprises:
an audio content identification submodule for identifying the audio content of the second audio data;
the audio data dividing submodule is used for dividing the second audio data into a plurality of sub audio data according to the audio content;
the sub-audio data searching sub-module is used for searching sub-audio data matched with the first key information from the plurality of sub-audio data;
the sub audio data searching sub-module comprises:
a key information extraction unit for extracting second key information for each sub audio data, respectively;
a matching degree determining unit, configured to perform word sense matching on the first key information and the second key information, and determine a matching degree between the content of the first key information and the content of each piece of sub-audio data;
the sub audio data searching unit is used for searching the sub audio data of which the matching degree meets the preset condition;
the matching degree determination unit is further configured to:
calculating the proportion of the second key information in the first key information;
and determining the matching degree of the content of the first key information and the content of each piece of sub-audio data according to the proportion.
5. The apparatus of claim 4,
the feature mark adding module is specifically configured to add corresponding feature marks to the multiple searched sub audio data respectively.
6. The apparatus of claim 4, wherein after the signature adding module, the mobile terminal further comprises:
the operation instruction receiving module is used for receiving a click operation instruction of the feature mark;
and the sub audio data playing module is used for playing the sub audio data corresponding to the characteristic mark.
7. A mobile terminal, characterized in that it comprises a processor, a memory and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the audio positioning method according to any of claims 1 to 3.
CN201711132035.0A 2017-11-15 2017-11-15 Audio positioning method and device and mobile terminal Active CN108093124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711132035.0A CN108093124B (en) 2017-11-15 2017-11-15 Audio positioning method and device and mobile terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711132035.0A CN108093124B (en) 2017-11-15 2017-11-15 Audio positioning method and device and mobile terminal

Publications (2)

Publication Number Publication Date
CN108093124A CN108093124A (en) 2018-05-29
CN108093124B true CN108093124B (en) 2021-01-08

Family

ID=62172683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711132035.0A Active CN108093124B (en) 2017-11-15 2017-11-15 Audio positioning method and device and mobile terminal

Country Status (1)

Country Link
CN (1) CN108093124B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408717B (en) * 2018-10-23 2022-03-29 广东小天才科技有限公司 Content recommendation method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103414948A (en) * 2013-08-01 2013-11-27 王强 Method and device for playing video
CN104967907A (en) * 2014-06-09 2015-10-07 腾讯科技(深圳)有限公司 Video playing positioning method and system
CN106776890A (en) * 2016-11-29 2017-05-31 北京小米移动软件有限公司 The method of adjustment and device of video playback progress
CN107333185A (en) * 2017-07-27 2017-11-07 上海与德科技有限公司 A kind of player method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6973256B1 (en) * 2000-10-30 2005-12-06 Koninklijke Philips Electronics N.V. System and method for detecting highlights in a video program using audio properties
CN102262890A (en) * 2010-05-31 2011-11-30 鸿富锦精密工业(深圳)有限公司 Electronic device and marking method thereof
JP5708445B2 (en) * 2011-10-31 2015-04-30 富士通株式会社 Registration method, registration program, and registration apparatus
CN103400592A (en) * 2013-07-30 2013-11-20 北京小米科技有限责任公司 Recording method, playing method, device, terminal and system
CN103647761B (en) * 2013-11-28 2017-04-12 小米科技有限责任公司 Method and device for marking audio record, and terminal, server and system
CN106131324A (en) * 2016-06-28 2016-11-16 广东欧珀移动通信有限公司 Audio data processing method, device and terminal
CN106571137A (en) * 2016-10-28 2017-04-19 努比亚技术有限公司 Terminal voice dotting control device and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103414948A (en) * 2013-08-01 2013-11-27 王强 Method and device for playing video
CN104967907A (en) * 2014-06-09 2015-10-07 腾讯科技(深圳)有限公司 Video playing positioning method and system
CN106776890A (en) * 2016-11-29 2017-05-31 北京小米移动软件有限公司 The method of adjustment and device of video playback progress
CN107333185A (en) * 2017-07-27 2017-11-07 上海与德科技有限公司 A kind of player method and device

Also Published As

Publication number Publication date
CN108093124A (en) 2018-05-29

Similar Documents

Publication Publication Date Title
CN108415652B (en) Text processing method and mobile terminal
CN109240577B (en) Screen capturing method and terminal
CN107943390B (en) Character copying method and mobile terminal
CN109388456B (en) Head portrait selection method and mobile terminal
CN109523253B (en) Payment method and device
CN110096203B (en) Screenshot method and mobile terminal
CN108460817B (en) Jigsaw puzzle method and mobile terminal
CN110930410B (en) Image processing method, server and terminal equipment
CN111464428B (en) Audio processing method, server, electronic device, and computer-readable storage medium
CN109753202B (en) Screen capturing method and mobile terminal
CN108920040B (en) Application icon sorting method and mobile terminal
CN108595107B (en) Interface content processing method and mobile terminal
CN110990679A (en) Information searching method and electronic equipment
CN110780751B (en) Information processing method and electronic equipment
CN109992753B (en) Translation processing method and terminal equipment
CN111405043A (en) Information processing method and device and electronic equipment
CN109063076B (en) Picture generation method and mobile terminal
CN108520760B (en) Voice signal processing method and terminal
CN108062370B (en) Application program searching method and mobile terminal
CN108270928B (en) Voice recognition method and mobile terminal
JP2021532492A (en) Character input method and terminal
CN111292727B (en) Voice recognition method and electronic equipment
CN111144065B (en) Display control method and electronic equipment
CN108093124B (en) Audio positioning method and device and mobile terminal
CN107835310B (en) Mobile terminal setting method and mobile terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant