CN112102854A - Recording filtering method and device and computer readable storage medium - Google Patents

Recording filtering method and device and computer readable storage medium Download PDF

Info

Publication number
CN112102854A
CN112102854A CN202010999917.2A CN202010999917A CN112102854A CN 112102854 A CN112102854 A CN 112102854A CN 202010999917 A CN202010999917 A CN 202010999917A CN 112102854 A CN112102854 A CN 112102854A
Authority
CN
China
Prior art keywords
recording
preset
voice
filtering
speaker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010999917.2A
Other languages
Chinese (zh)
Inventor
严馨华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Hongxingfu Food Co ltd
Original Assignee
Fujian Hongxingfu Food Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Hongxingfu Food Co ltd filed Critical Fujian Hongxingfu Food Co ltd
Priority to CN202010999917.2A priority Critical patent/CN112102854A/en
Publication of CN112102854A publication Critical patent/CN112102854A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10009Improvement or modification of read or write signals
    • G11B20/10046Improvement or modification of read or write signals filtering or equalising, e.g. setting the tap weights of an FIR filter
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Quality & Reliability (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The sound recording filtering method disclosed by the invention is used for carrying out voice recognition analysis on the first sound recording; filtering the first recording according to a preset rule to obtain a second recording; wherein the preset rule comprises: preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise; or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter. Therefore, the recording filtering method provided by the invention can filter the recording according to the preset rule, filter the invalid recording, only reserve the valid recording, reduce the time for manually carrying out playback recognition on the recording, and improve the efficiency of the recording playback recognition.

Description

Recording filtering method and device and computer readable storage medium
Technical Field
The invention relates to the technical field of recording processing, in particular to a recording filtering method and device and a computer readable storage medium.
Background
With the continuous popularization of electronic products and the continuous development of electronic technologies, people usually choose to record in a recording mode in scenes (such as meeting scenes or monitoring scenes) needing real-time recording, and then manually play back a recording file, identify and screen effective recordings and manually convert the effective recordings into characters.
Because the duration of the recording file is usually longer, and more invalid recordings may exist in the middle, more time needs to be consumed for manually playing back and identifying the recording, and the efficiency is lower.
Disclosure of Invention
In view of the above, the present invention provides a recording filtering method, an apparatus and a computer-readable storage medium to solve the above technical problems.
Firstly, in order to achieve the above object, the present invention provides a recording filtering method, including:
performing voice recognition analysis on the first sound recording;
filtering the first recording according to a preset rule to obtain a second recording;
wherein the preset rule comprises:
preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise;
or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter.
Optionally, the performing voice recognition analysis on the first audio recording includes:
performing voice classification on the first sound recording to obtain a voice type, wherein the voice type comprises: human voice, noise, music;
if the voice type is voice, performing voiceprint recognition on the first recording to obtain voiceprint characteristic parameters of the speaker, and/or performing gender judgment on the first recording to obtain gender of the speaker, and/or performing age range judgment on the first recording to obtain age range of the speaker.
Optionally, the preset rule includes retaining or filtering a recording of a preset voice type, and the filtering the first recording according to the preset rule includes:
keeping the recording of the first preset voice type;
and/or filtering the recordings of the second predetermined voice font.
Optionally, the first preset voice type includes a human voice, and/or the second preset voice type includes music and/or noise.
Optionally, the preset condition comprises the preset age range;
the recording which meets the preset condition is reserved or filtered, and the recording comprises the following steps:
judging whether the age range of the speaker in the first recording falls into the preset age range included by the preset condition;
if the age range of the speaker in the first recording does not fall into the preset age range included in the preset conditions, retaining or filtering the recording of the speaker.
Optionally, the preset condition comprises the preset gender;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the gender of the speaker in the first recording is the same as the preset gender included in the preset condition;
if the gender of the speaker in the first recording is the same as the preset gender included in the preset condition, retaining or filtering the recording of the speaker.
Optionally, the preset condition includes the preset voiceprint characteristic parameter;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the voiceprint characteristic parameters of the speaker in the first recording are matched with the voiceprint characteristic parameters included in the preset conditions;
if the voiceprint characteristic parameters of the speaker in the first recording are matched with the preset voiceprint characteristic parameters included in the preset conditions, retaining or filtering the recording of the speaker.
Optionally, in the process of performing voice classification on the first recording to obtain the voice type, when noise or music contains voice, the voice type is voice.
Further, to achieve the above object, the present invention also provides a recording filter device, which includes a memory, at least one processor, and at least one program stored on the memory and executable on the at least one processor, wherein the at least one program, when executed by the at least one processor, implements the steps of the method.
Further, to achieve the above object, the present invention provides a computer-readable storage medium storing at least one program executable by a computer, the at least one program causing the computer to perform the steps of the method of any one of the above when the at least one program is executed by the computer.
Compared with the prior art, the sound recording filtering method provided by the invention is used for carrying out voice recognition analysis on the first sound recording; filtering the first recording according to a preset rule to obtain a second recording; wherein the preset rule comprises: preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise; or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter. Therefore, the recording filtering method provided by the invention can filter the recording according to the preset rule, filter the invalid recording, only reserve the valid recording, reduce the time for manually carrying out playback recognition on the recording, and improve the efficiency of the recording playback recognition.
Drawings
Fig. 1 is a schematic structural diagram of a recording filter device according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a vehicle-mounted locator according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a server according to an embodiment of the present invention;
fig. 4 is a schematic flow chart of a recording filtering method according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in itself. Thus, "module", "component" or "unit" may be used mixedly.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a recording filter apparatus according to an embodiment of the present invention, as shown in fig. 1, the recording filter apparatus 100 includes a processor 101 and a memory 102, where the memory 102 is used to store related data, such as a program, of the recording filter apparatus 100, and the processor 101 is used to execute the program stored in the memory 102 and implement a corresponding function. In the embodiment of the present invention, the recording filter device 100 may be a vehicle-mounted locator or a server.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a vehicle-mounted locator according to an embodiment of the present invention, as shown in fig. 2, a vehicle-mounted locator 200 includes a processor 201 and a memory 202, where the memory 202 is used to store relevant data of the vehicle-mounted locator 200, for example, data and programs collected by the vehicle-mounted locator 200, and the processor 201 is used to execute the programs stored in the processor 202 and implement corresponding functions.
The in-vehicle locator 200 further includes one or more of a location module 203, a recording module 204, a wireless communication module 205, a shock sensor 206, a low-power detection module 207, and a battery module 208. The positioning module 203 is configured to position the vehicle-mounted locator 200 to obtain position information of the vehicle-mounted locator 200, where the positioning module 203 may be a positioning chip such as a GPS or a beidou, and may also be a WIFI positioning module, a bluetooth positioning module, or a base station positioning module by obtaining longitude and latitude information of a vehicle, and by obtaining address information of peripheral WIFI devices, address information of bluetooth devices, or identification information of a base station.
The recording module 204 is configured to record sound around the vehicle-mounted locator 200, the wireless communication module 205 is configured to implement wireless communication connection between the vehicle-mounted locator 200 and an external device, and the wireless communication module 205 may include one or more of a bluetooth communication module, an infrared communication module, a WIFI communication module, and a mobile cellular network communication module (e.g., a 2G, a 3G, a 4G, or a 5G communication module). It is understood that in some embodiments, the vehicle-mounted locator 200 may include a wired communication module for implementing a wired communication connection between the vehicle-mounted locator 200 and a vehicle-mounted terminal, and further, a communication connection between external devices through the vehicle-mounted terminal. The vibration sensor 206 is configured to detect vibration data of the vehicle, and the processor 201 may determine a driving state (e.g., a moving state or a stationary state) of the vehicle according to the vibration data detected by the vibration sensor 206. The low power detection module 207 is configured to detect power information of the vehicle-mounted locator 200, and report the battery power information to the processor 201, and the battery module 208 is configured to supply power to the vehicle-mounted locator 200.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a server according to an embodiment of the present invention, as shown in fig. 3, a server 300 includes a processor 301 and a memory 302, where the memory 302 is used for storing relevant data, such as a program, of the server 300, and the processor 301 is used for executing the program stored in the memory 302 and implementing a corresponding function.
It should be noted that, when the recording filter apparatus 100 is the vehicle-mounted locator 200 shown in fig. 2, the vehicle-mounted locator 200 may implement a communication connection with a client through the server 300, or may directly establish a communication connection with the client without the server 300. When the recording filter device 100 is the server 300 shown in fig. 3, the server 300 acquires data collected by the vehicle-mounted locator 200, such as position information and sound information, by establishing a communication connection with the vehicle-mounted locator 200.
Based on the schematic structural diagram of the recording filter device 100, various embodiments of the method of the present invention are provided.
Referring to fig. 4, fig. 4 is a flowchart illustrating steps of a recording filtering method according to an embodiment of the present invention, where the method is applied to the recording filtering apparatus 100, and as shown in fig. 4, the method includes:
step 401, performing voice recognition analysis on the first sound recording.
In this step, the method performs a speech recognition analysis of a first recording, which is a recording recorded by a recording device, for example a recording recorded by a recording pen at a meeting, or a sound recorded by an onboard locator provided in the vehicle. For the case of too long voice content, the voice content can be split into multiple pieces, and then the voice analysis is performed piece by piece.
Performing voice recognition analysis on the first recording, which may specifically include performing voice classification on the first recording to obtain a voice type, where the voice type includes voice, noise, and music; if the voice type is voice, performing voiceprint recognition on the first recording to obtain voiceprint characteristic parameters of the speaker, and/or performing gender judgment on the first recording to obtain gender of the speaker, and/or performing age range judgment on the first recording to obtain age range of the speaker.
It should be noted that a voice recognition device may be disposed inside the recording filter device, and the first recording is subjected to voice analysis through the voice recognition device, or the function of performing voice analysis on the first recording is realized by calling an external voice recognition server without disposing the voice recognition device.
Step 402, filtering the first sound recording according to a preset rule to obtain a second sound recording; wherein the preset rule comprises: preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise; or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter.
In the step, the method filters the first recording according to a preset rule to obtain a second recording. The preset rule may include filtering according to a voice type, for example, retaining or filtering a recording of a preset voice type, where the preset voice type includes voice, music, and noise; the preset rule may also include filtering according to the voiceprint characteristics of the speaker, for example, retaining or filtering the recording meeting a preset condition, where the preset condition includes at least one of a preset age range, a preset gender, and a preset voiceprint characteristic parameter.
For example, when the user only needs to recognize the voice, the preset rule may be set to keep the recording of the preset voice type, where the preset voice type is the voice. When the user only needs to recognize the voice of the female speaker, the preset rule may be set to retain the recording with gender of female, or filter the recording with gender of male. When the user only needs to recognize the voice of a specified speaker (e.g., a car owner, a driver, or a fixed passenger), the preset rule may be set to retain the recording of the preset voiceprint characteristic parameter, where the preset voiceprint characteristic parameter is the voiceprint characteristic parameter corresponding to the specified speaker. Conversely, when the user needs to recognize the voices of other speakers except the designated speaker, the preset rule may be set to filter the recording of the preset voiceprint characteristic parameter, where the preset voiceprint characteristic parameter is the voiceprint characteristic parameter corresponding to the designated speaker.
In some embodiments of the present invention, in the process of filtering the sound record and/or after the sound record is filtered, the method may further receive a modification operation for the preset rule, and update the preset rule according to the modification operation.
In this embodiment, the recording filtering method performs voice recognition analysis on the first recording; filtering the first recording according to a preset rule to obtain a second recording; wherein the preset rule comprises: preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise; or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter. Therefore, the recording filtering method provided by the invention can filter the recording according to the preset rule, filter the invalid recording, only reserve the valid recording, reduce the time for manually carrying out playback recognition on the recording, and improve the efficiency of the recording playback recognition.
The following describes the method and process of the present invention in detail by taking the recording filter device as a server and taking the first recording as a recording recorded by a vehicle-mounted locator as an example.
When the administrator needs to perform playback recognition on the recording on the vehicle, the administrator can start an application program on the client, and send a recording filtering request to the server through the application program, wherein the recording filtering request carries filtering parameters, and the filtering parameters at least comprise preset rules and can also comprise other information, such as at least one of a user account number, a vehicle-mounted locator identifier, a vehicle passenger limitation number and user information (such as name, gender, age, contact information and the like). The server receives the recording filtering request sent by the client, acquires and stores the filtering parameters carried in the driver and passenger identification request, and is used for subsequently carrying out voice recognition analysis on the recording, and returning a recording filtering starting response message to the client, marking the recording filtering request sent by the client and starting the recording filtering function, and the server sends the recording filtering request to the vehicle-mounted locator corresponding to the vehicle-mounted locator marking, is used for requesting to acquire the first recording collected by the vehicle-mounted locator and carries out subsequent recording filtering steps according to the acquired sound information. It can be understood that, the server to before the on-vehicle locator sends the recording filtering request, can judge earlier whether on-vehicle locator is online, if online, then directly to on-vehicle locator sends the recording filtering request, if not online, then wait after on-vehicle locator goes on the line the on-vehicle locator sends the recording filtering request. And after receiving the recording filtering request sent by the server, the vehicle-mounted locator stores the filtering parameters in the recording filtering request and returns a recording filtering response message to the server, and in addition, the vehicle-mounted locator reports the collected first recording to the server.
The method and process provided by the invention are described in detail below by taking the recording filter device as a vehicle-mounted locator and taking the first recording as the recording recorded by the vehicle-mounted locator as an example.
When the administrator needs to perform playback recognition on the recording on the vehicle, the administrator can start an application program on the client, and send a recording filtering request to the vehicle-mounted locator through the application program, wherein the recording filtering request carries filtering parameters, the filtering parameters at least comprise preset rules, and the filtering parameters also comprise other information, such as at least one of a user account number, a vehicle-mounted locator identifier, the number of passengers of the vehicle, preset voiceprint characteristic parameters and user information (such as name, gender, age, contact way and the like). The client side can directly establish communication connection with the vehicle-mounted locator and send the recording filtering request to the vehicle-mounted locator, and can also send the recording filtering request to the vehicle-mounted locator through a server. The method comprises the steps that after receiving a recording filtering request sent by a client, a vehicle-mounted locator acquires and stores filtering parameters carried in the recording filtering request, the vehicle-mounted locator is used for carrying out voice analysis on a first recording subsequently and returning a recording filtering response message to the client, the vehicle-mounted locator is marked to successfully receive the recording filtering request sent by the client and start a recording filtering function, and the vehicle-mounted locator acquires the acquired first recording and carries out subsequent recording filtering steps according to the acquired first recording.
Optionally, the performing voice recognition analysis on the first audio recording includes:
performing voice classification on the first sound recording to obtain a voice type, wherein the voice type comprises: human voice, noise, music;
if the voice type is voice, performing voiceprint recognition on the first recording to obtain voiceprint characteristic parameters of the speaker, and/or performing gender judgment on the first recording to obtain gender of the speaker, and/or performing age range judgment on the first recording to obtain age range of the speaker.
Optionally, the preset rule includes retaining or filtering a recording of a preset voice type, and the filtering the first recording according to the preset rule includes:
keeping the recording of the first preset voice type;
and/or filtering the recordings of the second predetermined voice font.
Optionally, the first preset voice type includes a human voice, and/or the second preset voice type includes music and/or noise.
Optionally, the preset condition comprises the preset age range;
the recording which meets the preset condition is reserved or filtered, and the recording comprises the following steps:
judging whether the age range of the speaker in the first recording falls into the preset age range included by the preset condition;
if the age range of the speaker in the first recording does not fall into the preset age range included in the preset conditions, retaining or filtering the recording of the speaker.
Optionally, the preset condition comprises the preset gender;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the gender of the speaker in the first recording is the same as the preset gender included in the preset condition;
if the gender of the speaker in the first recording is the same as the preset gender included in the preset condition, retaining or filtering the recording of the speaker.
Optionally, the preset condition includes the preset voiceprint characteristic parameter;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the voiceprint characteristic parameters of the speaker in the first recording are matched with the voiceprint characteristic parameters included in the preset conditions;
if the voiceprint characteristic parameters of the speaker in the first recording are matched with the preset voiceprint characteristic parameters included in the preset conditions, retaining or filtering the recording of the speaker.
For example, when the user only needs to play back the recording for identifying the designated speaker (e.g., the driver), the voiceprint feature parameter of the designated speaker may be preset as a preset voiceprint feature parameter, and if the voiceprint feature parameter of the speaker in the first recording matches with the preset voiceprint feature parameter included in the preset condition, the recording of the speaker is retained.
In some embodiments of the present invention, the method may further identify voices of different speakers in the first recording, and store the recording of each speaker in a centralized manner, that is, store the recording having the same voiceprint characteristic parameter in a centralized manner. For example, assuming that the first sound recording includes A, B, C voices of three persons, the method stores the contents of the speech a in the first sound recording separately, stores the contents of the speech B in the first sound recording separately, and stores the contents of the speech C in the first sound recording separately.
Furthermore, it is also possible to identify each centrally stored sound recording, for example, assign a passenger identification code to each same voiceprint feature parameter, identify the speech recordings of different passengers using different passenger identification codes, and for the case where there are multiple speakers speaking simultaneously, identify the section of sound recording including multiple speakers speaking simultaneously with multiple passenger identification codes, and identify that the section of sound recording includes the speech recordings of multiple speakers. Or judging the gender and/or age range of the voice of each speaker, determining the gender and/or age range of each speaker, and identifying the speech recording of the river and lake according to the gender and/or age range of the speaker.
In some embodiments of the present invention, the method stores the second sound recording obtained after filtering the first sound recording, identifies the second sound recording as a normal sound recording, and also stores the third sound recording filtered out, and identifies the third sound recording as a filtered sound recording. Therefore, when the user needs to perform playback recognition on the first recording, which recording file is the filtered normal recording can be determined according to the recording identifier, and the user can conveniently and accurately select the normal recording file to perform playback recognition. In some embodiments, the recording filtering device further performs speech-to-text processing on the second recording to obtain text content corresponding to the second recording.
Optionally, in the process of performing voice classification on the first recording to obtain the voice type, when noise or music contains voice, the voice type is voice.
Those skilled in the art will appreciate that all or part of the steps of the method of the above embodiments may be implemented by hardware associated with at least one program instruction, where the at least one program may be stored in the memory 102 of the recording filter apparatus 100 shown in fig. 1 and can be executed by the processor 101 of the recording filter apparatus 100, and when executed by the processor, the at least one program implements the following steps:
performing voice recognition analysis on the first sound recording;
filtering the first recording according to a preset rule to obtain a second recording;
wherein the preset rule comprises:
preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise;
or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter.
Optionally, the performing voice recognition analysis on the first audio recording includes:
performing voice classification on the first sound recording to obtain a voice type, wherein the voice type comprises: human voice, noise, music;
if the voice type is voice, performing voiceprint recognition on the first recording to obtain voiceprint characteristic parameters of the speaker, and/or performing gender judgment on the first recording to obtain gender of the speaker, and/or performing age range judgment on the first recording to obtain age range of the speaker.
Optionally, the preset rule includes retaining or filtering a recording of a preset voice type, and the filtering the first recording according to the preset rule includes:
keeping the recording of the first preset voice type;
and/or filtering the recordings of the second predetermined voice font.
Optionally, the first preset voice type includes a human voice, and/or the second preset voice type includes music and/or noise.
Optionally, the preset condition comprises the preset age range;
the recording which meets the preset condition is reserved or filtered, and the recording comprises the following steps:
judging whether the age range of the speaker in the first recording falls into the preset age range included by the preset condition;
if the age range of the speaker in the first recording does not fall into the preset age range included in the preset conditions, retaining or filtering the recording of the speaker.
Optionally, the preset condition comprises the preset gender;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the gender of the speaker in the first recording is the same as the preset gender included in the preset condition;
if the gender of the speaker in the first recording is the same as the preset gender included in the preset condition, retaining or filtering the recording of the speaker.
Optionally, the preset condition includes the preset voiceprint characteristic parameter;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the voiceprint characteristic parameters of the speaker in the first recording are matched with the voiceprint characteristic parameters included in the preset conditions;
if the voiceprint characteristic parameters of the speaker in the first recording are matched with the preset voiceprint characteristic parameters included in the preset conditions, retaining or filtering the recording of the speaker.
Optionally, in the process of performing voice classification on the first recording to obtain the voice type, when noise or music contains voice, the voice type is voice.
It will be understood by those skilled in the art that all or part of the steps of the method for implementing the above embodiments may be implemented by hardware associated with at least one program instruction, the at least one program may be stored in a computer readable storage medium, and when executed, the at least one program implements the steps of:
performing voice recognition analysis on the first sound recording;
filtering the first recording according to a preset rule to obtain a second recording;
wherein the preset rule comprises:
preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise;
or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter.
Optionally, the performing voice recognition analysis on the first audio recording includes:
performing voice classification on the first sound recording to obtain a voice type, wherein the voice type comprises: human voice, noise, music;
if the voice type is voice, performing voiceprint recognition on the first recording to obtain voiceprint characteristic parameters of the speaker, and/or performing gender judgment on the first recording to obtain gender of the speaker, and/or performing age range judgment on the first recording to obtain age range of the speaker.
Optionally, the preset rule includes retaining or filtering a recording of a preset voice type, and the filtering the first recording according to the preset rule includes:
keeping the recording of the first preset voice type;
and/or filtering the recordings of the second predetermined voice font.
Optionally, the first preset voice type includes a human voice, and/or the second preset voice type includes music and/or noise.
Optionally, the preset condition comprises the preset age range;
the recording which meets the preset condition is reserved or filtered, and the recording comprises the following steps:
judging whether the age range of the speaker in the first recording falls into the preset age range included by the preset condition;
if the age range of the speaker in the first recording does not fall into the preset age range included in the preset conditions, retaining or filtering the recording of the speaker.
Optionally, the preset condition comprises the preset gender;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the gender of the speaker in the first recording is the same as the preset gender included in the preset condition;
if the gender of the speaker in the first recording is the same as the preset gender included in the preset condition, retaining or filtering the recording of the speaker.
Optionally, the preset condition includes the preset voiceprint characteristic parameter;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the voiceprint characteristic parameters of the speaker in the first recording are matched with the voiceprint characteristic parameters included in the preset conditions;
if the voiceprint characteristic parameters of the speaker in the first recording are matched with the preset voiceprint characteristic parameters included in the preset conditions, retaining or filtering the recording of the speaker.
Optionally, in the process of performing voice classification on the first recording to obtain the voice type, when noise or music contains voice, the voice type is voice.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method of filtering audio recordings, the method comprising:
performing voice recognition analysis on the first sound recording;
filtering the first recording according to a preset rule to obtain a second recording;
wherein the preset rule comprises:
preserving or filtering recordings of a preset voice type, the preset voice type comprising: human voice, music, noise;
or, the recording meeting the preset condition is reserved or filtered, wherein the preset condition comprises at least one of a preset age range, a preset gender and a preset voiceprint characteristic parameter.
2. The method for filtering audio records according to claim 1, wherein the performing a speech recognition analysis on the first audio record comprises:
performing voice classification on the first sound recording to obtain a voice type, wherein the voice type comprises: human voice, noise, music;
if the voice type is voice, performing voiceprint recognition on the first recording to obtain voiceprint characteristic parameters of the speaker, and/or performing gender judgment on the first recording to obtain gender of the speaker, and/or performing age range judgment on the first recording to obtain age range of the speaker.
3. The audio record filtering method according to claim 1, wherein the preset rule comprises retaining or filtering audio records of a preset voice type, and the filtering the first audio record according to the preset rule comprises:
keeping the recording of the first preset voice type;
and/or filtering the recordings of the second predetermined voice font.
4. The recording filtering method according to claim 3, wherein the first predetermined voice type includes human voice, and/or the second predetermined voice type includes music and/or noise.
5. The audio record filtering method according to claim 2, characterized in that said preset condition comprises said preset age range;
the recording which meets the preset condition is reserved or filtered, and the recording comprises the following steps:
judging whether the age range of the speaker in the first recording falls into the preset age range included by the preset condition;
if the age range of the speaker in the first recording does not fall into the preset age range included in the preset conditions, retaining or filtering the recording of the speaker.
6. The method of claim 2, wherein the predetermined condition comprises the predetermined gender;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the gender of the speaker in the first recording is the same as the preset gender included in the preset condition;
if the gender of the speaker in the first recording is the same as the preset gender included in the preset condition, retaining or filtering the recording of the speaker.
7. The audio record filtering method according to claim 2, wherein the preset condition comprises the preset voiceprint characteristic parameter;
the voice which meets the preset condition is reserved or filtered, and the voice comprises the following steps:
judging whether the voiceprint characteristic parameters of the speaker in the first recording are matched with the voiceprint characteristic parameters included in the preset conditions;
if the voiceprint characteristic parameters of the speaker in the first recording are matched with the preset voiceprint characteristic parameters included in the preset conditions, retaining or filtering the recording of the speaker.
8. The method of claim 2, wherein in the voice classifying the first recording to obtain the voice type, the voice type is voice when the noise or music contains voice.
9. A sound recording filtering apparatus, characterized in that the sound recording filtering apparatus comprises a memory, at least one processor and at least one program stored on the memory and executable on the at least one processor, the at least one program implementing the steps of the method of any one of the preceding claims 1 to 8 when executed by the at least one processor.
10. A computer-readable storage medium storing at least one program executable by a computer, the at least one program, when executed by the computer, causing the computer to perform the steps of the method of any one of claims 1 to 8.
CN202010999917.2A 2020-09-22 2020-09-22 Recording filtering method and device and computer readable storage medium Pending CN112102854A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010999917.2A CN112102854A (en) 2020-09-22 2020-09-22 Recording filtering method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010999917.2A CN112102854A (en) 2020-09-22 2020-09-22 Recording filtering method and device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN112102854A true CN112102854A (en) 2020-12-18

Family

ID=73755742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010999917.2A Pending CN112102854A (en) 2020-09-22 2020-09-22 Recording filtering method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN112102854A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113014844A (en) * 2021-02-08 2021-06-22 Oppo广东移动通信有限公司 Audio processing method and device, storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103714817A (en) * 2013-12-31 2014-04-09 厦门天聪智能软件有限公司 Satisfaction survey cheating screening method based on voiceprint recognition technology
CN108694954A (en) * 2018-06-13 2018-10-23 广州势必可赢网络科技有限公司 A kind of Sex, Age recognition methods, device, equipment and readable storage medium storing program for executing
CN108831440A (en) * 2018-04-24 2018-11-16 中国地质大学(武汉) A kind of vocal print noise-reduction method and system based on machine learning and deep learning
CN109448756A (en) * 2018-11-14 2019-03-08 北京大生在线科技有限公司 A kind of voice age recognition methods and system
CN110473566A (en) * 2019-07-25 2019-11-19 深圳壹账通智能科技有限公司 Audio separation method, device, electronic equipment and computer readable storage medium
CN111246285A (en) * 2020-03-24 2020-06-05 北京奇艺世纪科技有限公司 Method for separating sound in comment video and method and device for adjusting volume
CN111640422A (en) * 2020-05-13 2020-09-08 广州国音智能科技有限公司 Voice and human voice separation method and device, terminal and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103714817A (en) * 2013-12-31 2014-04-09 厦门天聪智能软件有限公司 Satisfaction survey cheating screening method based on voiceprint recognition technology
CN108831440A (en) * 2018-04-24 2018-11-16 中国地质大学(武汉) A kind of vocal print noise-reduction method and system based on machine learning and deep learning
CN108694954A (en) * 2018-06-13 2018-10-23 广州势必可赢网络科技有限公司 A kind of Sex, Age recognition methods, device, equipment and readable storage medium storing program for executing
CN109448756A (en) * 2018-11-14 2019-03-08 北京大生在线科技有限公司 A kind of voice age recognition methods and system
CN110473566A (en) * 2019-07-25 2019-11-19 深圳壹账通智能科技有限公司 Audio separation method, device, electronic equipment and computer readable storage medium
CN111246285A (en) * 2020-03-24 2020-06-05 北京奇艺世纪科技有限公司 Method for separating sound in comment video and method and device for adjusting volume
CN111640422A (en) * 2020-05-13 2020-09-08 广州国音智能科技有限公司 Voice and human voice separation method and device, terminal and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113014844A (en) * 2021-02-08 2021-06-22 Oppo广东移动通信有限公司 Audio processing method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN106209138B (en) Vehicle cautious emergency response system and method
US9646427B2 (en) System for detecting the operational status of a vehicle using a handheld communication device
CN107613144B (en) Automatic calling method, device, storage medium and mobile terminal
US9420431B2 (en) Vehicle telematics communication for providing hands-free wireless communication
CN112086098B (en) Driver and passenger analysis method and device and computer readable storage medium
CN106816149A (en) The priorization content loading of vehicle automatic speech recognition system
WO2014137384A1 (en) Emergency handling system using informative alarm sound
CN112785837A (en) Method and device for recognizing emotion of user when driving vehicle, storage medium and terminal
CN105895132B (en) vehicle-mounted voice recording method, device and system
CN108597524B (en) Automobile voice recognition prompting device and method
CN111028834B (en) Voice message reminding method and device, server and voice message reminding equipment
CN112071309A (en) Network appointment car safety monitoring device and system
CN113094483B (en) Method and device for processing vehicle feedback information, terminal equipment and storage medium
CN106156036B (en) Vehicle-mounted audio processing method and vehicle-mounted equipment
CN112102854A (en) Recording filtering method and device and computer readable storage medium
WO2016165403A1 (en) Transportation assisting method and system
CN110826433B (en) Emotion analysis data processing method, device and equipment for test driving user and storage medium
CN113596247A (en) Alarm clock information processing method, device, vehicle, storage medium and program product
JP2006121270A (en) Hands-free speech unit
CN106296867B (en) Image recording apparatus and its image mark method
CN112116911B (en) Sound control method and device and computer readable storage medium
CN112118536B (en) Power saving method and device of device and computer readable storage medium
CN113306487A (en) Vehicle prompting method, device, electronic equipment, storage medium and program product
CN112261586A (en) Method for automatically identifying driver to limit driving range of driver by using vehicle-mounted robot
CN113392650A (en) Intelligent memo reminding method and system based on vehicle position

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201218