CN112509597A - Recording data identification method and device and recording equipment - Google Patents

Recording data identification method and device and recording equipment Download PDF

Info

Publication number
CN112509597A
CN112509597A CN202011303903.9A CN202011303903A CN112509597A CN 112509597 A CN112509597 A CN 112509597A CN 202011303903 A CN202011303903 A CN 202011303903A CN 112509597 A CN112509597 A CN 112509597A
Authority
CN
China
Prior art keywords
data
sound
audio data
target
recording
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011303903.9A
Other languages
Chinese (zh)
Inventor
钟兆彬
叶王建
任鑫鑫
曾安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai filed Critical Gree Electric Appliances Inc of Zhuhai
Priority to CN202011303903.9A priority Critical patent/CN112509597A/en
Publication of CN112509597A publication Critical patent/CN112509597A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/16Storage of analogue signals in digital stores using an arrangement comprising analogue/digital [A/D] converters, digital memories and digital/analogue [D/A] converters 

Abstract

The invention provides a recording data identification method and device and recording equipment, wherein the method comprises the following steps: acquiring sound data acquired by each of a plurality of sound recording devices; separating the audio data of the target object from the sound data collected by each sound recording device; performing definition recognition on audio data of a target object separated from sound data acquired by each sound recording device, and selecting the audio data with the highest definition as the target audio data of the target object; the target audio data is converted into text data. Through the scheme, the problem that the acquisition result of the existing recording equipment is influenced by the distance and has low accuracy is solved, and the technical effect of effectively improving the accuracy is achieved.

Description

Recording data identification method and device and recording equipment
Technical Field
The invention relates to the technical field of audio processing, in particular to a recording data identification method and device and recording equipment.
Background
The recording pen is a common tool used as a meeting summary in various meetings, and at present, the recording pen can carry out voiceprint recognition on multiple speakers speaking at the same time, and the speaking content of each person is separated and converted into characters.
However, the sound pickup distance of the recording pen is short, so that the recording of people at a short distance is clear, and the recording of people at a long distance is fuzzy. And the microphone of the recording pen is a directional microphone, so that the content of the recording pen of each person in the multi-person conference is transcribed into the content of the conference and has deviation, and the examination after the conference is difficult.
Aiming at breaking through the distance limit of a recording pen and accurately and stably recording the conference content in real time, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a recording data identification method and device and recording equipment, so as to achieve the technical effect of accurately and stably identifying recording contents.
In one aspect, a method for identifying recorded sound data is provided, including:
acquiring sound data acquired by each of a plurality of sound recording devices;
separating the audio data of the target object from the sound data collected by each sound recording device;
performing definition recognition on audio data of a target object separated from sound data acquired by each sound recording device, and selecting the audio data with the highest definition as the target audio data of the target object;
the target audio data is converted into text data.
In one embodiment, in a case where a plurality of target objects exist, separating audio data of the target objects from sound data collected by the respective sound recording apparatuses includes:
and separating the audio data of each target object from the sound data collected by each sound recording device through voiceprint recognition.
In one embodiment, after converting the target audio data into text data in a case where a plurality of target objects exist, the method further includes:
the text data converted from the target audio data of the plurality of target objects is displayed in chronological order.
In one embodiment, after acquiring the sound data collected by each of the plurality of sound recording apparatuses, the method further includes:
determining whether data with the intensity lower than a preset threshold exists in sound data collected by each of the plurality of sound recording devices;
taking the data with the intensity lower than a preset threshold value as environmental noise;
and carrying out noise reduction processing on the sound data acquired by each of the plurality of recording devices through the environmental noise.
In one embodiment, selecting the audio data with the highest definition as the target audio data of the target object, and converting the target audio data into text data further includes:
acquiring context data under the condition that the definition of audio data of a target object separated from sound data acquired by each sound recording device is lower than a preset definition threshold;
and performing fuzzy recognition on the target audio data according to the context data so as to convert the target audio data into text data.
In another aspect, an apparatus for recognizing recorded sound data is provided, including:
the acquisition module is used for acquiring sound data acquired by each of the plurality of sound recording devices;
the separation module is used for separating the audio data of the target object from the sound data collected by each sound recording device;
the recognition module is used for performing definition recognition on the audio data of the target object separated from the sound data collected by each sound recording device, and selecting the audio data with the highest definition as the target audio data of the target object;
and the conversion module is used for converting the target audio data into text data.
In one embodiment, the separation module is specifically configured to separate, in a case where a plurality of target objects exist, audio data of each target object from sound data collected by each sound recording apparatus through voiceprint recognition.
In one embodiment, the above apparatus further comprises:
and the display module is used for converting the target audio data into text data under the condition that a plurality of target objects exist, and then displaying the text data converted from the target audio data of the plurality of target objects according to the time sequence.
In still another aspect, there is provided an audio recording apparatus including: the recording data identification device is provided.
In yet another aspect, a network device is provided, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
In a further aspect, a non-transitory computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the above-mentioned method.
In the embodiment, the sound data are collected by arranging the plurality of sound recording devices, the audio data of the target object are identified from the sound data, and the group of audio data with the highest definition is selected as the target audio data to be converted into the text data, so that the text content of the voice of the target object is obtained.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a flowchart of a method of recording data identification according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a recording scene according to an embodiment of the present invention;
FIG. 3 is a flowchart of an example of sound recording data identification according to an embodiment of the present invention;
fig. 4 is a block diagram of a recording data identification apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
Aiming at the problem that sound reception is unstable due to the fact that a recording pen is influenced by distance, in the embodiment, the fact that sound pickup can be achieved through cooperation of a plurality of recording pens is considered, then the recording pen which is the most optimal in sound matching for each person is selected, different sounds with the strongest signals are integrated together and transcribed, and therefore accuracy of conference summary is guaranteed. Based on this, in this example, a recorded sound data identification method is provided, as shown in fig. 1, which may include the following steps:
step 101: acquiring sound data acquired by each of a plurality of sound recording devices;
step 102: separating the audio data of the target object from the sound data collected by each sound recording device;
step 103: performing definition recognition on audio data of a target object separated from sound data acquired by each sound recording device, and selecting the audio data with the highest definition as the target audio data of the target object;
step 104: the target audio data is converted into text data.
In the above embodiment, a plurality of recording devices are arranged to collect sound data, the audio data of the target object is identified from the sound data, and a group of audio data with the highest definition is selected as the target audio data to be converted into text data, so that the text content of the voice of the target object is obtained.
When the recognition is performed, the target objects to be recognized (i.e., speaking objects) are multiple, and for this reason, it is necessary to recognize the clearest recording pen data, that is, in the case of multiple target objects, the audio data of the target object is separated from the sound data collected by each recording device, or the audio data of each target object is separated from the sound data collected by each recording device through voiceprint recognition.
In the case of recording and recalling a conference, after converting target audio data into text data in the case where a plurality of target objects exist, the text data into which the target audio data of the plurality of target objects are converted may be displayed in chronological order. That is, each optimized voice data is transcribed into a text content, and is sequentially displayed on the screen according to the actual speaking sequence.
Considering that environmental noise exists sometimes during recording, noise reduction processing can be performed through the environmental noise to improve the definition of audio data. For this purpose, after acquiring the sound data acquired by each of the plurality of sound recording devices, it may be determined whether there is data whose intensity is lower than a preset threshold in the sound data acquired by each of the plurality of sound recording devices; taking the data with the intensity lower than a preset threshold value as environmental noise; and carrying out noise reduction processing on the sound data acquired by each of the plurality of recording devices through the environmental noise. That is, if the signal of a certain type of sound is slightly below the threshold, it can be considered as ambient noise.
Some users may not be very close to each recording pen when speaking, or have relatively low voice, and in such a case, it may be found that each group of voice data of the user is not very clear in recognition, and therefore, context data may be obtained in a fuzzy recognition manner, for example, in a case where the definition of the audio data of the target object separated from the voice data acquired by each recording device is lower than a preset definition threshold; and performing fuzzy recognition on the target audio data according to the context data so as to convert the target audio data into text data. That is, the most appropriate content may be identified based on contextual data matching.
The above method is described below with reference to a specific example, however, it should be noted that the specific example is only for better describing the present application and is not to be construed as limiting the present application.
In order to break through the problem of distance limitation of the recording pens and stably and accurately record the conference content in real time, in the embodiment, it is considered that the conference content can be picked up by a plurality of recording pens in a coordinated mode, then the server calculates the distance between the sound of different persons and each recording pen according to the signal strength value, the sound of each person is matched with the optimal recording pen, the different sounds with the strongest signals are integrated together and transcribed, and therefore real-time, stable and accurate conference summary is guaranteed.
The optimal data of different sounds in the plurality of recording pens is selected, so that the problem that the recorded sound is not clear due to insufficient distance is solved, and the problem that the transcribed text is inaccurate due to harmonic sounds is solved.
As shown in fig. 2, a plurality of recording pens (recording pen 1, recording pen 2, recording pen 3) are provided, and the recording pens cooperate to pick up sound, and further, a server, a display screen, and the like are provided. The sound of different speakers is received and separated by a plurality of recording pens, so that the accurate transcription of the conference content is realized. Specifically, as shown in fig. 3, the method includes the following steps:
s1: when a conference starts, the recording pens at different positions send matching information to the server through the wireless module, the server numbers the recording pens after receiving the matching information, the recording pens start to work, and collected sound data are transmitted to a recording storage of the server so as to be further processed or played back for use.
S2: the server analyzes the received sound data, if the signal intensity value of a certain type of sound is lower than a threshold value, the sound data is identified as the environmental noise, and noise reduction processing is carried out on the environmental noise.
S3: the sound of the speakers during the conference is transmitted to a plurality of recording pens, and one recording pen receives the sounds of a plurality of speakers at the same time, so that the recognized sounds need to be separated according to the different sound ripples of each speaker, and the server encodes the various sounds separated by different recording pens, for example, the sounds can be grouped according to different timbres.
S4: the server contrasts and analyzes the sound data from different recording pens in the same group, selects one with better definition and strength for further optimization, and performs the same processing on each group of data.
S5: and (4) transferring each optimized sound data into character contents, and sequentially displaying the character contents on a screen according to the actual speaking sequence.
Further, if the voice data of a speaker collected by all the recording pens is not clear enough in a period of time, the server performs fuzzy recognition on the data and matches the most suitable content according to the context.
In the above example, the recording range can be expanded by the cooperative pickup of the plurality of recording pens, the optimal data of a plurality of sounds can be selected by the sound wave energy, and the fuzzy recognition can be performed on the context of the conference summary aiming at the unclear voice content, so that the accuracy of the voice recognition can be enhanced.
Based on the same inventive concept, the embodiment of the present invention further provides a recorded sound data identification apparatus, as described in the following embodiments. Because the principle of the recorded data identification device for solving the problems is similar to the recorded data identification method, the recorded data identification device can be implemented by the recorded data identification method, and repeated parts are not described again. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated. Fig. 4 is a block diagram of a recording data identification apparatus according to an embodiment of the present invention, and as shown in fig. 4, the recording data identification apparatus may include: an acquisition module 401, a separation module 402, an identification module 403 and a conversion module 404, the structure of which is explained below.
An obtaining module 401, configured to obtain sound data collected by each of the multiple sound recording devices;
a separation module 402, configured to separate audio data of a target object from sound data acquired by each of the sound recording devices;
the recognition module 403 is configured to perform definition recognition on the audio data of the target object separated from the sound data collected by each sound recording device, and select the audio data with the highest definition as the target audio data of the target object;
a conversion module 404, configured to convert the target audio data into text data.
In an embodiment, the separation module 402 may be specifically configured to, in a case where a plurality of target objects exist, separate audio data of each target object from sound data collected by each sound recording apparatus through voiceprint recognition.
In one embodiment, the recorded sound data identification device may further include: and the display module is used for converting the target audio data into text data under the condition that a plurality of target objects exist, and then displaying the text data converted from the target audio data of the plurality of target objects according to the time sequence.
In an embodiment, the recorded sound data identification apparatus may be further configured to determine, after acquiring sound data collected by each of a plurality of recording devices, whether there is data with intensity lower than a preset threshold in the sound data collected by each of the plurality of recording devices; taking the data with the intensity lower than a preset threshold value as environmental noise; and carrying out noise reduction processing on the sound data acquired by each of the plurality of recording devices through the environmental noise.
In an embodiment, the conversion module 404 may be further configured to obtain context data when the definitions of the audio data of the target object separated from the sound data collected by each sound recording device are all lower than a preset definition threshold; and performing fuzzy recognition on the target audio data according to the context data so as to convert the target audio data into text data.
In another embodiment, a software is provided, which is used to execute the technical solutions described in the above embodiments and preferred embodiments.
In another embodiment, a storage medium is provided, in which the software is stored, and the storage medium includes but is not limited to: optical disks, floppy disks, hard disks, erasable memory, etc.
From the above description, it can be seen that the embodiments of the present invention achieve the following technical effects: the voice data of the target object is identified from the voice data, a group of voice data with the highest definition is selected to serve as the target voice data, the voice data are converted into text data, the text content of the voice of the target object is obtained, the problem that the accuracy is low due to the fact that the existing voice recording device is influenced by the distance in the existing voice recording device is solved, and the technical effect of effectively improving the accuracy is achieved.
Although various specific embodiments are mentioned in the disclosure of the present application, the present application is not limited to the cases described in the industry standards or the examples, and the like, and some industry standards or the embodiments slightly modified based on the implementation described in the custom manner or the examples can also achieve the same, equivalent or similar, or the expected implementation effects after the modifications. Embodiments employing such modified or transformed data acquisition, processing, output, determination, etc., may still fall within the scope of alternative embodiments of the present application.
Although the present application provides method steps as described in an embodiment or flowchart, more or fewer steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an apparatus or client product in practice executes, it may execute sequentially or in parallel (e.g., in a parallel processor or multithreaded processing environment, or even in a distributed data processing environment) according to the embodiments or methods shown in the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in a process, method, article, or apparatus that comprises the recited elements is not excluded.
The devices or modules and the like explained in the above embodiments may be specifically implemented by a computer chip or an entity, or implemented by a product with certain functions. For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, in implementing the present application, the functions of each module may be implemented in one or more pieces of software and/or hardware, or a module that implements the same function may be implemented by a combination of a plurality of sub-modules, and the like. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and other divisions may be realized in practice, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a mobile terminal, a server, or a network device) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same or similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
While the present application has been described by way of examples, those of ordinary skill in the art will appreciate that there are numerous variations and permutations of the present application that do not depart from the spirit of the present application and that the appended embodiments are intended to include such variations and permutations without departing from the present application.

Claims (11)

1. A method for identifying recorded data, comprising:
acquiring sound data acquired by each of a plurality of sound recording devices;
separating the audio data of the target object from the sound data collected by each sound recording device;
performing definition recognition on audio data of a target object separated from sound data acquired by each sound recording device, and selecting the audio data with the highest definition as the target audio data of the target object;
the target audio data is converted into text data.
2. The method of claim 1, wherein in the case that a plurality of target objects exist, separating audio data of the target objects from sound data collected by the respective sound recording apparatuses comprises:
and separating the audio data of each target object from the sound data collected by each sound recording device through voiceprint recognition.
3. The method according to claim 1, wherein after converting the target audio data into text data in a case where there are a plurality of target objects, further comprising:
the text data converted from the target audio data of the plurality of target objects is displayed in chronological order.
4. The method of claim 1, wherein after obtaining the sound data collected by each of the plurality of sound recording devices, further comprising:
determining whether data with the intensity lower than a preset threshold exists in sound data collected by each of the plurality of sound recording devices;
taking the data with the intensity lower than a preset threshold value as environmental noise;
and carrying out noise reduction processing on the sound data acquired by each of the plurality of recording devices through the environmental noise.
5. The method of claim 1, wherein selecting the audio data with the highest definition as the target audio data of the target object, and converting the target audio data into text data further comprises:
acquiring context data under the condition that the definition of audio data of a target object separated from sound data acquired by each sound recording device is lower than a preset definition threshold;
and performing fuzzy recognition on the target audio data according to the context data so as to convert the target audio data into text data.
6. An apparatus for recognizing recorded sound data, comprising:
the acquisition module is used for acquiring sound data acquired by each of the plurality of sound recording devices;
the separation module is used for separating the audio data of the target object from the sound data collected by each sound recording device;
the recognition module is used for performing definition recognition on the audio data of the target object separated from the sound data collected by each sound recording device, and selecting the audio data with the highest definition as the target audio data of the target object;
and the conversion module is used for converting the target audio data into text data.
7. The apparatus according to claim 6, wherein the separation module is specifically configured to separate the audio data of each target object from the sound data collected by each sound recording device through voiceprint recognition in a case where a plurality of target objects exist.
8. The apparatus of claim 6, further comprising:
and the display module is used for converting the target audio data into text data under the condition that a plurality of target objects exist, and then displaying the text data converted from the target audio data of the plurality of target objects according to the time sequence.
9. An audio recording apparatus comprising: the recorded data identification apparatus of any one of claims 6 to 8.
10. A network device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 5 when executing the computer program.
11. A non-transitory computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of the method of any one of claims 1 to 5.
CN202011303903.9A 2020-11-19 2020-11-19 Recording data identification method and device and recording equipment Pending CN112509597A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011303903.9A CN112509597A (en) 2020-11-19 2020-11-19 Recording data identification method and device and recording equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011303903.9A CN112509597A (en) 2020-11-19 2020-11-19 Recording data identification method and device and recording equipment

Publications (1)

Publication Number Publication Date
CN112509597A true CN112509597A (en) 2021-03-16

Family

ID=74959922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011303903.9A Pending CN112509597A (en) 2020-11-19 2020-11-19 Recording data identification method and device and recording equipment

Country Status (1)

Country Link
CN (1) CN112509597A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114141274A (en) * 2021-11-22 2022-03-04 珠海格力电器股份有限公司 Audio processing method, device, equipment and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888703A (en) * 2014-03-28 2014-06-25 深圳市中兴移动通信有限公司 Shooting method and camera shooting device with recording enhanced
CN108769400A (en) * 2018-05-23 2018-11-06 宇龙计算机通信科技(深圳)有限公司 A kind of method and device of locating recordings
US20190318733A1 (en) * 2018-04-12 2019-10-17 Kaam Llc. Adaptive enhancement of speech signals
CN111078185A (en) * 2019-12-26 2020-04-28 珠海格力电器股份有限公司 Method and equipment for recording sound
CN111883168A (en) * 2020-08-04 2020-11-03 上海明略人工智能(集团)有限公司 Voice processing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888703A (en) * 2014-03-28 2014-06-25 深圳市中兴移动通信有限公司 Shooting method and camera shooting device with recording enhanced
US20190318733A1 (en) * 2018-04-12 2019-10-17 Kaam Llc. Adaptive enhancement of speech signals
CN108769400A (en) * 2018-05-23 2018-11-06 宇龙计算机通信科技(深圳)有限公司 A kind of method and device of locating recordings
CN111078185A (en) * 2019-12-26 2020-04-28 珠海格力电器股份有限公司 Method and equipment for recording sound
CN111883168A (en) * 2020-08-04 2020-11-03 上海明略人工智能(集团)有限公司 Voice processing method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114141274A (en) * 2021-11-22 2022-03-04 珠海格力电器股份有限公司 Audio processing method, device, equipment and system

Similar Documents

Publication Publication Date Title
CN107799126B (en) Voice endpoint detection method and device based on supervised machine learning
CN108305615B (en) Object identification method and device, storage medium and terminal thereof
CN105989836B (en) Voice acquisition method and device and terminal equipment
CN110047512B (en) Environmental sound classification method, system and related device
Stowell et al. Birdsong and C4DM: A survey of UK birdsong and machine recognition for music researchers
CN111312256A (en) Voice identity recognition method and device and computer equipment
CN109065051B (en) Voice recognition processing method and device
CN111462758A (en) Method, device and equipment for intelligent conference role classification and storage medium
Pillos et al. A Real-Time Environmental Sound Recognition System for the Android OS.
CN106791579A (en) The processing method and system of a kind of Video Frequency Conference Quality
CN112712809B (en) Voice detection method and device, electronic equipment and storage medium
CN111028845A (en) Multi-audio recognition method, device, equipment and readable storage medium
CN106469555B (en) Voice recognition method and terminal
CN111159987A (en) Data chart drawing method, device, equipment and computer readable storage medium
CN111540370A (en) Audio processing method and device, computer equipment and computer readable storage medium
CN111028834B (en) Voice message reminding method and device, server and voice message reminding equipment
CN110689885A (en) Machine-synthesized speech recognition method, device, storage medium and electronic equipment
CN112509597A (en) Recording data identification method and device and recording equipment
CN111710332B (en) Voice processing method, device, electronic equipment and storage medium
CN110931019B (en) Public security voice data acquisition method, device, equipment and computer storage medium
CN111640450A (en) Multi-person audio processing method, device, equipment and readable storage medium
CN107180629B (en) Voice acquisition and recognition method and system
CN107154996B (en) Incoming call interception method and device, storage medium and terminal
CN114049875A (en) TTS (text to speech) broadcasting method, device, equipment and storage medium
CN116486789A (en) Speech recognition model generation method, speech recognition method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination