CN117440001A

CN117440001A - Data synchronization method based on message

Info

Publication number: CN117440001A
Application number: CN202311753182.5A
Authority: CN
Inventors: 李盼盼; 王子佩; 甘海鹏; 邓诗雨
Original assignee: Sdic Human Resources Service Co ltd
Current assignee: Sdic Human Resources Service Co ltd
Priority date: 2023-12-20
Filing date: 2023-12-20
Publication date: 2024-01-23
Anticipated expiration: 2043-12-20
Also published as: CN117440001B

Abstract

The invention discloses a data synchronization method based on information, which belongs to the technical field of data processing and comprises the following steps: s1, acquiring original voice data, preprocessing the original voice data to generate de-noised voice data, inputting the original voice data and the de-noised voice data into a constructed voice processing model, and generating a voice message queue; s2, determining voice message weight according to the voice message queue, and generating a voice message report; s3, synchronously transmitting the voice message report and the denoising voice data to the user terminal. The invention synchronizes the preprocessed voice data and voice message report to the user terminal, such as a mobile phone, so that the user can conveniently and synchronously check the voice data in time, and acquire the audio change condition of the voice message according to the voice message report, thereby quickly knowing the audio condition.

Description

Data synchronization method based on message

Technical Field

The invention belongs to the technical field of data processing, and particularly relates to a data synchronization method based on a message.

Background

With the improvement of economy and technology, the development and popularization of mobile internet technology, data transmission technology and intelligent terminals, and the development of audio transmission are widely progressed. The requirements of people on the audio quality are gradually improved, so that the research on the quality improvement of videos has very important significance for enriching the mental and living demands of people. At present, with the development of audio collection technology, how to synchronize specific information of audio collected by an audio collection device (such as a microphone) and a voice signal thereof to a user terminal (such as a mobile phone) becomes an urgent problem to be solved.

Disclosure of Invention

The invention provides a data synchronization method based on a message in order to solve the problems.

The technical scheme of the invention is as follows: a message-based data synchronization method comprising the steps of:

s1, acquiring original voice data, preprocessing the original voice data to generate de-noised voice data, inputting the original voice data and the de-noised voice data into a constructed voice processing model, and generating a voice message queue;

s2, determining voice message weight according to the voice message queue, and generating a voice message report;

s3, synchronously transmitting the voice message report and the denoising voice data to the user terminal.

Further, S1 comprises the following sub-steps:

s11, collecting a plurality of frames of original voice data, denoising each frame of original voice data, and generating each frame of denoised voice data;

s12, constructing a voice processing model, inputting original voice data of each frame and denoising voice data of each frame into the voice processing model, and generating a voice message sequence;

s13, arranging all voice message values in the voice message sequence according to time sequence as a voice message queue.

The beneficial effects of the above-mentioned further scheme are: in the invention, the voice processing model can be used for extracting the audio characteristic values of the original voice data and the denoising voice data, carrying out difference operation and absolute value operation on the audio characteristic values of each frame, and fusing the operated audio characteristic values, so that the audio characteristics can be enriched, and the voice message values output by the characteristic generation layer can contain more voice characteristics after time sequencing.

Further, in S12, the speech processing model includes a first feature extraction layer, a second feature extraction layer, an operator A1, a feature fusion layer, and a feature generation layer;

the input end of the first feature extraction layer and the input end of the second feature extraction layer are both used as input ends of a voice processing model; the output end of the first characteristic extraction layer is connected with the first input end of the arithmetic unit A1; the output end of the second characteristic extraction layer is connected with the second input end of the arithmetic unit A1; the output end of the arithmetic unit A1 is connected with the input end of the feature fusion layer; the output end of the characteristic fusion layer is connected with the input end of the characteristic generation layer; the output of the feature generation layer serves as the output of the speech processing model.

Further, the expression of the first feature extraction layer is: x= { X ₁ ,x ₂ ,..,x _n ,..,x _N ｝，The method comprises the steps of carrying out a first treatment on the surface of the Where n=1, 2,..n, X represents the output of the first feature extraction layer, N represents the total number of frames of the original speech data, X _n An audio feature value representing the original speech data of the nth frame, s representing the pitch of the original speech data, f representing the sampling frequency of the original speech data, l _n Indicating the loudness of the original voice data of the nth frame, and e indicating an index;

the expression of the second feature extraction layer is: y= { Y ₁ ,y ₂ ,..,y _m ,..,y _M ｝，The method comprises the steps of carrying out a first treatment on the surface of the Where m=1, 2, M, Y denote the output of the second feature extraction layer, M denote the total number of frames of the denoised speech data, Y _m Audio characteristic value representing m-th frame of de-noised speech data,/->Pitch representing denoised speech data, l _m Representing the loudness of the m-th frame of de-noised speech data.

Further, operation AThe expression of 1 is: z= { x ₁ -y ₁ ,x ₂ -y ₂ ,..,x _n -y _m ,..,x _N -y _M N=m; where n=1, 2, N, m=1, 2, M, x _n Audio feature value, y, representing n-th frame of original speech data _m The audio characteristic value of the M-th frame of de-noised voice data is represented, N represents the total frame number of the original voice data, and M represents the total frame number of the de-noised voice data.

Further, the expression of the feature fusion layer is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein P represents the output of the feature fusion layer, σ _k A weight parameter, b, representing the kth neuron in the feature fusion layer _k The bias parameter of the kth neuron in the feature fusion layer is represented, Z represents the output of an operator A1, K represents the number of neurons in the feature fusion layer, max (DEG) represents the maximum value operation, X represents the output of the first feature extraction layer, and Y represents the output of the second feature extraction layer.

Further, the expression of the feature generation layer is: r= { R ₁ ,r ₂ ,..,r _n ,..,r _N ｝，The method comprises the steps of carrying out a first treatment on the surface of the Where n=1, 2,..n, N represents the total number of frames of the original speech data, r _n Voice message value, x, representing n-th frame of original voice data _n An audio feature value representing the original speech data of the nth frame, and P representing the output of the feature generation layer.

Further, S2 comprises the following sub-steps:

s21, acquiring a voice message blank report;

s22, calculating the voice message weight of each frame of original voice data according to the voice message queue to generate a voice message weight set;

s23, performing de-duplication processing on the voice message weight set to generate standard voice message weight of each frame of original voice data;

s23, generating a voice message report according to the standard voice message weight of the original voice data of each frame.

The beneficial effects of the above-mentioned further scheme are: in the invention, the weight operation is carried out on the voice message queue containing a plurality of voice signal characteristics, the voice message weight is filled into the blank report form according to the sequence from large to small, the position with larger weight indicates that the change of the audio signal is larger or more important, and the user can conveniently and rapidly obtain the important information of the voice data when viewing the voice message report form.

Further, in S22, the voice message weight ρ of the n-th frame of original voice data _n The calculation formula of (2) is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein r is _n Voice message value r representing n-th frame of original voice data _n+1 A voice message value representing the n+1th frame of original voice data, epsilon represents a minimum value, and N represents the total frame number of the original voice data.

Further, in S24, the specific method for generating the voice message report includes: and filling the standard voice message weight of each frame of original voice data into the voice message blank report according to the sequence from large to small to generate a voice message report.

The beneficial effects of the invention are as follows: the invention discloses a data synchronization method based on messages, which is characterized in that voice data is preprocessed to generate a voice message queue containing a large number of audio features, and after the weight operation is carried out on the voice message queue, a voice message blank report is filled in to generate a final voice message report; finally, the invention synchronizes the preprocessed voice data and voice message report to the user terminal, such as mobile phone, so that the user can check the voice data in time and acquire the audio change condition of the voice message according to the voice message report, and the audio condition can be known quickly.

Drawings

FIG. 1 is a flow chart of a message-based data synchronization method;

fig. 2 is a schematic diagram of a speech processing model.

Detailed Description

Embodiments of the present invention are further described below with reference to the accompanying drawings.

As shown in fig. 1, the present invention provides a message-based data synchronization method, which includes the following steps:

In an embodiment of the present invention, S1 comprises the following sub-steps:

In the invention, the voice processing model can be used for extracting the audio characteristic values of the original voice data and the denoising voice data, carrying out difference operation and absolute value operation on the audio characteristic values of each frame, and fusing the operated audio characteristic values, so that the audio characteristics can be enriched, and the voice message values output by the characteristic generation layer can contain more voice characteristics after time sequencing.

In the embodiment of the present invention, as shown in fig. 2, in S12, the speech processing model includes a first feature extraction layer, a second feature extraction layer, an operator A1, a feature fusion layer, and a feature generation layer;

In the embodiment of the present invention, the expression of the first feature extraction layer is: x= { X ₁ ,x ₂ ,..,x _n ,..,x _N ｝，The method comprises the steps of carrying out a first treatment on the surface of the Where n=1, 2,..n, X represents the output of the first feature extraction layer, N represents the total number of frames of the original speech data, X _n An audio feature value representing the original speech data of the nth frame, s representing the pitch of the original speech data, f representing the sampling frequency of the original speech data, l _n Indicating the loudness of the original voice data of the nth frame, and e indicating an index;

In the embodiment of the present invention, the expression of the operation A1 is: z= { x ₁ -y ₁ ,x ₂ -y ₂ ,..,x _n -y _m ,..,x _N -y _M N=m; where n=1, 2, N, m=1, 2, M, x _n Audio feature value, y, representing n-th frame of original speech data _m The audio characteristic value of the M-th frame of de-noised voice data is represented, N represents the total frame number of the original voice data, and M represents the total frame number of the de-noised voice data.

In the embodiment of the invention, the expression of the feature fusion layer is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein P represents the output of the feature fusion layer, σ _k A weight parameter, b, representing the kth neuron in the feature fusion layer _k The bias parameter of the kth neuron in the feature fusion layer is represented, Z represents the output of an operator A1, K represents the number of neurons in the feature fusion layer, max (DEG) represents the maximum value operation, X represents the output of the first feature extraction layer, and Y represents the output of the second feature extraction layer.

In the embodiment of the present invention, the expression of the feature generation layer is: r= { R ₁ ,r ₂ ,..,r _n ,..,r _N ｝，The method comprises the steps of carrying out a first treatment on the surface of the Where n=1, 2,..n, N represents the total number of frames of the original speech data, r _n Voice message value, x, representing n-th frame of original voice data _n An audio feature value representing the original speech data of the nth frame, and P representing the output of the feature generation layer.

In an embodiment of the present invention, S2 comprises the following sub-steps:

s21, acquiring a voice message blank report;

In the invention, the weight operation is carried out on the voice message queue containing a plurality of voice signal characteristics, the voice message weight is filled into the blank report form according to the sequence from large to small, the position with larger weight indicates that the change of the audio signal is larger or more important, and the user can conveniently and rapidly obtain the important information of the voice data when viewing the voice message report form.

In the embodiment of the present invention, in S22, the voice message weight ρ of the n-th frame of original voice data _n The calculation formula of (2) is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein r is _n Voice message value r representing n-th frame of original voice data _n+1 A voice message value representing the n+1th frame of original voice data, epsilon represents a minimum value, and N represents the total frame number of the original voice data.

In the embodiment of the present invention, in S24, the specific method for generating the voice message report is: and filling the standard voice message weight of each frame of original voice data into the voice message blank report according to the sequence from large to small to generate a voice message report.

Those of ordinary skill in the art will recognize that the embodiments described herein are for the purpose of aiding the reader in understanding the principles of the present invention and should be understood that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims

1. A method for message-based data synchronization, comprising the steps of:

2. The message-based data synchronization method according to claim 1, wherein S1 comprises the sub-steps of:

3. The message-based data synchronization method according to claim 2, wherein in S12, the speech processing model includes a first feature extraction layer, a second feature extraction layer, an operator A1, a feature fusion layer, and a feature generation layer;

the input end of the first feature extraction layer and the input end of the second feature extraction layer are both used as input ends of a voice processing model; the output end of the first characteristic extraction layer is connected with the first input end of the arithmetic unit A1; the output end of the second characteristic extraction layer is connected with the second input end of the arithmetic unit A1; the output end of the arithmetic unit A1 is connected with the input end of the feature fusion layer; the output end of the characteristic fusion layer is connected with the input end of the characteristic generation layer; the output end of the characteristic generation layer is used as the output end of the voice processing model.

4. A message-based data synchronization method according to claim 3, wherein the expression of the first feature extraction layer is: x= { X ₁ ,x ₂ ,..,x _n ,..,x _N ｝，The method comprises the steps of carrying out a first treatment on the surface of the Where n=1, 2,..n, X represents the output of the first feature extraction layer, N represents the total number of frames of the original speech data, X _n An audio feature value representing the original speech data of the nth frame, s representing the pitch of the original speech data, f representing the sampling frequency of the original speech data, l _n Indicating the loudness of the original voice data of the nth frame, and e indicating an index;

5. A message-based data synchronization method according to claim 3, wherein the expression of operation A1 is: z= { x ₁ -y ₁ ,x ₂ -y ₂ ,..,x _n -y _m ,..,x _N -y _M N=m; where n=1, 2, N, m=1, 2, M, x _n Audio feature value, y, representing n-th frame of original speech data _m The audio characteristic value of the M-th frame of de-noised voice data is represented, N represents the total frame number of the original voice data, and M represents the total frame number of the de-noised voice data.

6. A message-based data synchronization method according to claim 3, wherein the expression of the feature fusion layer is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein P represents the output of the feature fusion layer, σ _k A weight parameter, b, representing the kth neuron in the feature fusion layer _k The bias parameter of the kth neuron in the feature fusion layer is represented, Z represents the output of an operator A1, K represents the number of neurons in the feature fusion layer, max (DEG) represents the maximum value operation, X represents the output of the first feature extraction layer, and Y represents the output of the second feature extraction layer.

7. A message-based data synchronization method according to claim 3, wherein the expression of the feature generation layer is: r= { R ₁ ,r ₂ ,..,r _n ,..,r _N ｝，The method comprises the steps of carrying out a first treatment on the surface of the Where n=1, 2,..n, N represents the total number of frames of the original speech data, r _n Voice message value, x, representing n-th frame of original voice data _n An audio feature value representing the original speech data of the nth frame, and P representing the output of the feature generation layer.

8. The message-based data synchronization method according to claim 1, wherein S2 comprises the sub-steps of:

s21, acquiring a voice message blank report;

9. The method according to claim 8, wherein in S22, the voice message weight ρ of the n-th frame of original voice data is _n The calculation formula of (2) is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein r is _n Voice message value r representing n-th frame of original voice data _n+1 A voice message value representing the n+1th frame of original voice data, epsilon represents a minimum value, and N represents the total frame number of the original voice data.

10. The message-based data synchronization method according to claim 8, wherein in S24, the specific method for generating the voice message report is: and filling the standard voice message weight of each frame of original voice data into the voice message blank report according to the sequence from large to small to generate a voice message report.