CN109165533B

CN109165533B - Anti-peeping method of short video based on cross-group mechanism

Info

Publication number: CN109165533B
Application number: CN201810880478.6A
Authority: CN
Inventors: 向敏明
Original assignee: Shenzhen Dr Ma Network Technology Co ltd
Current assignee: Shenzhen Dr Ma Network Technology Co ltd
Priority date: 2018-08-04
Filing date: 2018-08-04
Publication date: 2022-06-03
Anticipated expiration: 2038-08-04
Also published as: CN109165533A

Abstract

A short video peep-proof method based on a cross-group mechanism comprises the following steps: after the social client detects that a certain short video published on a first group session interface of a first group is viewed by a legal user of the social client, the social client prompts to select any two users published with sound signals on a second group session interface of a second group from a second group associated with the first group so as to form a first user pair; the social client associates the first user pair with the short video, and overlays the short video with a preset image on a first group session interface; the first user pair is used as a first basis for removing the preset image covering the short video. By implementing the embodiment of the invention, the risk that the short video published on the group session interface of a certain group is peeped by others can be reduced based on a cross-group mechanism.

Description

Anti-peeping method of short video based on cross-group mechanism

Technical Field

The invention relates to the technical field of social contact, in particular to a short video peep-proof method based on a cross-group mechanism.

Technical Field

At present, a plurality of groups are usually created on various social clients including WeChat and QQ, and users in a certain group sometimes publish some short videos on a group session interface. In practice, it is found that when a short video is published on a group session interface of a certain group of social clients, if a user device (such as a mobile phone) where the social clients are located is not locked, the short video published on the group session interface is easily peeped by other people.

Disclosure of Invention

The embodiment of the invention discloses a peep-proof method of a short video based on a cross-group mechanism, which can reduce the risk that the short video published on a group session interface of a certain group is peeped by others.

The anti-peeping method for the short video based on the cross-group mechanism comprises the following steps of:

after the social client detects that a certain short video published on a first group session interface of the first group is viewed by a legal user of the social client, the social client prompts to select any two users from the second group, wherein the two users have sound signals published on a second group session interface of the second group, so as to form a first user pair;

the social client associating the first user pair with the short video and overlaying the short video with a preset image on the first group session interface; wherein the first user pair is used as a first basis for removing the preset image which covers the short video.

As an optional implementation manner, in an embodiment of the present invention, after the social client overlays the short video with a preset image on the first group session interface, the method further includes:

the social client detecting a removal instruction for the preset image on the first group session interface that covers the short video;

the social client prompts to select any two users from the second group according to the removing instruction so as to form a second user pair;

the social client side judges whether the selected second user pair is the same as the first user pair serving as a first basis for removing the preset image covering the short video or not;

if the short video is the same as the preset video, the social client removes the preset image which is covered by the short video on the first group conversation interface so as to redisplay the short video.

As an optional implementation manner, in an embodiment of the present invention, after the social client associates the first user pair with the short video and before overlaying the short video with a preset image on the first group session interface, the method further includes:

the social client prompts to select a first sound signal issued by one user included in the first user pair from the second group session interface and prompts to select a second sound signal issued by the other user included in the first user pair from the second group session interface;

the social client synthesizes the selected first sound signal and the second sound signal to obtain a verification sound signal;

the social client associates the verification sound signal with the short video and executes the step of covering the short video with a preset image on the first group session interface; wherein the verification sound signal is used as a second basis for removing the preset image which covers the short video;

after the social client determines that the second user pair is the same as the first user pair serving as a first basis for removing the preset image that covers the short video, and before the social client removes the preset image that covers the short video on the first group session interface to redisplay the short video, the method further includes:

the social client prompts to select a third sound signal issued by one user included in the second user pair from the second group session interface and prompts to select a fourth sound signal issued by another user included in the second user pair from the second group session interface;

the social client synthesizes the selected third sound signal and the fourth sound signal to obtain a synthesized sound signal;

and the social client judges whether the synthesized sound signal is matched with the verification sound signal which is used as a second basis for removing the preset image covering the short video, and if so, executes the step of removing the preset image covering the short video on the first group conversation interface so as to redisplay the short video.

As an optional implementation manner, in an embodiment of the present invention, the synthesizing, by the social client, the selected first sound signal and the second sound signal to obtain a verification sound signal includes:

the social client side determines an alignment point between the selected first sound signal and the second sound signal; wherein the alignment point refers to a starting position of the synthesis of the first sound signal and the second sound signal;

and the social client synthesizes the first sound signal and the second sound signal into a verification sound signal according to the alignment point.

As an optional implementation manner, in an embodiment of the present invention, the determining, by the social client, an alignment point between the first selected sound signal and the second selected sound signal includes:

the social client side calculates a first time length of the selected first sound signal and a second time length of the second sound signal; wherein the first time length represents a time duration of sound of the first sound signal; the second time duration represents a time duration of sound of the second sound signal;

the social client calculates a difference between the first duration and the second duration;

and the social client judges whether the difference value is smaller than or equal to a preset value, if so, any one of the first sound signal and the second sound signal is subjected to periodic scaling so as to obtain the first sound signal and the second sound signal with the same final duration, and then the first audio frame of the first sound signal and the first audio frame of the second sound signal with the same final duration are used as an alignment point.

As an optional implementation manner, in an embodiment of the present invention, the social client scaling, periodically, any one of the first sound signal and the second sound signal, includes:

if the first time length of the first sound signal is shorter than the second time length of the second sound signal, the social contact client determines the proportion X of the difference value in the first time length of the first sound signal according to the difference value;

the social client side calculates the audio frame number Y of the first sound signal;

the social client calculates an amplification factor Z, wherein Z is X (Y/(Y-1));

and the social client amplifies other audio frames except the first audio frame in the first sound signal in an equal proportion according to the amplification coefficient, so that the final duration of the amplified first sound signal is the same as the second duration of the second sound signal.

As an optional implementation manner, in an embodiment of the present invention, if the difference is greater than the preset value, the method further includes:

the social client side respectively samples the first sound signal and the second sound signal by using the same default sampling frequency to obtain a first sampling group and a second sampling group;

the social client side generates a cross-correlation group according to the default sampling frequency, the first sampling group, the second sampling group and the cross-correlation weight; wherein the cross-correlation weight is positively correlated with the difference, and the cross-correlation group comprises a plurality of values;

the social client compares the plurality of values in the cross-correlation group to find out the maximum value;

and the social client uses the audio frame position corresponding to the maximum numerical value as an alignment point.

As an optional implementation manner, in an embodiment of the present invention, the generating, by the social client, a cross-correlation group according to the default sampling frequency, the first sampling group, the second sampling group, and a cross-correlation weight includes:

wherein S is_n[t]Representing a set of cross-correlations, x [ m ]]Representing the mth sample data in the first sample group, y [ m-t ]]Represents the (m-t) th sampling data in the second sampling group, t represents the offset of time, t is an integer and takes the value from 0 to m, W_tDenotes a window function, where n is l f, l is a cross-correlation weight, and f is fThe default sampling frequency.

As an optional implementation manner, in an embodiment of the present invention, the determining, by the social client, whether the synthesized sound signal matches the verification sound signal as a second basis for removing the preset image that covers the short video includes:

the social client determining whether the alignment point between the synthesized sound signal and the verification sound signal as a second basis for removing the preset image that has covered the short video is the same;

if the two multi-dimensional vectors are the same, the social client judges whether the first multi-dimensional vector corresponding to the voiceprint feature of the synthesized voice signal is matched with the second multi-dimensional vector corresponding to the voiceprint feature of the verification voice signal, and if the two multi-dimensional vectors are matched, the synthesized voice signal is matched with the verification voice signal; if not, determining that the synthesized voice signal does not match the verification voice signal;

the first multi-dimensional vector corresponding to the voiceprint feature of the synthetic sound signal is composed of a mel-frequency cepstrum coefficient, a linear prediction cepstrum coefficient, a first order difference of the mel-frequency cepstrum coefficient, a first order difference of the linear prediction cepstrum coefficient, energy, a first order difference of the energy and a Gammatone filter cepstrum coefficient.

In the embodiment of the invention, after the social client detects that a certain short video published on a first group session interface of a first group is watched by a legal user of the social client, the social client prompts to select any two users from a second group, wherein the two users have sound signals published on a second group session interface of the second group, so as to form a first user pair; the social client may associate a first user pair with the short video, and overlay the short video with a preset image on the first group session interface, the first user pair serving as a first basis for removing the preset image of the overlaid short video. Therefore, by implementing the embodiment of the invention, even under the condition that the screen of the user equipment (such as a mobile phone) where the social client is located is not locked, the risk that the short video published on the group session interface of a certain group is peeped by others can be reduced based on a cross-group mechanism.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a short video peep-proof method based on a cross-group mechanism according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of another short video peeking prevention method based on a cross-group mechanism according to an embodiment of the present invention;

fig. 3 is a schematic flowchart of another short video peeking prevention method based on a cross-group mechanism according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "comprises" and "comprising," and any variations thereof, of embodiments of the present invention are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The embodiment of the invention discloses a peep-proof method of a short video based on a cross-group mechanism, which reduces the risk that the short video published on a group session interface of a certain group is peeped by others. The following detailed description is made with reference to the accompanying drawings.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a short video peep-prevention method based on a cross-group mechanism according to an embodiment of the present invention. In the anti-peeking method for short video based on cross-group mechanism shown in fig. 1, a first group and a second group associated with the first group are created in a social client, which may include but is not limited to WeChat, QQ; the social client can be installed on a mobile phone, a tablet computer and other user equipment. As shown in fig. 1, the anti-peeping method for short video based on cross-group mechanism may include the following steps:

101. and establishing a mapping relation between the group identification of the first group and the group identification of the second group on a group mapping setting interface provided by the social client in advance, wherein the mapping relation is used for representing that the first group is associated with the second group.

The group mapping setting interface may be presented to the user by the social client when detecting that the user device where the social client is located and the wearable device wirelessly connected to the user device simultaneously generate the same action event (e.g., the same flick action event).

102. After the social client detects that a certain short video published on a first group session interface of a first group is viewed by a legal user of the social client, the social client prompts to select any two users published with sound signals on a second group session interface of a second group from the second group so as to form a first user pair.

As an optional implementation manner, when a short video published on a first group session interface of a first group by a social client is watched by a current user, it may be verified whether the current user belongs to a legal user of the social client, if so, the social client may determine whether the short video published on the first group session interface of the first group is watched by the legal user of the social client, and if so, prompt to select any two users from a second group who have an audio signal published on a second group session interface of the second group, so as to form a first user pair. The verifying, by the social client, whether the viewing user belongs to a valid user of the social client may include:

the method comprises the steps that a social client acquires color information of a face image of a current user;

the social client carries out binarization processing on the color information of the face image of the current user;

the social client divides the face image of the current user after binarization processing into a plurality of pixel blocks, and performs OR operation on pixel values corresponding to all pixels in each pixel block to obtain an OR operation result of each pixel block to form a down-sampling image of the face image of the current user;

the social client divides the obtained down-sampling image into a plurality of pixel areas, and obtains the characteristic information of each pixel area forming the face image of the current user by summing the OR operation results of all pixel points in each pixel area;

and the social client judges whether the face image of the current user is matched with the face image of a legal user stored in the social client in advance according to the characteristic information of each pixel region of the face image of the current user, and if so, the watching user is determined to belong to the legal user of the social client. According to the embodiment, the subsequent short video covering can be executed after the short video is accurately identified to be watched by the legal user pre-stored by the social client, the situation that the illegal user triggers the social client to cover any short video published on the first group session interface of the first group is prevented, and therefore the legality and the reliability of the short video covering can be improved.

In the embodiment of the invention, the social client can select each user who posts the sound signal on the second group conversation interface of the second group from the second group, and pop up a first user pair selection interface, wherein the first user pair selection interface comprises the identification of each user who posts the sound signal on the second group conversation interface of the second group selected from the second group; accordingly, the social client may detect the identifiers of any two users selected by a legal user of the social client from the first user pair selection interface, so as to select any two users from the second group, where the two users have sound signals issued on the second group session interface of the second group, to form the first user pair.

The users who issue the sound signals on the second group conversation interface refer to the users who issue the sound signals and can be seen on the second group conversation interface.

103. The social client associates the first user pair with the short video, and overlays the short video with a preset image on a first group session interface; the first user pair is used as a first basis for removing the preset image of the covered short video.

The social client associating the first user pair with the short video refers to the social client associating the corresponding relationship of the identifications of the two users in the first user pair with the short video.

In the method described in fig. 1, even in a case where a user device (e.g., a mobile phone) where the social client is located is not locked, the risk that a short video published on a group session interface of a certain group is peeped by others can be reduced based on a cross-group mechanism. In addition, in the method described in fig. 1, an illegal user can be prevented from triggering the social client to overlay a certain part of the short video published on the first group session interface of the first group, so that the validity and reliability of the overlay of the short video can be improved.

Referring to fig. 2, fig. 2 is a schematic flow chart illustrating another short video peep-prevention method based on a cross-group mechanism according to an embodiment of the present invention. In the method shown in fig. 2, a mapping relationship between the group identifier of the first group and the group identifier of the second group may be established in advance on a group mapping setting interface provided by the social client, where the mapping relationship is used to indicate that the first group is associated with the second group. As shown in fig. 2, the anti-peeping method for short video based on cross-group mechanism may include the following steps:

201. after the social client detects that a certain short video published on a first group session interface of a first group is viewed by a legal user of the social client, the social client prompts to select any two users published with sound signals on a second group session interface of a second group from the second group so as to form a first user pair.

As an optional implementation manner, when a certain short video published on a first group session interface of a first group by a social client is watched by a current user, it may be verified whether the current user belongs to a legal user of the social client, if so, the social client may determine whether the certain short video published on the first group session interface of the first group is watched by the legal user of the social client completely, and if so, prompt to select any two users from a second group, from which audio signals are published on a second group session interface of the second group, so as to form a first user pair. The verifying, by the social client, whether the viewing user belongs to a valid user of the social client may include:

In the embodiment of the invention, the social client can select each user who has the sound signal issued on the second group session interface of the second group from the second group, and pop up a first user pair selection interface, wherein the first user pair selection interface comprises the identifier of each user who has the sound signal issued on the second group session interface of the second group and is selected from the second group; accordingly, the social client may detect the identifiers of any two users selected by a legal user of the social client from the first user pair selection interface, so as to select any two users from the second group, where the two users have sound signals issued on the second group session interface of the second group, to form the first user pair.

202. The social client associates the first user pair with the short video, and overlays the short video with a preset image on a first group session interface; the first user pair is used as a first basis for removing the preset image of the covered short video.

In the embodiment of the invention, the social client can select each user who posts the sound signal on the second group conversation interface of the second group from the second group, and pop up a first user pair selection interface, wherein the first user pair selection interface comprises the identification of each user who posts the sound signal on the second group conversation interface of the second group selected from the second group; accordingly, the social client may detect the identifiers of any two users selected by the current user of the social client from the first user pair selection interface, so as to select any two users from the second group, where the sound signals are issued on the second group session interface of the second group, to form the first user pair.

203. The social client detects a removal instruction for a preset image of the overlaid short video on the first group session interface.

204. And the social client prompts to select any two users from the second group according to the removing instruction so as to form a second user pair.

In the embodiment of the present invention, the social client may pop up a second user pair selection interface according to the removal instruction, where the second user pair selection interface includes identifiers of all users in the second group; accordingly, the social client may detect the identifications of any two users selected by the legitimate user of the social client from the second user pair selection interface, thereby enabling to select any two users from the second group to form a second user pair.

205. The social client side judges whether the selected second user pair is the same as a first user pair serving as a first basis for removing the preset image of the covered short video or not; if yes, go to step 206; if not, the process is ended.

206. And the social client removes the preset image covered with the short video on the first group session interface so as to redisplay the short video.

In the method depicted in fig. 2, only when a second user pair formed by any two users selected from the second group is the same as a first user pair associated with the short video and having a sound signal issued on the second group session interface, the preset image of the first group session interface that has been covered with the short video is removed to redisplay the short video; on the contrary, if the second user pair formed by any two users selected from the second group is associated with the short video and the first user pair issued with the sound signal on the second group conversation interface is different, the preset image covered with the short video on the first group conversation interface is not removed, and the short video is still covered. It can be seen that, by implementing the method described in fig. 2, even when the user equipment (e.g., a mobile phone) where the social client is located is not locked, the risk that a short video published on a group session interface of a certain group is peeped by others can be reduced based on a cross-group mechanism. In addition, in the method described in fig. 2, an illegal user can be prevented from triggering the social client to overlay the short video published on the second group session interface of the first group, so that the validity and reliability of the overlay of the short video can be improved.

Referring to fig. 3, fig. 3 is a schematic flow chart illustrating another short video peep prevention method based on a cross-group mechanism according to an embodiment of the present invention. In the method shown in fig. 3, a mapping relationship between the group identifier of the first group and the group identifier of the second group may be established in advance on a group mapping setting interface provided by the social client, where the mapping relationship is used to indicate that the first group is associated with the second group. As shown in fig. 3, the anti-peeping method for short video based on cross-group mechanism may include the following steps:

301. after the social client detects that a certain short video published on a first group session interface of a first group is viewed by a legal user of the social client, the social client prompts to select any two users published with sound signals on a second group session interface of a second group from the second group so as to form a first user pair.

302. The social client associates the first user pair with the short video, prompts to select a first sound signal issued by one user included in the first user pair from the second group conversation interface, and prompts to select a second sound signal issued by another user included in the first user pair from the second group conversation interface.

The first sound signal issued by one user included in the first user pair may be a voice signal or an ambient sound signal issued by one user included in the first user pair on the second group conversation interface, and the first sound signal issued by another user included in the first user pair may be a voice signal or an ambient sound signal issued by another user included in the first user pair on the second group conversation interface.

303. And the social client synthesizes the selected first sound signal and the second sound signal to obtain a verification sound signal.

As an optional implementation manner, in an implementation of the present invention, before the social client performs step 303, the social client may first determine whether the first sound signal and the second sound signal are both voice signals, and perform step 303 only if the first sound signal and the second sound signal are both voice signals.

For example, the social client may accurately determine whether the first sound signal is a voice signal by:

the social client performs fast Fourier transform on the first sound signal to obtain a frequency domain signal;

the social client side calculates a spectrum amplitude value according to the frequency domain signal;

the social client side calculates probability density according to the spectrum amplitude value;

the social client side calculates the spectral entropy of the first sound signal according to the probability density;

the social client determines whether the first sound signal is a voice signal according to the spectral entropy;

the social client determines whether the first sound signal is a voice signal according to the spectral entropy, and may include:

the social client calculates the energy of the first sound signal;

the social client determines whether the first sound signal is a voice signal or not according to the energy and the spectral entropy of the first sound signal, namely the social client can calculate a product of the energy of the first sound signal and the spectral entropy of the first sound signal and perform a square-open operation on the product to obtain a square-open value corresponding to the product; and the social client side can judge whether the square value corresponding to the product is larger than a preset threshold value, if so, the first sound signal is determined to be the voice signal, and if not, the first sound signal is determined not to be the voice signal.

304. The social client associates the verification sound signal with the short video, and covers the short video with a preset image on a first group session interface; the first user pair is used as a first basis for removing the preset image of the covered short video; and verifying the sound signal as a second basis for removing the preset image covered by the short video.

305. The social client detects a removal instruction for a preset image of the overlaid short video on the first group session interface.

306. And the social client prompts to select any two users from the second group according to the removing instruction so as to form a second user pair.

307. The social client side judges whether the second user pair is the same as the first user pair serving as a first basis for removing the preset image of the covered short video or not; if yes, go to step 308-step 310; if not, the process is ended.

308. And the social client prompts the third sound signal issued by the second user to the contained user from the second group session interface and prompts the fourth sound signal issued by the second user to the contained other user from the second group session interface.

309. And the social client synthesizes the selected third sound signal and the selected fourth sound signal to obtain a synthesized sound signal.

310. The social contact client determines whether the synthesized sound signal matches a verification sound signal serving as a second basis for removing the preset image of the covered short video, and if so, executes step 311; if not, the process is ended.

311. And the social client removes the preset image covered with the short video on the first group session interface so as to redisplay the short video.

As an alternative implementation, in step 303, the synthesizing, by the social client, the selected first sound signal and the second sound signal to obtain the verification sound signal includes:

the social client side determines an alignment point between the selected first sound signal and the second sound signal; wherein, the alignment point refers to a starting position of the synthesis of the first sound signal and the second sound signal; in other words, if the first sound signal and the second sound signal are to be synthesized, it is necessary to find out from which audio frame the synthesis starts, and this audio frame can be understood as the alignment point;

As an alternative embodiment, the social client determines an alignment point between the selected first sound signal and the second sound signal, including:

the social client calculates a first time length of the selected first sound signal and a second time length of the second sound signal; wherein the first duration represents a time of sound duration of the first sound signal; the second time length represents the duration of the sound of the second sound signal;

the social client calculates a difference value between the first duration and the second duration;

and the social client judges whether the difference value is smaller than or equal to a preset value, if so, any one of the first sound signal and the second sound signal is subjected to periodic scaling to obtain the first sound signal and the second sound signal with the same final duration, and then the first audio frame of the first sound signal and the first audio frame of the second sound signal with the same final duration are used as an alignment point.

In the embodiment of the present invention, if the difference is smaller than or equal to the predetermined value, it indicates that the difference between the two sound signals (i.e. the first sound signal and the second sound signal) is small when inputting, and at this time, one of the sound signals (e.g. the first sound signal) may be scaled periodically, for example, periodically compressing the sound signal with a longer time length (i.e. colloquially referred to as fast forward), and/or periodically amplifying the sound signal with a shorter time length (i.e. colloquially referred to as slow forward), so that the final durations of the two sound signals are the same, and then the first audio frames of the two sound signals are used as the alignment point for aligning.

Wherein, the value range of the preset value can be 0 to 0.1 second.

As an optional implementation, the social client performs periodic scaling on any one of the first sound signal and the second sound signal, including:

if the first time length of the first sound signal is shorter than the second time length of the second sound signal, the social client determines the proportion X of the difference value in the first time length of the first sound signal according to the difference value;

the social client calculates the audio frame number Y of the first sound signal;

the social client calculates a magnification factor Z, wherein Z is X (Y/(Y-1));

and the social client side amplifies the audio frames except the first audio frame in the first sound signal in an equal proportion according to the amplification coefficient, so that the final duration of the amplified first sound signal is the same as the second duration of the second sound signal.

For example, if the first sound signal is 1 second and there are 100 audio frames, each audio frame is 0.01 second, and the second sound signal is 1.1 second, it is necessary to amplify the first sound signal to 1.1 second. The first frame is not moved, the subsequent 99 frames are amplified, and the amplification factor Z is firstly determined to be 0.1 (100/(100-1)) -0.101, namely 10.1%; at this time, in the subsequent 99 frames, each frame needs to be amplified by 10.1%, each amplified frame is 0.01 × 0.01101 (1+ 10.1%), the length of the 99 frames after amplification is 1.09 seconds, and the final duration of the first amplified sound signal is 1.1 seconds after the first frame without motion is added by 0.01 seconds, that is, the final duration of the first amplified sound signal is the same as the second duration of the second sound signal.

In the embodiment of the present invention, if the difference is greater than the preset value, it indicates that the difference between the two sound signals (i.e., the first sound signal and the second sound signal) is large when inputting, and if one of the sound signals is periodically scaled at this time, then relatively serious distortion may be caused after scaling, and a problem may occur in subsequent verification, so that a cross-correlation algorithm may be used to determine the alignment point. That is, when the difference is greater than the preset value, the method further includes:

the social client side uses the same default sampling frequency to respectively sample the first sound signal and the second sound signal to obtain a first sampling group and a second sampling group;

the social client generates a cross-correlation group according to a default sampling frequency (for example, 8000Hz to 10000Hz), the first sampling group, the second sampling group and the cross-correlation weight; wherein, the cross-correlation weight is positively correlated with the difference (for example, the cross-correlation weight may be 1.5 times of the difference), and the cross-correlation group includes a plurality of values;

the social client compares a plurality of values in the cross-correlation group to find out the maximum value;

the social client uses the audio frame position corresponding to the maximum value as the alignment point.

The social client generates a cross-correlation group according to the default sampling frequency, the first sampling group, the second sampling group and the cross-correlation weight, and includes:

wherein S is_n[t]Representing a set of cross-correlations, x [ m ]]Representing the mth sample data in the first sample group, y [ m-t ]]Represents the (m-t) th sampling data in the second sampling group, t represents the offset of time, t is an integer and takes the value from 0 to m, W_tAnd expressing a window function, wherein n is l and f, l is a cross-correlation weight, and f is a default sampling frequency.

The social client may use the audio frame position corresponding to the maximum numerical value as the alignment point:

after the social client finds the maximum value, it can reversely deduce what m is, i.e. which sample data, according to the above formula (1), and then determine which audio frame the sample data is in, and use the audio frame as the alignment point.

In the embodiment of the invention, after acquiring the first sound signal and the second sound signal, the social client does not verify the two sound signals one by one, but synthesizes the two sound signals to obtain the verified sound signal, then matches the synthesized sound signal with the verified sound signal when the preset image needs to be removed, and generates more verifiable parameters (such as whether the alignment points are the same or not, whether the voiceprint characteristics are matched or not and the like) after the sound signals are synthesized.

As an alternative implementation, in step 310, the determining, by the social client, whether the synthesized sound signal matches the verification sound signal as a second basis for removing the preset image of the covered short video includes:

the social client side judges whether an alignment point between the synthesized sound signal and a verification sound signal serving as a second basis for removing the preset image of the covered short video is the same; the alignment points are the same, and the alignment points comprise that the mixed frame contents of two mixed audio frames corresponding to the alignment points (one mixed audio frame belongs to the synthesized sound signal, and the other mixed audio frame belongs to the verification sound signal) are the same, and the frame times of the two mixed audio frames corresponding to the alignment points are also the same;

if the first multi-dimensional vector is the same as the second multi-dimensional vector, the social client judges whether the first multi-dimensional vector corresponding to the voiceprint feature of the synthesized voice signal is matched with the second multi-dimensional vector corresponding to the voiceprint feature of the verified voice signal, and if the first multi-dimensional vector is matched with the second multi-dimensional vector, the synthesized voice signal is matched with the verified voice signal; if not, determining that the synthesized voice signal is not matched with the verification voice signal;

the first multi-dimensional vector corresponding to the voiceprint features of the synthesized sound signal is composed of a Mel frequency cepstrum coefficient, a linear prediction cepstrum coefficient, a first order difference of the Mel frequency cepstrum coefficient, a first order difference of the linear prediction cepstrum coefficient, energy, a first order difference of the energy and a Gammatone filter cepstrum coefficient. In addition, the above embodiment can improve the accuracy of sound matching.

In the method described in fig. 3, even in a case where a user device (e.g., a mobile phone) where the social client is located is not locked, the risk that a short video posted on a group session interface of a certain group is peeped by others can be reduced. In addition, in the method described in fig. 3, an illegal user can be prevented from triggering the social client to overlay the short video published on the first group session interface of the first group, so that the validity and reliability of the overlay of the short video can be improved.

It will be understood by those skilled in the art that all or part of the steps in the methods of the embodiments described above may be implemented by hardware instructions of a program, and the program may be stored in a computer-readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM), or other Memory, such as a magnetic disk, or a combination thereof, A tape memory, or any other medium readable by a computer that can be used to carry or store data.

The foregoing describes in detail a cross-group mechanism-based short video peeking prevention method disclosed in the embodiments of the present invention, and a specific example is applied in the present document to explain the principle and implementation manner of the present invention, and the description of the foregoing embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for anti-peeking short videos based on a cross-group mechanism, a first group and a second group being created in a social client, the method comprising:

the social client associating the first user pair with the short video and overlaying the short video with a preset image on the first group session interface; wherein the first user pair is used as a first basis for removing the preset image covering the short video;

the method further comprises the following steps:

establishing a mapping relation between the group identifier of the first group and the group identifier of the second group on a group mapping setting interface provided by the social client in advance, wherein the mapping relation is used for representing that the first group is associated with the second group;

when the social client detects that the user equipment where the social client is located and wearable equipment wirelessly connected with the user equipment simultaneously generate the same action event, the user is presented with the group mapping setting interface;

after the social client overlays the short video with a preset image on the first group session interface, the method further comprises:

if the short videos are the same, the social client removes the preset image which is covered by the short videos on the first group conversation interface so as to redisplay the short videos;

after the social client associates the first user pair with the short video and before overlaying the short video with a preset image on the first group session interface, the method further comprises:

after the social client determines that the second user pair is the same as the first user pair serving as a first basis for removing the preset image covering the short video, and before the social client removes the preset image covering the short video on the first group session interface to redisplay the short video, the method further includes:

the social client synthesizes the selected third sound signal and the selected fourth sound signal to obtain a synthesized sound signal;

the social client side judges whether the synthesized sound signal is matched with the verification sound signal which is used as a second basis for removing the preset image covering the short video, and if the synthesized sound signal is matched with the verification sound signal, the step of removing the preset image covering the short video on the first group conversation interface is executed to redisplay the short video;

the social client synthesizing the selected first sound signal and the second sound signal to obtain a verification sound signal, including:

the social client side synthesizes the first sound signal and the second sound signal into a verification sound signal according to the alignment point;

the social client determining an alignment point between the selected first sound signal and the second sound signal, including:

the social client side calculates a first time length of the selected first sound signal and a second time length of the second sound signal; wherein the first duration represents a duration of sound of the first sound signal; the second time duration represents a time duration of sound of the second sound signal;

the social client side judges whether the difference value is smaller than or equal to a preset value, if yes, any one of the first sound signal and the second sound signal is subjected to periodic scaling to obtain the first sound signal and the second sound signal which are the same in final duration, and then the first audio frame of the first sound signal and the first audio frame of the second sound signal which are the same in final duration are used as an alignment point;

the social client periodically scaling either of the first sound signal and the second sound signal, comprising:

if the first time length of the first sound signal is shorter than the second time length of the second sound signal, the social client determines a ratio X of the difference value to the first time length of the first sound signal according to the difference value;

the social client computing a magnification factor Z, which is X (Y/(Y-1));

the social client amplifies other audio frames except the first audio frame in the first sound signal in an equal proportion according to the amplification coefficient, so that the final duration of the amplified first sound signal is the same as the second duration of the second sound signal;

before the social client synthesizes the first sound signal and the second sound signal into a verification sound signal according to the alignment point, the method further includes:

the social client side judges whether the first sound signal and the second sound signal are both voice signals, and if the first sound signal and the second sound signal are both voice signals, the step that the social client side synthesizes the first sound signal and the second sound signal into a verification sound signal according to the alignment point is triggered;

the social client determining whether the first sound signal is a voice signal comprises:

the social client determining whether the first sound signal is a speech signal according to spectral entropy comprises:

the social client calculating the energy of the first sound signal;

the social client calculates the product of the energy of the first sound signal and the spectral entropy of the first sound signal, and performs square-open operation on the product to obtain a square-open value corresponding to the product;

and the social client judges whether the square value corresponding to the product is larger than a preset threshold value, if so, the first sound signal is determined to be the voice signal, and if not, the first sound signal is determined not to be the voice signal.

2. The method of claim 1, wherein if the difference is greater than the predetermined value, the method further comprises:

the social client generates a cross-correlation group according to the default sampling frequency, the first sampling group, the second sampling group and the cross-correlation weight; wherein the cross-correlation weight is positively correlated with the difference, and the cross-correlation group comprises a plurality of values;

3. The method of claim 2, wherein the social client generates a cross-correlation set according to the default sampling frequency, the first sampling set, the second sampling set, and a cross-correlation weight, and comprises:

wherein S is_n[t]Representing a set of cross-correlations, x [ m ]]Representing the mth sample data in the first sample group, y [ m-t ]]To representThe (m-t) th sampling data in the second sampling group, t represents the offset of time, t is an integer and takes the value from 0 to m, W_tAnd representing a window function, wherein n is l f, l is a cross-correlation weight, and f is the default sampling frequency.

4. The method according to claim 3, wherein the determining, by the social client, whether the synthesized sound signal matches the verification sound signal as a second basis for removing the preset image that covers the short video comprises: