CN110992951B - Method for protecting personal privacy based on countermeasure sample - Google Patents

Method for protecting personal privacy based on countermeasure sample Download PDF

Info

Publication number
CN110992951B
CN110992951B CN201911228334.3A CN201911228334A CN110992951B CN 110992951 B CN110992951 B CN 110992951B CN 201911228334 A CN201911228334 A CN 201911228334A CN 110992951 B CN110992951 B CN 110992951B
Authority
CN
China
Prior art keywords
interference
voice
optimization function
module
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911228334.3A
Other languages
Chinese (zh)
Other versions
CN110992951A (en
Inventor
付强
郭九麟
彭凝多
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Homwee Technology Co ltd
Original Assignee
Homwee Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Homwee Technology Co ltd filed Critical Homwee Technology Co ltd
Priority to CN201911228334.3A priority Critical patent/CN110992951B/en
Publication of CN110992951A publication Critical patent/CN110992951A/en
Application granted granted Critical
Publication of CN110992951B publication Critical patent/CN110992951B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Telephone Function (AREA)

Abstract

The invention discloses a method for protecting personal privacy based on a confrontation sample, intelligent equipment comprises a microphone, a loudspeaker, a voice recognition module and an interference generation module, and the method comprises the following steps: the user starts an interference generating module, and the interference generating module generates an interference signal and plays the interference signal through a loudspeaker; the microphone collects voice signals and sends the voice signals to the voice recognition module for recognition, and the voice signals comprise user voice and interference signals; when the voice assistant of the intelligent device needs to be used normally, the interference generation module is turned off. The method determines the voice space needing interference based on the basic pronunciation, greatly reduces the space compared with the space of the speaking content, and reduces the time for generating the confrontation sample; from the acoustic perspective, the voice signal outside the human hearing range is used as a search space, and the search is not perceived by the user; and finding out the universal interference m by adopting a particle swarm optimization algorithm in a search space to obtain a confrontation sample and protect the privacy of a user.

Description

Method for protecting personal privacy based on confrontation sample
Technical Field
The invention relates to the technical field of information security, in particular to a method for protecting personal privacy based on a confrontation sample.
Background
With the increasing maturity of artificial intelligence theory and technology, the application field is continuously expanded, and the life style of people is greatly improved and facilitated. The voice recognition is one of the enabling directions of the artificial intelligence technology, and the accuracy of the voice recognition is also continuously improved. The characteristics of the voice interaction mode enable the voice interaction mode to have great potential in the environment of the Internet of things. For example, aiming at the task of setting an alarm clock for a smart phone in daily life, a voice instruction of seconds is only needed to complete. In the process, the hands are released, low-cost devices such as a microphone, a processor and a loudspeaker of the smart phone are used, and the task is completed through simple voice instructions. From this we can see that voice interaction has the advantages of simplicity, rapidity, low equipment cost, context comprehension, etc. Based on the advantages, the voice interaction also has absolute advantages in a plurality of scenes such as families, vehicles, hiking and the like. While the artificial intelligence technology brings convenience and convenience to people, problems are also associated, for example, when a user does not turn on a voice function of the intelligent device, a voice assistant and the like of the intelligent device always waits for a wake-up word of the user to activate the voice function, namely, the voice assistant and the like listen to the voice content of the user and do not recognize the voice content of the user in a voice recognition system, and the recognized content is used for recommendation of other applications, which is obviously not desired by the user and infringes the privacy of the user.
Disclosure of Invention
The invention aims to provide a method for protecting personal privacy based on an antagonistic sample, which is used for solving the problem that the intelligent equipment identifies the chat content of a user to damage the privacy of the user in the prior art.
The invention protects the voice content of the user from being correctly identified by the intelligent equipment through the following technical scheme:
a method of protecting personal privacy based on a challenge sample, comprising a smart device comprising a microphone, a speaker, a voice recognition module, the smart device further comprising an interference generation module, the method comprising:
step S1: the user starts an interference generating module, and the interference generating module generates an interference signal and plays the interference signal through a loudspeaker;
step S2: a microphone collects voice signals, wherein the voice signals comprise user voices and interference signals and are sent to a voice recognition module for recognition, and the voice recognition module cannot recognize original contents of the user voices;
step S3: and when the voice assistant of the intelligent equipment needs to be normally used, the interference generation module is closed so as to achieve the purpose of normal use.
The method is characterized in that small interference (interference signals) which cannot be perceived by people are intentionally added to an input sample (user voice), so that a voice recognition module of the intelligent device can give an erroneous output with high confidence level, namely, an antagonistic sample is introduced into the voice recognition field, and different purposes can be achieved by making different limits on the interference signals.
Further, the method for generating the interference signal by the interference generating module is as follows:
step A: the interference generation module acquires the calling right of the voice recognition module;
and B: determining a voice space needing to be interfered, and recording the voice space as X;
step C: recording a voice signal outside a human audible range as a search space S according to an acoustic angle;
step D: searching general interference m by adopting a particle swarm optimization algorithm, so that m meets the following requirements: when the universal interference m and any element in the voice space X are played together, the voice recognition module can make a mistake, and the universal interference m is an interference signal.
Further, the particle swarm optimization algorithm in the step D includes:
step D1: initializing optimal values of an optimal particle sequence and an optimization function, wherein the optimization function defines the errors according to the speech recognition results after the common interference m is added to all members in the speech space X;
step D2: randomly generating 99 random particle sequences, and adding the optimal particle sequences to obtain 100 particles in total;
step D3: calculating an optimization function value of each of the 99 random particle sequences and the optimal particle sequence, comparing the minimum optimization function value with the optimal value of the optimization function, if the minimum optimization function value is smaller than the optimal value of the optimization function, setting the particle sequence corresponding to the minimum optimization function value as the optimal particle sequence, and setting the minimum optimization function value as the optimal value of the optimization function;
step D4: judging whether the optimal value of the optimization function is smaller than a preset value or not, if so, determining the optimal particle sequence as the universal interference m, and ending; otherwise, the positions and velocities of the 99 random particle sequences are updated, and the step D3 is returned.
Further, the method for the interference generation module to obtain the call right of the speech recognition module in step a includes: and accessing through a developer account of a voice recognition module manufacturer, or directly using a voice recognition module of the intelligent equipment, and acquiring a voice recognition interface in a reverse mode.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the method determines the voice space needing interference based on the basic pronunciation, greatly reduces the space and reduces the time for generating the confrontation sample compared with the space of the speaking content; from the acoustic perspective, taking a voice signal beyond the range heard by a person as a search space, and enabling the search not to be perceived by a user; and finding the interference m by adopting a particle swarm optimization algorithm in the search space, wherein the interference m aims at the elements of the pronunciation space, and one interference can make the elements of a plurality of pronunciation spaces go wrong and be universal interference, so that a countersample is obtained, and the privacy of a user is protected.
Drawings
FIG. 1 is a functional block diagram of the present invention;
fig. 2 is a threshold curve of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples, but the embodiments of the present invention are not limited thereto.
Example 1:
with reference to fig. 1, a method for protecting personal privacy based on a confrontation sample includes a smart device, the smart device includes a microphone, a speaker, a voice recognition module, and an interference generation module, and the method includes:
step S1: the user starts an interference generating module, and the interference generating module generates an interference signal and plays the interference signal through a loudspeaker;
step S2: a microphone collects voice signals which comprise user voice and interference signals and are sent to a voice recognition module for recognition, and the voice recognition module cannot recognize original contents of the user voice;
step S3: the interference generating module is turned off when a voice assistant is needed to use the smart device normally.
The method is characterized in that small interference (interference signals) which cannot be perceived by some people are intentionally added to an input sample (user voice), so that a voice recognition module of the intelligent equipment gives an erroneous output with high confidence level, namely, an antagonistic sample is introduced into the voice recognition field, and different purposes can be achieved by making different limits on the interference signals.
Further, the method for generating the interference signal by the interference generation module is as follows:
step A: the interference generation module acquires the calling right of the voice recognition module; the voice recognition module can be accessed through a developer account of a voice recognition module manufacturer, and the developer account is only required to be registered by a cloud service;
or the voice recognition module of the intelligent equipment is directly used, only equipment merchants need to be called normally, and a voice recognition interface (a third party) can be obtained in a reverse mode;
and B, step B: determining a voice space needing to be interfered, and recording the voice space as X; compared with the space of the speech content of a person, the space of the basic pronunciation is much smaller, such as: the combination of initial consonants and final consonants in Chinese removes the total 400 multi-syllables which can not pronounce, therefore, collect these 400 multi-syllables and obtain the speech space needing to interfere and note as X;
and C: recording voice signals outside the audible range of a person as a search space S according to the acoustic angle; the search is not perceived by the user, and it needs to be considered from the acoustic point of view, and the human ear hearing can not sense the sound of all frequencies and all sound intensities, but only sense the sound of a certain sound pressure and frequency range. The frequency range of normal human audible sound pressure is 20 Hz-20 kHz. Typically, young people hear 20kHz, while older people hear high frequency sounds reduced to 10 kHz. Furthermore, the audible sound intensity range of normal people is 0-120 dB SPL (sound pressure level). The pure tone threshold is also called absolute threshold and silent threshold, which reflects the minimum sound pressure level that the human ear can just hear the sound in a quiet environment without any other sound interference, and the unit of the sound pressure level is dB, and the sound pressure level is related to the frequency. The threshold of hearing refers to the lowest sound pressure level that can be heard. There is a corresponding relationship between absolute hearing threshold and frequency. As shown in fig. 2, which is the "absolute hearing threshold" curve obtained according to the formula. Below the threshold curve is our search space.
Step D: searching general interference m by adopting a particle swarm optimization algorithm, so that m meets the following requirements: when the universal interference m and any element in the voice space X are played together, the voice recognition module can make a mistake, and the universal interference m is an interference signal.
Defining an optimization function in the particle swarm optimization algorithm according to the errors of the voice recognition results after all members in the voice space X are added with the universal interference m; for example: the recognition is correct to 1 (recognition structure is consistent with the collection), and the error is 0 (recognition result is inconsistent with the collection). The optimized search space is S in step C, and by defining an optimization function, we can know that each member in the speech space X can be correctly identified after m is added, and the final value of the optimization function is the total number of pronunciations, namely the number of members in X; if m is added, errors are identified, when the final value of the optimization function is 0. When m is obtained, most of m can be made wrong, namely the minimum value of the optimization function value reaches a preset value;
the solving step comprises the following steps:
step D1: initializing optimal values of the optimal particle sequence and the optimization function;
step D2: randomly generating 99 random particle sequences, and adding the optimal particle sequences to total 100 particles;
step D3: calculating an optimization function value of each of the 99 random particle sequences and the optimal particle sequence, comparing the minimum optimization function value with the optimal value of the optimization function, if the minimum optimization function value is smaller than the optimal value of the optimization function, setting the particle sequence corresponding to the minimum optimization function value as the optimal particle sequence, and setting the minimum optimization function value as the optimal value of the optimization function;
step D4: judging whether the optimal value of the optimization function is smaller than a preset value or not, if so, determining the optimal particle sequence as the universal interference m, and ending; otherwise, the positions and velocities of the 99 random particle sequences are updated, and the procedure returns to step D3.
Through the steps, a universal interference m is found, namely the result to be found by the user.
The interference generation module obtains the calling right of the voice recognition module and is used for verifying and determining whether the added interference achieves the purpose of making voice error.
Although the present invention has been described herein with reference to the illustrated embodiments thereof, which are intended to be preferred embodiments of the present invention, it is to be understood that the invention is not limited thereto, and that numerous other modifications and embodiments can be devised by those skilled in the art that will fall within the spirit and scope of the principles of this disclosure.

Claims (3)

1. A method for protecting personal privacy based on confrontation samples, comprising a smart device, the smart device comprising a microphone, a speaker, a voice recognition module, wherein the smart device further comprises an interference generation module, the method comprising:
step S1: the user starts an interference generating module, and the interference generating module generates an interference signal and plays the interference signal through a loudspeaker; the method for generating the interference signal by the interference generation module comprises the following steps:
step A: the interference generation module acquires the calling right of the voice recognition module;
and B, step B: determining a voice space needing to be interfered, and recording the voice space as X;
step C: recording a voice signal outside a human audible range as a search space S according to an acoustic angle;
step D: searching general interference m by adopting a particle swarm optimization algorithm, so that m meets the following requirements: when the universal interference m and any element in the voice space X are played together, the voice recognition module can make an error, and the universal interference m is an interference signal;
step S2: a microphone collects voice signals and sends the voice signals to a voice recognition module for recognition, wherein the voice signals comprise user voice and interference signals;
step S3: the interference generating module is turned off when a voice assistant is needed to use the smart device normally.
2. The method according to claim 1, wherein the particle swarm optimization algorithm in step D comprises:
step D1: initializing optimal values of an optimal particle sequence and an optimization function, wherein the optimization function defines the errors according to the speech recognition results after the common interference m is added to all members in the speech space X;
step D2: randomly generating 99 random particle sequences, and adding the optimal particle sequences to obtain 100 particles in total;
step D3: calculating an optimization function value of each of the 99 random particle sequences and the optimal particle sequence, comparing the minimum optimization function value with the optimal value of the optimization function, if the minimum optimization function value is smaller than the optimal value of the optimization function, setting the particle sequence corresponding to the minimum optimization function value as the optimal particle sequence, and setting the minimum optimization function value as the optimal value of the optimization function;
step D4: judging whether the optimal value of the optimization function is smaller than a preset value or not, if so, determining the optimal particle sequence as the universal interference m, and ending; otherwise, the positions and velocities of the 99 random particle sequences are updated, and the procedure returns to step D3.
3. The method for protecting personal privacy based on the countermeasure sample according to claim 1 or 2, wherein the method for the interference generation module to obtain the call right of the voice recognition module in step a comprises: the voice recognition interface is obtained in a reverse mode through the access of a developer account of a voice recognition module manufacturer or the direct use of a voice recognition module of the intelligent device.
CN201911228334.3A 2019-12-04 2019-12-04 Method for protecting personal privacy based on countermeasure sample Active CN110992951B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911228334.3A CN110992951B (en) 2019-12-04 2019-12-04 Method for protecting personal privacy based on countermeasure sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911228334.3A CN110992951B (en) 2019-12-04 2019-12-04 Method for protecting personal privacy based on countermeasure sample

Publications (2)

Publication Number Publication Date
CN110992951A CN110992951A (en) 2020-04-10
CN110992951B true CN110992951B (en) 2022-07-26

Family

ID=70089915

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911228334.3A Active CN110992951B (en) 2019-12-04 2019-12-04 Method for protecting personal privacy based on countermeasure sample

Country Status (1)

Country Link
CN (1) CN110992951B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129875A (en) * 2021-03-12 2021-07-16 嘉兴职业技术学院 Voice data privacy protection method based on countermeasure sample

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040125922A1 (en) * 2002-09-12 2004-07-01 Specht Jeffrey L. Communications device with sound masking system
CN102543066B (en) * 2011-11-18 2014-04-02 中国科学院声学研究所 Target voice privacy protection method and system
US10282166B2 (en) * 2017-05-03 2019-05-07 The Reverie Group, Llc Enhanced control, customization, and/or security of a sound controlled device such as a voice controlled assistance device
US11404038B2 (en) * 2018-04-26 2022-08-02 Ronald J. Zenk Privacy sleeve for smart speakers
CN109036389A (en) * 2018-08-28 2018-12-18 出门问问信息科技有限公司 The generation method and device of a kind of pair of resisting sample
CN108831471B (en) * 2018-09-03 2020-10-23 重庆与展微电子有限公司 Voice safety protection method and device and routing terminal
CN109902705A (en) * 2018-10-30 2019-06-18 华为技术有限公司 A kind of object detection model to disturbance rejection generation method and device
CN109887496A (en) * 2019-01-22 2019-06-14 浙江大学 Orientation confrontation audio generation method and system under a kind of black box scene
CN110048797A (en) * 2019-04-10 2019-07-23 中国科学院声学研究所 A kind of acoustics protective device of prevention audio-frequency information leakage

Also Published As

Publication number Publication date
CN110992951A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
US11823679B2 (en) Method and system of audio false keyphrase rejection using speaker recognition
US20220295194A1 (en) Interactive system for hearing devices
Principi et al. An integrated system for voice command recognition and emergency detection based on audio signals
US11521598B2 (en) Systems and methods for classifying sounds
JP2020016875A (en) Voice interaction method, device, equipment, computer storage medium, and computer program
US20190138603A1 (en) Coordinating Translation Request Metadata between Devices
CN112352441B (en) Enhanced environmental awareness system
WO2015023751A1 (en) Device for language processing enhancement in autism
WO2019228329A1 (en) Personal hearing device, external sound processing device, and related computer program product
Guo et al. Specpatch: Human-in-the-loop adversarial audio spectrogram patch attack on speech recognition
US20220122605A1 (en) Method and device for voice operated control
CN108476072A (en) Crowdsourcing database for voice recognition
JP2019028465A (en) Speaker verification method and speech recognition system
CN110992951B (en) Method for protecting personal privacy based on countermeasure sample
Liu et al. Defending against microphone-based attacks with personalized noise
CN111800700B (en) Method and device for prompting object in environment, earphone equipment and storage medium
JP2024510779A (en) Voice control method and device
WO2008075305A1 (en) Method and apparatus to address source of lombard speech
JP6918471B2 (en) Dialogue assist system control method, dialogue assist system, and program
Vovos et al. Speech operated smart-home control system for users with special needs.
US11275551B2 (en) System for voice-based alerting of person wearing an obstructive listening device
TWI824424B (en) Hearing aid calibration device for semantic evaluation and method thereof
JP2008286921A (en) Keyword extraction device, keyword extraction method, and program and recording medium therefor
Cheng Acoustic-channel attack and defence methods for personal voice assistants
CN110166863B (en) In-ear voice device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant