CN112565207A - Non-invasive intelligent sound box safety evidence obtaining system and method thereof - Google Patents

Non-invasive intelligent sound box safety evidence obtaining system and method thereof Download PDF

Info

Publication number
CN112565207A
CN112565207A CN202011315413.0A CN202011315413A CN112565207A CN 112565207 A CN112565207 A CN 112565207A CN 202011315413 A CN202011315413 A CN 202011315413A CN 112565207 A CN112565207 A CN 112565207A
Authority
CN
China
Prior art keywords
user
sound box
intelligent sound
network
network flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011315413.0A
Other languages
Chinese (zh)
Other versions
CN112565207B (en
Inventor
伏晓
林丽
骆斌
刘轩宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202011315413.0A priority Critical patent/CN112565207B/en
Publication of CN112565207A publication Critical patent/CN112565207A/en
Application granted granted Critical
Publication of CN112565207B publication Critical patent/CN112565207B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/10Mapping addresses of different types
    • H04L61/103Mapping addresses of different types across network layers, e.g. resolution of network layer into physical layer addresses or address resolution protocol [ARP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

A non-invasive intelligent sound box safety evidence obtaining system and a method thereof comprise: the system comprises a network flow analysis module, a user intention acquisition module and a forensics analysis module; the method verifies the one-to-one mapping relation between the network flow mode and the intelligent sound box behavior, and supports the safety monitoring of the intelligent sound box by using an effective method combining network flow analysis, user intention extraction and abnormal network flow alarm. The method can well monitor the security risk, protect the privacy of the user, enhance the security of the intelligent sound box, and send related risks or abnormal conditions to the user through the APP.

Description

Non-invasive intelligent sound box safety evidence obtaining system and method thereof
Technical Field
The invention relates to a non-invasive intelligent sound box security evidence obtaining system and a method thereof, and belongs to the technical field of computer evidence obtaining.
Background
In the past few years, the popularity of internet of things consumer devices, especially smart speakers, has increased rapidly. There are many smart speakers on the market, including a millet smart speaker, a tianmao sprite, an amazon Echo, a Google Home, and a huaji AI speaker. The smart sound box can provide real convenience for users, but can also bring safety problems. Smart enclosures may typically connect to a cloud server during which time, even if a user does not activate their enclosure, the enclosure may eavesdrop on their daily conversations and upload them to the cloud server, thereby violating user privacy. Sensitive information such as passwords, credit card numbers and home addresses are easily compromised.
Digital forensics seems to be the most effective of the existing methods for protecting user privacy and enhancing security of smart speakers. However, the existing method for obtaining evidence of a smart sound box is usually intrusive, and either internal equipment of the smart sound box is modified or the support of a cloud server is required. The intelligent sound box is not open source, so that a non-invasive research method capable of working independently is a better choice.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a non-invasive intelligent sound box safety evidence obtaining system and a non-invasive intelligent sound box safety evidence obtaining method, which realize an intelligent sound box alarm APP and provide image-text alarm to obtain real-time evidence when abnormal behaviors occur. The method is relatively mature and can be applied to various conditions without technical challenges.
In order to achieve the purpose, the technical scheme of the invention is as follows: a non-invasive intelligent sound box security forensics system comprises a network flow analysis module, a user intention acquisition module and a forensics analysis module; the network flow analysis module and the user intention acquisition module provide convenience for the forensics analysis module together.
The network traffic analysis module is used for monitoring network traffic between the intelligent sound box and the cloud server and analyzing a network mode;
the user intention acquisition module is used for determining the intention of a user, namely the operation that the user wants the intelligent sound box to execute;
the evidence obtaining analysis module is combined with the network flow analysis module and the user intention obtaining module, monitors the intelligent sound box events and collects related evidence obtaining relations, verifies one-to-one mapping between the flow mode and the intelligent sound box events, and is convenient for protecting user privacy and enhancing safety of the intelligent sound box. And may alert the user to risks or abnormal conditions related to the smart speaker.
As an improvement of the invention, the forensics analysis module comprises an instruction and user intention comparison module and an abnormity alarm sub-module.
(1) And the instruction in the evidence obtaining analysis module and the user intention comparison module can determine whether the intelligent sound box works as expected or not by comparing the user intention with the intelligent sound box behavior. If sensitive keywords from the user are detected, the behavior of the smart sound box at the moment is monitored to determine whether abnormal network traffic exists.
(2) And the abnormal alarm sub-module is used for sending an alarm and abnormal flow details to remind the user in time through the APP when the network flow of the intelligent sound box is not matched with the intention of the user or the intelligent sound box monitors the sensitive information of the user. If the network traffic pattern does not match the user specification, then the abnormal traffic is marked in red in the network traffic graph. If the user mentions sensitive information and the traffic increases abnormally, the abnormal traffic is marked yellow.
A method for obtaining evidence of a non-invasive intelligent sound box safety evidence obtaining system comprises the following steps:
step one, in a network flow analysis stage, a man-in-the-middle method is adopted, equipment with ARP spoofing software is used as a man-in-the-middle, and all network packets between the intelligent sound box and the cloud server are captured. And extracting a network mode according to the captured network data packet, and deducing the current event of the intelligent sound box. Then, using the inferred event to determine whether the network traffic is consistent with the user instruction and whether the smart speaker is recording the user voice in real time;
and step two, in the user intention extraction stage, recording the conversation between the intelligent sound box and the user by using a recording device with a microphone. The user's voice is converted into text in real time, and the user's intention is extracted from the text by a keyword matching method. After the intention of the user is obtained, verifying whether the intention of the user is consistent with the behavior of the intelligent sound box;
and step three, in a forensics analysis stage, monitoring the behaviors of the intelligent sound box and collecting relevant forensics information by combining network flow analysis and user intention, obtaining the current behavior of the intelligent sound box from the result of the network flow analysis, obtaining a current user instruction for the intelligent sound box from the extracted user intention, considering the behavior to be abnormal if the behavior and the instruction are inconsistent, and sending an alarm to a user through an alarm APP when an abnormal condition occurs, and determining what to do by checking forensics detailed information through a user interface by the user.
As an improvement of the present invention, the specific steps of the step of analyzing the network traffic are as follows:
(1) the ARP spoofing software is used for spoofing the intelligent sound box and intercepting network flow;
(2) and extracting a network mode, and analyzing a one-to-one mapping relation between the network flow mode and the intelligent sound box behavior. The behavior of the smart speakers was analyzed by observing the number of packets per period captured by the traffic packet capture software.
(3) And after the mapping relation between various network flow modes and the behavior of the intelligent sound box is found, the stage of analyzing the network flow is ended.
In step two, a recording device with a microphone is used to record the conversation between the smart speaker and the user. By extracting the user intent, the operation that the user wishes the smart speaker to perform is determined, and it is checked whether the smart speaker follows the user instructions. The method is relatively mature and can be applied to various situations without technical challenges.
In the second step, an end-to-end automatic speech recognition system realized in TensorFlow is used for converting the monitored speech into a text format, and the specific operation steps are as follows:
(1) first, as input, speech enters the ASR system in audio form.
(2) Then, after feature extraction, it is converted into a speech feature vector for subsequent processing and recognition.
(3) The speech feature vectors and labels are then provided to an acoustic model (the core module of the overall ASR system) for training. One using characters and the second using phonemes. When the trained acoustic model is predicted in the testing stage, it outputs a result according to the labels used in the training process. If a character is used as a tag, the model will return the character. If phonemes are used as labels, the model will spit out the phonemes.
(4) And finally outputting the result.
As an improvement of the present invention, the method for extracting the user's intention from the text by keyword matching specifically includes: the keyword is matched using a topic model based approach Topmine. And aggregating adjacent words in the text into phrases according to the result of the topic analysis, and then selecting the high-frequency phrases as key phrases. Firstly, a group of common user instructions are collected according to the functions provided by the intelligent sound box. The instruction text is segmented by phrase mining and subject models are executed thereon. Thereafter, the same topic is associated with each word in the text, and the topic that successfully matches the phrase is determined to be a keyword.
As an improvement of the present invention, the abnormality alarm sub-module specifically includes:
mode 1: the alarm APP sends a short message to the user to prompt the user of the abnormal behavior;
mode 2: and displaying a network flow graph of abnormal behaviors through the alarm APP. If the network traffic pattern does not match the user specification, then the abnormal traffic is marked in red in the network traffic graph. If the user mentions sensitive information and the traffic increases abnormally, the abnormal traffic is marked yellow.
In step one, ARP man-in-the-middle technique is used. Two hosts can communicate by MAC address addressing in the same local area network. Two hosts may communicate with each other if they are not in the same subnet. The data is sent to the respective router gateway and the IP address of the gateway is used to effect the communication. However, when the gateway communicates with hosts in its local area network, it still relies on MAC address addressing. Thus, if an attacker forges a gateway for spoofing purposes, the MAC address cache of the gateway on the target host must be changed to the attacker's MAC address.
The specific operation steps are as follows:
(1) disguising the host as a gateway. All data from the smart speakers to the external network are analyzed by the host.
(2) And opening Wireshark, and capturing network traffic in real time. The information observed from the traffic packets mainly includes packet length, packet direction and time stamp, plaintext information, etc.
Compared with the prior art, the invention has the beneficial effects that:
(1) according to the method, a non-invasive method is used for carrying out safety evidence obtaining on the intelligent sound box, and troubles caused by the closed source of the intelligent sound box are avoided.
(2) The method uses a recording device with a microphone to record the dialogue between the smart speaker and the user, extracts the user's intentions, determines the operation the user wishes the smart speaker to perform, and checks whether the smart speaker follows the user's instructions. There are several alternative methods of extracting the user's intention:
1) the cached data is obtained from the smart speaker application, which records the conversation between the smart speaker and the user. However, the cached data may be encrypted and some applications may not even retain the data. Thus, this method has certain limitations and is not versatile.
2) The UI component of the message dialog is obtained from the smart speaker application to analyze the dialog between the smart speaker and the user to extract the user intent. However, this approach has the disadvantage that the data under the UI interface is not always available and requires the application to continue running.
Therefore, the method is relatively mature and applicable to various conditions without technical challenges by analyzing the voice of the user to obtain the content of the conversation, recording the content of the conversation, converting the content into the text, matching the text with the keywords and extracting the user intention;
(3) the method verifies one-to-one mapping between the flow mode and the intelligent sound box behavior, so that the privacy of a user is protected conveniently, and the safety of the intelligent sound box is enhanced.
(4) The method not only can send timely short messages to the user when the flow is abnormal, but also provides an alarm APP for the user, and the user can clearly control whether the intelligent sound box monitors the daily behaviors of the user and whether the daily behaviors are consistent with the request of the user by checking the alarm prompt and abnormal flow information.
Drawings
FIG. 1 is a block diagram of a security evidence obtaining method for an intelligent speaker and a flow of cooperative work thereof;
FIG. 2 is a flow chart of the speech recognition principle;
fig. 3 is a diagram of a user interface providing forensic details.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Example 1: referring to fig. 1, a non-invasive intelligent sound box security forensics system includes a network traffic analysis module, a user intention acquisition module, and a forensics analysis module; the network flow analysis module and the user intention acquisition module provide convenience for the forensics analysis module together.
The network traffic analysis module is used for monitoring network traffic between the intelligent sound box and the cloud server and analyzing a network mode;
the user intention acquisition module is used for determining the intention of a user, namely the operation that the user wants the intelligent sound box to execute;
the system evidence obtaining analysis module is combined with the network flow analysis module and the user intention obtaining module to monitor the intelligent sound box events and collect related evidence obtaining relations.
The forensics analysis module comprises a comparison module of instructions and user intentions and an abnormity alarm sub-module.
(1) And the instruction in the evidence obtaining analysis module and the user intention comparison module can determine whether the intelligent sound box works as expected or not by comparing the user intention with the intelligent sound box behavior. If sensitive keywords from the user are detected, the behavior of the smart sound box at the moment is monitored to determine whether abnormal network traffic exists.
(2) And the abnormal alarm sub-module is used for sending an alarm and abnormal flow details to remind the user in time through the APP when the network flow of the intelligent sound box is not matched with the intention of the user or the intelligent sound box monitors the sensitive information of the user. If the network traffic pattern does not match the user specification, then the abnormal traffic is marked in red in the network traffic graph. If the user mentions sensitive information and the traffic increases abnormally, the abnormal traffic is marked yellow.
Example 2: referring to fig. 1-3, a method for forensics of a non-invasive smart speaker security forensics system, the method comprising the steps of:
step one, in a network flow analysis stage, a man-in-the-middle method is adopted, equipment with ARP spoofing software is used as a man-in-the-middle, and all network packets between the intelligent sound box and the cloud server are captured. And extracting a network mode according to the captured network data packet, and deducing the current event of the intelligent sound box. Then, using the inferred event to determine whether the network traffic is consistent with the user instruction and whether the smart speaker is recording the user voice in real time;
and step two, in the user intention extraction stage, recording the conversation between the intelligent sound box and the user by using a recording device with a microphone. The user's voice is converted into text in real time, and the user's intention is extracted from the text by a keyword matching method. After the intention of the user is obtained, verifying whether the intention of the user is consistent with the behavior of the intelligent sound box;
and step three, in a forensics analysis stage, monitoring the behavior of the intelligent sound box and collecting relevant forensics information by combining network flow analysis and user intention. The current behavior of the smart speaker is obtained from the results of the network traffic analysis, and the current user instruction for the smart speaker may be obtained from the extracted user intent. If the behavior and the instruction are not consistent, the behavior is considered abnormal. When abnormal conditions occur, an alarm is sent to the user through the alarm APP, and the evidence obtaining detailed information can be checked through the user interface.
The first step of analyzing the network traffic stage comprises the following specific steps:
(1) the ARP spoofing software is used for spoofing the intelligent sound box and intercepting network flow;
(2) and extracting a network mode, and analyzing a one-to-one mapping relation between the network flow mode and the intelligent sound box behavior. The behavior of the smart speakers was analyzed by observing the number of packets per period captured by the traffic packet capture software.
(3) And after the mapping relation between various network flow modes and the behavior of the intelligent sound box is found, the stage of analyzing the network flow is ended.
And in the second step, recording the conversation between the intelligent sound box and the user by using a recording device with a microphone. By extracting the user intent, the operation that the user wishes the smart speaker to perform is determined, and it is checked whether the smart speaker follows the user instructions. The method is relatively mature and can be applied to various situations without technical challenges.
In the second step, an end-to-end automatic speech recognition system realized in TensorFlow is used for converting the monitored speech into a text format, and the specific operation steps are as follows:
(1) first, as input, speech enters the ASR system in audio form.
(2) Then, after feature extraction, it is converted into a speech feature vector for subsequent processing and recognition.
(3) The speech feature vectors and labels are then provided to an acoustic model (the core module of the overall ASR system) for training. One using characters and the second using phonemes. When the trained acoustic model is predicted in the testing stage, it outputs a result according to the labels used in the training process. If a character is used as a tag, the model will return the character. If phonemes are used as labels, the model will spit out the phonemes.
(4) And finally outputting the result.
The method for extracting the user intention from the text by the keyword matching specifically comprises the following steps: the keyword is matched using a topic model based approach Topmine. And aggregating adjacent words in the text into phrases according to the result of the topic analysis, and then selecting the high-frequency phrases as key phrases. Firstly, a group of common user instructions are collected according to the functions provided by the intelligent sound box. The instruction text is segmented by phrase mining and subject models are executed thereon. Thereafter, the same topic is associated with each word in the text, and the topic that successfully matches the phrase is determined to be a keyword.
The abnormity warning submodule specifically comprises:
mode 1: the alarm APP sends a short message to the user to prompt the user of the abnormal behavior;
mode 2: and displaying a network flow graph of abnormal behaviors through the alarm APP. If the network traffic pattern does not match the user specification, then the abnormal traffic is marked in red in the network traffic graph. If the user mentions sensitive information and the traffic increases abnormally, the abnormal traffic is marked yellow.
In step one, ARP man-in-the-middle technique is used. Two hosts can communicate by MAC address addressing in the same local area network. Two hosts may communicate with each other if they are not in the same subnet. The data is sent to the respective router gateway and the IP address of the gateway is used to effect the communication. However, when the gateway communicates with hosts in its local area network, it still relies on MAC address addressing. Thus, if an attacker forges a gateway for spoofing purposes, the MAC address cache of the gateway on the target host must be changed to the attacker's MAC address.
The specific operation steps are as follows:
(1) disguising the host as a gateway. All data from the smart speakers to the external network are analyzed by the host.
(2) And opening Wireshark, and capturing network traffic in real time. The information observed from the traffic packets mainly includes packet length, packet direction and time stamp, plaintext information, etc.
Application example 1: the detailed operation flow of the method is shown in fig. 1-3.
Fig. 1 shows the working flow of the method.
In the embodiment, a non-invasive safety evidence obtaining method is adopted, and equipment with a microphone is arranged near the intelligent sound box so as to continuously monitor conversation between the intelligent sound box and a user.
When a user issues an instruction, speech recognition technology converts the content they say to text. The text is then analyzed by NLP techniques for keyword matching to determine what the user wants to do and to aggregate this information into the user's intent.
Meanwhile, the intermediate device with the ARP spoofing software continues to monitor the network traffic between the intelligent sound box and the cloud server, and deduces the behavior of the intelligent sound box by analyzing network modes under different behaviors.
And finally, collecting and reserving the inferred behavior of the intelligent sound box and the intention of the user by a forensics analysis module. Then, the concurrent behaviors are compared with the user intention in a forensics analysis comparison module to find out whether the smart sound box is following the instructions of the user. However, if we find that the smart speaker is transmitting data to the cloud server when the user is not interacting with the smart speaker, then the event is an exception. When an abnormal event occurs, an alarm is sent to a user through the forensics analysis alarm module, and the user can check forensics detailed information to decide what to do.
Fig. 2 is a schematic diagram illustrating the principle of speech recognition technology. First, as input, speech enters the ASR system in audio form. Then, after feature extraction, it is converted into a speech feature vector for subsequent processing and recognition. The speech feature vectors and labels are then provided to an acoustic model (the core module of the overall ASR system) for training. And (5) outputting after the above processes are finished, and finally obtaining a result.
Fig. 3 shows a schematic diagram of an alarm APP. The APP comprises a text part and a network flow graph. The content presented to the user includes an abnormal flow graph against which behavior and behavioral inconsistencies the smart speaker violates.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims (9)

1. The utility model provides a system of collecting evidence of non-invasive intelligent audio amplifier safety which characterized in that: the system comprises a network flow analysis module, a user intention acquisition module and a forensics analysis module;
the network traffic analysis module is used for monitoring network traffic between the intelligent sound box and the cloud server and analyzing a network mode;
the user intention acquisition module is used for determining the intention of a user, namely the operation that the user wants the intelligent sound box to execute;
the system evidence obtaining analysis module is combined with the network flow analysis module and the user intention obtaining module to monitor the intelligent sound box events and collect relevant evidence obtaining.
2. The non-invasive smart speaker security forensics system according to claim 1, wherein: the forensics analysis module comprises a comparison module of instructions and user intentions and an abnormity alarm sub-module;
(1) the instruction in the evidence obtaining analysis module is compared with the user intention by a comparison module, whether the intelligent sound box works as expected is determined by comparing the user intention with the intelligent sound box behavior, and if sensitive keywords from the user are detected, the behavior of the intelligent sound box at the moment is monitored to determine whether abnormal network traffic exists;
(2) the abnormal alarm sub-module is used for sending an alarm and abnormal flow details to remind a user through the APP in time when the network flow of the intelligent sound box is not matched with the user intention or the intelligent sound box monitors sensitive information of the user, wherein the abnormal flow is marked with red in a network flow chart if the network flow mode is not matched with the user description, and the abnormal flow is marked with yellow if the user mentions the sensitive information and the flow is increased abnormally.
3. A method for obtaining evidence of the non-invasive smart sound box security evidence obtaining system of claim 1 or 2, wherein: the method comprises the following steps:
step one, analyzing network flow, namely capturing all network packets between an intelligent sound box and a cloud server by using equipment with ARP spoofing software as a middleman by adopting a middleman method, extracting a network mode according to captured network data packets, deducing a current event of the intelligent sound box, and then determining whether the network flow is consistent with a user instruction and whether the intelligent sound box records user voice in real time or not by using the deduced event;
step two, a user intention extraction stage, namely recording a dialogue between the intelligent sound box and a user by using a recording device with a microphone, converting the voice of the user into a text in real time through a voice recognition technology, matching keywords through an NLP technology, extracting the user intention from the text, and verifying whether the intention of the user is consistent with the behavior of the intelligent sound box after obtaining the intention of the user; meanwhile, the intermediate device with the ARP spoofing function continues to monitor the network traffic between the intelligent sound box and the cloud server, and deduces the behavior of the intelligent sound box by analyzing network modes under different events.
And step three, in a forensics analysis stage, monitoring the behaviors of the intelligent sound box and collecting relevant forensics information by combining network flow analysis and user intention, obtaining the current behavior of the intelligent sound box from the result of the network flow analysis, obtaining a current user instruction for the intelligent sound box from the extracted user intention, considering the behavior to be abnormal if the behavior and the instruction are inconsistent, and sending an alarm to a user through an alarm APP when an abnormal condition occurs, and determining what to do by checking forensics detailed information through a user interface by the user.
4. The non-invasive smart sound box security evidence obtaining method according to claim 3, wherein: the step of analyzing the network flow comprises the following specific steps:
(1) the ARP spoofing software is used for spoofing the intelligent sound box and intercepting network flow;
(2) extracting a network mode, analyzing a one-to-one mapping relation between the network flow mode and the intelligent sound box behavior, and analyzing the behavior of the intelligent sound box by observing the number of data packets of flow data packet capture software in each period;
(3) and after the mapping relation between various network flow modes and the behavior of the intelligent sound box is found, the stage of analyzing the network flow is ended.
5. The non-invasive smart sound box security evidence obtaining method according to claim 3, wherein: in the second step, a recording device with a microphone is used for recording the conversation between the intelligent sound box and the user, the user intention is extracted, the operation that the user wants to be executed by the intelligent sound box is determined, and whether the intelligent sound box follows the user instruction or not is checked.
6. The non-invasive smart sound box security evidence obtaining method according to claim 3, wherein: in the second step, an end-to-end automatic speech recognition system realized in TensorFlow is used for converting the monitored speech into a text format, and the specific operation steps are as follows:
(1) first, as input, speech enters the ASR system in audio form;
(2) then, after feature extraction, converting the feature into a voice feature vector for subsequent processing and recognition;
(3) the speech feature vectors and labels are then provided to an acoustic model (the core module of the overall ASR system) for training, one using characters and a second using phonemes; when the trained acoustic model is predicted in the testing stage, the trained acoustic model outputs a result according to a label used in the training process; if the character is used as a tag, the model will return the character; if a phoneme is used as a label, the model will spit the phoneme;
(4) and finally outputting the result.
7. The non-invasive smart sound box security evidence obtaining method according to claim 3, wherein: step two, extracting the intention of the user from the text by a keyword matching method, specifically: the method comprises the steps of using a topic model-based method Topmine to match keywords, aggregating adjacent words in a text into phrases according to the result of topic analysis, then selecting high-frequency phrases as key phrases, firstly collecting a group of common user instructions according to functions provided by an intelligent sound box, segmenting an instruction text through phrase mining, executing a topic model on the instruction text, then associating the same topic with each word in the text, and determining the topic successfully matched with the phrases as the keywords.
8. The non-invasive smart sound box security evidence obtaining method according to claim 3, wherein: the abnormity warning submodule specifically comprises:
mode 1: the alarm APP sends a short message to the user to prompt the user of the abnormal behavior;
mode 2: and displaying a network flow graph of abnormal behaviors through an alarm APP, if the network flow mode is not matched with the user description, marking the abnormal flow in the network flow graph in red, and if the user mentions sensitive information and the flow is increased abnormally, marking the abnormal flow in yellow.
9. A non-intrusive smart sound security forensics method according to claim 3, wherein in step one, ARP spoofing software is used to spoof the smart sound, intercept network traffic, in particular, as follows, using ARP man-in-the-middle technology, two hosts communicate by MAC address addressing in the same lan, if they are not in the same subnet, they communicate with each other, data is sent to respective router gateways, and the IP address of the gateway is used to enable communication, when the gateway communicates with the hosts in its lan, it relies on MAC address addressing;
the specific operation steps are as follows:
(1) the host is disguised as a gateway, and all data from the intelligent sound box to an external network are analyzed through the host;
(2) and opening Wireshark, and capturing network traffic in real time. The information observed from the traffic packet mainly includes the packet length, the packet direction and time stamp, and plaintext information.
CN202011315413.0A 2020-11-20 2020-11-20 Non-invasive intelligent sound box safety evidence obtaining system and method thereof Active CN112565207B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011315413.0A CN112565207B (en) 2020-11-20 2020-11-20 Non-invasive intelligent sound box safety evidence obtaining system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011315413.0A CN112565207B (en) 2020-11-20 2020-11-20 Non-invasive intelligent sound box safety evidence obtaining system and method thereof

Publications (2)

Publication Number Publication Date
CN112565207A true CN112565207A (en) 2021-03-26
CN112565207B CN112565207B (en) 2022-06-21

Family

ID=75044527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011315413.0A Active CN112565207B (en) 2020-11-20 2020-11-20 Non-invasive intelligent sound box safety evidence obtaining system and method thereof

Country Status (1)

Country Link
CN (1) CN112565207B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569083A (en) * 2021-06-17 2021-10-29 南京大学 Intelligent sound box local end digital evidence obtaining system and method based on data traceability model
CN113905012A (en) * 2021-09-08 2022-01-07 北京世纪互联宽带数据中心有限公司 Communication method, device, equipment and medium
CN117672227A (en) * 2024-01-25 2024-03-08 深圳市音随我动科技有限公司 Question-answer control method and device based on intelligent sound box, computer equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190132787A1 (en) * 2017-10-27 2019-05-02 LGS Innovations LLC Rogue base station router detection with statistical algorithms
CN109905374A (en) * 2019-01-29 2019-06-18 杭州电子科技大学 A kind of identity identifying method with secret protection characteristic towards wired home
CN110891047A (en) * 2019-10-08 2020-03-17 中国信息通信研究院 Intelligent sound box data stream processing method and system
CN210536678U (en) * 2019-10-08 2020-05-15 中国信息通信研究院 Intelligent sound box data stream acquisition device
CN111182385A (en) * 2019-11-19 2020-05-19 广东小天才科技有限公司 Voice interaction control method and intelligent sound box

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190132787A1 (en) * 2017-10-27 2019-05-02 LGS Innovations LLC Rogue base station router detection with statistical algorithms
CN109905374A (en) * 2019-01-29 2019-06-18 杭州电子科技大学 A kind of identity identifying method with secret protection characteristic towards wired home
CN110891047A (en) * 2019-10-08 2020-03-17 中国信息通信研究院 Intelligent sound box data stream processing method and system
CN210536678U (en) * 2019-10-08 2020-05-15 中国信息通信研究院 Intelligent sound box data stream acquisition device
CN111182385A (en) * 2019-11-19 2020-05-19 广东小天才科技有限公司 Voice interaction control method and intelligent sound box

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王基策等: "智能家居安全综述", 《计算机研究与发展》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569083A (en) * 2021-06-17 2021-10-29 南京大学 Intelligent sound box local end digital evidence obtaining system and method based on data traceability model
CN113569083B (en) * 2021-06-17 2023-11-03 南京大学 Intelligent sound box local digital evidence obtaining system and method based on data tracing model
CN113905012A (en) * 2021-09-08 2022-01-07 北京世纪互联宽带数据中心有限公司 Communication method, device, equipment and medium
CN117672227A (en) * 2024-01-25 2024-03-08 深圳市音随我动科技有限公司 Question-answer control method and device based on intelligent sound box, computer equipment and medium
CN117672227B (en) * 2024-01-25 2024-04-05 深圳市音随我动科技有限公司 Question-answer control method and device based on intelligent sound box, computer equipment and medium

Also Published As

Publication number Publication date
CN112565207B (en) 2022-06-21

Similar Documents

Publication Publication Date Title
CN112565207B (en) Non-invasive intelligent sound box safety evidence obtaining system and method thereof
CN109525558B (en) Data leakage detection method, system, device and storage medium
CN106909847B (en) Malicious code detection method, device and system
CN104966053B (en) Face identification method and identifying system
EP2806425B1 (en) System and method for speaker verification
WO2010031288A1 (en) Botnet inspection method and system
CN113691566B (en) Mail server secret stealing detection method based on space mapping and network flow statistics
CN105516073B (en) Network intrusion prevention method
CN111083117A (en) Botnet tracking and tracing system based on honeypots
CN110351237B (en) Honeypot method and device for numerical control machine tool
CN111970233B (en) Analysis and identification method for network violation external connection scene
CN111581621A (en) Data security processing method, device, system and storage medium
Lin et al. A non-intrusive method for smart speaker forensics
CN112217777A (en) Attack backtracking method and equipment
CN108494791A (en) A kind of DDOS attack detection method and device based on Netflow daily record datas
CN111917699A (en) Detection technology for identifying counterfeit dumb terminal of illegal equipment based on fingerprint
Razak et al. Network intrusion simulation using OPNET
CN112887303B (en) Series threat access control system and method
TW200924428A (en) An inside tracing method of the network attacking detection
KR101005093B1 (en) Method and device for identifying of client
US11436310B1 (en) Biometric keystroke attribution
CN108566380A (en) A kind of proxy surfing Activity recognition and detection method
Sulistya et al. Network Security Monitoring System on Snort with Bot Telegram as a Notification
CN109981602B (en) Internet of things security gateway protection method by using Internet of things security gateway system
CN113569083A (en) Intelligent sound box local end digital evidence obtaining system and method based on data traceability model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant