WO2018194282A1 - Server access control system for detecting abnormal user on basis of learning of inputted commands for security enhancement - Google Patents

Server access control system for detecting abnormal user on basis of learning of inputted commands for security enhancement Download PDF

Info

Publication number
WO2018194282A1
WO2018194282A1 PCT/KR2018/003549 KR2018003549W WO2018194282A1 WO 2018194282 A1 WO2018194282 A1 WO 2018194282A1 KR 2018003549 W KR2018003549 W KR 2018003549W WO 2018194282 A1 WO2018194282 A1 WO 2018194282A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
command
statement
server
probability
Prior art date
Application number
PCT/KR2018/003549
Other languages
French (fr)
Korean (ko)
Inventor
김대옥
신호철
구제웅
정종균
염창주
Original Assignee
주식회사 넷앤드
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 넷앤드 filed Critical 주식회사 넷앤드
Publication of WO2018194282A1 publication Critical patent/WO2018194282A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention for the security management of the main server managed by the institution, the user remotely access the server to learn the commands used, extract the behavior pattern of the user through the learning, the input command when the user's command input event occurs
  • the present invention relates to a server access control system that detects an input instruction learning based abnormal user that determines and controls an abnormality by comparing with a learned behavior pattern.
  • the present invention focuses on performing an operation different from a conventional user's legitimate user's work pattern when performing infringement by a hacker or a malicious intention user who is not a normal user.
  • the server access control system analyzes the packet passing through the access control gateway server to extract a command input by the user, and checks and controls whether the extracted command is allowed. For example, if it is in the list, the command is destroyed without being sent to the server, compared to the list of prohibited commands that may threaten the security applied to the user. This can enhance the security of the server.
  • the server access control system provides a security function to perform the user authentication for the first user to access, and to access and perform the equipment within the authority granted after the user authentication.
  • An object of the present invention is to solve the problems as described above, for the security management of the main server managed by the institution, to learn the commands used by the user remote access to the server, and extract the behavior pattern of the user through the learning
  • a server access control system for detecting an input user based on an input command learning that determines an abnormality by comparing the input command with a learned behavior pattern and controls the abnormality.
  • an object of the present invention is to connect to the server access control system that the user controls access to the main server of the institution, the user accesses the server, collects the command data used at this time, and utilizes the machine learning (Machine Learning) technique
  • the server access control system for detecting an abnormal user input based on input instruction learning, extracting a work behavior pattern for each user.
  • an object of the present invention is to determine whether the user is an abnormal user based on the work behavior pattern information extracted when the user is connected to the server via the actual access control system, the command input, and if the user is determined to be an abnormal user It is to provide a server access control system that detects abnormal user input based on input instruction learning which automatically executes the user control based on it.
  • an object of the present invention is to create a learning model based on the user's server access and command input values in the access control system, in order to defend against infringement by hackers and intrusion by malicious users, It is to provide a server access control system that detects abnormal user based on inputted instruction learning that can check whether the currently connected user is the user using the model.
  • the present invention provides a server access control system for detecting an input instruction learning-based abnormal user, in which a user terminal and a server are connected to a network and installed as a gateway on a network between the user terminal and the server.
  • a relay module comprising: a relay module for extracting session information and a statement from a packet transmitted from the user terminal and relaying a result of a statement or a server input between the user terminal and the server; Receiving session information and statements from the relay module, extracting a command, learning and generating a behavior model indicating a user's command input pattern, and calculating a probability of the abnormal user by applying a user's current command to the behavior model.
  • an abnormality determination unit configured to receive a probability of an abnormal user from the abnormality detection unit, and determine an alert, a session block, or a user block to an administrator by using the probability of the abnormal user according to a predetermined policy.
  • the present invention also provides a server access control system for detecting an input instruction learning based abnormal user, wherein the abnormality detecting unit comprises: an event channel for receiving a statement from the relay module; A state channel for providing probability information of an abnormal user to the abnormal determination unit; An action coordinator for extracting instructions from the statement; A behavior model engine for generating a behavior model for each user through learning; Receiving a command from the behavior coordinator, calculates the probability for each user with respect to the received command, characterized in that it comprises a calculation unit for calculating using the behavior model.
  • the present invention is a server access control system for detecting an input instruction learning-based abnormal user, wherein the behavior coordinator writes the command in the behavior log storage and accumulates the behavior model engine, the behavior model engine is stored in the instruction of the behavior log storage It is characterized by continuously learning and updating the behavior model using data.
  • the present invention is a server access control system for detecting an input instruction learning-based abnormal user
  • the behavior model is composed of a Bayesian model and a deep learning model
  • the operation unit calculates a first probability from the Bayesian model
  • the second probability is calculated from the deep learning model
  • the final probability is extracted by adjusting a ratio between the first probability and the second probability by using weights.
  • the present invention provides a server access control system for detecting an input instruction learning-based abnormal user, wherein the relay module receives a statement character from the user terminal, accumulates the statement character if the statement character is not an enter character. Generate a statement cumulative string, extract the final statement to be actually executed from the cumulative string if the statement character is an enter character, and if the statement character is a control character, transmit the accumulated statement cumulative string and the control character to the server, The character string reflecting the control character is received, and the accumulated character string is generated by accumulating the reflected character string.
  • the present invention is a server access control system for detecting an input instruction learning-based abnormal user, wherein the relay module is any one of an echo command, a linked command extraction (realpath) command, a command name extraction (basename) command And transmitting the cumulative string to the server as one or more commands and as an argument of the command, extracting the final command using the result of the command received from the server, and the echo command being variable-processed.
  • the linked command extract (realpath) command is a linked command extract (realpath).
  • Command is a command that returns an actual command linked by a symbolic link, and extracts the command name.
  • the command is characterized in that the command to return the name of the actual execution command excluding the path if the command includes a path (path).
  • the present invention is characterized in that in the server access control system that detects the input instruction learning-based abnormal user, the command is extracted by consisting of only the command name and command options in the statement.
  • the present invention is a server access control system for detecting an input instruction learning-based abnormal user
  • the behavior model engine is similar to the input order of options and option string of each command, similarity of usage frequency for each command, command Reflecting the similarity of the usage pattern according to the order of use, characterized in that to learn the behavior model.
  • the present invention is a server access control system for detecting an input instruction learning-based abnormal user
  • the behavior model engine is characterized by obtaining the behavior model using the following [Equation 1].
  • Training Count is the number of training data
  • Training Data Length is the length of the training instruction
  • is a predetermined constant. pseudocount
  • A is the number of distinct commands.
  • the present invention is a server access control system for detecting an input instruction learning-based abnormal user
  • the behavior model is built by learning N number in advance by the number of users
  • the operation unit is a user input commands of a specific user
  • Probability of N users is obtained by applying to all N behavior models, and the ranking of probability of each user is given by sorting the probability of N users in descending order. It is determined whether the user is an abnormal user according to the rank of the probability, and when the probability of the corresponding user is below a predetermined rank with respect to a command input by a specific user, the user is determined to be an abnormal user.
  • the server access control system for detecting an input instruction learning-based abnormal user by comparing whether the behavior of the accessor is similar to the command usage pattern of the legitimate user extracted in advance to estimate whether the legitimate user For example, a hacker or a malicious user can take over an account and defend against an attack through normal authentication.
  • FIG. 1 is a view showing a security vulnerability when the authentication information is stolen from the access control system according to the prior art.
  • FIG. 2 is a block diagram of an overall system for practicing the present invention.
  • FIG. 3 is a block diagram of a configuration of a server access control system for detecting an input instruction learning-based abnormal user for enhanced security according to an embodiment of the present invention.
  • Figure 4 is a block diagram for a detailed configuration of the abnormality detection unit for real-time abnormal user detection according to an embodiment of the present invention.
  • FIG. 5 is a block diagram of an action model for detecting an abnormal user by machine learning according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart illustrating a process of extracting a statement according to an embodiment of the present invention.
  • FIG. 7 is a flowchart illustrating a process of extracting a statement according to an embodiment of the present invention.
  • FIG. 8 is an exemplary view of a statement checking process according to an embodiment of the present invention.
  • 9 is an exemplary view of a result of deriving an instruction from a statement according to an embodiment of the present invention.
  • FIG. 10 is an exemplary diagram of input data of an action model engine according to an embodiment of the present invention.
  • FIG. 11 is an exemplary diagram of output data of an action model engine according to an embodiment of the present invention.
  • FIG. 12 is a flowchart illustrating a method of determining an abnormal user according to an embodiment of the present invention.
  • Figure 13 is a block diagram for learning the behavior model according to an embodiment of the present invention.
  • the entire system for implementing the present invention includes an access control system 30 that serves as a gateway between the user terminal 10, the server 40, and the user terminal 10 and the server 40. It is composed of In addition, the user terminal 10 and the server 40 are connected through a network (not shown).
  • the user terminal 10 is a computing terminal used by a user, such as a PC, a notebook, a smartphone, a tablet PC, and the like.
  • the user terminal 10 connects to the server 40 through a remote access protocol Telnet (TELNET) or SSH (secure shell), and performs the operation through a shell (Shell) installed in the server 40.
  • Telnet Telnet
  • SSH secure shell
  • a shell is a command interpreter that translates user input into machine language and passes it to the server's kernel.
  • a shell is an interactive command interprinter that interprets commands entered by a user and processes them with the server kernel.
  • the shell uses a character user interface (CLI).
  • CLI command line interface
  • various shells such as a Bourne shell, a Korn shell, a bash shell, a C shell, and a Tcsh shell may be applied.
  • the user uses the service of the server 40 through the user terminal 10, and for this purpose, a series of characters, that is, a command string (or string), is input on a shell, and an enter is entered.
  • the server 40 transmits the input statement or string.
  • a shell installed in the user terminal 10 receives a result of a command input from the server 40 and displays the result on a screen in a text form or a string form.
  • the input command string (string) is called a statement.
  • the server 40 receives a statement from the user terminal 10 through a network (not shown), performs a command of the corresponding statement, and transmits the result to the user terminal 10.
  • the statement consists of a series of characters, ie strings. These characters, or strings, are transmitted over the network. That is, a session is formed between the user terminal 10 or the shell and the server 40, and a character or a string is transmitted through the data packet within the session.
  • the server 40 recognizes a string of characters entered by the user, preferably, a string of characters or strings entered up to the previous character as one statement (or command string).
  • the enter character means a character indicating that statement input is completed.
  • the enter character or enter key character, enter key
  • statement input completion character will be referred to.
  • the server 40 may accumulate input characters until an enter is input, generate a string (or a statement), parse the generated string up to the middle, and return the result.
  • the returned result may be displayed on the shell of the user terminal 10. That is, whenever one character is input, the server 40 returns a command string (or statement) corresponding to the input in the form of a text to be displayed on the user terminal 10 or the shell.
  • the server 40 causes the character string of the result corresponding thereto to be returned.
  • the server 40 receives a control key (or control character) such as a combination of a tab, a direction key, and a control (ctrl, alt) key in addition to a general character key input such as a letter, number, or symbol.
  • a control key or control character
  • the control character is interpreted and converted into a character string (or a text string) of the corresponding general character.
  • the general character refers to characters that can be displayed as text such as letters, numbers, and symbols
  • the control character refers to characters for control such as tabs, direction keys, control keys, and combination characters.
  • the server 40 is operated by a Linux or Unix-based operating system (OS), remote access using a CLI (Command-Line Interface) type remote access protocol (TELNET, SSH) Provide services for
  • the access control system 30 controls the statements entered in the service for such remote access.
  • the access control system 30 is a gateway installed on a network (not shown) between the user terminal 10 and the server 40, and relays or blocks the user terminal 10 and the server 40. .
  • the access control system 30 receives the command character or string (or statement) received from the user terminal 10 and transmits it to the server 40, and receives the result from the server 40 to receive the user terminal 10. To pass.
  • the access control system 30 analyzes the command character or string received from the user terminal 10, and determines whether to block the command contained in the command string (or statement). That is, according to the determination of blocking or not, the command string (or statement) is transmitted to the server 40 or blocked.
  • the present invention collects and learns the history of working on the server through the access control system, that is, command information, and extracts the work behavior pattern for each user.
  • the access control system compares the statements input by the user with the patterns in real time based on the extracted pattern to determine whether there is an abnormal user. That is, the authentication function for the user is continuously performed through the pattern. If it is determined that the abnormal user, the session of the user can be blocked to prevent potential threats.
  • the server access control system 30 includes an input / response transmitter 31, a statement extractor 32, a packet relay 33, an equipment transmitter 34, and a policy.
  • the determination unit 35 and the abnormality detection unit 36 are configured.
  • the input response transmitter 31 receives an input from the user or the user terminal 10 and transmits it to the relay module 38 or transmits a response received from the relay module 38 to the user or the user terminal 10. .
  • the relay module 38 is a module that connects the input response transmitter 31 and the equipment transmitter 34 to relay packets or data.
  • the relay module 38 includes a statement extractor 32 for extracting a statement from a packet, and a packet relay 33 for relaying a packet or data.
  • the statement extracting unit 32 analyzes the packet received from the user terminal 10, extracts a statement, or extracts session information of the user.
  • the session refers to a session of the TCP / IP protocol established by the user terminal 10 to remotely access the server 40.
  • the user makes a remote connection to the server 40 by using Telnet, Secure Shell (SSH) protocol, or the like.
  • Telnet Telnet
  • SSH Secure Shell
  • a user enters a number of commands to do what he wants.
  • the session information includes device connection session information of the user. That is, session information includes user identification information (user ID, etc.), access equipment identification number (access equipment ID, etc.), access protocol, access account, access start time, access end time, and the like.
  • the statement extracting unit 32 If the statement extracting unit 32 extracts the user's session information and statements from the packet, the statement extracting unit 32 transmits the session information and the information about the statement to the abnormality detecting unit 36. When an abnormality is not detected from the abnormality detecting unit 36 or the abnormality determining unit 35, the packet is relayed through the packet relaying unit 33.
  • the abnormality detection unit 36 extracts a command from a statement, collects a series of commands corresponding to one session, and analyzes a pattern for the series of commands. And it compares with the behavior pattern (or command pattern) of the previously learned user. At this time, the analysis result is derived from the learned user's behavior pattern and the difference or abnormal user probability (probability value of abnormal user).
  • the abnormality detection unit 36 collects a series of commands in the session, learns and generates a behavior model with the corresponding commands, and derives an abnormal user probability using the learned behavior model.
  • the abnormality detection unit 36 transmits the derived abnormal user probability to the abnormality determination unit 35.
  • the abnormality determination unit 35 receives the abnormal user probability and determines whether the abnormal user is based on a predefined policy. That is, when the abnormal user probability is out of a predetermined threshold (for example, when the probability exceeds a predetermined threshold), the abnormal user is determined as an abnormal user.
  • the abnormality determination unit 35 judges the abnormal user, and sanctions on the use of the user based on the defined policy. For example, send notifications to administrators or block sessions. This can enhance security.
  • the access control system suspends the command transmission to the management server actually connected. And the user authentication is continuously performed by the abnormality detection unit 36 and the abnormality determination unit 25 for authentication.
  • the abnormality determination unit 35 or the abnormality detection unit 36 derives a result based on the probability that the corresponding user is not True / False or not, and “Decision Support System”. If the probability that a user is hit by a policy exceeds a certain threshold, then an automatic action is taken based on the defined policy.
  • FIG. 4 shows a detailed configuration of the abnormality detecting unit 36 and a series of processes for detecting an abnormal user based on a usage pattern based on the user's real-time work behavior in the access control system.
  • the configuration of the abnormality detection unit 36 according to an embodiment of the present invention is largely divided into a data pipe and an action engine.
  • the data pipe part is composed of an event channel 51 for delivering a work event of a user and a state channel 52 for delivering an abnormal user probability.
  • the event channel 51 receives information about a user's work event (User, Device Session, Command). That is, the gateway relay module 38 of the access control system transmits the corresponding event information to the event channel 51 when a user's work event (User, Device Session, Command) occurs.
  • the event channel 51 calls the behavior coordinator 61 to deliver the corresponding information.
  • the job event refers to a user's statement input. That is, the user enters a series of strings, and finally enters an enter character (enter key) that completes a statement. In other words, when a statement is completed and entered, a job event is generated for that statement.
  • the state channel 52 receives the calculated probability user information from the operation unit 64 of the behavior engine. That is, the calculator 64 of the behavior engine transfers the calculated probability user to the status channel 52.
  • the state channel 52 calls the abnormality determination unit 35 to transmit corresponding information.
  • the action engine includes an action coordinator 61, a data storage unit 62, an action model engine 63, and an operation unit 64.
  • the behavior log storage 71, the behavior model 72, the model cache 73 is configured.
  • the behavior coordinator 61 receives a user's work event (User, Device Session, Command) from the event channel 51, instructs to store a log for model construction, and detects an abnormal user.
  • the operation unit 64 is called to perform analysis on the work event. That is, the behavior coordinator 61 transmits the information to the data saver 62 to store the received event information, and calls the calculator 64 to determine the abnormal user.
  • the behavior coordinator 61 gets the statement from the event channel 51.
  • the command is extracted from the imported statement, and the extracted command is stored in the behavior log storage 71.
  • the data storage unit 62 stores the user's job event information received from the behavior coordinator 61 in the behavior log storage 71. That is, the data saver 62 stores the event information in the behavior log storage 71 for learning the user's work behavior.
  • the activity log storage 71 records and accumulates session information and commands in the session. Also, more preferably, the entire statement is recorded in a log together with session information and commands.
  • the calculator 64 compares the learned behavior pattern model for each user with a probability and calculates a probability that the received behavior information differs from the stored behavior pattern (or behaviors) of the user. .
  • the behavior model engine 63 is an engine for generating the behavior model 72 to build a behavior-based behavior model of the user. That is, the behavior model engine 63 uses the user behavior information stored in the behavior log storage 71 to reconstruct the behavior model (work behavior based behavior model of the user). That is, the behavior model engine 63 is called to reconstruct the user's work behavior based behavior model.
  • the behavior model uses a Bayesian model and a cyclic neural network (RNN) model. That is, the behavior model engine 63 extracts variables of the Bayesian model by using user behavior information, that is, inputted commands. In addition, the recursive neural network model is trained using the inputted commands.
  • RNN cyclic neural network
  • the behavior model engine 63 is periodically called to periodically reconstruct the user's work behavior based behavior model.
  • the calculator 64 calculates probability information of a calculated abnormal user and transfers the calculated probability information to a status channel. That is, the calculator 64 calculates the probability of the abnormal user by comparing the behavior model 72 and the behavior patterns of the user.
  • the calculation unit 64 calculates first probability information of the abnormal user by the Bayesian model, and calculates second probability information of the abnormal user by the cyclic neural network model (or deep learning model).
  • the calculating unit 64 extracts final probability information (or probability information of an abnormal user) by weighting and adjusting the first probability information and the second probability information.
  • the weight is determined according to the learning data or the number of events or instructions of each user. That is, when the size and number of the training data are relatively small numbers (100 to 1000) of input data, the weight of the first probability information is greater than the weight of the second probability information, and when the number is relatively large, The weight of the first probability information is made smaller than the weight of the second probability information.
  • each user's data In order to use the deep learning model, each user's data must be accumulated sufficiently. Using tip learning models until they accumulate enough may not produce the correct results. That is, a cold start problem occurs.
  • the Bayesian model can discern abnormal users even with relatively small input data. Therefore, when the initial data is insufficient, the weight is added to the Bayesian model, and as the data is accumulated, the weight of the deep learning model is increased. In other words, if the data is sufficiently accumulated, the ratio of the deep learning model and the Bayesian decision model may be adjusted to make an optimal decision.
  • model cache 73 is a medium (or cache) that temporarily stores these data in order to prepare for the behavior model and the behavior pattern of the user. In particular, record the behavior model to be prepared in the model cache and contrast it with the user behavior pattern.
  • the model cache allocates and uses a certain amount of cache space for abnormal user detection in real time. This speeds up searches for commonly used models.
  • the abnormality determination unit 35 is a decision support system (Decision Support System), and determines whether or not the abnormal user based on the defined policy. At this time, the probability information of the abnormal user is received from the status channel 52, and it is determined whether the abnormal user is using the received probability information.
  • Decision Support System Decision Support System
  • the abnormality determination unit 35 requests an appropriate sanction (warning, administrator notification, session blocking) when the applied threshold is exceeded.
  • the abnormality determination unit 35 requests the gateway server or the relay module 38 to block the session based on the defined policy.
  • the threshold of the probability unit of the abnormal user may be set as follows.
  • the administrator sends a notification signal informing the administrator of the abnormal user. If the error rate is 80% or more, the session currently connected to the user to block a task is blocked. In addition, if the probability of the abnormal user is more than 90%, access to the user is blocked. As described above, each threshold is set in advance, and the abnormality determination unit 35 automatically performs a predetermined series of tasks (security tasks) when the probability of the abnormal user exceeds each threshold.
  • the relay module 38 or the statement extracting unit 32 performs a work in an environment in which communication data is relayed between the user terminal 10 and the management target server 40.
  • an environment in which a user or a user terminal 10 performs a remote access operation using a CLI protocol to a management target server 40 authorized through an access control system 30 will be described as an embodiment. .
  • the user or user terminal 10 inputs a statement to work on the server 40.
  • the relay module 38 or the statement extracting unit 32 transmits the received statement to the server 40 and transmits the response of the server 40 to the user terminal 10 again.
  • the statement extracting unit 32 extracts the inputted characters and accumulates them in the internal memory until the Enter key input indicating execution of the command is received.
  • the statement extractor 32 or the relay module 38 suspends the transmission of the corresponding key input to the server 40, and the permission policy in which the instructions accumulated in the memory are assigned to the user is assigned. It is determined whether or not according to the execution, and whether or not to transmit to the server (40).
  • the relay module 38 does not determine whether the terminal is immediately authorized, but instead executes a final command or statement through communication with the server 40 and analysis of response data through a command confirmation process. Extract. For example, when the accumulated user command string is a string of “/ usr / bin / rm”, the relay module 38 extracts “rm”, which is the final command string to be actually executed from the command string. Then extract the actual statement with the extracted final commands, options, and arguments.
  • the control character is a character for control, such as a tab, a direction key, a combination character with a control (ctrl, alt) key, a function key, etc. and means a character that is not a text type character.
  • the input statement character is accumulated in the input statement cumulative string (S40).
  • a statement cumulative string is a string created by accumulating the input statement characters. The cumulative statement string is accumulated before the enter key character (or command completion character) is input.
  • the relay module 38 returns the cumulative statement string to the user terminal 10 and outputs the corresponding character string on the shell of the user terminal 10. That is, the user can see that the statement string he entered is displayed in the shell.
  • the accumulated character string and the corresponding input control character are transmitted to the server 40 (S31), and the statement string reflecting the control character is received from the server 40 (S32).
  • the reflected statement string is reflected in the input statement cumulative string (S40).
  • the accumulated string is "ren", in which the user enters a tab character.
  • the character string "ren [Tab]” is transmitted to the server 40, the character string “rename” is received from the server 40. That is, "rename” is a string in which "ren” reflects the control character [Tab].
  • the reflected cumulative string is returned to the user terminal 10, and the reflected cumulative string is displayed on the shell.
  • the user sees "rename" displayed on the screen (in a shell).
  • the statement checking process S50 is a process of extracting a statement in which an input cumulative string or a statement string is actually executed. That is, the relay module 38 checks the final statement intended by the user by executing the statement on the real server 40.
  • the statement checking process S50 requests the server 40 and returns a result by using system commands such as an echo command, a linked command extract (realpath) command, and a command name extract (basename) command.
  • system commands such as an echo command, a linked command extract (realpath) command, and a command name extract (basename) command.
  • an executable checking (which) instruction may be additionally used.
  • an echo command is applied (S51).
  • the echo command converts a command that has been processed into a variable, a command including a wild char, and a command including a history to convert the command to be executed.
  • the echo command replaces newlines, spaces, and so on.
  • the command when the user presses the Enter key to execute a statement, the command outputs newline characters on the standard output and includes a space between the strings and the end of a line to check the accumulated statement characters in memory. "Echo" + [input command] is sent to the server, and the server 40 receives the response.
  • the relay module 38 analyzes the received message to replace the variable processed statement character and replaces wild char and history statements with actual statements.
  • an executable checking (which) command may be performed.
  • the executable check (which) command is used to check whether a corresponding cumulative statement string is executable.
  • the command “which [command]” is sent to the server to check whether the replaced command is a command that can be executed on the server.
  • the linked command extraction (realpath) command is applied (S52).
  • the linked command realpath command returns the actual command or statement linked by the symbolic link.
  • the command command basename returns the name of the actual execution command except the path if the command contains a path.
  • the final intended command can be extracted by sending the command “basename” to the server to get the command character except the path from the string including the full path of the executable file.
  • the output is displayed as "rename" on the screen of the user terminal.
  • the cumulative statement string according to the prior art is "ren [TAB]", but the cumulative statement string according to the present invention is "rename”.
  • the relay module 38 does not accumulate the character and transmits the character to the server to analyze the response value and accumulate the statement.
  • FIG. 8B is an example of using a history command. Check the execution by looking at the statement number used previously. That is, in the example of FIG. 8B, the number "546" represents the statement of "/ usr / bin / ssh". The command string "/ usr / bin / ssh" is returned from the server 40.
  • FIG. 8C is an example of using a statement using wild chars (*,?). Instead of typing the entire statement, you can use wild chars to run similar executables that exist in that directory.
  • FIG. 8D shows an example of using a variable processed statement.
  • a has the statement "rm”.
  • the server returns "rm -rf”.
  • the next step is to check using the linked command's realpath statement.
  • "ssf" is a command linked to a command of "/ usr / bin / ssh”. So when you run “ssf”, the linked statement "/ usr / bin / ssh” is executed.
  • the "ssf" statement string returns the linked command "/ usr / bin / ssh”. At this time, reply with the location of the actual executable and the name of the executable.
  • FIG. 8F an example of confirming a final command using a command name extraction (basename) is shown in FIG. 8F.
  • the command string "usr / bin / ssh" is requested to the server with the command command basename, the command name "ssh" is returned.
  • command includes an optional part of the command. That is, a statement consists of a command and its arguments (or arguments). At this time, only the command is extracted without the argument value.
  • a command when a command includes a path, only the pure command is extracted except the path. That is, when entering a statement, the path (directory or folder) in which the command is located may be described together with the name of the command. Exclude the path from the command or statement.
  • the final command therefore consists only of the command (or command name) and command options. In this way, the learning noise can be reduced by reducing the statement to instructions.
  • FIG. 9 An example of deriving an instruction from a statement is shown in FIG. 9.
  • whether a user is an abnormal user with respect to a series of commands input by a user in one session is determined based on the following criteria.
  • each command and the input order of option strings are similar. That is, it determines whether or not each command is similar to which option is used.
  • each user enters a command and its option string in a certain order according to their own habits. For example, users who use the command "ls -al" do not change the order of the options "ls -la".
  • each user has a series of instructions that he or she performs regularly to perform a particular task.
  • the task may be performed according to the execution of another series of commands, but in general, the user performs the task through a command task that is familiar to the user.
  • analyzing a series of commands required to perform a specific task it may be determined whether the user is similar.
  • the behavior model engine 63 inputs session information and command data to build a behavior model through learning.
  • the input data of the behavior model includes session information and command data, and includes user identification information (user ID, etc.) and session identification information (ID of a device access session).
  • the command input time may be further added.
  • FIG. 10 An example of the input data of the behavior model is shown in FIG.
  • “_id” represents an object unique value in a DB
  • “user_id” represents a user ID connected to equipment.
  • command represents a command input after the device is connected
  • connection_id represents the ID of the device connection session.
  • the equipment connection session eye can connect and find data such as equipment information and connection protocol information.
  • datetime represents the input time of the statement.
  • FIG. 1 An example of output data of the behavior model is shown in FIG.
  • the output data (or result data) of the behavior model a usage pattern of a command for each user is extracted.
  • “_id” is an object unique value in the DB
  • “num_distinct_commands” represents the number of unique_commands (non-duplicated) commands.
  • “user_id” is a user ID and indicates which user's command learning model.
  • “unique_commands” represents the weight for each command modeled through the training data. The command weight is a weight value for determining whether there is an abnormal user.
  • the instruction is trained by the following [Equation 1] to build a model (or pattern) for each user.
  • Equation 1 shows which options are used with each command, and how often each command is used.
  • the probability at each command line c for a specific user u is calculated by the above equation.
  • is a predetermined constant and is a pseudocount
  • A is the number of distinct commands.
  • the pseudocount ⁇ does not mathematically make the molecule zero.
  • a new command is entered that is not in the training data, it can be regarded as sensitivity.
  • the behavioral model for each user is referred to (S11).
  • the behavior model is previously generated by the behavior model engine 63. That is, the behavior model engine 63 learns about N users. When a specific command is entered, it can find out which user the command is most likely to be.
  • N behavior models M are learned in advance as many as the number of users, and operation unit 64 is brought from a database or the like so as to refer to behavior model M (S11).
  • the input command is the actual command extracted by the command extractor 32 or the behavior adjuster 61. That is, the entered command consists of the command name and options.
  • the calculation unit 64 applies the input command to the behavior model for each user, and obtains the probability P for each user (S13). That is, the form of the result consists of a combination of (user ID, probability).
  • the probability per user is calculated by the number of users (eg N).
  • the probability is not an absolute value but a relative probability that the current user is each user.
  • the user-specific probability is calculated by the calculation unit 64.
  • ranks are assigned according to sizes of N user-specific probabilities (S14).
  • the probability of each user is ranked by sorting the probability of each user in descending order. That is, the probability for each user represents the probability that the current user is the corresponding user.
  • the calculated probability for each user is as follows.
  • the command “ls? F? L? L” used by the current user “zz_user” is evaluated based on the learning model of the entire user. Examples of the results are as described above.
  • the resultant value as described above is transmitted to the abnormality determination unit 35.
  • the judgment work is performed by the abnormal decision unit 35.
  • the order of users sorted in descending order indicates the predicted value of the probability that the user is correct for the input command string.
  • the abnormality determination unit 35 determines that the user is an abnormal user when the probability of being the corresponding user is below a predetermined rank with respect to a command input by a specific user.
  • the abnormality determination unit 35 assumes that the user who used the command is correct when the user (or user ID) that inputs the command is included in the applied top N names according to the policy.
  • FIG. 13 is when a command is input. This is a conceptual diagram of how to calculate the probability that a user who enters a command is correct.
  • the input command is calculated based on the trained model of the entire user and Equation 1, and the probability that the corresponding user is corrected is calculated as a result value.

Abstract

The present invention relates to a server access control system for detecting an abnormal user on the basis of learning of inputted commands, the system: learning commands used by a user when the user remotely accesses a server; extracting behavior patterns of the user through learning; and comparing an inputted command with a learned behavior pattern when a command input event of the user occurs, thereby determining whether abnormality occurs and controlling the same, and the system comprises: a relay module for extracting session information and a command statement from a packet transmitted from a user terminal, and relaying the command statement inputted between the user terminal and the server, or the results of the server; an abnormality detection unit, which extracts commands by receiving the session information and the command statement from the relay module, learns and generates a behavior model that exhibits command input patterns of the user, and applies a current command of the user to the behavior model so as to calculate the probability of an abnormal user; and an abnormality determination unit, which receives the probability of the abnormal user from the abnormality detection unit, and determines to warn a manager or disconnect a session or a user by using the probability of the abnormal user according to a policy determined in advance. According to the server access control system, whether a user is an authorized user is estimated by comparing whether behaviors of an accessing person are similar to the authorized user's command use patterns having been extracted in advance, such that attack behaviors by a hacker or a malicious user who seizes an account and proceeds through normal authentication can be defended against.

Description

보안 강화를 위해 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템Server access control system that detects abnormal user input based on command learning for enhanced security
본 발명은 기관에서 관리하는 주요 서버의 보안 관리를 위하여, 사용자가 서버에 원격 접근하여 사용한 명령어들을 학습시키고, 학습을 통해 사용자의 행위 패턴을 추출하고, 사용자의 명령어 입력 이벤트가 발생되면 입력된 명령어를 학습된 행위 패턴과 비교하여, 이상 여부를 판단하여 통제하는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 관한 것이다.The present invention for the security management of the main server managed by the institution, the user remotely access the server to learn the commands used, extract the behavior pattern of the user through the learning, the input command when the user's command input event occurs The present invention relates to a server access control system that detects an input instruction learning based abnormal user that determines and controls an abnormality by comparing with a learned behavior pattern.
또한, 본 발명은 정상적인 사용자가 아닌 해커 또는 악의적 의도를 가진 사용자들의 침해 행위 수행 시, 기존에 수행했던 정당한 사용자의 작업 패턴과 상이한 작업을 수행하는 것에 착안하여 이를 방지 할 수 있는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 관한 것이다.In addition, the present invention focuses on performing an operation different from a conventional user's legitimate user's work pattern when performing infringement by a hacker or a malicious intention user who is not a normal user. A server access control system for detecting abnormal user.
일반적으로, 서버 접근통제 시스템은 접근통제 게이트웨이(Gateway) 서버를 경유하는 패킷을 분석하여 사용자가 입력한 명령어를 추출하고, 추출된 명령어의 허용 여부를 검사하여 통제한다. 예를 들어, 해당 사용자에게 적용된 보안에 위협이 될 수 있는 금지 명령어 리스트와 비교하여, 리스트에 있을 경우 해당 명령어를 서버에 전송하지 않고 파기한다. 이를 통해, 서버에 대한 보안을 강화할 수 있다.In general, the server access control system analyzes the packet passing through the access control gateway server to extract a command input by the user, and checks and controls whether the extracted command is allowed. For example, if it is in the list, the command is destroyed without being sent to the server, compared to the list of prohibited commands that may threaten the security applied to the user. This can enhance the security of the server.
또한, 종래기술에 따른 서버 접근통제 시스템은 먼저 접속하는 사용자에 대하여 사용자 인증을 수행하고, 사용자 인증을 거친 후 부여된 권한 내의 장비에 접근 및 작업을 수행할 수 있도록 하는 보안 기능을 제공한다.In addition, the server access control system according to the prior art provides a security function to perform the user authentication for the first user to access, and to access and perform the equipment within the authority granted after the user authentication.
그러나 도 1에서 보는 바와 같이, 해커 또는 악의적인 사용자가 관리자 계정 정보를 탈취한 후, 해당 계정 정보로 사용자 인증을 수행하면 정상적인 인증을 거칠 수 있다. 이 경우, 악의적 사용자는 관리자의 권한 하에서 서버에 접근할 수 있고, 부여된 권한 내에서 악의적인 행위(정보유출, 파괴행위 등)를 수행할 수 있다. 즉, 관리자 계정 정보의 누출로 인하여, 통제할 수 없는 보안 문제가 발생할 수 있다.However, as shown in FIG. 1, if a hacker or a malicious user steals the administrator account information and performs user authentication with the corresponding account information, normal authentication may be performed. In this case, the malicious user can access the server under the authority of the administrator, and can perform malicious actions (eg, information leakage, destruction, etc.) within the assigned authority. That is, due to leakage of administrator account information, a security problem that cannot be controlled may occur.
따라서 기존의 1차적인 관문형 인증 방식을 뛰어 넘어 사용자의 작업 행위에 기반하여 사용자의 정상적인 행위인지를 판별하여 통제하는 기술이 필요하다.Therefore, it is necessary to go beyond the existing primary authentication method and to determine and control whether the user's normal behavior is based on the user's work behavior.
본 발명의 목적은 상술한 바와 같은 문제점을 해결하기 위한 것으로, 기관에서 관리하는 주요 서버의 보안 관리를 위하여, 사용자가 서버에 원격 접근하여 사용한 명령어들을 학습시키고, 학습을 통해 사용자의 행위 패턴을 추출하고, 사용자의 명령어 입력 이벤트가 발생되면 입력된 명령어를 학습된 행위 패턴과 비교하여, 이상 여부를 판단하여 통제하는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템을 제공하는 것이다.An object of the present invention is to solve the problems as described above, for the security management of the main server managed by the institution, to learn the commands used by the user remote access to the server, and extract the behavior pattern of the user through the learning In addition, when a command input event of a user is generated, a server access control system for detecting an input user based on an input command learning that determines an abnormality by comparing the input command with a learned behavior pattern and controls the abnormality.
특히, 본 발명의 목적은 사용자가 기관의 주요 서버에 접근을 통제하는 서버접근통제 시스템과 연계하여, 사용자가 서버에 접근하고, 이때 사용한 명령어 데이터를 수집하고, 기계학습(Machine Learning) 기법을 활용하여, 사용자별 작업 행위패턴을 추출하는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템을 제공하는 것이다.In particular, an object of the present invention is to connect to the server access control system that the user controls access to the main server of the institution, the user accesses the server, collects the command data used at this time, and utilizes the machine learning (Machine Learning) technique By providing a server access control system for detecting an abnormal user input based on input instruction learning, extracting a work behavior pattern for each user.
또한, 본 발명의 목적은 사용자가 실제 접근통제 시스템을 경유하여 서버에 접속하여, 명령어 입력을 수행 시 추출된 작업 행위 패턴 정보를 기반으로 이상 사용자 여부를 판단하고, 이상 사용자로 판단되었을 경우 정책에 의거하여 자동으로 해당 사용자 통제를 수행하는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템을 제공하는 것이다.In addition, an object of the present invention is to determine whether the user is an abnormal user based on the work behavior pattern information extracted when the user is connected to the server via the actual access control system, the command input, and if the user is determined to be an abnormal user It is to provide a server access control system that detects abnormal user input based on input instruction learning which automatically executes the user control based on it.
또한, 본 발명의 목적은 해커에 의한 계정정보 탈취에 의한 침해 행위 및 악의적 사용자의 침해 행위에 대한 방어를 위하여, 접근통제 시스템에서 사용자들의 서버 접속 및 명령어 입력 값을 기반으로 학습 모델을 만들고, 그 모델을 이용하여 현재 접속한 사용자가 해당 사용자가 맞는지 확인할 수 있는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템을 제공하는 것이다.In addition, an object of the present invention is to create a learning model based on the user's server access and command input values in the access control system, in order to defend against infringement by hackers and intrusion by malicious users, It is to provide a server access control system that detects abnormal user based on inputted instruction learning that can check whether the currently connected user is the user using the model.
상기 목적을 달성하기 위해 본 발명은 사용자 단말과 서버가 네트워크로 연결되고, 상기 사용자 단말과 상기 서버 사이의 네트워크 상에 게이트웨이로 설치되는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 관한 것으로서, 상기 사용자 단말로부터 전달되는 패킷으로부터 세션정보와 명령문을 추출하고, 상기 사용자 단말과 상기 서버 사이에서 입력되는 명령문 또는 서버의 결과를 중계하는 중계모듈; 상기 중계모듈로부터 세션정보와 명령문을 수신하여 명령어를 추출하고, 사용자의 명령어 입력 패턴을 나타내는 행위모델을 학습시켜 생성하고, 사용자의 현재 명령어를 상기 행위모델에 적용하여 이상 사용자의 확률을 산출하는 이상탐지부; 및, 상기 이상탐지부로부터 이상 사용자의 확률을 수신하고, 사전에 정해진 정책에 따라 상기 이상 사용자의 확률을 이용하여 관리자에게 경고나 세션 차단, 사용자 차단을 결정하는 이상판단부를 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention provides a server access control system for detecting an input instruction learning-based abnormal user, in which a user terminal and a server are connected to a network and installed as a gateway on a network between the user terminal and the server. A relay module, comprising: a relay module for extracting session information and a statement from a packet transmitted from the user terminal and relaying a result of a statement or a server input between the user terminal and the server; Receiving session information and statements from the relay module, extracting a command, learning and generating a behavior model indicating a user's command input pattern, and calculating a probability of the abnormal user by applying a user's current command to the behavior model. Detection unit; And an abnormality determination unit configured to receive a probability of an abnormal user from the abnormality detection unit, and determine an alert, a session block, or a user block to an administrator by using the probability of the abnormal user according to a predetermined policy. .
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 이상탐지부는, 상기 중계모듈로부터 명령문을 수신하는 이벤트 채널; 상기 이상판단부에게 이상 사용자의 확률 정보를 제공하는 상태 채널; 상기 명령문으로부터 명령어를 추출하는 행위조정자; 학습을 통해 각 사용자별 행위모델을 생성하는 행위모델 엔진; 상기 행위조정자로부터 명령어를 수신하여, 수신된 해당 명령어에 대하여 각 사용자별 확률을 산출하되, 상기 행위모델을 이용하여 산출하는 연산부를 포함하는 것을 특징으로 한다.The present invention also provides a server access control system for detecting an input instruction learning based abnormal user, wherein the abnormality detecting unit comprises: an event channel for receiving a statement from the relay module; A state channel for providing probability information of an abnormal user to the abnormal determination unit; An action coordinator for extracting instructions from the statement; A behavior model engine for generating a behavior model for each user through learning; Receiving a command from the behavior coordinator, calculates the probability for each user with respect to the received command, characterized in that it comprises a calculation unit for calculating using the behavior model.
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 행위조정자는 행위로그 스토리지에 상기 명령어를 기록하여 축적하고, 상기 행위모델 엔진은 축적된 행위로그 스토리지의 명령어 데이터를 이용하여 상기 행위모델을 지속적으로 학습시켜 갱신하는 것을 특징으로 한다.In addition, the present invention is a server access control system for detecting an input instruction learning-based abnormal user, wherein the behavior coordinator writes the command in the behavior log storage and accumulates the behavior model engine, the behavior model engine is stored in the instruction of the behavior log storage It is characterized by continuously learning and updating the behavior model using data.
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 행위모델은 베이지안 모델과 딥러닝 모델로 구성되고, 상기 연산부는 상기 베이지안 모델로부터 제1 확률을 산출하고, 상기 딥러닝 모델로부터 제2 확률을 산출하여, 상기 제1 확률과 상기 제2 확률에 대하여 가중치를 통해 비율 조정을 하여 최종 확률을 추출하는 것을 특징으로 한다.In addition, the present invention is a server access control system for detecting an input instruction learning-based abnormal user, the behavior model is composed of a Bayesian model and a deep learning model, the operation unit calculates a first probability from the Bayesian model, The second probability is calculated from the deep learning model, and the final probability is extracted by adjusting a ratio between the first probability and the second probability by using weights.
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 중계모듈은 상기 사용자 단말로부터 명령문 문자를 수신하고, 상기 명령문 문자가 엔터 문자가 아니면 상기 명령문 문자를 누적하여 명령문 누적 문자열을 생성하고, 상기 명령문 문자가 엔터 문자이면 상기 누적 문자열에서 실제 실행될 최종 명령문을 추출하되, 상기 명령문 문자가 제어 문자이면, 누적된 명령문 누적 문자열과 상기 제어 문자를 상기 서버에 전송하고, 상기 제어 문자가 반영된 문자열을 수신하고, 상기 반영된 문자열을 누적하여 상기 누적 문자열을 생성하는 것을 특징으로 한다.The present invention provides a server access control system for detecting an input instruction learning-based abnormal user, wherein the relay module receives a statement character from the user terminal, accumulates the statement character if the statement character is not an enter character. Generate a statement cumulative string, extract the final statement to be actually executed from the cumulative string if the statement character is an enter character, and if the statement character is a control character, transmit the accumulated statement cumulative string and the control character to the server, The character string reflecting the control character is received, and the accumulated character string is generated by accumulating the reflected character string.
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 중계모듈은 에코(echo) 명령어, 링크된 명령 추출(realpath) 명령어, 명령어 명칭 추출(basename) 명령어 중 어느 하나 이상의 명령어와, 해당 명령어의 인수로서 상기 누적 문자열을 상기 서버에 전송하고, 상기 서버로부터 수신한 해당 명령어의 결과를 이용하여, 상기 최종 명령어를 추출하고, 상기 에코(echo) 명령어는 변수 처리된 명령어, 와일드 문자(Wild Char)가 포함된 명령어, 히스토리(History)가 포함된 명령에 대하여 실제 실행할 명령어로 변환하여 회신하는 명령어이고, 상기 링크된 명령 추출(realpath) 명령어는 링크된 명령 추출(realpath) 명령어는 심볼릭 링크(Symbolic Link)에 의해 링크된 실제 명령어를 회신하는 명령어이고, 상기 명령어 명칭 추출(basename) 명령어는 명령어가 경로(path)를 포함하는 경우 경로를 제외하고 실제 실행 명령어의 이름을 회신하는 명령어인 것을 특징으로 한다.In addition, the present invention is a server access control system for detecting an input instruction learning-based abnormal user, wherein the relay module is any one of an echo command, a linked command extraction (realpath) command, a command name extraction (basename) command And transmitting the cumulative string to the server as one or more commands and as an argument of the command, extracting the final command using the result of the command received from the server, and the echo command being variable-processed. A command that converts a command, a command including a wild char, and a command including a history to a command that is actually executed, is returned. The linked command extract (realpath) command is a linked command extract (realpath). ) Command is a command that returns an actual command linked by a symbolic link, and extracts the command name. e) The command is characterized in that the command to return the name of the actual execution command excluding the path if the command includes a path (path).
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 명령문에서 명령어 명칭과 명령어 옵션만을 구성된 것으로 명령어를 추출하는 것을 특징으로 한다.In addition, the present invention is characterized in that in the server access control system that detects the input instruction learning-based abnormal user, the command is extracted by consisting of only the command name and command options in the statement.
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 행위모델 엔진은 각 명령어의 옵션 및 옵션 문자열의 입력순서의 유사 여부, 명령어별 사용빈도의 유사 여부, 명령어의 사용 순서에 의한 사용 패턴의 유사 여부를 반영하여, 상기 행위모델을 학습시키는 것을 특징으로 한다.In addition, the present invention is a server access control system for detecting an input instruction learning-based abnormal user, the behavior model engine is similar to the input order of options and option string of each command, similarity of usage frequency for each command, command Reflecting the similarity of the usage pattern according to the order of use, characterized in that to learn the behavior model.
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 행위모델 엔진은 다음 [수학식 1]을 이용하여, 상기 행위모델을 구하는 것을 특징으로 한다.In addition, the present invention is a server access control system for detecting an input instruction learning-based abnormal user, the behavior model engine is characterized by obtaining the behavior model using the following [Equation 1].
[수학식 1][Equation 1]
Figure PCTKR2018003549-appb-I000001
Figure PCTKR2018003549-appb-I000001
단, Pc,u는 사용자 u에 대하여 명령어 c에서의 확률을 나타내고, Training Count는 학습 데이터의 개수이고, Training Data Length는 학습 명령어의 길이를 나타내고, α는 사전에 정해진 상수로서 슈도우 카운트(pseudocount)이며, A는 서로다른 명령어(distinct command) 개수임.Where P c and u represent the probability of the instruction c with respect to the user u, Training Count is the number of training data, Training Data Length is the length of the training instruction, and α is a predetermined constant. pseudocount), and A is the number of distinct commands.
또, 본 발명은 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서, 상기 행위모델은 사용자 수 만큼 N개가 사전에 학습되어 구축되고, 상기 연산부는 특정 사용자의 입력된 명령어를 각 사용자별 행위모델 N개에 모두 적용하여, N개의 사용자별 확률을 구하고, N개의 사용자별 확률을 내림차순으로 정렬하여 각 사용자별 확률의 순위를 부여하고, 상기 이상판단부는 입력된 명령어에 대하여 해당 사용자의 확률의 순위에 따라 이상 사용자 여부를 판단하되, 특정 사용자가 입력한 명령어(command)에 대해 해당 사용자일 확률이 일정 순위 아래에 있을 경우 이상 사용자로 판단하는 것을 특징으로 한다.In addition, the present invention is a server access control system for detecting an input instruction learning-based abnormal user, the behavior model is built by learning N number in advance by the number of users, the operation unit is a user input commands of a specific user Probability of N users is obtained by applying to all N behavior models, and the ranking of probability of each user is given by sorting the probability of N users in descending order. It is determined whether the user is an abnormal user according to the rank of the probability, and when the probability of the corresponding user is below a predetermined rank with respect to a command input by a specific user, the user is determined to be an abnormal user.
상술한 바와 같이, 본 발명에 따른 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 의하면, 접속자의 행위가 사전에 추출한 정당한 사용자의 명령어 사용 패턴과 유사한지를 비교하여 정당한 사용자 여부를 추정함으로써, 해커 또는 악의적 사용자가 계정을 탈취하여 정상 인증을 거쳐 공격하는 행위에 대해서도 방어할 수 있는 효과가 얻어진다.As described above, according to the server access control system for detecting an input instruction learning-based abnormal user according to the present invention, by comparing whether the behavior of the accessor is similar to the command usage pattern of the legitimate user extracted in advance to estimate whether the legitimate user For example, a hacker or a malicious user can take over an account and defend against an attack through normal authentication.
도 1은 종래기술에 따른 접근통제 시스템에서 인증정보가 탈취되는 경우의 보안 취약성을 나타낸 도면.1 is a view showing a security vulnerability when the authentication information is stolen from the access control system according to the prior art.
도 2는 본 발명을 실시하기 위한 전체 시스템에 대한 구성도.2 is a block diagram of an overall system for practicing the present invention.
도 3은 본 발명의 일실시예에 따른 보안 강화를 위해 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템의 구성에 대한 블록도.3 is a block diagram of a configuration of a server access control system for detecting an input instruction learning-based abnormal user for enhanced security according to an embodiment of the present invention.
도 4는 본 발명의 일실시예에 따른 실시간 이상 사용자 탐지를 위한 이상 탐지부의 세부구성에 대한 블록도.Figure 4 is a block diagram for a detailed configuration of the abnormality detection unit for real-time abnormal user detection according to an embodiment of the present invention.
도 5는 본 발명의 일실시예에 따른 기계학습에 의한 이상 사용자 탐지를 위한 행위모델에 대한 구성도.5 is a block diagram of an action model for detecting an abnormal user by machine learning according to an embodiment of the present invention.
도 6은 본 발명의 일실시예에 따른 명령문을 추출하는 과정을 설명하는 개략적인 흐름도.6 is a schematic flowchart illustrating a process of extracting a statement according to an embodiment of the present invention.
도 7은 본 발명의 일실시예에 따른 명령문을 추출하는 과정을 설명하는 흐름도.7 is a flowchart illustrating a process of extracting a statement according to an embodiment of the present invention.
도 8은 본 발명의 일실시예에 따른 명령문 확인 과정의 예시도.8 is an exemplary view of a statement checking process according to an embodiment of the present invention.
도 9은 본 발명의 일실시예에 따라 명령문에서 명령어를 도출한 결과에 대한 예시도.9 is an exemplary view of a result of deriving an instruction from a statement according to an embodiment of the present invention.
도 10은 본 발명의 일실시예에 따른 행위모델 엔진의 입력 데이터에 대한 예시도.10 is an exemplary diagram of input data of an action model engine according to an embodiment of the present invention.
도 11은 본 발명의 일실시예에 따른 행위모델 엔진의 출력 데이터에 대한 예시도.11 is an exemplary diagram of output data of an action model engine according to an embodiment of the present invention.
도 12는 본 발명의 일실시예에 따른 이상 사용자를 판단하는 방법을 설명하는 흐름도.12 is a flowchart illustrating a method of determining an abnormal user according to an embodiment of the present invention.
도 13은 본 발명의 일실시예에 따른 행위모델을 학습시키는 구성도.Figure 13 is a block diagram for learning the behavior model according to an embodiment of the present invention.
이하, 본 발명의 실시를 위한 구체적인 내용을 도면에 따라서 설명한다.DETAILED DESCRIPTION Hereinafter, specific contents for carrying out the present invention will be described with reference to the drawings.
또한, 본 발명을 설명하는데 있어서 동일 부분은 동일 부호를 붙이고, 그 반복 설명은 생략한다.In addition, in describing this invention, the same code | symbol is attached | subjected and the repeated description is abbreviate | omitted.
먼저, 본 발명을 실시하기 위한 전체 시스템의 구성의 예들에 대하여 도 2를 참조하여 설명한다.First, examples of the configuration of the entire system for implementing the present invention will be described with reference to FIG.
도 2에서 보는 바와 같이, 본 발명을 실시하기 위한 전체 시스템은 사용자 단말(10), 서버(40), 및, 사용자 단말(10)과 서버(40) 사이에서 게이트웨이 역할을 하는 접근통제 시스템(30)으로 구성된다. 또한, 사용자 단말(10)과 서버(40)는 네트워크(미도시)를 통해 연결된다.As shown in FIG. 2, the entire system for implementing the present invention includes an access control system 30 that serves as a gateway between the user terminal 10, the server 40, and the user terminal 10 and the server 40. It is composed of In addition, the user terminal 10 and the server 40 are connected through a network (not shown).
사용자 단말(10)은 사용자가 사용하는 컴퓨팅 단말로서, PC, 노트북, 스마트폰, 태플릿PC 등이다. 또한, 사용자 단말(10)은 원격접속 프로토콜인 텔넷(TELNET) 또는 SSH(secure shell)를 통해 서버(40)에 접속하고, 서버(40)에 설치된 쉘(Shell)을 통해 작업을 수행한다.The user terminal 10 is a computing terminal used by a user, such as a PC, a notebook, a smartphone, a tablet PC, and the like. In addition, the user terminal 10 connects to the server 40 through a remote access protocol Telnet (TELNET) or SSH (secure shell), and performs the operation through a shell (Shell) installed in the server 40.
쉘(shell)은 사용자가 입력하는 명령어를 기계어로 번역하여 서버의 커널에 전달하는 역할을 수행하는 명령어 해석기(command interpreter). 즉, 쉘(shell)은 사용자가 입력하는 명령어를 해석하여 서버 커널로 처리할 수 있도록 하는 대화형 명령 인터프린터이다. 쉘(shell)은 문자 사용자 인터페이스(CLI)를 사용한다. 특히, 쉘(shell)은 명령줄 인터페이스(CLI, Command line interface)를 사용하여, 텍스트 터미널을 통해 사용자 단말(10)과 서버(40) 간에 상호 작업을 수행한다. 쉘(shell)은 본(bourne) 쉘, 콘(Korn) 쉘, 배쉬(bash) 쉘, C 쉘, Tcsh 쉘 등 다양한 쉘이 적용될 수 있다.A shell is a command interpreter that translates user input into machine language and passes it to the server's kernel. In other words, a shell is an interactive command interprinter that interprets commands entered by a user and processes them with the server kernel. The shell uses a character user interface (CLI). In particular, the shell uses a command line interface (CLI) to interact with the user terminal 10 and the server 40 through a text terminal. As the shell, various shells such as a Bourne shell, a Korn shell, a bash shell, a C shell, and a Tcsh shell may be applied.
따라서 사용자는 사용자 단말(10)을 통하여 서버(40)의 서비스를 이용하고, 이를 위해, 쉘(shell) 상에서 일련의 문자, 즉, 명령어 스트링(또는 문자열)을 입력하고, 엔터(enter)를 입력함으로써, 입력된 명령문 또는 문자열을 수행하도록 서버(40)에 전달한다. 또한, 사용자 단말(10)에 설치된 쉘(shell)은 서버(40)로부터 입력된 명령의 결과를 수신하여, 텍스트 형태 또는 문자열(string) 형태로 화면에 그 결과를 표시한다. 이때, 입력된 명령어 스트링(문자열)을 명령문이라 부르기로 한다.Accordingly, the user uses the service of the server 40 through the user terminal 10, and for this purpose, a series of characters, that is, a command string (or string), is input on a shell, and an enter is entered. Thus, the server 40 transmits the input statement or string. In addition, a shell installed in the user terminal 10 receives a result of a command input from the server 40 and displays the result on a screen in a text form or a string form. At this time, the input command string (string) is called a statement.
다음으로, 서버(40)는 사용자 단말(10)로부터 네트워크(미도시)를 통해 명령문을 수신하여, 해당 명령문의 명령을 수행하고 그 결과를 사용자 단말(10)에 전송한다. 이때, 명령문은 일련의 문자, 즉, 문자열로 구성된다. 이들 문자, 또는 문자열은 네트워크를 통해 전송된다. 즉, 사용자 단말(10) 또는 쉘(shell)과, 서버(40) 사이에는 세션이 형성되고, 세션 내에서 문자 또는 문자열이 데이터 패킷을 통해 전송된다.Next, the server 40 receives a statement from the user terminal 10 through a network (not shown), performs a command of the corresponding statement, and transmits the result to the user terminal 10. In this case, the statement consists of a series of characters, ie strings. These characters, or strings, are transmitted over the network. That is, a session is formed between the user terminal 10 or the shell and the server 40, and a character or a string is transmitted through the data packet within the session.
서버(40)는 사용자가 입력된 문자열, 바람직하게는, 엔터(enter) 문자가 입력되면 그 이전까지의 입력된 일련의 문자 또는 문자열을 하나의 명령문(또는 명령 문자열)으로 인식한다. 엔터 문자는 명령문 입력이 완료되는 것을 나타내는 문자를 의미한다. 이하에서 엔터 문자(또는 엔터키 문자, 엔터키) 또는 명령문 입력완료 문자라 부르기로 한다.The server 40 recognizes a string of characters entered by the user, preferably, a string of characters or strings entered up to the previous character as one statement (or command string). The enter character means a character indicating that statement input is completed. Hereinafter, the enter character (or enter key character, enter key) or statement input completion character will be referred to.
바람직하게는, 서버(40)는 엔터(enter)가 입력되기 전까지 입력되는 문자들을 누적하여 문자열(또는 명령문)을 만들고, 중간까지 만들어진 문자열에 대한 구문분석을 하여 그 결과를 회신할 수도 있다. 이때, 회신된 결과는 사용자 단말(10)의 쉘 상에 표시될 수 있다. 즉, 서버(40)는 하나의 문자가 입력될 때마다 입력에 대응되는 명령어 문자열(또는 명령문)을 텍스트 형태로 회신하여, 사용자 단말(10) 또는 쉘 상에 표시하게 한다. 그리고 입력되는 문자가 제어 문자 등 쉘 상에 표시할 수 없는 문자인 경우, 서버(40)는 그에 해당하는 결과의 문자열을 회신하여 표시하게 한다.Preferably, the server 40 may accumulate input characters until an enter is input, generate a string (or a statement), parse the generated string up to the middle, and return the result. In this case, the returned result may be displayed on the shell of the user terminal 10. That is, whenever one character is input, the server 40 returns a command string (or statement) corresponding to the input in the form of a text to be displayed on the user terminal 10 or the shell. When the input character is a character that cannot be displayed on the shell such as a control character, the server 40 causes the character string of the result corresponding thereto to be returned.
더욱 바람직하게는, 서버(40)는 문자나 숫자, 기호 등 일반 문자 키 입력 외에 탭(tab), 방향키, 컨트롤(ctrl, alt) 키와의 조합문자 등 제어 키(또는 제어 문자)가 수신되는 경우, 해당 제어문자를 해석하여 이에 대응되는 일반 문자의 문자열(또는 텍스트 형태의 문자열)로서 변환한다. 여기서, 일반 문자는 글자, 숫자, 기호 등 텍스트로 표시 가능한 문자들을 의미하고, 제어 문자는 탭, 방향키, 컨트롤 키와 조합 문자 등 제어를 위한 문자들을 의미하는 것으로 정의한다.More preferably, the server 40 receives a control key (or control character) such as a combination of a tab, a direction key, and a control (ctrl, alt) key in addition to a general character key input such as a letter, number, or symbol. In this case, the control character is interpreted and converted into a character string (or a text string) of the corresponding general character. Here, the general character refers to characters that can be displayed as text such as letters, numbers, and symbols, and the control character refers to characters for control such as tabs, direction keys, control keys, and combination characters.
한편, 바람직하게는, 서버(40)는 리눅스(Linux) 또는 유닉스(Unix) 계열 운영체제(OS)로 운영되며, CLI (Command-Line Interface) 방식의 원격접속 프로토콜 (TELNET, SSH)을 이용한 원격 접근에 대한 서비스를 제공한다. 또한, 접근통제 시스템(30)은 이러한 원격 접근에 대한 서비스에서 입력한 명령문에 대하여 통제를 수행한다.On the other hand, preferably, the server 40 is operated by a Linux or Unix-based operating system (OS), remote access using a CLI (Command-Line Interface) type remote access protocol (TELNET, SSH) Provide services for In addition, the access control system 30 controls the statements entered in the service for such remote access.
다음으로, 접근통제 시스템(30)은 사용자 단말(10)과 서버(40) 사이의 네트워크(미도시) 상에 설치되는 게이트웨이로서, 사용자 단말(10)과 서버(40) 사이를 중계하거나 차단한다.Next, the access control system 30 is a gateway installed on a network (not shown) between the user terminal 10 and the server 40, and relays or blocks the user terminal 10 and the server 40. .
즉, 접근통제 시스템(30)은 사용자 단말(10)로부터 수신되는 명령어 문자 또는 문자열(또는 명령문)을 수신하여 서버(40)에 전달하고, 서버(40)로부터 결과를 수신하여 사용자 단말(10)로 전달한다.That is, the access control system 30 receives the command character or string (or statement) received from the user terminal 10 and transmits it to the server 40, and receives the result from the server 40 to receive the user terminal 10. To pass.
이때, 접근통제 시스템(30)은 사용자 단말(10)로부터 수신되는 명령어 문자 또는 문자열을 분석하여, 해당 명령어 문자열(또는 명령문)에 내포하는 명령어의 차단 여부를 결정한다. 즉, 차단 여부의 결정에 따라, 해당 명령어 문자열(또는 명령문)을 서버(40)에 전송하거나, 차단한다.At this time, the access control system 30 analyzes the command character or string received from the user terminal 10, and determines whether to block the command contained in the command string (or statement). That is, according to the determination of blocking or not, the command string (or statement) is transmitted to the server 40 or blocked.
다음으로, 본 발명의 일실시예에 따른 보안 강화를 위해 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 대하여 도 3을 참조하여 설명한다.Next, a server access control system that detects an input instruction learning-based abnormal user to enhance security according to an embodiment of the present invention will be described with reference to FIG. 3.
본 발명은 접근통제 시스템을 경유해서 서버에 작업한 이력, 즉, 명령어 정보를 수집하고 학습하여 사용자별 작업 행위패턴을 추출한다. 사용자가 서버에 접근하여 작업을 수행할 때, 접근통제 시스템은 추출한 패턴을 기반으로 사용자가 입력한 명령문들을 실시간으로 패턴과 비교하여, 이상 사용자 여부를 판단한다. 즉, 사용자에 대한 인증 기능을 패턴을 통해 지속적으로 수행한다. 그리고 이상 사용자로 판단되면, 해당 사용자의 세션을 차단하여, 잠재적 위협을 방지할 수 있다.The present invention collects and learns the history of working on the server through the access control system, that is, command information, and extracts the work behavior pattern for each user. When a user accesses a server and performs a task, the access control system compares the statements input by the user with the patterns in real time based on the extracted pattern to determine whether there is an abnormal user. That is, the authentication function for the user is continuously performed through the pattern. If it is determined that the abnormal user, the session of the user can be blocked to prevent potential threats.
도 3에서 보는 바와 같이, 본 발명에 따른 서버 접근 통제 시스템(30)은 입력/응답 전송부(31), 명령문 추출부(32), 패킷 중계부(33), 장비 전송부(34), 정책 판단부(35), 이상 탐지부(36)로 구성된다.As shown in FIG. 3, the server access control system 30 according to the present invention includes an input / response transmitter 31, a statement extractor 32, a packet relay 33, an equipment transmitter 34, and a policy. The determination unit 35 and the abnormality detection unit 36 are configured.
사용자는 명령문 추출부의 서버접근 통제시스템(30)을 경유하여 특정 서버(40)에 접근하고 작업을 수행하는 상황을 가정한다.It is assumed that a user accesses a specific server 40 and performs a task via the server access control system 30 of the statement extracting unit.
먼저, 입력응답 전송부(31)는 사용자 또는 사용자 단말(10)로부터 입력을 수신하여 중계모듈(38)로 전달하거나 중계모듈(38)로부터 수신한 응답을 사용자 또는 사용자 단말(10)로 전송한다.First, the input response transmitter 31 receives an input from the user or the user terminal 10 and transmits it to the relay module 38 or transmits a response received from the relay module 38 to the user or the user terminal 10. .
중계모듈(38)은 입력응답 전송부(31)와 장비전송부(34) 사이를 연결하여, 패킷 또는 데이터를 중계하는 모듈이다. 중계모듈(38)은 패킷으로부터 명령문을 추출하는 명령문 추출부(32)와, 패킷 또는 데이터를 중계하는 패킷 중계부(33)로 구성된다.The relay module 38 is a module that connects the input response transmitter 31 and the equipment transmitter 34 to relay packets or data. The relay module 38 includes a statement extractor 32 for extracting a statement from a packet, and a packet relay 33 for relaying a packet or data.
특히, 명령문 추출부(32)는 사용자 단말(10)로부터 수신한 패킷을 분석하여, 명령문을 추출하거나, 사용자의 세션정보를 추출한다.In particular, the statement extracting unit 32 analyzes the packet received from the user terminal 10, extracts a statement, or extracts session information of the user.
세션은 사용자 단말(10)이 원격으로 서버(40)에 접속하기 위해 형성한 TCP/IP 프로토콜의 세션을 말한다. 사용자는 텔넷(TELNET), 시큐어쉘(SSH) 프로토콜 등에 의해 서버(40)에 원격 접속을 수행한다. 하나의 세션 동안에 사용자는 다수의 명령어들을 입력하여 자신이 원하는 작업을 수행한다.The session refers to a session of the TCP / IP protocol established by the user terminal 10 to remotely access the server 40. The user makes a remote connection to the server 40 by using Telnet, Secure Shell (SSH) protocol, or the like. During a session, a user enters a number of commands to do what he wants.
세션정보는 사용자의 장비 접속 세션정보를 포함한다. 즉, 세션정보는 사용자 식별정보(사용자 아이디 등), 접속장비 식별번호(접속장비 아이디 등), 접속 프로토콜, 접속 계정, 접속 시작시간, 접속 종료시간 등으로 구성된다.The session information includes device connection session information of the user. That is, session information includes user identification information (user ID, etc.), access equipment identification number (access equipment ID, etc.), access protocol, access account, access start time, access end time, and the like.
만약 명령문 추출부(32)가 패킷으로부터 사용자의 세션정보와 명령문을 추출하면, 세션정보와 명령문에 대한 정보를 이상 탐지부(36)로 전송한다. 그리고 이상 탐지부(36) 또는 이상 판단부(35)로부터 이상이 탐지되지 않는 경우, 패킷을 패킷 중계부(33)를 통해 중계하도록 한다.If the statement extracting unit 32 extracts the user's session information and statements from the packet, the statement extracting unit 32 transmits the session information and the information about the statement to the abnormality detecting unit 36. When an abnormality is not detected from the abnormality detecting unit 36 or the abnormality determining unit 35, the packet is relayed through the packet relaying unit 33.
다음으로, 이상 탐지부(36)는 명령문으로부터 명령어를 추출하고, 하나의 세션에 해당하는 일련의 명령어들을 수집하고, 일련의 명령어에 대한 패턴을 분석한다. 그리고 기존에 학습된 사용자의 행위 패턴(또는 명령어 패턴)과 비교하여 분석한다. 이때, 분석결과는 학습된 사용자의 행위패턴과 상이도 또는 이상 사용자 확률(이상 사용자일 확률값)로 도출된다.Next, the abnormality detection unit 36 extracts a command from a statement, collects a series of commands corresponding to one session, and analyzes a pattern for the series of commands. And it compares with the behavior pattern (or command pattern) of the previously learned user. At this time, the analysis result is derived from the learned user's behavior pattern and the difference or abnormal user probability (probability value of abnormal user).
바람직하게는, 이상 탐지부(36)는 세션 내의 일련의 명령어들을 수집하여, 해당 명령어들로 행위모델을 학습시켜 생성하고, 학습된 행위모델을 이용하여 이상 사용자 확률을 도출한다.Preferably, the abnormality detection unit 36 collects a series of commands in the session, learns and generates a behavior model with the corresponding commands, and derives an abnormal user probability using the learned behavior model.
이상 탐지부(36)는 도출된 이상 사용자 확률을 이상 판단부(35)로 전송한다.The abnormality detection unit 36 transmits the derived abnormal user probability to the abnormality determination unit 35.
이상 판단부(35)는 이상 사용자 확률을 수신하여, 사전에 정의된 정책에 의거하여 이상 사용자의 여부를 판단한다. 즉, 상기 이상 사용자 확률이 소정의 임계치를 벗어나는 경우(예를 들어 확률이 사전에 정해진 임계치를 초과하는 경우), 이상 사용자로 판단한다.The abnormality determination unit 35 receives the abnormal user probability and determines whether the abnormal user is based on a predefined policy. That is, when the abnormal user probability is out of a predetermined threshold (for example, when the probability exceeds a predetermined threshold), the abnormal user is determined as an abnormal user.
이상 판단부(35)는 이상 사용자로 판단하면, 정의된 정책에 의거하여, 사용자의 사용에 제재를 수행한다. 예를 들어, 관리자에게 알림을 전달하거나 세션을 차단한다. 이를 통해, 보안을 강화할 수 있다.The abnormality determination unit 35 judges the abnormal user, and sanctions on the use of the user based on the defined policy. For example, send notifications to administrators or block sessions. This can enhance security.
즉, 사용자가 명령문 입력 시 실시간으로 이상 사용자 여부를 확인하기 위하여, 접근통제 시스템이 실제 접속된 관리 서버에 명령어를 전송하는 것을 보류한다. 그리고 인증을 위하여 이상 탐지부(36) 및 이상 판단부(25)에 의해 사용자 인증을 지속적으로 수행한다.That is, in order to check whether the user is an abnormal user in real time when the user inputs a statement, the access control system suspends the command transmission to the management server actually connected. And the user authentication is continuously performed by the abnormality detection unit 36 and the abnormality determination unit 25 for authentication.
특히, 이상 판단부(35) 또는 이상 탐지부(36)는 사용자 여부를 참/거짓(True/False)이 아닌 해당 사용자가 맞을 확률에 대해서 결과를 도출하며, "의사 결정 시스템(Decision SupportSystem)" 에서 정책에 의거하여 사용자가 맞을 확률이 일정 임계치를 초과할 경우 정의된 정책에 의거 자동 조치를 취하게 된다. In particular, the abnormality determination unit 35 or the abnormality detection unit 36 derives a result based on the probability that the corresponding user is not True / False or not, and “Decision Support System”. If the probability that a user is hit by a policy exceeds a certain threshold, then an automatic action is taken based on the defined policy.
다음으로, 본 발명의 일실시예에 따른 실시간 이상 사용자 탐지를 위한 이상 탐지부(36)의 세부구성을 도 4를 참조하여 보다 구체적으로 설명한다. 도 4는 이상 탐지부(36)의 세부구성 및, 접근통제 시스템 내부에서 사용자의 실시간 작업행위를 기반으로 사용패턴 기반 이상 사용자를 탐지하는 일련의 과정을 보여준다.Next, the detailed configuration of the abnormality detection unit 36 for real-time abnormal user detection according to an embodiment of the present invention will be described in more detail with reference to FIG. 4. 4 shows a detailed configuration of the abnormality detecting unit 36 and a series of processes for detecting an abnormal user based on a usage pattern based on the user's real-time work behavior in the access control system.
도 4에서 보는 바와 같이, 본 발명의 일실시예에 따른 이상 탐지부(36)의 구성은 크게 데이터 파이프와 행위 엔진으로 구분된다.As shown in Figure 4, the configuration of the abnormality detection unit 36 according to an embodiment of the present invention is largely divided into a data pipe and an action engine.
먼저, 데이터 파이프 부분을 설명하면, 사용자의 작업 이벤트를 전달하는 이벤트 채널(51)과, 이상 사용자 확률을 전달하는 상태 채널(52)로 구성된다.First, the data pipe part will be described. It is composed of an event channel 51 for delivering a work event of a user and a state channel 52 for delivering an abnormal user probability.
이벤트 채널(51)은 사용자의 작업 이벤트(User, Device Session, Command)에 대한 정보를 전달받는다. 즉, 접근통제 시스템의 게이트웨이 중계 모듈(38)은 사용자의 작업 이벤트(User, Device Session, Command)가 발생하면, 해당 이벤트 정보를 이벤트 채널(51)에 전달한다. 이벤트 채널(51)은 이벤트가 전달되면, 행위조정자(61)를 호출하여 해당 정보를 전달한다.The event channel 51 receives information about a user's work event (User, Device Session, Command). That is, the gateway relay module 38 of the access control system transmits the corresponding event information to the event channel 51 when a user's work event (User, Device Session, Command) occurs. When the event is delivered, the event channel 51 calls the behavior coordinator 61 to deliver the corresponding information.
이때, 작업 이벤트는 사용자의 명령문 입력을 말한다. 즉, 사용자가 일련의 문자열을 입력하는데, 하나의 명령문을 완성하는 엔터 문자(엔터키)를 최종적으로 입력한다. 즉, 하나의 명령문이 완성되어 입력되면, 해당 명령문에 대하여 작업 이벤트가 발생된다.At this time, the job event refers to a user's statement input. That is, the user enters a series of strings, and finally enters an enter character (enter key) that completes a statement. In other words, when a statement is completed and entered, a job event is generated for that statement.
또한, 상태 채널(52)은 계산된 이상 사용자일 확률 정보를 행위 엔진의 연산부(64)로부터 전달받는다. 즉, 행위 엔진의 연산부(Calculator)(64)는 계산된 이상 사용자일 확률 정보를 상태채널(Status Channel)(52)에 전달한다. 상태 채널(52)은 이상 확률(이상 사용자 확률)이 전달되면, 이상판단부(35)를 호출하여 해당 정보를 전달한다.In addition, the state channel 52 receives the calculated probability user information from the operation unit 64 of the behavior engine. That is, the calculator 64 of the behavior engine transfers the calculated probability user to the status channel 52. When the abnormality probability (abnormal user probability) is transmitted, the state channel 52 calls the abnormality determination unit 35 to transmit corresponding information.
다음으로, 행위엔진은 행위조정자(61), 데이터 저장부(62), 행위모델 엔진(63), 연산부(64)를 포함한다. 또한, 데이터 저장을 위해, 행위로그 스토리지(71), 행위모델(72), 모델 캐시(73)으로 구성된다.Next, the action engine includes an action coordinator 61, a data storage unit 62, an action model engine 63, and an operation unit 64. In addition, for storing data, the behavior log storage 71, the behavior model 72, the model cache 73 is configured.
먼저, 행위조정자(Behavior Coordinator)(61)는 이벤트 채널(51)로부터 사용자의 작업 이벤트(User, Device Session, Command)를 수신받아, 모델 구성을 위해 로그를 저장하도록 지시하고, 이상 사용자를 탐지하는 연산부(64)를 호출하여 해당 작업 이벤트에 대한 분석을 수행하게 한다. 즉, 행위조정자(61)는 전달받은 이벤트 정보를 저장하기 위하여 데이터 저장부(Data Saver)(62)에 정보를 전달하고, 이상 사용자 판단을 위해 연산부(Calculator)(64)를 호출한다.First, the behavior coordinator 61 receives a user's work event (User, Device Session, Command) from the event channel 51, instructs to store a log for model construction, and detects an abnormal user. The operation unit 64 is called to perform analysis on the work event. That is, the behavior coordinator 61 transmits the information to the data saver 62 to store the received event information, and calls the calculator 64 to determine the abnormal user.
특히, 행위조정자(61)는 이벤트 채널(51)로부터 명령문을 가져온다. 그리고 가져온 명령문에서 명령어를 추출하고, 추출된 명령어를 행위로그 스토리지(71)에 저장한다.In particular, the behavior coordinator 61 gets the statement from the event channel 51. The command is extracted from the imported statement, and the extracted command is stored in the behavior log storage 71.
다음으로, 데이터 저장부(62)는 행위조정자(61)로부터 받은 사용자의 작업 이벤트 정보를 행위로그 스토리지(71)에 저장한다. 즉, 데이터 저장부(Data Saver)(62)가 사용자 작업행위 학습을 위해 이벤트 정보를 행위로그 스토리지(71)에 저장한다.Next, the data storage unit 62 stores the user's job event information received from the behavior coordinator 61 in the behavior log storage 71. That is, the data saver 62 stores the event information in the behavior log storage 71 for learning the user's work behavior.
바람직하게는, 행위로그 스토리지(71)에는 세션정보와, 해당 세션 내의 명령어를 기록하고 축적한다. 또한, 더욱 바람직하게는, 세션정보 및 명령어와 함께, 명령문 전체를 로그로 기록한다.Preferably, the activity log storage 71 records and accumulates session information and commands in the session. Also, more preferably, the entire statement is recorded in a log together with session information and commands.
다음으로, 연산부(Calculator)(64)는 학습된 사용자별 추출된 행위패턴 모델과 비교하여, 전달받은 행위 정보가 저장된 사용자의 행위패턴(또는 행위들)과 어느 정도 차이가 나는지를 확률로 계산한다.Next, the calculator 64 compares the learned behavior pattern model for each user with a probability and calculates a probability that the received behavior information differs from the stored behavior pattern (or behaviors) of the user. .
또한, 행위모델 엔진(63)은 행위모델(72)을 생성하는 엔진으로서, 사용자의 행위기반 행위모델을 구축한다. 즉, 행위모델 엔진(63)은 행위로그 스토리지(71)에 저장된 사용자 행위 정보를 이용하여, 행위모델(사용자의 작업행위 기반 행위모델)을 재구축한다. 즉, 행위모델 엔진(63)이 호출됨으로써, 사용자의 작업 행위 기반 행동 모델을 재구축한다.In addition, the behavior model engine 63 is an engine for generating the behavior model 72 to build a behavior-based behavior model of the user. That is, the behavior model engine 63 uses the user behavior information stored in the behavior log storage 71 to reconstruct the behavior model (work behavior based behavior model of the user). That is, the behavior model engine 63 is called to reconstruct the user's work behavior based behavior model.
바람직하게는, 도 5에서 보는 바와 같이, 행위모델은 베이지안 모델과 순환형 신경망(RNN) 모델을 사용한다. 즉, 행위모델 엔진(63)은 사용자 행위 정보, 즉, 입력한 명령어들을 이용하여 베이지안 모델의 변수들을 추출한다. 또한, 입력한 명령어들을 이용하여, 순환형 신경망 모델을 학습시킨다.Preferably, as shown in Figure 5, the behavior model uses a Bayesian model and a cyclic neural network (RNN) model. That is, the behavior model engine 63 extracts variables of the Bayesian model by using user behavior information, that is, inputted commands. In addition, the recursive neural network model is trained using the inputted commands.
또한, 행위모델 엔진(63)은 주기적으로 호출되어, 사용자의 작업 행위 기반 행동 모델을 주기적으로 재구축한다.In addition, the behavior model engine 63 is periodically called to periodically reconstruct the user's work behavior based behavior model.
다음으로, 연산부(Calculator)(64)는 계산된 이상 사용자일 확률 정보를 산출하고, 산출된 확률정보를 상태채널(Status Channel)에 전달한다. 즉, 연산부(64)는 행위모델(72)과, 사용자의 행위패턴들을 대비하여 이상 사용자의 확률을 연산한다.Next, the calculator 64 calculates probability information of a calculated abnormal user and transfers the calculated probability information to a status channel. That is, the calculator 64 calculates the probability of the abnormal user by comparing the behavior model 72 and the behavior patterns of the user.
바람직하게는, 연산부(64)는 베이지안 모델에 의한 이상 사용자의 제1 확률 정보를 산출하고, 순환형 신경망 모델(또는 딥러닝 모델)에 의한 이상 사용자의 제2 확률 정보를 산출한다. 그리고 연산부(64)는 상기 제1 확률 정보와 제2 확률 정보를 가중치를 주어 비율 조정하여 최종 확률 정보(또는 이상 사용자의 확률정보)를 추출한다.Preferably, the calculation unit 64 calculates first probability information of the abnormal user by the Bayesian model, and calculates second probability information of the abnormal user by the cyclic neural network model (or deep learning model). The calculating unit 64 extracts final probability information (or probability information of an abnormal user) by weighting and adjusting the first probability information and the second probability information.
이때, 바람직하게는, 학습 데이터 또는 각 사용자의 이벤트 또는 명령어들의 개수에 따라 상기 가중치를 결정한다. 즉, 학습 데이터의 크기, 개수가 상대적으로 작은 개수(100 ~ 1000)의 입력 데이터인 경우, 제1 확률정보의 가중치를 제2 확률정보의 가중치 보다 크게 하고, 개수가 상대적으로 큰 경우에는, 제1 확률정보의 가중치를 제2 확률정보의 가중치 보다 작게 한다.In this case, preferably, the weight is determined according to the learning data or the number of events or instructions of each user. That is, when the size and number of the training data are relatively small numbers (100 to 1000) of input data, the weight of the first probability information is greater than the weight of the second probability information, and when the number is relatively large, The weight of the first probability information is made smaller than the weight of the second probability information.
딥러닝(Deep Learning) 모델을 사용하기 위해서는 각 사용자들의 데이터가 충분히 축적되어야 한다. 충분히 축적되기 전까지 팁러닝 모델을 사용하면, 제대로 된 결과가 나오지 않을 수 있다. 즉, 콜드 스타트 문제(Cold Start Problem)가 발생된다. 한편, 베이지안 모델은 상대적으로 작은 입력 데이터 만으로도 이상 사용자를 분별해 낼 수 있다. 따라서 초기 데이터가 부족한 시점에는 베이지안 모델에 가중치를 더 많이 두고, 데이터가 많이 축적될수록 딥러닝 모델의 가중치를 높인다. 즉, 그리고 데이터가 충분히 축적되면, 딥러닝(Deep Learning) 모델과 베이지안 판단 모델의 비율을 조정하여 최적의 판단을 내릴 수 있다. In order to use the deep learning model, each user's data must be accumulated sufficiently. Using tip learning models until they accumulate enough may not produce the correct results. That is, a cold start problem occurs. The Bayesian model, on the other hand, can discern abnormal users even with relatively small input data. Therefore, when the initial data is insufficient, the weight is added to the Bayesian model, and as the data is accumulated, the weight of the deep learning model is increased. In other words, if the data is sufficiently accumulated, the ratio of the deep learning model and the Bayesian decision model may be adjusted to make an optimal decision.
또한, 모델캐시(73)는 행위모델 및, 사용자의 행위패턴을 대비하기 위해, 이들 데이터를 임시적으로 저장하는 매체(또는 캐시)이다. 특히 대비할 행위모델을 모델캐시에 기록하고, 이를 사용자 행위패턴과 대비한다. 모델 캐시는 실시간으로 이상 사용자 탐지를 위한 일정 용량의 캐시 공간을 할당하여 사용한다. 이를 통해, 자주 쓰이는 모델에 대한 검색 속도를 향상할 수 있다.In addition, the model cache 73 is a medium (or cache) that temporarily stores these data in order to prepare for the behavior model and the behavior pattern of the user. In particular, record the behavior model to be prepared in the model cache and contrast it with the user behavior pattern. The model cache allocates and uses a certain amount of cache space for abnormal user detection in real time. This speeds up searches for commonly used models.
한편, 이상판단부(35)는 의사 결정 시스템(Decision SupportSystem)으로서, 정의된 정책에 의거하여, 이상 사용자 여부를 판단한다. 이때, 상태채널(Status Channel)(52)로부터 이상 사용자일 확률 정보를 수신하고, 수신된 확률 정보를 이용하여 이상 사용자 여부를 판단한다.On the other hand, the abnormality determination unit 35 is a decision support system (Decision Support System), and determines whether or not the abnormal user based on the defined policy. At this time, the probability information of the abnormal user is received from the status channel 52, and it is determined whether the abnormal user is using the received probability information.
특히, 이상판단부(35)는 적용된 임계치 초과 시 적절한 제재(경고, 관리자 알림, 세션 차단)를 요청한다.In particular, the abnormality determination unit 35 requests an appropriate sanction (warning, administrator notification, session blocking) when the applied threshold is exceeded.
이상 사용자로 판단되어 적용된 임계치를 초과할 경우 이상판단부(35)는 정의된 정책에 의거하여 게이트웨이 서버(Gateway Server) 또는 중계 모듈(38)에 해당 세션을 차단할 것을 요청한다.If it is determined that the user has exceeded the threshold applied, the abnormality determination unit 35 requests the gateway server or the relay module 38 to block the session based on the defined policy.
예를 들어, 이상 사용자일 확률 단위의 임계치를 다음과 같이 설정할 수 있다.For example, the threshold of the probability unit of the abnormal user may be set as follows.
이상 사용자의 확률이 70% 이상인 경우, 관리자에게 이상 사용자 의심을 알리는 알림 신호를 발송하고, 80%이상인 경우, 해당 사용자가 작업 수행을 위해 현재 접속한 세션을 차단한다. 또한, 이상 사용자의 확률이 90% 이상인 경우, 해당 사용자에 대한 접근을 차단한다. 상기와 같이 각 임계치를 사전에 설정하고, 이상판단부(35)는 이상 사용자의 확률이 각 임계치를 초과했을 경우, 사전에 정해진 일련의 작업(보안 작업)을 자동으로 수행한다.If the probability of the abnormal user is 70% or more, the administrator sends a notification signal informing the administrator of the abnormal user. If the error rate is 80% or more, the session currently connected to the user to block a task is blocked. In addition, if the probability of the abnormal user is more than 90%, access to the user is blocked. As described above, each threshold is set in advance, and the abnormality determination unit 35 automatically performs a predetermined series of tasks (security tasks) when the probability of the abnormal user exceeds each threshold.
다음으로, 본 발명의 일실시예에 따른 중계모듈(38)(또는 명령문 추출부)에서 명령문을 추출하는 방법을 도 6 및 도 7을 참조하여 설명한다.Next, a method of extracting a statement from the relay module 38 (or the statement extracting unit) according to an embodiment of the present invention will be described with reference to FIGS. 6 and 7.
도 6에서 보는 바와 같이, 중계모듈(38) 또는 명령문 추출부(32)는 사용자 단말(10)과 관리대상 서버(40)와의 사이에서 통신 데이터를 중계하는 환경에서 작업을 수행한다. 실시 예에서는 사용자 또는 사용자 단말(10)이 접근통제 시스템(30)을 경유하여 권한이 부여된 관리대상 서버(40)에 CLI 프로토콜을 이용하여 원격 접속하여 작업을 수행하는 환경을 실시 예로서 설명한다. As shown in FIG. 6, the relay module 38 or the statement extracting unit 32 performs a work in an environment in which communication data is relayed between the user terminal 10 and the management target server 40. In an embodiment, an environment in which a user or a user terminal 10 performs a remote access operation using a CLI protocol to a management target server 40 authorized through an access control system 30 will be described as an embodiment. .
사용자 또는 사용자 단말(10)은 서버(40)에 작업을 위하여, 명령문을 입력한다. 이때, 중계모듈(38) 또는 명령문 추출부(32)는 전달받은 명령문을 서버(40)에 전송하고, 서버(40)의 응답을 다시 사용자 단말(10)에게 전달한다. 이때 명령문 추출부(32)는 명령어의 실행을 의미하는 엔터(Enter) 키 입력을 전달 받기 전까지 입력된 문자들을 추출하고, 내부 메모리에 누적한다. 사용자가 엔터(Enter) 키를 입력하면, 명령문 추출부(32) 또는 중계모듈(38)은 해당 키 입력을 서버(40)에 전송을 보류하고, 메모리에 누적된 명령어가 사용자에 할당된 권한 정책에 의거 실행 가능한지 판단하고 서버(40)에 전송 여부를 판단한다.The user or user terminal 10 inputs a statement to work on the server 40. At this time, the relay module 38 or the statement extracting unit 32 transmits the received statement to the server 40 and transmits the response of the server 40 to the user terminal 10 again. At this time, the statement extracting unit 32 extracts the inputted characters and accumulates them in the internal memory until the Enter key input indicating execution of the command is received. When the user enters the Enter key, the statement extractor 32 or the relay module 38 suspends the transmission of the corresponding key input to the server 40, and the permission policy in which the instructions accumulated in the memory are assigned to the user is assigned. It is determined whether or not according to the execution, and whether or not to transmit to the server (40).
즉, 사용자의 엔터(Enter) 키가 입력 시, 중계모듈(38)은 바로 인가 여부를 판단하는 것이 아니라 명령어 확인 프로세스를 통해 서버(40)와의 통신 및 응답 데이터의 분석을 통한 최종 명령어 또는 명령문을 추출한다. 예를 들어 누적된 사용자 명령어 문자열이 “/usr/bin/rm” 이란 문자열인 경우, 중계모듈(38)은 해당 명령어 문자열로부터 실제 실행될 최종 명령어 문자열인 “rm” 추출한다. 그리고 추출된 최종 명령어, 옵션, 인수들로 실제 명령문을 추출한다.That is, when the user's enter key is input, the relay module 38 does not determine whether the terminal is immediately authorized, but instead executes a final command or statement through communication with the server 40 and analysis of response data through a command confirmation process. Extract. For example, when the accumulated user command string is a string of “/ usr / bin / rm”, the relay module 38 extracts “rm”, which is the final command string to be actually executed from the command string. Then extract the actual statement with the extracted final commands, options, and arguments.
결국, 명령문의 실행 여부 판단 시 사용자가 입력한 명령문에 대한 분석이 필요하며, 이를 위하여 서버(40)와의 정의된 통신 및 분석을 통한 명령문 확인 작업(또는 명령어 확인 작업)을 수행하고, 이를 통해, 실제로 실행될 최종 명령문을 추출한다. 실제로 실행될 최종 명령문 문자열 필터링 작업을 수행함으로써, 명령어에 대한 학습을 보다 효과적으로 수행할 수 있다.As a result, when determining whether to execute the statement, analysis of the statement input by the user is required, and for this, a statement checking operation (or command checking operation) through defined communication and analysis with the server 40 is performed. Extract the final statement that will actually be executed. By filtering the final statement string that will actually be executed, we can learn more about the instruction.
구체적으로, 사용자 단말(10)로부터 명령어 문자를 수신하고(S10), 명령문 실행을 의미하는 엔터(Enter)키가 입력되는지를 판단한다(S20).Specifically, it receives a command character from the user terminal 10 (S10), it is determined whether the enter key (Enter), which means the execution of the statement is input (S20).
만약 엔터(Enter)키가 아니면 제어 문자인지를 확인한다(S30). 제어 문자는 탭(tab), 방향키, 컨트롤(ctrl, alt) 키와의 조합문자, 함수키 등 제어를 위한 문자로서, 텍스트 형태의 문자가 아닌 문자를 의미한다.If it is not the Enter key, it checks whether it is a control character (S30). The control character is a character for control, such as a tab, a direction key, a combination character with a control (ctrl, alt) key, a function key, etc. and means a character that is not a text type character.
제어 문자가 아니면, 입력된 명령문 문자를 입력 명령문 누적 문자열에 누적한다(S40). 명령문 누적 문자열은 입력되는 명령문 문자를 누적하여 만든 문자열로서, 엔터키 문자(또는 명령어 입력완료 문자)가 입력되기 전까지 누적되는 문자열이다. 중계모듈(38)은 명령문 누적 문자열을 사용자 단말(10)로 회신하여, 사용자 단말(10)의 쉘(shell) 상에서 해당 문자열을 출력하게 한다. 즉, 사용자는 자신이 입력한 명령문 문자열이 쉘 상에서 표시되는 것을 볼 수 있다.If not the control character, the input statement character is accumulated in the input statement cumulative string (S40). A statement cumulative string is a string created by accumulating the input statement characters. The cumulative statement string is accumulated before the enter key character (or command completion character) is input. The relay module 38 returns the cumulative statement string to the user terminal 10 and outputs the corresponding character string on the shell of the user terminal 10. That is, the user can see that the statement string he entered is displayed in the shell.
입력된 명령문 문자가 제어 문자인 경우, 누적된 문자열 및 해당 입력된 제어 문자를 서버(40)에 전송하고(S31), 서버(40)로부터 제어문자가 반영된 명령문 문자열을 수신한다(S32). 반영된 명령문 문자열을 입력 명령문 누적 문자열에 반영한다(S40). 예를 들어, 누적된 문자열이 "ren"이고, 이 상태에서 사용자가 탭(tab) 문자를 입력한다. 이 경우, "ren[Tab]"의 문자열을 서버(40)에 전송하면, 서버(40)로부터 "rename"이라는 문자열을 수신한다. 즉, "rename"은 "ren"에 제어문자 [Tab]이 반영된 문자열이다.When the input statement character is a control character, the accumulated character string and the corresponding input control character are transmitted to the server 40 (S31), and the statement string reflecting the control character is received from the server 40 (S32). The reflected statement string is reflected in the input statement cumulative string (S40). For example, the accumulated string is "ren", in which the user enters a tab character. In this case, when the character string "ren [Tab]" is transmitted to the server 40, the character string "rename" is received from the server 40. That is, "rename" is a string in which "ren" reflects the control character [Tab].
반영된 누적 문자열은 사용자 단말(10)에 회신되어, 반영된 누적 문자열이 쉘 상에 표시된다. 따라서 사용자는 ren+탭을 입력하면, 화면 상(쉘 상)에 "rename"이 표시되는 것을 볼 수 있다.The reflected cumulative string is returned to the user terminal 10, and the reflected cumulative string is displayed on the shell. Thus, when the user enters the ren + tab, the user sees "rename" displayed on the screen (in a shell).
그리고 앞서 S20단계에서 엔터 문자 또는 명령문 입력완료 문자가 입력되면, 명령문 확인 과정을 수행한다(S50). 명령문 확인 과정(S50)은 입력된 누적 문자열 또는 명령문 문자열이 실제로 실행되는 명령문을 추출하는 과정이다. 즉, 중계모듈(38)이 실제 서버(40)에 명령문 실행을 통한 사용자가 의도한 최종 명령문을 확인하는 과정이다.When the enter character or the statement input completion character is input in step S20, the statement checking process is performed (S50). The statement checking process S50 is a process of extracting a statement in which an input cumulative string or a statement string is actually executed. That is, the relay module 38 checks the final statement intended by the user by executing the statement on the real server 40.
명령문 확인 과정(S50)은 에코(echo) 명령어, 링크된 명령 추출(realpath) 명령어, 명령어 명칭 추출(basename) 명령어 등의 시스템 명령어들을 이용하여, 서버(40)에 요청하여 결과를 회신한다. 또한, 성능 향상을 위하여, 실행가능확인(which) 명령어를 추가로 이용할 수 있다. 명령문 확인 과정(S50)의 구체적인 과정이 도 7에 도시되고 있다.The statement checking process S50 requests the server 40 and returns a result by using system commands such as an echo command, a linked command extract (realpath) command, and a command name extract (basename) command. In addition, to improve performance, an executable checking (which) instruction may be additionally used. A detailed process of the statement checking process S50 is illustrated in FIG. 7.
먼저, 에코 명령어를 적용한다(S51). 에코(echo) 명령어는 변수 처리된 명령어, 와일드 문자(Wild Char)가 포함된 명령어, 히스토리(History)가 포함된 명령어 등에 대하여 실제 실행할 명령어로 변환하여 회신한다. 또한, 에코 명령어는 개행문자, 공백문자 등을 치환한다.First, an echo command is applied (S51). The echo command converts a command that has been processed into a variable, a command including a wild char, and a command including a history to convert the command to be executed. In addition, the echo command replaces newlines, spaces, and so on.
즉, 사용자에게 명령문 실행을 의미하는 엔터(Enter)키가 입력되었을 경우 현재까지 메모리에 누적된 명령문 문자를 확인하기 위하여, 문자열 사이에 포함된 공백과 줄 마지막에 개행 문자를 표준 출력으로 출력하는 명령어인 “echo” + [입력 명령어] 를 서버에 전송하고, 서버(40)의 응답을 수신한다. 수신된 메시지를 중계모듈(38)은 분석하여 변수 처리된 명령문 문자를 치환하고, 와일드 문자(Wild Char) 및 히스토리(History) 명령문을 실제 명령문으로 치환한다.In other words, when the user presses the Enter key to execute a statement, the command outputs newline characters on the standard output and includes a space between the strings and the end of a line to check the accumulated statement characters in memory. "Echo" + [input command] is sent to the server, and the server 40 receives the response. The relay module 38 analyzes the received message to replace the variable processed statement character and replaces wild char and history statements with actual statements.
다음으로, 도시되지는 않았으나, 실행가능확인(which) 명령어를 수행할 수 있다. 실행가능확인(which) 명령어는 해당 누적 명령문 문자열이 실행가능한지 여부를 확인하는 명령어이다.Next, although not shown, an executable checking (which) command may be performed. The executable check (which) command is used to check whether a corresponding cumulative statement string is executable.
즉, 접근통제 시스템의 성능 향상을 위하여 이렇게 치환된 명령문이 실제 서버에서 실행 가능한 명령어 인지 “which [명령어]” 라는 명령어를 서버에 전송하여 확인한다.In other words, to improve the performance of the access control system, the command “which [command]” is sent to the server to check whether the replaced command is a command that can be executed on the server.
다음으로, 링크된 명령 추출(realpath) 명령어를 적용한다(S52). 링크된 명령 추출(realpath) 명령어는 심볼릭 링크(Symbolic Link)에 의해 링크된 실제 명령어 또는 명령문을 회신한다.Next, the linked command extraction (realpath) command is applied (S52). The linked command realpath command returns the actual command or statement linked by the symbolic link.
즉, 앞서“which [명령어]”에 대한 응답 메시지를 분석하여 실행 가능한 명령문일 경우 심볼릭 링크 명령문 실행 여부를 판단하기 위하여 실제 실행파일 경로를 반환하는 명령어인 “realpath” 명령어를 서버에 전송하고, 실제 실행 파일 경로를 획득한다.In other words, in case of an executable statement by analyzing the response message for “which [command]”, the “realpath” command, which returns the actual executable path, is sent to the server to determine whether to execute the symbolic link statement. Get the executable file path.
다음으로, 명령어 명칭 추출(basename) 명령어를 적용한다(S53). 명령어 명칭 추출(basename) 명령어는 명령어가 경로(path)를 포함하는 경우 경로를 제외하고 실제 실행 명령어의 이름을 회신한다.Next, the command name extraction (basename) command is applied (S53). The command command basename returns the name of the actual execution command except the path if the command contains a path.
실행 파일 전체 경로를 포함한 문자열에서 경로를 제외한 명령어 문자를 얻기 위하여 서버에 “basename” 명령어를 전송을 통하여 최종 의도한 명령어를 추출할 수 있다.The final intended command can be extracted by sending the command “basename” to the server to get the command character except the path from the string including the full path of the executable file.
마지막으로, 상기와 같은 과정으로 최종 명령문 또는 실제 명령문을 추출한다(S60). 특히, 명령문에서 명령어 명칭을 "basename"으로 구한 명령어 명칭으로 치환하여, 최종 명령문을 획득한다.Finally, the final statement or the actual statement is extracted by the above process (S60). In particular, replace the command name in the statement with the command name obtained by "basename" to obtain the final statement.
다음으로, 본 발명의 일실시예에 따른 명령문 추출 방법의 예시를 도 8을 참조하여 설명한다.Next, an example of a statement extracting method according to an embodiment of the present invention will be described with reference to FIG. 8.
먼저, 사용자가 “ren” + [TAB키] 명령어 입력하는 경우를 설명한다.First, the case where the user enters the command “ren” + [TAB key] is explained.
이때, 사용자 단말의 화면 상에서 출력은 “rename”로 나타난다.At this time, the output is displayed as "rename" on the screen of the user terminal.
종래기술에 따른 누적 명령문 문자열은“ren[TAB]”이나, 본 발명에 따른 명령문 누적 문자열은 “rename”이다. 또한, 사용자가 일반 문자가 아닌 제어 문자Control Key)를 이용한 명령문을 입력 시 중계모듈(38)은 이 문자를 누적하지 않고, 서버에 전송하여 응답값을 분석하여 명령문을 누적한다.The cumulative statement string according to the prior art is "ren [TAB]", but the cumulative statement string according to the present invention is "rename". In addition, when a user inputs a statement using a control character (Control Key) rather than a general character, the relay module 38 does not accumulate the character and transmits the character to the server to analyze the response value and accumulate the statement.
다음으로, 에코(echo) 명령을 이용하여 최종 명령문을 확인하는 예이다. 도 8a와 같이, 개행 문자가 입력되는 경우이다. 개행 문자를 치환하거나 제거하여, 원래 실행할 명령문 문자열인 rm -rf 를 회신받는다.The following is an example of checking the final statement by using the echo command. It is a case where a newline character is input like FIG. 8A. Replace or remove the newline character, returning the original command string rm -rf.
다음으로, 도 8b는 히스토리(histroy) 명령을 사용하는 예이다. 이전에 사용한 명령문 번호를 통해, 실행을 확인한다. 즉, 도 8b의 예시에서, 번호 "546"은 "/usr/bin/ssh"의 명령문을 나타낸다. 서버(40)로부터 "/usr/bin/ssh"의 명령문 문자열을 회신받는다.Next, FIG. 8B is an example of using a history command. Check the execution by looking at the statement number used previously. That is, in the example of FIG. 8B, the number "546" represents the statement of "/ usr / bin / ssh". The command string "/ usr / bin / ssh" is returned from the server 40.
다음으로, 도 8c는 와일드 문자(Wild Char)(*,?)를 활용한 명령문 사용 예시이다. 명령문 전체를 입력하지 않고 와일드 문자(Wild Char)를 사용하여 해당 디렉토리에 존재하는 유사한 실행 파일을 실행할 수 있다. Next, FIG. 8C is an example of using a statement using wild chars (*,?). Instead of typing the entire statement, you can use wild chars to run similar executables that exist in that directory.
예를 들어, “ss*” 입력하여, “ssh”를 실행할 수 있다. 도 8c와 같이, 이 경우에도 에코 명령문은 "./ssh"를 회신한다.For example, you can run “ssh” by typing “ss *”. As in Fig. 8C, even in this case, the echo statement returns "./ssh".
다음으로, 도 8d는 변수 처리된 명령문 사용에 대한 예시를 나타낸다. 변수 a를 이용하는 경우로서, a가 "rm"이란 명령문을 갖는 것으로 정의하여 이용한다. 에코 명령문을 이용하면, 서버는 "rm -rf"를 회신한다.Next, FIG. 8D shows an example of using a variable processed statement. In the case of using the variable a, it is defined that a has the statement "rm". Using an echo statement, the server returns "rm -rf".
다음으로, 링크된 명령 추출(realpath) 명령문을 활용하여 확인하는 경우이다. 도 8e의 굵은 글씨에 보면, "ssf"는 "/usr/bin/ssh"의 명령에 링크시킨 명령이다. 따라서 "ssf"를 실행하면, 링크된 명령문인 "/usr/bin/ssh"가 실행된다. 링크된 명령 추출(realpath) 명령문을 이용하여, 서버에 전달하면, "ssf" 명령문 문자열이 링크된 명령어인 "/usr/bin/ssh"를 회신한다. 이때, 실제 실행파일의 위치와 실행파일의 이름과 함께 회신한다.The next step is to check using the linked command's realpath statement. In bold in Fig. 8E, "ssf" is a command linked to a command of "/ usr / bin / ssh". So when you run "ssf", the linked statement "/ usr / bin / ssh" is executed. When passed to the server using a linked command command (realpath), the "ssf" statement string returns the linked command "/ usr / bin / ssh". At this time, reply with the location of the actual executable and the name of the executable.
다음으로, 명령어 명칭 추출(basename) 명령어를 활용한 최종 명령어 확인에 대한 예시가 도 8f에 도시되고 있다. 명령어 문자열 "usr/bin/ssh"을 명령어 명칭 추출(basename) 명령어로 서버에 요청하면, 명령어 이름인 "ssh"를 회신한다.Next, an example of confirming a final command using a command name extraction (basename) is shown in FIG. 8F. When the command string "usr / bin / ssh" is requested to the server with the command command basename, the command name "ssh" is returned.
다음으로, 본 발명의 일실시예에 따른 행위조정자(61)에서 명령문에서 명령어를 추출하는 방법에 대하여 도 9을 참조하여 설명한다.Next, a method of extracting an instruction from a statement in the behavior coordinator 61 according to an embodiment of the present invention will be described with reference to FIG. 9.
먼저, 이상 사용자 탐지 위한 학습 모델을 생성하기 위하여, 사용자가 서버에 접속하여 사용한 작업 명령문에서 명령어만 추출한다. 명령어는 명령어의 옵션 부분을 포함한다. 즉, 명령문은 명령어와, 명령어에 대한 인수값(또는 인수, argument)으로 구성된다. 이때 명령문에서 인수값을 제외하고 명령어만 추출한다.First, in order to create a learning model for detecting anomalous users, only commands are extracted from work statements that a user connects to the server. The command includes an optional part of the command. That is, a statement consists of a command and its arguments (or arguments). At this time, only the command is extracted without the argument value.
또한, 명령어에서 경로가 포함된 경우에는 경로를 제외하여, 순수한 명령어만을 추출한다. 즉, 명령문을 입력할 때, 명령어의 명칭만을 기재하지 않고, 해당 명령어가 위치하는 경로(디렉토리 또는 폴더)를 함께 기재하는 경우가 있다. 이때 해당 경로를 명령어 또는 명령문에서 제외한다.In addition, when a command includes a path, only the pure command is extracted except the path. That is, when entering a statement, the path (directory or folder) in which the command is located may be described together with the name of the command. Exclude the path from the command or statement.
따라서 최종적인 명령어는 명령어(또는 명령어의 명칭)과 명령어 옵션들만으로 구성된다. 이와 같이, 명령문을 명령어로 줄임으로써 학습 노이즈를 줄일 수 있다.The final command therefore consists only of the command (or command name) and command options. In this way, the learning noise can be reduced by reducing the statement to instructions.
명령문에서 명령어를 도출하는 예시가 도 9에 도시되고 있다.An example of deriving an instruction from a statement is shown in FIG. 9.
다음으로, 본 발명의 일실시예에 따른 행위모델 엔진(63)에서 행위모델을 구축하는 방법을 구체적으로 설명한다.Next, a method of constructing a behavior model in the behavior model engine 63 according to an embodiment of the present invention will be described in detail.
행위모델에 의하면, 사용자가 하나의 세션에서 입력한 일련의 명령어에 대하여 이상 사용자 여부를 다음과 같은 기준에 의하여 판단한다.According to the behavior model, whether a user is an abnormal user with respect to a series of commands input by a user in one session is determined based on the following criteria.
먼저, 각 명령어의 옵션 및 옵션 문자열의 입력 순서의 유사 여부로 판단한다. 즉, 각 명령어들이 어떠한 옵션과 함께 사용되는지에 대한 유사 여부를 판단한다. 일반적으로 각 사용자들은 자신들의 습관에 의하여, 명령어 및 그 옵션 문자열을 일정한 순서로 입력한다. 예를 들어, "ls -al" 형식의 명령어를 사용하는 사용자는 "ls -la" 라고 옵션 순서를 변경하여 사용하지 않는다.First, it is determined whether or not the options of each command and the input order of option strings are similar. That is, it determines whether or not each command is similar to which option is used. In general, each user enters a command and its option string in a certain order according to their own habits. For example, users who use the command "ls -al" do not change the order of the options "ls -la".
다음으로, 각 사용자별 명령어별 사용 빈도의 유사 여부로 판단한다. 일반적으로, 각 사용자는 루틴한 업무를 수행한다. 따라서 사용자는 특정 명령어들을 주로 자주 사용한다. 즉, 자주 사용하는 명령어들이 정해져 있다. 따라서 각 사용자의 명령어별 사용 빈도의 유사 여부로 사용자 사용 패턴의 유사 여부를 판단할 수 있다.Next, it is determined whether the frequency of use of each user command is similar. In general, each user performs routine tasks. Therefore, users often use certain commands. That is, frequently used commands are defined. Therefore, similarity of user usage patterns may be determined based on similarity of usage frequency of each user's command.
또한, 다음으로, 특정 명령어를 사용하기 위한 사전 명령어의 사용 순서들을 대비하여, 사용 패턴의 유사 여부를 판단한다. 일반적으로, 각 사용자는 특정 작업을 수행하기 위하여, 일정하게 수행하는 일련의 명령어들이 있다. 다른 일련의 명령어들의 수행에 따라 해당 작업이 수행될 수도 있으나, 일반적으로, 사용자는 자신에게 익숙한 형태의 명령어 작업을 통해 해당 작업을 수행한다. 따라서 특정 작업을 수행하기 위해 요청되는 일련의 명령어들을 분석하면, 사용자 유사 여부를 판단할 수 있다.In addition, next, in order to use sequences of a dictionary instruction for using a specific instruction, it is determined whether or not the usage pattern is similar. In general, each user has a series of instructions that he or she performs regularly to perform a particular task. The task may be performed according to the execution of another series of commands, but in general, the user performs the task through a command task that is familiar to the user. Thus, by analyzing a series of commands required to perform a specific task, it may be determined whether the user is similar.
구체적으로, 행위모델 엔진(63)은 세션정보 및 명령어 데이터를 입력하여 학습을 통해 행위모델을 구축한다. 이때, 행위모델의 입력 데이터는 세션 정보 및, 명령어 데이터로서, 사용자 식별정보(사용자 아이디 등), 세션 식별정보(장비접속 세션의 아이디)으로 구성된다. 바람직하게는, 명령어 입력시간을 더 추가할 수 있다.Specifically, the behavior model engine 63 inputs session information and command data to build a behavior model through learning. At this time, the input data of the behavior model includes session information and command data, and includes user identification information (user ID, etc.) and session identification information (ID of a device access session). Preferably, the command input time may be further added.
행위모델의 입력데이터의 일례가 도 10에 도시되고 있다. 도 10에서, "_id"는 DB에서 오브젝트 고유 값을 나타내고, "user_id"는 장비 접속한 사용자 아이디를 나타낸다. 또한, "command"는 장비 접속 후 입력한 명령어를 나타내고, "connection_id"는 장비 접속 세션의 아이디를 나타낸다. 장비접속세션 아이드를 통해 장비정보, 접속 프로토콜 정보 등의 데이터를 연결하여 찾을 수 있다. 그리고 "datetime"은 명령문의 입력시간을 나타낸다.An example of the input data of the behavior model is shown in FIG. In FIG. 10, "_id" represents an object unique value in a DB, and "user_id" represents a user ID connected to equipment. In addition, "command" represents a command input after the device is connected, and "connection_id" represents the ID of the device connection session. The equipment connection session eye can connect and find data such as equipment information and connection protocol information. And "datetime" represents the input time of the statement.
또한, 행위모델의 출력데이터의 일례가 도 11에 도시되고 있다. 행위모델의 출력데이터(또는 결과데이터)는 사용자별 명령어의 사용패턴이 추출된다.Also, an example of output data of the behavior model is shown in FIG. As the output data (or result data) of the behavior model, a usage pattern of a command for each user is extracted.
도 11에서, "_id"는 DB에서 오브젝트 고유 값이고, "num_distinct_commands"는 unique_commands(중복되지 않은) 명령어의 개수를 나타낸다. "user_id"는사용자 아이디로서, 어떤 사용자의 명령어 학습 모델인지를 나타낸다. 그리고 "unique_commands"는학습 데이터를 통해 모델링한 명령어별 가중치를 나타낸다. 명령어 가중치는 이상 사용자 유무를 판단하기 위한 가중치 값이다.In FIG. 11, "_id" is an object unique value in the DB, and "num_distinct_commands" represents the number of unique_commands (non-duplicated) commands. "user_id" is a user ID and indicates which user's command learning model. And "unique_commands" represents the weight for each command modeled through the training data. The command weight is a weight value for determining whether there is an abnormal user.
구체적으로, 명령어들을 다음과 같은 [수학식 1]에 의하여 학습시켜 각 사용자별 모델(또는 패턴)을 구축한다. 이하 수학식은 각 명령어(command)들이 어떠한 옵션(option)과 함께 사용되었는지와, 각 명령어(command)들이 어느 정도의 빈도로 사용되었는지를 나타낸다.Specifically, the instruction is trained by the following [Equation 1] to build a model (or pattern) for each user. The following equation shows which options are used with each command, and how often each command is used.
[수학식 1][Equation 1]
Figure PCTKR2018003549-appb-I000002
Figure PCTKR2018003549-appb-I000002
즉, 특정 사용자 u에 대해 각각의 명령어(command line) c에서의 확률은 위와 같은 수식으로 계산된다. 여기서, α는 사전에 정해진 상수로서 슈도우 카운트(pseudocount)이며, A는 서로다른 명령어(distinct command) 개수이다.That is, the probability at each command line c for a specific user u is calculated by the above equation. Here, α is a predetermined constant and is a pseudocount, and A is the number of distinct commands.
슈도우 카운트(pseudocount) α는 수학적으로 분자를 0으로 만들지 않는 역할을 한다. 개념적으로는 학습 데이터에 없는 새로은 명령어(command)가 입력되었을 경우 민감도로 볼 수 있다.The pseudocount α does not mathematically make the molecule zero. Conceptually, if a new command is entered that is not in the training data, it can be regarded as sensitivity.
나이브 베이지안(Naive Bayesian) 모델인 경우, 모든 명령어(command)들은 독립 시행이라고 가정하며, n번째 명령어(command) 입력 시점에서의 확률은 n번째까지의 모든 확률의 곱으로 계산할 수 있다.In the case of the Naive Bayesian model, all commands are assumed to be independent trials, and the probability at the nth command input can be calculated as the product of all the probabilities up to the nth.
다음으로, 본 발명의 일실시예에 따른 연산부(64) 및 이상판단부(35)가 이상 사용자 확률을 산출하는 방법을 도 12를 참조하여 설명한다.Next, a method of calculating the abnormal user probability by the calculating unit 64 and the abnormal determining unit 35 according to an embodiment of the present invention will be described with reference to FIG. 12.
도 12에서 보는 바와 같이, 먼저 사용자별 행위모델을 참조한다(S11).As shown in FIG. 12, first, the behavioral model for each user is referred to (S11).
행위모델은 앞서 행위모델 엔진(63)에 의해 생성된다. 즉, 행위모델 엔진(63)에 의하여 사용자 N명에 대하여 학습된다. 특정한 명령어(command)가 입력되었을 때, 입력된 명령어(command)가 어떤 사용자일 확률이 가장 높은지를 찾아줄 수 있다.The behavior model is previously generated by the behavior model engine 63. That is, the behavior model engine 63 learns about N users. When a specific command is entered, it can find out which user the command is most likely to be.
구체적으로, 먼저, 행위 모델 M은 사용자 수 만큼 N개가 사전에 학습되어 존재하고, 연산부(64)는 행위모델 M을 참조할 수 있도록 데이터베이스 등으로부터 가져온다(S11).Specifically, first, N behavior models M are learned in advance as many as the number of users, and operation unit 64 is brought from a database or the like so as to refer to behavior model M (S11).
다음으로, 현재 사용자로부터 입력된 명령어를 받아온다(S12). 입력된 명령어는 앞서 명령어 추출부(32) 또는 행위조정자(61)에 의해 추출된 실제 명령어이다. 즉, 입력된 명령어는 명령어 명칭과 옵션들로 구성된다.Next, the command received from the current user is received (S12). The input command is the actual command extracted by the command extractor 32 or the behavior adjuster 61. That is, the entered command consists of the command name and options.
다음으로, 연산부(64)는 입력된 명령어를 각 사용자별 행위모델에 적용하여, 각 사용자별 확률 P를 구한다(S13). 즉, 그 결과의 형태는 (사용자 아이디, 확률)의 조합으로 구성된다.Next, the calculation unit 64 applies the input command to the behavior model for each user, and obtains the probability P for each user (S13). That is, the form of the result consists of a combination of (user ID, probability).
사용자별 확률은 사용자의 수(예를 들어, N개)만큼 산출된다. 여기서, 확률은 절대적인 값이 아닌 현재 사용자가 각 사용자일 상대적인 확률이다.The probability per user is calculated by the number of users (eg N). Here, the probability is not an absolute value but a relative probability that the current user is each user.
또한, 사용자별 확률은 연산부(64)에 의해 산출된다.In addition, the user-specific probability is calculated by the calculation unit 64.
다음으로, N개의 사용자별 확률을 크기에 따라 순위를 부여한다(S14). 바람직하게는, 사용자별 확률을 내림차순으로 정렬하여 각 사용자별 확률의 순위를 부여한다. 즉, 각 사용자별 확률은 현재 사용자가 해당 사용자일 확률을 나타낸다.Next, ranks are assigned according to sizes of N user-specific probabilities (S14). Preferably, the probability of each user is ranked by sorting the probability of each user in descending order. That is, the probability for each user represents the probability that the current user is the corresponding user.
예를 들어, 산출된 각 사용자별 확률은 다음과 같다.For example, the calculated probability for each user is as follows.
[결과 예시][Example result]
- 사용자 : zz_user-User: zz_user
- 입력 명령어 : ls ?F ?l -l-Input command: ls? F? L -l
- 결과 데이터Result data
1. [ a_user, 90.43 ]  1. [a_user, 90.43]
2. [ b_user, 88.34 ]  2. [b_user, 88.34]
3. [ c_user, 85.32 ]  3. [c_user, 85.32]
4. [ d_user, 82.12 ]  4. [d_user, 82.12]
5. [ e_user, 79.14 ]  5. [e_user, 79.14]
6. [ f_user, 77.23 ]  6. [f_user, 77.23]
....  ....
100. [zz_user, 10.23 ]  100. [zz_user, 10.23]
앞서 결과 예시에서 보는 바와 같이, 현재 사용자 “zz_user”가 사용한 명령어 “ls ?F ?l ?l”를 전체 사용자의 학습 모델을 기반으로 평가한다. 그 결과의 예시가 앞서 기재한 바와 같다.As shown in the result example above, the command “ls? F? L? L” used by the current user “zz_user” is evaluated based on the learning model of the entire user. Examples of the results are as described above.
상기와 같은 결과 값은 이상판단부(35)로 전달된다.The resultant value as described above is transmitted to the abnormality determination unit 35.
다음으로, 입력된 명령어에 대하여 해당 사용자의 확률의 순위에 따라 이상 사용자 여부를 판단한다(S15). 바람직하게는, 판단작업은 이상판단부(35)에 의해 수행된다.Next, it is determined whether the user is an abnormal user according to the ranking of the probability of the corresponding user with respect to the input command (S15). Preferably, the judgment work is performed by the abnormal decision unit 35.
내림차순으로 정렬된 사용자의 순서가 입력된 명령어(command) 열에 대해 해당 사용자가 맞을 확률에 대한 예측 값을 나타낸다.The order of users sorted in descending order indicates the predicted value of the probability that the user is correct for the input command string.
따라서 이상판단부(35)에서는 특정 사용자가 입력한 명령어(command)에 대해 해당 사용자일 확률이 일정 순위 아래에 있을 경우 이상 사용자로 판단한다.Therefore, the abnormality determination unit 35 determines that the user is an abnormal user when the probability of being the corresponding user is below a predetermined rank with respect to a command input by a specific user.
일반적으로 이상판단부(35)는 정책에 따라, 적용된 상위 N 명 안에 명령어를 입력한 사용자(또는 사용자 아이디)가 포함되었을 경우, 명령어를 사용한 사용자가 맞다고 가정한다. In general, the abnormality determination unit 35 assumes that the user who used the command is correct when the user (or user ID) that inputs the command is included in the applied top N names according to the policy.
한편, 딥러닝 모델 또는 순환형 신경망 모델을 학습시키는 방법이 도 13에 나타낸 바와 같다. 도 13은 명령어가 입력되었을 경우. 명령어를 입력한 사용자가 맞을 확률을 어떻게 산출하는가에 대한 개념적인 도식이다.Meanwhile, a method of learning a deep learning model or a cyclic neural network model is shown in FIG. 13. 13 is when a command is input. This is a conceptual diagram of how to calculate the probability that a user who enters a command is correct.
또한, 입력된 명령어는 학습된 전체 사용자의 모델과 수학식 1에 의하여 계산하여 해당 사용자가 맞을 확률이 결과값으로 산출된다.In addition, the input command is calculated based on the trained model of the entire user and Equation 1, and the probability that the corresponding user is corrected is calculated as a result value.
이상, 본 발명자에 의해서 이루어진 발명을 상기 실시 예에 따라 구체적으로 설명하였지만, 본 발명은 상기 실시 예에 한정되는 것은 아니고, 그 요지를 이탈하지 않는 범위에서 여러 가지로 변경 가능한 것은 물론이다.As mentioned above, although the invention made by this inventor was demonstrated concretely according to the said Example, this invention is not limited to the said Example and can be variously changed in the range which does not deviate from the summary.

Claims (10)

  1. 사용자 단말과 서버가 네트워크로 연결되고, 상기 사용자 단말과 상기 서버 사이의 네트워크 상에 게이트웨이로 설치되는, 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템에 있어서,In the server access control system for detecting an input instruction learning-based abnormal user is installed in the network between the user terminal and the server, the gateway between the user terminal and the server,
    상기 사용자 단말로부터 전달되는 패킷으로부터 세션정보와 명령문을 추출하고, 상기 사용자 단말과 상기 서버 사이에서 입력되는 명령문 또는 서버의 결과를 중계하는 중계모듈;A relay module for extracting session information and a statement from a packet transmitted from the user terminal and relaying a result of a statement or a server input between the user terminal and the server;
    상기 중계모듈로부터 세션정보와 명령문을 수신하여 명령어를 추출하고, 사용자의 명령어 입력 패턴을 나타내는 행위모델을 학습시켜 생성하고, 사용자의 현재 명령어를 상기 행위모델에 적용하여 이상 사용자의 확률을 산출하는 이상탐지부; 및, Receiving session information and statements from the relay module, extracting a command, learning and generating a behavior model indicating a user's command input pattern, and calculating a probability of the abnormal user by applying a user's current command to the behavior model. Detection unit; And,
    상기 이상탐지부로부터 이상 사용자의 확률을 수신하고, 사전에 정해진 정책에 따라 상기 이상 사용자의 확률을 이용하여 관리자에게 경고나 세션 차단, 사용자 차단을 결정하는 이상판단부를 포함하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.Receiving the probability of the abnormal user from the abnormality detection unit, and using the probability of the abnormal user according to a predetermined policy comprises an abnormality determination unit for determining the warning, session blocking, user blocking Server access control system that detects abnormal user learning based instruction.
  2. 제1항에 있어서, 상기 이상탐지부는,The method of claim 1, wherein the abnormality detection unit,
    상기 중계모듈로부터 명령문을 수신하는 이벤트 채널;An event channel for receiving a statement from the relay module;
    상기 이상판단부에게 이상 사용자의 확률 정보를 제공하는 상태 채널;A state channel for providing probability information of an abnormal user to the abnormal determination unit;
    상기 명령문으로부터 명령어를 추출하는 행위조정자;An action coordinator for extracting instructions from the statement;
    학습을 통해 각 사용자별 행위모델을 생성하는 행위모델 엔진;A behavior model engine for generating a behavior model for each user through learning;
    상기 행위조정자로부터 명령어를 수신하여, 수신된 해당 명령어에 대하여 각 사용자별 확률을 산출하되, 상기 행위모델을 이용하여 산출하는 연산부를 포함하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.Receiving a command from the behavior coordinator, calculates the probability for each user with respect to the received command, the server for detecting the input instruction learning-based abnormal user, characterized in that it comprises a calculation unit using the behavior model Access control system.
  3. 제2항에 있어서,The method of claim 2,
    상기 행위조정자는 행위로그 스토리지에 상기 명령어를 기록하여 축적하고, 상기 행위모델 엔진은 축적된 행위로그 스토리지의 명령어 데이터를 이용하여 상기 행위모델을 지속적으로 학습시켜 갱신하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The behavior coordinator records and accumulates the command in the behavior log storage, and the behavior model engine continuously learns and updates the behavior model using the accumulated instruction data in the behavior log storage. Server access control system to detect users over infrastructure.
  4. 제2항에 있어서,The method of claim 2,
    상기 행위모델은 베이지안 모델과 딥러닝 모델로 구성되고,The behavior model is composed of a Bayesian model and a deep learning model,
    상기 연산부는 상기 베이지안 모델로부터 제1 확률을 산출하고, 상기 딥러닝 모델로부터 제2 확률을 산출하여, 상기 제1 확률과 상기 제2 확률에 대하여 가중치를 통해 비율 조정을 하여 최종 확률을 추출하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The calculating unit calculates a first probability from the Bayesian model, calculates a second probability from the deep learning model, and extracts a final probability by adjusting a ratio between the first probability and the second probability through weights. Server access control system that detects abnormal user input based instruction learning features.
  5. 제2항에 있어서,The method of claim 2,
    상기 중계모듈은 상기 사용자 단말로부터 명령문 문자를 수신하고, 상기 명령문 문자가 엔터 문자가 아니면 상기 명령문 문자를 누적하여 명령문 누적 문자열을 생성하고, 상기 명령문 문자가 엔터 문자이면 상기 누적 문자열에서 실제 실행될 최종 명령문을 추출하되, 상기 명령문 문자가 제어 문자이면, 누적된 명령문 누적 문자열과 상기 제어 문자를 상기 서버에 전송하고, 상기 제어 문자가 반영된 문자열을 수신하고, 상기 반영된 문자열을 누적하여 상기 누적 문자열을 생성하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The relay module receives a statement character from the user terminal, and if the statement character is not an enter character, accumulates the statement character to generate a statement cumulative string. Extracting the control character, and if the statement character is a control character, transmitting the accumulated statement cumulative string and the control character to the server, receiving a character string reflecting the control character, and accumulating the reflected character string to generate the cumulative character string. Server access control system for detecting the input command learning based abnormal user, characterized in that.
  6. 제5항에 있어서,The method of claim 5,
    상기 중계모듈은 에코(echo) 명령어, 링크된 명령 추출(realpath) 명령어, 명령어 명칭 추출(basename) 명령어 중 어느 하나 이상의 명령어와, 해당 명령어의 인수로서 상기 누적 문자열을 상기 서버에 전송하고, 상기 서버로부터 수신한 해당 명령어의 결과를 이용하여, 상기 최종 명령문을 추출하고,The relay module transmits any one or more of an echo command, a linked command extract (realpath) command, a command name extract (basename) command, and the accumulated string as an argument of the command to the server, and the server Extract the last statement using the result of the command received from
    상기 에코(echo) 명령어는 변수 처리된 명령어, 와일드 문자(Wild Char)가 포함된 명령어, 히스토리(History)가 포함된 명령에 대하여 실제 실행할 명령어로 변환하여 회신하는 명령어이고,The echo command is a command for converting a command that has been processed into a variable, a command including a wild char, and a command including a history to be converted into a command to be actually executed.
    상기 링크된 명령 추출(realpath) 명령어는 링크된 명령 추출(realpath) 명령어는 심볼릭 링크(Symbolic Link)에 의해 링크된 실제 명령어를 회신하는 명령어이고,The linked command extract (realpath) command is a linked command extract (realpath) command is a command to return the actual command linked by the symbolic link (Symbolic Link),
    상기 명령어 명칭 추출(basename) 명령어는 명령어가 경로(path)를 포함하는 경우 경로를 제외하고 실제 실행 명령어의 이름을 회신하는 명령어인 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The command name extraction (basename) command is a command for returning the name of the actual execution command except for the path when the command includes a path, wherein the server access control for detecting an abnormal instruction input based instruction system.
  7. 제2항에 있어서,The method of claim 2,
    상기 명령문에서 명령어 명칭과 명령어 옵션만을 구성된 것으로 명령어를 추출하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The server access control system for detecting an input command learning-based abnormal user, characterized in that the command is extracted with only the command name and command options in the statement.
  8. 제2항에 있어서,The method of claim 2,
    상기 행위모델 엔진은 각 명령어의 옵션 및 옵션 문자열의 입력순서의 유사 여부, 명령어별 사용빈도의 유사 여부, 명령어의 사용 순서에 의한 사용 패턴의 유사 여부를 반영하여, 상기 행위모델을 학습시키는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The behavior model engine learns the behavior model by reflecting the similarity of the input order of options and option strings of each command, the similarity of usage frequency for each instruction, and the similarity of usage patterns according to the usage order of instructions. Server access control system to detect abnormal user input based instruction learning.
  9. 제8항에 있어서,The method of claim 8,
    상기 행위모델 엔진은 다음 [수학식 1]을 이용하여, 상기 행위모델을 구하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The behavior model engine is a server access control system for detecting an input instruction learning-based abnormal user, characterized in that to obtain the behavior model using the following [Equation 1].
    [수학식 1][Equation 1]
    Figure PCTKR2018003549-appb-I000003
    Figure PCTKR2018003549-appb-I000003
    단, Pc,u는 사용자 u에 대하여 명령어 c에서의 확률을 나타내고, Training Count는 학습 데이터의 개수이고, Training Data Length는 학습 명령어의 길이를 나타내고, α는 사전에 정해진 상수로서 슈도우 카운트(pseudocount)이며, A는 서로다른 명령어(distinct command) 개수임.Where P c and u represent the probability of the instruction c with respect to the user u, Training Count is the number of training data, Training Data Length is the length of the training instruction, and α is a predetermined constant. pseudocount), and A is the number of distinct commands.
  10. 제2항에 있어서,The method of claim 2,
    상기 행위모델은 사용자 수 만큼 N개가 사전에 학습되어 구축되고,The behavior model is constructed by learning N number of users in advance,
    상기 연산부는 특정 사용자의 입력된 명령어를 각 사용자별 행위모델 N개에 모두 적용하여, N개의 사용자별 확률을 구하고, N개의 사용자별 확률을 내림차순으로 정렬하여 각 사용자별 확률의 순위를 부여하고,The operation unit applies all the input commands of a specific user to N behavior models for each user, obtains N user probabilities, ranks the N user probabilities in descending order, and ranks the probabilities for each user.
    상기 이상판단부는 입력된 명령어에 대하여 해당 사용자의 확률의 순위에 따라 이상 사용자 여부를 판단하되, 특정 사용자가 입력한 명령어(command)에 대해 해당 사용자일 확률이 일정 순위 아래에 있을 경우 이상 사용자로 판단하는 것을 특징으로 하는 입력된 명령어 학습 기반 이상 사용자를 탐지하는 서버 접근 통제 시스템.The abnormal determination unit determines whether the user is an abnormal user according to the rank of the probability of the corresponding user with respect to the input command, and if the probability that the user is under a certain rank for the command input by a specific user is determined as the abnormal user. Server access control system for detecting an abnormal instruction input based user, characterized in that the.
PCT/KR2018/003549 2017-04-17 2018-03-26 Server access control system for detecting abnormal user on basis of learning of inputted commands for security enhancement WO2018194282A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2017-0049242 2017-04-17
KR1020170049242A KR101796205B1 (en) 2017-04-17 2017-04-17 A server access control system of detecting abnormal users by using command learning for enhancing security

Publications (1)

Publication Number Publication Date
WO2018194282A1 true WO2018194282A1 (en) 2018-10-25

Family

ID=60386034

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2018/003549 WO2018194282A1 (en) 2017-04-17 2018-03-26 Server access control system for detecting abnormal user on basis of learning of inputted commands for security enhancement

Country Status (2)

Country Link
KR (1) KR101796205B1 (en)
WO (1) WO2018194282A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4071640A1 (en) * 2021-04-07 2022-10-12 SSH Communications Security Oyj Controlling command execution in a computer network
CN116809652A (en) * 2023-03-28 2023-09-29 材谷金带(佛山)金属复合材料有限公司 Abnormality analysis method and system for hot rolling mill control system
CN116809652B (en) * 2023-03-28 2024-04-26 材谷金带(佛山)金属复合材料有限公司 Abnormality analysis method and system for hot rolling mill control system

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102024142B1 (en) * 2018-06-21 2019-09-23 주식회사 넷앤드 A access control system for detecting and controlling abnormal users by users’ pattern of server access
KR101992963B1 (en) * 2018-11-20 2019-06-26 주식회사 넷앤드 An automatic generation system for the whitelist command policy using machine learning
WO2020235716A1 (en) * 2019-05-22 2020-11-26 엘지전자 주식회사 Intelligent electronic device and authentication method using message transmitted to intelligent electronic device
KR102118380B1 (en) * 2019-11-08 2020-06-04 주식회사 넷앤드 An access control system of controlling server jobs by users
CN111259412B (en) * 2020-01-09 2023-12-05 远景智能国际私人投资有限公司 Authority control method, authority control device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120049107A (en) * 2010-11-04 2012-05-16 중앙대학교 산학협력단 Log data clustering analysis system and the method for learning-based home network error recognition system
KR101388090B1 (en) * 2013-10-15 2014-04-22 펜타시큐리티시스템 주식회사 Apparatus for detecting cyber attack based on analysis of event and method thereof
KR20140055762A (en) * 2012-11-01 2014-05-09 주식회사 윈스 Network session behavioral pattern modeling detection method and modeling detection system
WO2015103514A1 (en) * 2014-01-06 2015-07-09 Cisco Technology, Inc. Distributed training of a machine learning model used to detect network attacks
KR20170024777A (en) * 2015-08-26 2017-03-08 주식회사 케이티 Apparatus and method for detecting smishing message

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120049107A (en) * 2010-11-04 2012-05-16 중앙대학교 산학협력단 Log data clustering analysis system and the method for learning-based home network error recognition system
KR20140055762A (en) * 2012-11-01 2014-05-09 주식회사 윈스 Network session behavioral pattern modeling detection method and modeling detection system
KR101388090B1 (en) * 2013-10-15 2014-04-22 펜타시큐리티시스템 주식회사 Apparatus for detecting cyber attack based on analysis of event and method thereof
WO2015103514A1 (en) * 2014-01-06 2015-07-09 Cisco Technology, Inc. Distributed training of a machine learning model used to detect network attacks
KR20170024777A (en) * 2015-08-26 2017-03-08 주식회사 케이티 Apparatus and method for detecting smishing message

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4071640A1 (en) * 2021-04-07 2022-10-12 SSH Communications Security Oyj Controlling command execution in a computer network
GB2606137A (en) * 2021-04-07 2022-11-02 Ssh Communications Security Oyj Controlling command execution in a computer network
CN116809652A (en) * 2023-03-28 2023-09-29 材谷金带(佛山)金属复合材料有限公司 Abnormality analysis method and system for hot rolling mill control system
CN116809652B (en) * 2023-03-28 2024-04-26 材谷金带(佛山)金属复合材料有限公司 Abnormality analysis method and system for hot rolling mill control system

Also Published As

Publication number Publication date
KR101796205B1 (en) 2017-11-13

Similar Documents

Publication Publication Date Title
WO2018194282A1 (en) Server access control system for detecting abnormal user on basis of learning of inputted commands for security enhancement
US11711438B2 (en) Systems and methods for controlling data exposure using artificial-intelligence-based periodic modeling
WO2013048111A2 (en) Method and apparatus for detecting an intrusion on a cloud computing service
WO2019245107A1 (en) Malicious code detection device and method
WO2010062063A2 (en) Method and system for preventing browser-based abuse
WO2017069348A1 (en) Method and device for automatically verifying security event
WO2019112326A1 (en) Security enhancement method and electronic device therefor
WO2018107811A1 (en) Joint defence method and apparatus for network security, and server and storage medium
WO2013168951A1 (en) Apparatus and method for checking malicious file
WO2013168913A1 (en) Apparatus and method for checking non-executable files
WO2015069018A1 (en) System for secure login, and method and apparatus for same
WO2018174486A1 (en) Unauthorized command control method of access control system for server security enhancement
WO2017034072A1 (en) Network security system and security method
WO2012023657A1 (en) Network-based harmful-program detection method using a virtual machine, and a system comprising the same
KR101964148B1 (en) Wire and wireless access point for analyzing abnormal action based on machine learning and method thereof
WO2018101565A1 (en) Structure for managing security in network virtualization environment
WO2017171188A1 (en) Security device using transaction information collected from web application server or web server
WO2018164503A1 (en) Context awareness-based ransomware detection
WO2023090864A1 (en) Apparatus and method for automatically analyzing malicious event log
CN106790149A (en) The method and system that a kind of defence IoT equipment is invaded
CN103236932A (en) Webpage tamper-proofing device and method based on access control and directory protection
WO2016159496A1 (en) Method for distributing application having security function added thereto, and operation method of same application
WO2015026083A1 (en) Text message security system and method for preventing illegal use of user authentication by mobile phone and preventing smishing
WO2019177265A1 (en) Data processing method against ransomware, program for executing same, and computer-readable recording medium with program recorded thereon
WO2021095926A1 (en) Complex iot device and sharing service providing method using same, and method for recognizing external information through blockchain application and providing information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18788014

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18788014

Country of ref document: EP

Kind code of ref document: A1