KR101754580B1 - Method and apprapatus for supporting full text search in embedded environment and computer program stored on computer-readable medium - Google Patents

Method and apprapatus for supporting full text search in embedded environment and computer program stored on computer-readable medium Download PDF

Info

Publication number
KR101754580B1
KR101754580B1 KR1020150125194A KR20150125194A KR101754580B1 KR 101754580 B1 KR101754580 B1 KR 101754580B1 KR 1020150125194 A KR1020150125194 A KR 1020150125194A KR 20150125194 A KR20150125194 A KR 20150125194A KR 101754580 B1 KR101754580 B1 KR 101754580B1
Authority
KR
South Korea
Prior art keywords
search
key
units
text
embedded environment
Prior art date
Application number
KR1020150125194A
Other languages
Korean (ko)
Other versions
KR20170028514A (en
Inventor
이원우
Original Assignee
주식회사 셀바스에이아이
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 셀바스에이아이 filed Critical 주식회사 셀바스에이아이
Priority to KR1020150125194A priority Critical patent/KR101754580B1/en
Publication of KR20170028514A publication Critical patent/KR20170028514A/en
Application granted granted Critical
Publication of KR101754580B1 publication Critical patent/KR101754580B1/en

Links

Images

Classifications

    • G06F17/30628
    • G06F17/2705
    • G06F17/30616

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method for supporting full text search in an embedded environment is disclosed in accordance with an embodiment of the present invention. The method includes the steps of generating a plurality of units by dividing a stored original text into predetermined units, generating a plurality of units by using a bloom filter as a bit array, And mapping the at least one of the original text and the plurality of units to the key and storing the key in an index table.

Description

[0001] METHOD AND APPARATUS FOR SUPPORTING FULL TEXT SEARCH IN EMBEDDED ENVIRONMENT AND COMPUTER PROGRAM STORED ON COMPUTER READABLE MEDIUM [0002] METHOD AND APPARATUS FOR SUPPORTING FULL TEXT SEARCH IN EMBEDDED ENVIRONMENT [

The present invention relates to an embedded system, and more particularly, to support full-text search in an embedded system.

An embedded system is a system that enables effective control by designing a microprocessor that acts as a brain for a machine or an electronic device. Instead of reading the software that runs the device from a disk like a computer, it is a device that is built into the device by being put on a chip. Most electronic devices, such as automobiles, mobile phones, household appliances, factory automation equipment, and electronic dictionaries, have embedded systems. For example, a DMB function equipped with a TV function in a mobile phone can be regarded as an embedded system. In addition, the search function installed in the electronic dictionary can be regarded as an embedded system. That is, it means a system that embeds into this system.

An electronic dictionary is a dictionary that can be used as a computer or an electronic device by constructing a general dictionary or an encyclopedia created by a book through an electronic recording medium such as a hard disk or a memory. These electronic dictionaries use the multimedia technology to show the contents of the dictionary to the user, which not only enhances the learning effect, but also has an advantage that the user can directly access the desired information interactively. An electronic dictionary can be implemented as an embedded system.

An electronic dictionary or the like can store a key for searching and a text corresponding to a key. When the user inputs a key to retrieve the original text, the matched original text can be retrieved. In order to provide such a search function, the key and the original text must be matched and stored, and a lot of storage space is required for storing. In addition, in order to support full text search, the length of the index increases, which requires more storage space.

Thus, in an embedded environment, there may be a need in the art to reduce the capacity required to store an index for retrieval.

The present invention has been devised in response to the above-described background art, and is intended to reduce the capacity required for storing an index for searching in an embedded environment.

The present invention is intended to reduce the capacity of an index by reducing the capacity of a key of an index by using a Bloom filter to generate an index for retrieval.

A method for supporting a full text search in an embedded environment is disclosed in accordance with an embodiment of the present invention for realizing the above-mentioned problems. The method includes the steps of generating a plurality of units by dividing a stored original text into predetermined units, generating a plurality of units by using a bloom filter as a bit array, And mapping the at least one of the original text and the plurality of units to the key and storing the key in an index table.

Also disclosed is a computer program stored on a computer-readable medium, comprising a plurality of instructions executed by one or more processors for supporting full text search in an embedded environment, in accordance with an embodiment of the present invention. The computer program includes a command for generating a plurality of units by dividing a stored original text into predetermined units, a command for generating a plurality of units by a bloom filter, and a command for converting at least one of the original text and the plurality of units and the key into an index table.

Further, in accordance with an embodiment of the present invention, an apparatus for supporting full text search in an embedded environment is disclosed. The apparatus comprises a unit generation module for generating a plurality of units by dividing a stored original text into predetermined units, a unit for generating a plurality of units by using a bloom filter, key, a memory for storing the original text and at least one of the plurality of units and the key in an index table by mapping the key.

In accordance with another embodiment of the present invention, a method for supporting full text search in an embedded environment is disclosed. The method includes receiving text from a user, generating one or more search units by dividing the text into predetermined units, searching the one or more search units through a bloom filter into bit strings And a step of determining one or more keys corresponding to the search key among the previously stored one or more keys.

Also disclosed is a computer program stored on a computer-readable medium, comprising a plurality of instructions executed by one or more processors for supporting full text search in an embedded environment, in accordance with another embodiment of the present invention. The computer program comprising instructions for causing a user to input text from a user, instructions for generating the one or more search units by dividing the text into predetermined units, Into a search key composed of a bit string and instructions for determining one or more keys corresponding to the search key among the previously stored one or more keys.

An apparatus for supporting full text search in an embedded environment is disclosed in accordance with another embodiment of the present invention. The apparatus includes a user input module for receiving text from a user, a unit generating module for generating the at least one search unit by dividing the text into predetermined units, a search unit for searching the at least one search unit through a bloom filter, A search module for determining one or more keys corresponding to the search key among the stored one or more keys, a search module for mapping one or more keys and the one or more keys And may include memory for storing text.

The present invention can reduce the capacity required to store an index for retrieval in a database environment.

The present invention can reduce the capacity of the index by decreasing the capacity of the index key by using a Bloom filter to generate an index for searching.

1 is a block diagram of an apparatus for supporting full text search in an embedded environment according to an embodiment of the present invention.
2 is a view for explaining a Bloom filter according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating a process of storing an original text in an index table according to an exemplary embodiment of the present invention. Referring to FIG.
4 is a flowchart of a method for supporting full text search in an embedded environment according to an embodiment of the present invention.
5 is a flowchart of a method for supporting full text search in an embedded environment according to another embodiment of the present invention.
FIG. 6 is a block diagram of a computer that operates to execute a computer program for providing full-text search according to an embodiment of the present invention.
Figure 7 shows a schematic block diagram of an exemplary computing environment for executing a program for providing full text search in accordance with an embodiment of the present invention.

Various embodiments are now described with reference to the drawings, wherein like reference numerals are used throughout the drawings to refer to like elements. In this specification, various explanations are given in order to provide an understanding of the present invention. It will be apparent, however, that such embodiments may be practiced without these specific details. In other instances, well-known structures and devices are provided in block diagram form in order to facilitate describing the embodiments.

The terms "component," "module," system, "and the like, as used herein, refer to a computer-related entity, hardware, firmware, software, combination of software and hardware, or execution of software. For example, a component may be, but is not limited to, a process executing on a processor, a processor, an object, an executing thread, a program, and / or a computer. For example, both an application running on a computing device and a computing device may be a component. One or more components may reside within a processor and / or thread of execution, one component may be localized within one computer, or it may be distributed between two or more computers. Further, such components may execute from various computer readable media having various data structures stored therein. The components may be, for example, a signal (e.g., a local system, data from one component interacting with another component in a distributed system, and / or data over a network, such as the Internet, Lt; RTI ID = 0.0 > and / or < / RTI >

The description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features presented herein.

1 is a block diagram of an apparatus for supporting full text search in an embedded environment according to an embodiment of the present invention.

The apparatus 100 according to an embodiment of the present invention includes a user input module 110, a unit creation module 120, a key transformation module 130, a search module 140, a memory 150, . ≪ / RTI >

The device 100 according to an embodiment of the present invention may include, but is not limited to, an electronic dictionary, a smart phone, a feature phone, a tablet PC, and the like.

The user input module 110 may receive text from a user. The user input module 110 may include a key pad, a dome switch, a touch pad (static / static), a jog wheel, a jog switch, a microphone, and the like. The text may include words, phrases, phrases, syllables, and the like that the user desires to search. The user can freely input text to be searched through the user input module 110. The user may input a sentence for full-text search according to an embodiment of the present invention.

The unit generation module 120 may divide the stored original text into predetermined units to generate a plurality of units. The original text may include one or more words, sentences, paragraphs, phrases, E-book contents, song lyrics, etc., and the original text is only an example and may include any text. The original text may include a sentence to be searched by the user in an electronic dictionary or the like, the content of the E-book, and the like. The predetermined unit may include at least one of a character, a word, a phrase, a paragraph, and a fingerprint. The plurality of units can be converted into index keys for searching the original text through a Bloom filter. For example, the unit generation module 120 may be configured to use the original text " A ", "wonderful ", " "sweet", "mornings", "of", "soul", "like", "like", "serenity", "has" , "spring", "which", "I", "enjoy", "with", "my", "whole", and "heart". The above-described original text, unit, and plural units are merely examples, and the unit generation module 120 according to an embodiment of the present invention can divide any original text into arbitrary units to generate arbitrary plural units have.

The unit generation module 120 may generate one or more search units by dividing the text received from the user into predetermined units. The text input from the user may include not only words, but also sentences, word phrases, and word phrases. The predetermined unit may include at least one of a character, a word, a phrase, a paragraph, and a fingerprint. The one or more search units may be converted to search keys for searching through a Bloom filter. For example, in order to retrieve the original text that the user is "A wonderful serenity has taken possession of my entire soul, like these sweet mornings of spring which I enjoy with my whole heart. the unit generation module 120 divides the text input by the user into words such as "these", "sweet", "mornings", "of", "spring" Units can be created. The text, unit, and retrieval unit input by the user are merely examples. The unit generation module 120 according to an embodiment of the present invention generates arbitrary retrieval units by dividing arbitrary user input text into arbitrary units. can do.

The unit generation module 120 may be implemented with software or hardware for dividing the original text into predetermined units, or a combination thereof. The unit generation module 120 may be implemented with instructions that can be executed by one or more processors to perform the operations.

The key conversion module 130 may convert a plurality of units into a bit array using a bloom filter. The Bloom filter has a bit size of the key and may include one or more hash functions. The key may be a bit string representing a value obtained by calculating the plurality of units using one or more hash functions. For example, when the key is composed of 16 bits, the Bloom filter may calculate the plurality of units in a 16-bit bit array as a hash function and insert the operation result into an index.

The key conversion module 130 may convert one or more search units into a search key composed of bit strings through a Bloom filter. The search key may be composed of a bit string representing a value obtained by calculating one or more search units with one or more hash functions. For example, when the search key is composed of 16 bits, the Bloom filter may calculate one or more search units in a 16-bit bit array as a hash function and insert the operation result into an index. Hereinafter, the bloom filter will be described in more detail.

The Bloom filter is a probabilistic data structure used to check if a component is a member of a set. The Bloom filter is a display function that probabilistically determines which element is an element of a set. The Bloom filter has two sets of elements, one of which is small and the other of which is large, and the number of intersections is small. Since the Bloom filter is made up of bits, it can have a very small size compared to the actual data.

When you perform a membership query that tells which element x is an element of a set A through a Bloom filter, x is not necessarily included in A even if the result is true (false positive). But in reality, there is no case of false negative data. In other words, the result of a membership query using a Bloom filter "probably seems to be an element of the set" or "definitely not an element in the set". The main purpose of the Bloom filter is to reduce the waste of resources consumed in detecting that the key does not exist in the search.

 The Bloom filter is a data structure using a single bit array V of m and k uniformly distributed hash functions of independent kind. The hash function is a function that returns one of 0 to m-1 for each input value.

Figure 112015086033538-pat00001

When inserting an element into the Bloom filter, each result value of k hash functions with respect to the input element is used as an index to the array V, and the corresponding position is set to 1.

Figure 112015086033538-pat00002

To make sure that the element x is contained in the Bloom filter, when all the k hash results for element x are used as the array index for V, if the corresponding array value is all 1, it is probably ' . At this time, it is possible to judge that the element contained in the set is certainly not the element when the element array value is not one.

Figure 112015086033538-pat00003

2A is an exemplary diagram illustrating how to perform a probabilistic membership query using a Bloom filter. When each hash function is performed for x, y, and z (three hash functions in the example of FIG. 2A), the value obtained by using the hash function result as an array index is all 1s. In this case, x, y, z may presumably be considered to be a set of corresponding elements. On the other hand, since the element value of the array V indicated by the execution result value of the hash function for w is not all 1s and includes one 0dmf, w can judge that the element contained in the set is not definitely.

FIG. 2B is an exemplary diagram for explaining a specific application in order to facilitate understanding. Assume that there is a 10-bit bit array structure, a Bloom filter having one hash function, and the hash function is a function that divides the hash function by 10 and obtains the remaining value. If there are only 34 elements in the reference set, the remainder after dividing by 10 is 4, so the fourth bit is changed to 1 as shown in FIG. 2B. If the first element of the object to be compared with the reference set is 56, performing a hash function on 56 results in 6, and checking the sixth bit is 0. Thus, 56 can be assured that the original text is not an element of the reference set without needing to compare it. If the second comparison of the object to be compared with the reference set is 54, a result of 4 is obtained by performing the hash function, and 1 is confirmed by checking of the fourth bit. Therefore, there is a need to confirm 54 in comparison with the original text. As with 56, the Bloom filter is false. If it is true (it is determined that Bloom filter is not an element of the reference set, but it is an element of the actual reference set) (false negative) does not occur. As with 54, the Bloom filter is true, but false (false positives can occur if the Bloom filter is determined to be an element of the reference set but not of the actual reference set). As the number of false positives increases, the performance of the Bloom filter deteriorates. In order to improve the performance, the false positives must be reduced. To reduce false positives, there are ways to increase the number of bits or to invalidate the hash function. However, if the number of bits in the array is wrong, the memory usage increases and if the hash function increases, the CPU usage increases. It is important to keep.

Accordingly, when a plurality of units are converted into a key sequence and stored by a Bloom filter, it is possible to know whether or not the text entered by the user is included in the original text through comparison with the search key. For example, if a user searches for "there sweet mornigs of spring" to search the original text "A wonderful serenity has taken possession of my entire soul, like these sweet mornings of spring" The search key is generated through the bloom filter and the generated search key is compared with the converted key of the original text so that each word of the search text is included in the original text, Can display the original text on the display 160. The user can select and use an appropriate one of the one or more original texts searched as a result of the search text input. In the above example, if the user inputs "there sweet morns of spring that" as search text, that is not an element of the original text, in this case, the above-described original text will not be searched. Accordingly, the user can perform a more accurate search using more accurate search terms. The bloom filter can result in false positives, so unwanted original text may be found in the user's search, but the bloom filter does not cause a false negative. If the user increases the length of the search term to perform a full text search, It is possible to add a word or the like which is not included in the original text other than the original text to be searched for in the text so as to be more precisely searched.

The search module 140 may determine one or more keys corresponding to the search key among the previously stored one or more keys. The previously stored one or more keys may be generated by the key generation module 130 according to an embodiment of the present invention and may include a key text and a key that is matched and stored with a plurality of units. The search module 140 may compare one or more pre-stored keys and the search key bit-by-bit to determine whether to correspond. The search module 140 may compare each bit of the search key with each bit of the stored key to determine a corresponding key. The pre-stored one or more keys and the search key may have the same bit size. The previously stored one or more keys may be generated through the bloom filter by dividing the original text by the key generation module 130. [ In addition, the search key may be generated through the bloom filter by dividing the search text input by the user by the key generation module 130. [ One or more pre-stored keys may be stored in the memory 150. For example, the search module 140 may determine a search key for the search text and a search key that matches the search key among the stored keys, by the user inputting "there sweet mornings of spring" as the search text. The search key described above is only an example, and the search module 140 can determine one or more corresponding keys for any search key.

The memory 150 may store the original text and at least one of the plurality of units in an index table by mapping the key. The index table may be a predetermined space within the memory 150. In addition, memory 150 may store instructions that enable each module to perform operations in accordance with one embodiment of the present invention. In addition, it is possible to temporarily store input / output data.

The memory 150 may be a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (e.g., SD or XD memory), a RAM (Random Access Memory), SRAM (Static Random Access Memory), ROM (Read Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM A disk, and / or an optical disk.

Display 160 may display search results, search windows, search text, textual text, and the like. The display 160 may display text matched with one or more keys corresponding to the search key. The text may include the original text derived from the search result. A liquid crystal display (LCD), a thin film transistor liquid crystal display (TFT LCD), an organic light-emitting diode (OLED), a flexible display, (3D display).

FIG. 3 is a diagram illustrating a process of storing an original text in an index table according to an exemplary embodiment of the present invention. Referring to FIG.

FIG. 3 is a flowchart illustrating a process of matching a text of the original text " A wonderful serenity has taken possession of my entire soul, like sweet mornings of spring " FIG.

Unit creation module 120 divides the original text into predetermined unit of words and generates unit creation module 120 for each of the words A "" I "," enjoy "," with "," with "," a plurality of units including "my "," whole ", and "heart " At this time, the capacity of the original text may be 52 KB. The above-described original text is only an example, and the unit generation module 120 according to an embodiment of the present invention can divide any original text into arbitrary predetermined units to generate a plurality of units.

The key conversion module 130 may convert the plurality of units into a key sequence composed of bits through a Bloom filter (see the bloom filter step of FIG. 3). The key conversion module 130 may convert a plurality of units into a bit string by calculating each of the units of the second number by a hash function and including the calculated value in predetermined bits of the bit string. The capacity that the converted key occupies can be 46 KB smaller than the capacity of the original text. The above-mentioned capacity is only an example, and the key conversion module 130 according to the embodiment of the present invention can convert a key having an arbitrary length.

4 is a flowchart of a method for supporting full text search in an embedded environment according to an embodiment of the present invention.

The method shown in FIG. 4 may be performed in the apparatus 100 according to an embodiment of the present invention.

In step 210, the unit generation module 120 may generate a plurality of units by dividing the stored original text into predetermined units. For example, the text "A wonderful serenity has taken possession of my entire soul, like these sweet mornings of spring" is divided into words "A", "wonderful", "serenity sweet "," mornings "," of "," like "," like "," a plurality of units including "spring", "which", "I", "enjoy", "with", "my", "whole", and "heart" The above-described original text, the division unit and the plurality of units are merely examples, and the unit generation module 120 according to the embodiment of the present invention can divide any original text into units and generate any of a plurality of units .

In operation 230, the key conversion module 130 may convert 230 a plurality of units into a bit stream using a Bloom filter. The key conversion module 130 may operate a plurality of units with one or more hash functions and insert the operation result into a bit string. For example, when the key is composed of 16 bits, the key conversion module 130 may calculate the plurality of units as a hash function, and insert the resultant value into a predetermined one of the bits of the bit string.

At step 250, the device 100 may map the key to at least one of the original text and the plurality of units and store it in an index table. The index table may be any storage space located in the memory 150. The index table can be referred to when retrieving data.

5 is a flowchart of a method for supporting full text search in an embedded environment according to another embodiment of the present invention.

At step 310, the device 100 may receive a textual input from the user via the user input module 110.

In step 330, the unit generation module 120 may divide the input text into predetermined units to generate one or more retrieval units. For example, in order to retrieve the original text that the user is "A wonderful serenity has taken possession of my entire soul, like these sweet mornings of spring which I enjoy with my whole heart. the unit generation module 120 divides the text input by the user into words such as "these", "sweet", "mornings", "of", "spring" Units can be created. The text, unit, and retrieval unit input by the user are merely examples. The unit generation module 120 according to an embodiment of the present invention generates arbitrary retrieval units by dividing arbitrary user input text into arbitrary units. can do.

In step 350, the key translation module 130 may convert the one or more search units into a search key composed of bit strings through a Bloom filter. The key conversion module 130 may operate the search unit with one or more hash functions and insert the operation result into the bit string. For example, when the key is composed of 16 bits, the key conversion module 130 may operate each of the search units with a hash function, and insert the result into a predetermined one of the bits of the bit string.

In step 370, the search module 140 may determine one or more keys corresponding to the search key among the one or more pre-stored keys. The previously stored one or more keys may be generated by the key generation module 130 according to an embodiment of the present invention and may include a key text and a key that is matched and stored with a plurality of units. The search module 140 may compare one or more pre-stored keys and the search key bit-by-bit to determine whether to correspond.

FIG. 6 is a block diagram of a computer that operates to execute a computer program for providing full-text search according to an embodiment of the present invention.

Referring to FIG. 6, a brief general description of a suitable computing environment in which various aspects of an embodiment of the invention may be implemented may be provided.

Although the present invention has been described above generally in terms of computer-executable instructions that can be executed on one or more computers, those skilled in the art will appreciate that the present invention may be implemented in combination with other program modules and / will be.

Generally, program modules include routines, programs, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the inventive methods may be practiced with other computer systems, including single-processor or multi-processor computer systems, minicomputers, mainframe computers, as well as personal computers, handheld computing devices, microprocessor-based or programmable consumer electronics, And may operate in conjunction with one or more associated devices).

Illustrative aspects of the invention may also be practiced in distributed computing environments where certain tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

Computers typically include a variety of computer readable media. Any medium accessible by a computer can be a computer-readable medium, which includes both volatile and non-volatile media, both removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital video disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, Or any other medium which can be accessed by a computer and used to store the desired information.

Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal, such as a carrier wave or other transport mechanism, Media. The term modulated data signal refers to a signal that has one or more of its characteristics set or changed to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, or other wireless media. Combinations of any of the above described media are also intended to be included within the scope of computer readable media.

There is shown an exemplary environment 1100 that implements various aspects of the present invention including a computer 1102 and includes a processing unit 1104, a system memory 1106 and a system bus 1108 do. The system bus 1108 couples system components, including but not limited to, system memory 1106 to the processing unit 1104. The processing unit 1104 may be any of a variety of commercially available processors. Dual processors and other multiprocessor architectures may also be used as the processing unit 1104.

The system bus 1108 may be any of several types of bus structures that may additionally be interconnected to a local bus using any of the memory bus, peripheral bus, and various commercial bus architectures. The system memory 1106 includes read only memory (ROM) 1110 and random access memory (RAM) The basic input / output system (BIOS) is stored in a non-volatile memory 1110, such as a ROM, EPROM, EEPROM or the like, which is a basic (non-volatile) memory device that aids in transferring information between components within the computer 1102 Routine. The RAM 1112 may also include a high speed RAM such as static RAM for caching data.

The computer 1102 may also be an internal hard disk drive (HDD) 1114 (e.g., EIDE, SATA) - this internal hard disk drive 1114 may also be configured for external use within a suitable chassis , A magnetic floppy disk drive (FDD) 1116 (e.g., for reading from or writing to a removable diskette 1118), and an optical disk drive 1120 (e.g., a CD-ROM For reading disc 1122 or reading from or writing to other high capacity optical media such as DVD). The hard disk drive 1114, magnetic disk drive 1116 and optical disk drive 1120 are connected to the system bus 1108 by a hard disk drive interface 1124, a magnetic disk drive interface 1126 and an optical drive interface 1128, respectively. . The interface 1124 for external drive implementation includes at least one or both of USB (Universal Serial Bus) and IEEE 1394 interface technologies.

These drives and their associated computer-readable media provide non-volatile storage of data, data structures, computer-executable instructions, and the like. In the case of computer 1102, the drives and media correspond to storing any data in a suitable digital format. While the above description of computer readable media refers to HDDs, removable magnetic disks, and removable optical media such as CDs or DVDs, those skilled in the art will appreciate that other types of storage devices, such as a zip drive, magnetic cassette, flash memory card, Or the like may also be used in the exemplary operating environment and any such medium may include computer-executable instructions for carrying out the methods of the present invention.

A number of program modules may be stored in the drive and RAM 1112, including an operating system 1130, one or more application programs 1132, other program modules 1134, and program data 1136. All or a portion of the operating system, applications, modules, and / or data may also be cached in the RAM 1112. It will be appreciated that the present invention may be implemented in a variety of commercially available operating systems or combinations of operating systems.

A user may enter commands and information into the computer 1102 via one or more wired / wireless input devices, such as a keyboard 1138 and a pointing device such as a mouse 1140. [ Other input devices (not shown) may include a microphone, an IR remote control, a joystick, a game pad, a style (17) mouse pen, a touch screen, These and other input devices are often connected to the processing unit 1104 via an input device interface 1142 that is coupled to the system bus 1108, but may be a parallel port, an IEEE 1394 serial port, a game port, a USB port, an IR interface, ≪ / RTI > and so forth.

A monitor 1144 or other type of display device is also connected to the system bus 1108 via an interface, such as a video adapter 1146, In addition to the monitor 1144, the computer typically includes other peripheral output devices (not shown) such as speakers, printers,

Computer 1102 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer (s) 1148, via wired and / or wireless communication. The remote computer (s) 1148 can be a workstation, a server computer, a router, a personal computer, a portable computer, a microprocessor-based entertainment device, a peer device or other conventional network node, Includes a number of or all of the described elements, but for simplicity, only memory storage device 1150 is shown. The logical connections depicted include a wired / wireless connection to a local area network (LAN) 1152 and / or a larger network, e.g., a wide area network (WAN) These LAN and WAN networking environments are commonplace in offices and corporations and facilitate enterprise-wide computer networks such as intranets, all of which can be connected to computer networks worldwide, for example the Internet.

When used in a LAN networking environment, the computer 1102 is connected to the local network 1152 via a wired and / or wireless communication network interface or adapter 1156. [ The adapter 1156 may facilitate wired or wireless communication to the LAN 1152 and the LAN 1152 also includes a wireless access point installed therein to communicate with the wireless adapter 1156. [ When used in a WAN networking environment, the computer 1102 may include a modem 1158, or may be coupled to a communications server on the WAN 1154, or to other devices that establish communications over the WAN 1154, such as via the Internet. . A modem 1158, which may be an internal or external and a wired or wireless device, is coupled to the system bus 1108 via a serial port interface 1142. In a networked environment, program modules described for the computer 1102, or portions thereof, may be stored in the remote memory / storage device 1150. It will be appreciated that the network connections shown are exemplary and other means of establishing a communication link between the computers may be used.

The computer 1102 may be any wireless device or entity that is deployed and operable in wireless communication, such as a printer, a scanner, a desktop and / or portable computer, a portable data assistant (PDA) (E.g., a kiosk, a newsstand, a toilet), and a telephone. This includes at least Wi-Fi and Bluetooth ™ wireless technology. Thus, the communication may be a predefined structure, such as in a conventional network, or simply an ad hoc communication between at least two devices.

Wi-Fi (Wireless Fidelity) allows you to connect to the Internet from a home sofa, a bed in a hotel room, or a meeting room in your workplace without wires. Wi-Fi is a wireless technology such as a cell phone that allows such devices, e.g., computers, to transmit and receive data indoors and outdoors, i. E. Anywhere within the coverage area of a base station. Wi-Fi networks use a wireless technology called IEEE 802.11 (a, b, g, etc.) to provide a secure, reliable, and high-speed wireless connection. Wi-Fi can be used to connect computers to each other, the Internet, and a wired network (using IEEE 802.3 or Ethernet). The Wi-Fi network operates in unlicensed 2.4 and 5 GHz wireless bands, for example, at 11 Mbps (802.11a) or 54 Mbps (802.11b) data rates, or in products containing both bands (dual band) Thus, this network can provide real-world performance similar to the basic 10BaseT wired Ethernet network used in many offices.

Figure 7 shows a schematic block diagram of an exemplary computing environment for executing a program for providing full text search in accordance with an embodiment of the present invention.

Referring to FIG. 7, system 1200 includes one or more client (s) 1202. The client (s) 1202 may be hardware and / or software (e.g., threads, processes, computing devices). The client (s) 1202 may store cookie (s) and / or associated contextual information, for example, by using the present invention.

System 1200 also includes one or more server (s) 1204. The server (s) 1204 may also be hardware and / or software (e.g., threads, processes, computing devices). The server 1204 may, for example, store a thread that performs the transformation by using the present invention. One possible communication between client 1202 and server 1204 may be in the form of a data packet that is configured to be transmitted between two or more computer processes. The data packet may include, for example, a cookie and / or associated contextual information. System 1200 includes a communications framework 1206 (e.g., a worldwide communications network such as the Internet) that can be utilized to facilitate communications between client (s) 1202 and server (s) .

Communications may be facilitated via wireline (including optical fibers) and / or wireless technology. The client (s) 1202 may include one or more client data stores (e. G., Client 1202) that may be used to store information (e.g., cookie (s) and / ) 1208, respectively. Similarly, server (s) 1204 operate in conjunction with one or more server data store (s) 1210 that may be utilized to store information local to servers 1204.

Those of ordinary skill in the art will understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced in the above description may include voltages, currents, electromagnetic waves, magnetic fields or particles, Particles or particles, or any combination thereof.

Those skilled in the art will appreciate that the various illustrative logical blocks, modules, processors, means, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be embodied directly in electronic hardware, (Which may be referred to herein as "software") or a combination of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends on the design constraints imposed on the particular application and the overall system. Those skilled in the art may implement the described functionality in various ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The various embodiments presented herein may be implemented as a method, apparatus, or article of manufacture using standard programming and / or engineering techniques. The term "article of manufacture" includes a computer program, carrier, or media accessible from any computer-readable device. For example, the computer-readable medium can be a magnetic storage device (e.g., a hard disk, a floppy disk, a magnetic strip, etc.), an optical disk (e.g., CD, DVD, etc.), a smart card, But are not limited to, devices (e. G., EEPROM, cards, sticks, key drives, etc.). The various storage media presented herein also include one or more devices and / or other machine-readable media for storing information. The term "machine-readable medium" includes, but is not limited to, a wireless channel and various other media capable of storing, holding, and / or transferring instruction (s) and / or data.

It will be appreciated that the particular order or hierarchy of steps in the presented processes is an example of exemplary approaches. It will be appreciated that, based on design priorities, certain orders or hierarchies of steps in processes may be rearranged within the scope of the present invention. The appended method claims provide elements of the various steps in a sample order, but are not meant to be limited to the specific order or hierarchy presented.

The description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features presented herein.

Claims (12)

A method for supporting full text search in an embedded environment,
Dividing the stored original text into predetermined units to generate a plurality of units;
Converting the plurality of units into a bit array through a bloom filter; And
Mapping the at least one of the original text and the plurality of units to the key and storing the key in an index table;
/ RTI >
A method for supporting full text search in an embedded environment.
The method according to claim 1,
Wherein the predetermined unit comprises:
A unit of at least one of a character, a word, a phrase, a paragraph, a paragraph, and a fingerprint,
A method for supporting full text search in an embedded environment.
The method according to claim 1,
Wherein the bloom filter comprises:
Wherein the key has a bit size of the key and includes at least one hash function,
A method for supporting full text search in an embedded environment.
The method according to claim 1,
The key comprises:
And a bit string representing a value obtained by calculating the plurality of units by one or more hash functions.
A method for supporting full text search in an embedded environment.
21. A computer program stored in a computer-readable storage medium comprising a plurality of instructions executed by one or more processors,
The computer program comprising:
Instructions for generating a plurality of units by dividing the stored original text into predetermined units;
Instructions for converting the plurality of units into a bit array via a bloom filter; And
Mapping the key and at least one of the original text and the plurality of units and storing the key in an index table;
/ RTI >
A computer program stored in a computer-readable storage medium.
Apparatus for supporting full text search in an embedded environment,
A unit generation module for generating a plurality of units by dividing the stored original text into predetermined units;
A key conversion module for converting the plurality of units into a key array composed of a bit array through a bloom filter; And
A memory for storing the original text and at least one of the plurality of units and the key in an index table;
/ RTI >
A device for supporting full text search in an embedded environment.
A method for supporting full text search in an embedded environment,
Receiving text from a user;
Dividing the text into predetermined units to create one or more search units;
Converting the one or more retrieval units into a retrieval key composed of a bit string through a bloom filter; And
Determining one or more keys corresponding to the search key among the previously stored one or more keys;
/ RTI >
A method for supporting full text search in an embedded environment.
8. The method of claim 7,
Displaying a text matched with at least one key corresponding to the search key;
≪ / RTI >
A method for supporting full text search in an embedded environment.
8. The method of claim 7,
Wherein the step of determining one or more keys corresponding to the search key among the previously stored one or more keys comprises:
Determining whether or not to correspond to one or more pre-stored keys and the search key bit by bit,
A method for supporting full text search in an embedded environment.
8. The method of claim 7,
Wherein the pre-stored one or more keys and the search key have the same bit size,
Wherein the search key comprises:
And a bit string representing a value obtained by calculating one or more search units with one or more hash functions.
A method for supporting full text search in an embedded environment.
21. A computer program stored in a computer-readable storage medium comprising a plurality of instructions executed by one or more processors,
The computer program comprising:
Instructions for causing a user to input text;
Dividing the text into predetermined units to generate one or more search units;
Instructions for converting the one or more search units to a search key composed of a bit string via a bloom filter; And
Determining one or more keys corresponding to the retrieval key among the previously stored one or more keys;
/ RTI >
A computer program stored in a computer-readable storage medium.
An apparatus for supporting full-text search in an embedded environment,
A user input module for receiving text from a user;
A unit generation module for dividing the text into predetermined units to generate one or more retrieval units;
A key conversion module for converting the at least one search unit into a search key composed of bit strings through a bloom filter;
A search module for determining one or more keys corresponding to the search key among the previously stored one or more keys; And
A memory storing one or more keys and text mapped to each of the one or more keys;
/ RTI >
A device for supporting full text search in an embedded environment.




KR1020150125194A 2015-09-04 2015-09-04 Method and apprapatus for supporting full text search in embedded environment and computer program stored on computer-readable medium KR101754580B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150125194A KR101754580B1 (en) 2015-09-04 2015-09-04 Method and apprapatus for supporting full text search in embedded environment and computer program stored on computer-readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150125194A KR101754580B1 (en) 2015-09-04 2015-09-04 Method and apprapatus for supporting full text search in embedded environment and computer program stored on computer-readable medium

Publications (2)

Publication Number Publication Date
KR20170028514A KR20170028514A (en) 2017-03-14
KR101754580B1 true KR101754580B1 (en) 2017-07-06

Family

ID=58460175

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150125194A KR101754580B1 (en) 2015-09-04 2015-09-04 Method and apprapatus for supporting full text search in embedded environment and computer program stored on computer-readable medium

Country Status (1)

Country Link
KR (1) KR101754580B1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102223151B1 (en) * 2019-08-29 2021-03-03 김영수 Apparatus for Searching Semiconductor Parts

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080111718A1 (en) 2006-11-15 2008-05-15 Po-Ching Lin String Matching System and Method Using Bloom Filters to Achieve Sub-Linear Computation Time

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080111718A1 (en) 2006-11-15 2008-05-15 Po-Ching Lin String Matching System and Method Using Bloom Filters to Achieve Sub-Linear Computation Time

Also Published As

Publication number Publication date
KR20170028514A (en) 2017-03-14

Similar Documents

Publication Publication Date Title
US11341419B2 (en) Method of and system for generating a prediction model and determining an accuracy of a prediction model
CN107735804B (en) System and method for transfer learning techniques for different sets of labels
US10303689B2 (en) Answering natural language table queries through semantic table representation
CN104462056B (en) For the method and information handling systems of knouledge-based information to be presented
AU2016311039A1 (en) Methods and systems for identifying a level of similarity between a filtering criterion and a data item within a set of streamed documents
US20180096035A1 (en) Query-time analytics on graph queries spanning subgraphs
CN105493075A (en) Retrieval of attribute values based upon identified entities
WO2016155662A1 (en) Search processing method and apparatus
CN111026319B (en) Intelligent text processing method and device, electronic equipment and storage medium
US10353936B2 (en) Natural language interpretation of hierarchical data
US20190258711A1 (en) Hybrid grammatical and ungrammatical parsing
US20210157983A1 (en) Hybrid in-domain and out-of-domain document processing for non-vocabulary tokens of electronic documents
US20120158742A1 (en) Managing documents using weighted prevalence data for statements
US10885085B2 (en) System to organize search and display unstructured data
CN104462030B (en) Character conversion equipment, character conversion method
US10885281B2 (en) Natural language document summarization using hyperbolic embeddings
US20170068732A1 (en) Multi-system segmented search processing
CN107133263A (en) POI recommends method, device, equipment and computer-readable recording medium
WO2017092493A1 (en) Ambiance music searching method and device
US9996535B1 (en) Efficient hierarchical user interface
US11222165B1 (en) Sliding window to detect entities in corpus using natural language processing
KR101754580B1 (en) Method and apprapatus for supporting full text search in embedded environment and computer program stored on computer-readable medium
US11468078B2 (en) Hierarchical data searching using tensor searching, fuzzy searching, and Bayesian networks
Fernandes et al. Lightweight context-based web-service composition model for mobile devices
US11748342B2 (en) Natural language based processor and query constructor

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal