WO2007088576A1

WO2007088576A1 - File search program, method, and device

Info

Publication number: WO2007088576A1
Application number: PCT/JP2006/301517
Authority: WO
Inventors: Takahiro Matsuda; Shigefumi Yamada; Takashi Morihara
Original assignee: Fujitsu Limited
Priority date: 2006-01-31
Filing date: 2006-01-31
Publication date: 2007-08-09
Also published as: JP4973503B2; JPWO2007088576A1

Abstract

A target file is searched from a file storage unit containing attribute information extracted from each file and registered in correlation. Attribute information is set as a search condition of the search condition setting unit file. For each file in the file storage unit, the search unit calculates similarity by using a weight from the search condition attribute information and the registered attribute information so as to search a file. A search result display unit displays the file search result and generates a display state corresponding to the view operation corresponding to the search result. A view operation extraction unit extracts a view operation by the search result display unit. A weight calculation unit calculates a value indicating a search viewpoint which is a feature of the file searched by the user according to the view operation history as a weight to be used for calculation of the similarity. The weight calculated when again performing search is set for calculating the similarity.

Description

Specification

File search program, method and apparatus

Technical field

TECHNICAL FIELD [0001] The present invention relates to a file search program, method, and apparatus for searching for a target file from files stored in a personal computer, a file server, and the like. In particular, the present invention uses a file attribute information to search for a target file. The present invention relates to a file search program, a method, and an apparatus for searching.

Background art

In recent years, with the spread of computer systems and networks, a large amount of electronic data is exchanged through networks and storage media and stored on computers. Under such circumstances, there is a problem that it is difficult to find the target information while at the same time being able to obtain a large amount of information.

[0003] As a method of searching for a target file from a file stored in a personal computer, a file server, or the like, a full-text search for words and phrases stored in the file is generally used.

[0004] Also, if there are a large number of search targets or the target file is ambiguous, AND and OR searches using multiple keywords are performed, and keywords are added to the search results to narrow down the search results. Go.

[0005] Further, the user himself / herself determines the suitability of the search result file, that is, determines whether the target file is similar to or not similar to the target file. There is also a way to provide feedback.

Patent Document 1: International Publication No. W04Z0319566

Patent Document 2: Japanese Patent Laid-Open No. 2001-209660

Disclosure of the invention

Problems to be solved by the invention

However, in such a conventional file search method, when a general full-text search method has a large number of search target files or the target file is ambiguous, There is a problem that it is difficult to set keywords. If simple keywords are set in order to eliminate search omissions, the search results will be large and the burden of judgment work by the user will increase.

[0007] Furthermore, even if the user determines the suitability for the search results, if there are many search results, the determination work for searching for file power is similar because the file power is similar. Become.

[0008] For this reason, under the present circumstances, trial and error are repeated to change keywords, and a large amount of search results are selected over time for the target file, which is a heavy burden on the user.

[0009] An object of the present invention is to provide a file search program, method, and apparatus that reduce the burden on the user and make it easy to find a target file.

Means for solving the problem

[0010] (Program)

The present invention provides a file search program. The file search program according to the present invention includes a computer having a file storage unit that is registered in association with attribute information extracted for each file.

A search condition setting step for setting attribute information as a file search condition, and for each registered file, the similarity is calculated using the weight from the attribute information of the search condition and the registered attribute information. A search step to search for,

A search result display step for displaying a search result of the file by the search step and generating a display state corresponding to a browsing operation for the search result;

Browsing operation extraction step to extract browsing operation history in the search result display step, Browsing operation history power Calculate the weight used to calculate the similarity in the search step, and calculate the weight to be set in the search step when searching again Steps,

Is executed.

[0011] Here, the attribute information includes a plurality of attribute items, and the search step determines the similarity of the files.

The sum of the similarity of each attribute item is multiplied by the weight of each attribute item.

[0012] In the search result display step, a file preview is displayed as a search result by a browsing operation. Can be displayed,

The browsing operation extraction step calculates the frequency of each attribute item of the previewed file or files and extracts it as browsing operation history.

The weight calculation step calculates the weight for each attribute item according to the frequency of each attribute item in the browsing operation history.

[0013] Specifically, the browsing operation extraction step counts common attribute values included in each attribute item of a plurality of previewed files, extracts the maximum frequency of attribute values for each attribute item, and calculates a weight. Calculates the weight of each attribute item as a value obtained by dividing the maximum frequency for each attribute item by the number of previews.

[0014] In the search result display step, the search result files can be rearranged and displayed in ascending or descending order according to the similarity of the selected attribute item in response to a selection operation of any one of a plurality of attribute items. ,

The browsing operation extraction step calculates a value according to the rank of each attribute item selected for sorting and extracts it as a browsing operation history.

In the weight calculation step, the weight for each attribute item may be calculated according to the rank of each attribute item in the browsing operation history.

[0015] Specifically, the browsing operation extraction step calculates a score that becomes a higher value in the order of the sorting operation from the order of each attribute item selected for sorting, and stores it as a browsing operation history. The weight calculation step calculates the weight of each attribute item as a normalized value by dividing the score for each attribute item by the number of times of sorting.

[0016] Further, the search result display step can display a preview of the file as a search result by a browsing operation,

In the browsing operation extraction step, the frequency and browsing time of each attribute item of the previewed file or files are calculated and extracted as browsing operation history,

In the weight calculation step, the weight for each attribute item may be calculated according to the frequency and browsing time of each attribute item in the browsing operation history.

Specifically, the browsing operation extraction step counts the common attribute values included in each attribute item of the previewed multiple files, extracts the maximum frequency of the attribute value for each attribute item, In the calculation step, the weight of each attribute item is calculated as a normalized value obtained by dividing the maximum frequency for each attribute item by the viewing time.

[0018] The search result display step can display a preview of the file as a search result by a browsing operation, and according to the similarity of the selected attribute item for any one selection operation of a plurality of attribute items. The search result files can be sorted and displayed in ascending or descending order.

The browsing operation extraction step

A first extraction step of calculating the frequency of each attribute item of the previewed file or files and extracting it as browsing operation history;

A second extraction step of calculating a value according to the rank of each attribute item selected for sorting and extracting as a browsing operation history;

The browsing operation extraction step includes at least one of the following two: a third extraction step that calculates the frequency and browsing time of each attribute item of one or more previewed files and extracts it as a browsing operation history.

The weight calculation step corresponds to the browsing operation extraction step,

A first weight calculation step for calculating a weight for each attribute item according to the frequency of each attribute item in the browsing operation history in the first extraction step;

The weight calculating step by the second extracting step includes a second weight calculating step for calculating a weight for each attribute item according to the rank of each attribute item in the browsing operation history;

A third weight calculating step for calculating a weight for each attribute item according to the frequency and browsing time of each attribute item in the browsing operation history in the third extracting step;

You may have at least two of them.

In this case, the weight calculating step calculates, for example, an average of weights calculated for each attribute item when combining at least two of the first weight calculating step to the third weight calculating step. And set it to the search step.

In the search condition setting step, attribute values of a plurality of attribute items input using the search work screen are set as search conditions.

[0021] The search condition setting step includes a file force attribute item specified using the search work screen. Extract eye attribute values and set search conditions.

[0022] The attribute information includes, as attribute items, type, owner, date / time of creation or update, and word / phrase.

[0023] (Method)

The present invention provides a file management method. The present invention provides a file management method for searching for a target file from a file storage unit registered in association with attribute information extracted for each file.

Search condition setting step for setting attribute information as a file search condition, and for each file in the file storage unit, search for files by calculating similarity using the attribute information of the search condition and the registered attribute information weight A search step to

It is provided with.

[0024] (Device)

The present invention provides a file management apparatus. The file management apparatus of the present invention includes a file storage unit that associates and registers attribute information extracted for each file, a search condition setting unit that sets attribute information as a file search condition,

For each registered file, a search unit that searches for a file by calculating similarity using weights from the attribute information of the search condition and the registered attribute information,

A search result display unit that displays a search result of the file by the search unit and generates a display state according to a browsing operation on the search result;

A browsing operation extraction unit for extracting a browsing operation history by the search result display unit;

Browsing operation history power A weight calculation unit that calculates weights used for calculating the similarity of the search unit and sets the search unit when performing a search again,

It is provided with. The invention's effect

[0025] According to the present invention, file attribute information (including attribute items such as update date, owner, and file type) is used to calculate the similarity of search files. Also, from the user browsing operation history on the search result display screen, the attribute information that the user is interested in is extracted as the search viewpoint and fed back to the next search to execute the search reflecting the user's intention. Appear.

[0026] As a method of extracting the search viewpoint of the user, provided on the search result screen,

(1) Preview screen,

(2) Sorting function

Can be used.

[0027] The preview screen is a function for displaying a summary image of an arbitrary search result from the search results displayed in a list. The preview viewed by the user indicates that the file is being watched by the user. When multiple files are previewed, the common items of each file, such as word, phrase, file type, owner, update date and time, etc. are detected and reflected in the next search. Search that takes into account is possible.

[0028] Furthermore, when combined with the sorting function, if previews are executed intensively on attribute items whose display priority has been increased, for example, specific file types, the search point of the user can be further improved. It can be extracted reliably.

[0029] In the present invention, it is assumed that the search is performed recursively, and the search viewpoint can be extracted in the same manner between the first and second searches.

[0030] As a result, since the user automatically extracts the characteristics of the file that the user is looking for from the operation of browsing the search result, the burden on the user is reduced and the accuracy reflecting the user's search viewpoint is reduced. The search is possible.

Brief Description of Drawings

[0031] [FIG. 1] A block diagram of a hardware environment in which a file search program of the present invention is executed.

FIG. 2 is a block diagram of another hardware configuration in which the file search program of the present invention is executed. FIG. 3 is a block diagram of another hardware configuration in which the file search program of the present invention is executed.

4) Functional block diagram of the file search device according to the present invention

FIG. 5 is a block diagram showing a detailed functional configuration of the file registration processing unit in FIG.

[FIG. 6] Explanatory diagram of the attribute information storage unit of FIG.

FIG. 7 is a block diagram showing a detailed functional configuration of the file search processing unit in FIG.

圆 8] Explanatory drawing of the search screen used in this embodiment

FIG. 9 is an explanatory diagram of specific processing contents of the file search processing unit in FIG.

[10] Explanatory diagram of attribute information file corresponding to search results in Fig. 9

[FIG. 11] Explanatory diagram of weight table including operation history generated based on preview of search result in FIG.

圆 12] Flow chart of search processing that calculates the weight to be used in the next search using preview as browsing operation history

圆 13 Explanatory diagram of attribute information file corresponding to search results sorted by similarity of vertices item “type”

FIG. 14 is an explanatory diagram of a weight table including an operation history generated based on the sorting of the search results in FIG.

圆 15] Explanatory diagram of the ordering score table that defines the points to be set for the attribute items based on the ordering of sorting in Figure 14

圆 16] Flow chart of search processing that calculates the weight to be used in the next search using sorting as browsing operation history

FIG. 17 is an explanatory diagram of a weight table including an operation history generated based on the browsing time of the search result preview in FIG.

圆 18] Calculate the weight to be used in the next search using the preview time as the browsing operation history.

[Fig. 19] Flow chart of search processing for calculating the weight used in the next search using the preview count, sort count and preview time as the browsing operation history.

BEST MODE FOR CARRYING OUT THE INVENTION FIG. 1 is a block diagram of a hardware configuration in which the file search program of the present invention is executed. In FIG. 1, in the present embodiment, the file search device 10 includes a CPU 12, a memory 14, an input / output unit 16, and a storage device 18. The file search program 20 and the file 22 are stored in the storage device 18. ! /

The file search program 20 is executed by the CPU 12 and searches for a target file from the file 22. The input / output unit 16 includes devices such as a keyboard, a mouse, and a display, and the storage device 18 is, for example, a node disk drive.

FIG. 2 is a block diagram of another hardware configuration in which the file search program of the present invention is executed. In the embodiment of FIG. 2, the file search device 10 is connected to the file management device 24 via the network 11. The file search device 10 includes a CPU 12, a memory 14, an input / output unit 16, and a storage device 18, and a file search program 20 is stored in the storage device 18.

On the other hand, the file management device 24 includes a CPU 26, a memory 28, an input / output unit 30, and a storage device 32, and the file 22 is stored in the storage device 32. Therefore, the file search device 10 executes the file search program 20 in the storage device 18 by the CPU 12 and searches for the target file from the file 22 in the storage device 32 of the file management device 24.

FIG. 3 is a block diagram of another hardware configuration in which the file search program of the present invention is executed. Similar to the embodiment of FIG. 2, the file search device 10 is connected to the file management device via the network 11. However, the file search program 20 and the file 22 are stored in the storage device 32 of the file management device 24.

[0037] Therefore, the file search device 10 reads the file search program 20 from the storage device 32 of the file management device 24, executes it by the CPU 26, and from the file 22 stored in the storage device 32 of the file management device 24. Search for the desired file.

FIG. 4 is a block diagram of a functional configuration of the file search apparatus 10 according to the present invention. In FIG. 4, the file search apparatus 10 includes a file registration processing unit 34, a file search processing unit 36, a file storage unit 38, and an attribute information storage unit 40.

FIG. 5 shows a functional configuration of the file registration processing unit 34 of FIG. 4. The file registration processing unit 34 is provided with a file processing detection unit 42, a file registration unit 44, and an attribute information extraction unit 46. [0040] The file search processing unit 36 in FIG. 4 detects a file storage process from, for example, a file write event in the file system, and notifies the file registration unit 44 and the attribute information extraction unit 46 in FIG. 5 of the detection of the storage process. To do. The file registration unit 44 stores the actual state of the file in the file storage unit 38 arranged in a storage device such as a hard disk using the file writing function of the file system.

[0041] The attribute information extraction unit 46 extracts attribute information attached to the file, for example, attribute information such as a file name, a file type, an owner, a storage location, a creation or update date, and a document phrase.

[0042] In the attribute information items extracted by the attribute information extraction unit 46, information other than file information can be acquired from a general file system. The attribute items in the file can also read the actual power of the file. The attribute items constituting the attribute information listed here are only examples, and all information that can be acquired by the file power is included in the attribute information used in the search processing of the present invention.

[0043] The attribute information extracted by the attribute information extraction unit 46 is stored in the attribute information storage unit 40 in association with the file.

FIG. 6 is an explanatory diagram of the attribute information storage unit 40 of FIG. In Fig. 6, in this example, four attribute items "attribute a", "attribute b", "attribute c" and "attribute d" are defined corresponding to the "file ID" to be associated with the file. "Al, bl, cl, ... 'd4" is stored as "attribute a" to "attribute d" corresponding to file ID = # 1 to # 4, and this corresponds to each attribute a to d Represents an attribute value.

FIG. 7 is a block diagram showing a detailed functional configuration of the file search processing unit 36 of FIG. In FIG. 7, the file search processing unit 36 includes a search condition setting unit 48, a search unit 50, a search result output unit 54, a browsing operation extraction unit 56, an operation history information storage unit 58, a weight calculation unit 60, and a weight storage unit 62. It has.

The search condition setting unit 48 sets attribute information as a search condition for a target file.

The attribute information to be set as search conditions is set to the attribute value corresponding to the attribute item in the attribute information storage unit 40 in FIG. 6 created at the time of file registration by the file registration processing unit 34 in FIG. To do.

[0047] The search condition setting unit 48 sets attribute information as a search condition by specifying a method in which the user directly inputs the value of the attribute item by searching the search operation screen, and a file close to the target file. Therefore, it is possible to select one of the methods for automatically extracting and setting attribute information as a search condition from a specified file.

[0048] The search unit 50 uses the attribute information power weight registered for the file stored in the attribute storage unit 40 and the attribute information of the search condition for each file stored in the file storage unit 38. To calculate the similarity (similarity) and search for files with high similarity.

The similarity calculation by the similarity calculation unit 52 provided in the search unit 50 is calculated as the sum of the similarity of each attribute item set as the search condition multiplied by the weight of each attribute item. The search result output unit 54 displays the search result of the file by the search unit 50 and sets the display state according to the browsing operation such as preview or rearrangement for the search result.

[0050] The browsing operation extraction unit 56 extracts the browsing operation history such as preview and rearrangement in the search result output unit 54 and stores it in the operation history information storage unit 58. Further, the weight calculation unit 60 calculates the weight used for the similarity calculation by the similarity calculation unit 52 provided in the search unit 50 for the browsing operation history power stored in the operation history information storage unit 58, and performs the search again. Set to calculate the similarity of the search unit 50.

FIG. 8 is an explanatory diagram of a search screen used in this embodiment. In FIG. 8, on the search screen 64, a search condition setting operation unit 66, a rearrangement operation unit 68, and a search result display unit 72 are arranged from the upper side.

[0052] In the search condition setting operation section 66, an attribute item as a search condition used for searching for a target file is set, so that a type input frame 74, an owner input frame 76, a date and time input frame 78, a phrase input frame 80 is provided. Also, a path input frame 82 for specifying a search condition file for automatically extracting and setting attribute items as search conditions from a file and a reference button 84 for referring to a search target path are provided.

[0053] Further, on the right side of the search condition setting operation section 66, there are a search button 86 for starting the search in a state where the operation items and the search conditions by the path designation have been input, and the operation items already input at the time of re-search Or with an initialization button 88 to clear the path to the initial state ing.

In this example, the search result display section 72 displays five pieces of search file information 98-1 to 98-5 arranged in descending order of the similarity calculated by the similarity calculation at the time of search.

[0055] Search file information 98— 1 to 98—5 is search file information 98—1, for example, “File # 1” is displayed as the file ID, and the type, owner, date / time, and phrase attributes “Abc”, “2005ZllZ01 17:25”, “User Ul”, and “Summary of file # 1” are displayed as attribute values corresponding to the items! RU

[0056] Further, a preview button 100-1 to 100 is displayed on the right side of the search file information 98-1 to 98-5.

5 is provided, and by operating any of the preview buttons 100-1 to 100-5, the corresponding file preview can be displayed as a previous screen with respect to the search screen 64.

[0057] For this reason, the operator must check the search file information 98-1 to 98-5 in the search result display section 72, and if he / she wants to confirm the contents, the preview button 100-1 to LOO-5 must be selected. You can browse the preview as needed to determine if you have the power to find the file you want.

Further, a sorting operation unit 68 is provided on the search result display unit 72, and in order to execute sorting according to the priority of the attribute item, the type sorting button 90, the owner sorting A change button 92, a date / time sort button 94, and a word / phrase sort button 96 are arranged.

[0059] When any sort button in the sort operation section 68 is selected, the search file information 98-1 to 98-5 according to the similarity of the selected attribute item is sorted. In this state, it is possible to appropriately determine whether or not the target file can be searched by viewing the preview by paying attention to the file having the higher similarity.

[0060] In the search result display unit 72 of FIG. 8, the power of displaying five search results can be displayed on the search result display unit 72 by scroll operation or screen switching operation.

FIG. 9 is an explanatory diagram of specific processing contents of the file search processing unit 36 of FIG. In this example, the search condition setting section 48 uses the attribute value “a2” or “a3” as the attribute a. Specify attribute value "b2" or "b3" for attribute b, specify attribute value "c 2" for attribute c, specify attribute value "dl" for attribute d, The

[0062] Based on the specification of the attribute value of each attribute item by the search condition setting unit 48, the search unit 50 uses the specified attribute values a2, a3, b2, b3, c2, dl for each file of the attribute information storage unit 40. Similarity calculation is performed on attribute values to calculate the similarity sa, sb, sc, sd for each attribute item a to d.

[0063] The similarity of each attribute item in this example is calculated as one point when the attribute value of the specified condition matches the attribute value of the attribute information storage unit 40. For example, for the file "# 1" Is attribute point a3, b3 for file “# 3”, attribute point a2, b2, c2 matches for attribute “a” 2 points for the power to match, and there is no matching attribute value for the file “# 4”.

[0064] On the other hand, a weight storage unit 62 is provided for the search unit 50, and weights Wa, Wb, Wc, and Wd are set for each of the attributes a to d. Similarity for each attribute item a to d in this search unit 50 Sa, Sb, Sc, Sd and similar weight using weights Wa, Wb, Wc, Wd Calculate degree S.

Similarity S = Wa 'a + Wb' b + Wc 'c + Wd' d (1)

[0065] Here, assuming that the initial value of the weights Wa to Wd by the weight storage unit 62 is "1", the similarity of file # 1 is 1 point, and the similarity of file # 2 is The similarity between file 3 and file # 3 is 2, and file 0 is 0, and search result output unit 54 outputs and displays the search results arranged in descending order of similarity.

[0066] It should be noted that the actual similarity S is calculated using weights obtained by normalizing the appearance probabilities of the attribute items so that the sum of the weights of all the attribute items is "1". In the case of FIG. 9, the normalized initial values of the four weights Wa, Wb, Wc, and Wd are “0.25”.

[0067] Therefore, in the search processing of the present invention, the search viewpoint that is the feature of the file that the user searches and previews and rearranges the search results of the user, and the user's search history, By reflecting this search viewpoint on the weight used for calculating the similarity of the next search process by weight calculation, user search can be performed without imposing a burden on the user. Highly accurate reflecting the intention! Search is possible.

That is, the user can use the respective search buttons 100-1 to 100-5 of the search file information 98-1 to 98-5 on the search result display section 72 shown in the search screen 64 of FIG. Operate to see the preview, or select the type, owner, date / time, or phrase in the sort operation block 68, and select either attribute item, and sort according to the similarity of the selected attribute item. A browsing operation is performed to determine whether or not the target file power has been reached.

[0069] With respect to the browsing results for such search results, the attribute items that the user intentionally browses and the common ranking items for sorting are extracted, and the attribute items that the user pays attention to are as described above. By adjusting the weight calculation to increase the weight of equation (1), it is possible to perform a search that reflects the user's search viewpoint for the next search.

In the present invention, as a method for automatically extracting the user's search viewpoint and adjusting the weight, the following! / Can be performed.

(1) Preview viewing operation method of calculating weights by extracting the user's search viewpoint;

(2) A method for calculating the weight by extracting the user's search viewpoint by the user's rearrangement operation

(3) A method of calculating the weight by extracting the user's search viewpoint from the browsing time by the preview browsing operation;

(4) Method of calculating weight by combining at least two of (1), (2), and (3); First, a method of calculating a weight by extracting a user's search viewpoint by a preview browsing operation will be described.

FIG. 10 is a search result list table 101 of the search result display section 72 on the search screen 64 of FIG. 8. The type 104, owner 106, date 108 and phrase 110 are attribute items corresponding to the file ID 102. Represents. Assume that a preview is executed for file # 1, file # 3, and file # 5 as shown by arrows 115-1, 115-2, 115-3 on the left side of such a search result list display 101.

[0072] This previewed file counts the common attribute values for each of the attribute item type 104, owner 106, date / time 108, and word / phrase 110 in the files # 1, # 3, and # 5. Calculate the maximum frequency of the sex value. The value obtained by dividing the maximum frequency by the number of previews represents the probability that the attribute value has appeared. The larger the value indicating this probability, the more the user pays attention to the attribute value.

[0073] If the value obtained by dividing the maximum frequency by the number of previews is set as the weight of the equation (1), a search reflecting the search viewpoint, which is the attribute focused on by the user in the next search, can be realized.

FIG. 11 is an explanatory diagram of the weight table 112 including the browsing operation history generated based on the files # 1, # 3, and # 5 in which the search result list table 101 of FIG. 9 is executed.

In the weight table 112, type, owner, date, and phrase attribute values are set as the search count 114, preview count 116, and weight 118. Here, the weight 118 is divided into two rows, an upper row and a lower row. The upper row is the appearance probability of each attribute, and the lower row is the weight obtained by normalizing the appearance probability of each attribute.

The search result list table 101 in FIG. 10 is used for weight calculation when the search count 114 in the weight table 112 in FIG. 11 is “first” and the preview count 116 is “3”.

[0077] For this weight calculation, the appearance probability is calculated for each attribute item.

Appearance probability = (Maximum frequency of counting common attribute values) Z (Number of previews)

(2)

Calculate as

For example, in the case of type 104 in the attribute item of FIG. 10, the maximum frequency of counting common attribute values is the frequency “2” of the attribute value “abc”. ), The probability of appearance Pa of type 104 based on the browsing operation history in this case is

Pa = 2/3

It is obtained as

[0079] Since the maximum frequency of the common attribute value of the next owner 106 is the frequency “2” of “user Ul”, the probability of appearance of the owner Pb is

Pb = 2/3 It becomes. For date and time 108, since all attribute values are different, the maximum frequency of common attribute values is “1”, and the appearance probability Pc is

Pc = l / 3

It becomes.

[0080] Furthermore, for the word 110, which is an attribute value, since the three words are registered in files # 1, # 3, # 5 in preview, there are a total of nine files, and file # 1, # 3, Since the maximum frequency “2” of the attribute value “apple” included in # 5 is common, its appearance probability Pd is Pd = 2/9

It becomes.

[0081] In this way, the operation history power of the number of previews is calculated. Appearance probability for each attribute item 2Z

For 3, 2/3, 1/3, 2Z9, the sum of the appearance probabilities of each attribute item is set to “1”, and the weights Wa, Wb, Wc, Wd of each attribute item are obtained.

[0082] Normalization is calculated by finding the sum 17/9 of the occurrence probabilities 2Z3, 2/3, 1/3, 2/9 for each attribute item, and multiplying each occurrence probability by the reciprocal of the sum (9Z17). It becomes as follows.

Type weight Wa = 6/17

Owner weight Wb = 6Zl7

Date / time weight Wc = 3Zl7

Word weight Wd = 2Zl7

When this is represented by / J, a number, it becomes 0.35, 0.35, 0.18, 0.12, as shown in Fig. 11, and the total power is "1".

[0083] The weight table 112 in FIG. 11 stores the case where the number of searches 114 is repeated from the first to the fifth, and the preview times from the first to the fifth are 3, 5, 2 This shows that it has been executed four times, four times, and six times. For the second and subsequent times, the weight is calculated on the assumption that V is for word 110, and three words are registered in all files # 1 to # 5!

[0084] For the appearance probability of each attribute item after the second time, the sum of the current search plus the maximum frequency of the previous search is also obtained as the sum of the previous number of previews from the following formula: . Where n is the current number of searches. [0085] [Equation 1] Appearance probability = Maximum frequency of common attribute value) / Number of previews) (3)

[0086] For example, when looking at the appearance probability of the “second” type in the search number 114, the maximum frequency of the common attribute value in the second preview number 5 is “4”. The maximum frequency “2” of the attribute value is added to obtain “2 + 4 = 6”. The total number of previews is “3 + 5 = 8” by adding the first and second previews. Therefore, the appearance probability Pa of the second type is

Pa = 6/8

It becomes.

[0087] Looking at the weight for each attribute item finally obtained by executing the preview by repeating the number of searches up to the 5th time of the first time, the highest normalized weight is The type is “0.47”, and it is clear from the execution of the user's preview that the user's attention is high for the attribute item “type”, and this is reflected in the search results by weight calculation.

FIG. 12 is a flowchart of search processing according to the present invention that employs the method of extracting the search viewpoint of the user and calculating the weight by the preview browsing operation shown in FIGS. 10 and 11.

In FIG. 12, an attribute item is set as a search condition in step S 1, and subsequently, in step S 2, the expression (1) is also calculated for all files. In this case, the initial weight is the same for each attribute item.

Subsequently, in step S3, the search result is displayed according to the similarity as shown in the search screen of FIG. Subsequently, in step S4, whether or not the preview is executed is checked. If the preview is performed, the process proceeds to step S5 and the common attribute values of the preview file are counted.

[0091] If there is no preview in step S4, a check is made in step S6 based on the user's input instruction, and if it is not the target file, the process proceeds to step S7 to determine whether the recursive search is executed or not. For example, search condition setting in the search screen 64 of FIG. The presence / absence of operation of the initialization button 88 provided on the operation unit 66 is also determined.

[0092] If the initialization button 88 is operated, it is determined that recursive search is executed, and the process proceeds to step S8. As shown in the weight table 112 in FIG. Calculate the weight for each item, return to step S1 again, repeat the next search process, and use the weight based on the preview operation history information calculated in step S8 to calculate the similarity in step S2 .

Next, a method for calculating a weight by extracting a user's search viewpoint by a rearrangement operation will be described.

[0094] Here, the sort operation is the type sort button 90, the owner sort button 92, the date sort button 94, the phrase sort button in the switching operation unit 68 provided on the search screen 64 in FIG. Select one of the 96 buttons and select the search file information 98-1 to 98-5 in the search result display area according to the similarity of the selected operation item. This is an operation that makes the search results easier to see by rearranging them in descending order, that is, in order of low similarity.

[0095] For example, the search result display unit 72 in the search screen 64 of FIG. 8 becomes the search result list table 101 of FIG. 10 before the rearrangement, but when the attribute item “type” is selected and rearranged, The search result list table 101-1 in Fig. 13 is displayed.

The sorting operation can be executed by selecting a plurality of attribute items. For example, when sorting is performed in the order of “type” and “owner”, sorting is performed first according to the similarity of “owner”, and then for results with the same owner. Perform “type” sorting.

FIG. 14 shows a weight table 120 in which the user's search viewpoint is extracted from the search result list table 101-1 by the rearrangement operation of FIG. 13 and the weight is calculated. It consists of the number of substitutions 124 and the weight 126 for each type, owner, date, and phrase.

The browsing operation history of rearrangement in the weight table 120 of FIG. 14 is executed “4 times” as the rearrangement count 124 in the “first” search count 122. These four sorts are performed by selecting attribute items in the following order. 1st time: Word

Second time: Type

Third time: Owner

4th: Date and time

Here, the higher the ordering, the higher the score, with respect to the order of the number of times of sorting for one search result.

FIG. 15 is an explanatory diagram of the sorting order score file 128 for setting scores according to the ranking with respect to the number of sorting times in one search result. For the first time in Fig. 14, the number of times of sorting is four, so the first to the fourth time according to the order of sorting on the time axis is the fourth. The score is 3 points for the 3rd time older than that, 2 points for the 2nd oldest score, and the oldest 1st score. Scores according to this sort order are shown as numbers in Katsuko on the right side of the upper part of the weight in Fig. 4.

[0100] Therefore, the weights of the attribute items “Type”, “Owner”, “Date / Time” and “Phrase” associated with the number of sorting times 4 for the first search result are “iZio”, “2Z10”, “3ZlO”, When the weight of each attribute item is calculated by normalization so that the sum of the weights becomes 1, the power of “0.20”, “0.30”, 0.40 ”,“ 0.10 ”is obtained. Maru.

[0101] In the case of Fig. 14, the number of times of the first search is also changed to 4, 2, 1, 2, and 3 per 5th time. For the second and subsequent times, the weight of each attribute item Is obtained by the following equation

[0102] [Equation 2] Weight = score for ranking) / total score for each search) (4)

[0103] For the second search, the number of sorts is 2,

First time: Type

Second time: Owner

Therefore, referring to the sorting order score file 128 in FIG. 15 for the number of times of sorting, three points are set for the first time and four points are set for the second time. [0104] The value obtained by dividing the score obtained by the rearrangement by the total score is cumulatively added each time the search is repeated, and the weight is calculated and used as a weight that is normally added to the equation (1). Thus, it is possible to perform a search according to the extracted search viewpoint.

In the case of FIG. 14, since the weight of the attribute item “owner” for which the sorting power for the fifth search result is also calculated is “0.49”, the user finally has the attribute item “ It can be seen that the owner is highly interested.

FIG. 16 is a flowchart of a search process using a process of extracting a user's search viewpoint and calculating a weight by a sorting operation on the search results shown in FIGS.

In FIG. 16, attribute items are set as search conditions in step S 1, similarity is calculated for all files in accordance with the above equation (1) in step S 2, and search results are displayed according to the similarity in step S 3. indicate.

[0108] In step S4, whether or not reordering has been executed is determined for the search result, and when reordering is performed, in step S5, the attribute value of reordering, specifically, the score corresponding to the order of reordering is counted. .

[0109] If reordering has not been performed, it is determined in step S6 whether or not the user has determined whether or not the file is the target file. When the user's operation force is determined, the process proceeds to step S8, and as shown in the weight table 120 of FIG. 14, the reordering power is obtained. The score power of the obtained rank is calculated. The weight calculated in step S8 is reflected in the calculation of the similarity in step S2 in the search.

Next, a method for calculating the weight by extracting the user's search viewpoint from the preview time associated with the preview browsing operation for the search result will be described.

[0111] In the search viewpoint extraction using the preview operation shown in FIGS. 10 to 12, even if the preview time is used as another method of calculating the weight using the number of times of preview, the user's A search viewpoint can be extracted and a weight can be calculated. That is, it can be determined that the longer the user's preview time, the closer to the search result that the user seeks.

[0112] The method of calculating weights in this preview time is shown in the weight table 130 of FIG. I will become. The weight table 130 of FIG. 17 is composed of the number of searches 132, the preview time 134, and the weight 136 for each attribute item of type, owner, date, and phrase.

[0113] The weight table 130 of the method for extracting the user's search viewpoint using the preview time is based on the preview count power shown in FIG. It has been replaced with.

[0114] For example, in the first search, “30 seconds” is obtained as the preview time, and the maximum frequency of common attribute values in the previewed file is obtained by this preview time. Will be divided by the preview time. In other words, the appearance frequency per unit preview per attribute item is given by the following equation.

[0115] [Equation 3] Appearance probability = (Maximum frequency of J common attribute values)

(Preview time) ( ₅₎

[0116] The value obtained by normalizing the appearance frequency per unit preview time of each loss correction item is used as a weight.

FIG. 18 is a flowchart of search processing using a method of extracting the user's search viewpoint from the preview time and calculating the weight. In FIG. 18, the attribute item as a search condition is set in step S1, the similarity is calculated for all the files in step S2, and the search result is displayed according to the similarity in step S3.

[0118] When the execution of the preview is determined in step S4, the common attributes of the preview and the browsing time are counted in step S5. If the preview is not executed, it is checked in step S6 whether the target file is appropriate. If it is not the target file, it is determined whether recursive search is executed in step S7.

If recursive search execution is performed, the process proceeds to step S8, the attribute weight calculation based on the preview time as shown in the weight table 130 of FIG. 17 is performed, the next search is performed from step S1, and the next search is performed in step S2. The weight of the attribute according to the preview time calculated in step S8 is reflected in the calculation of the similarity of the search process.

FIG. 19 is a flowchart of the search process of the present invention using the method of calculating the weight used in the next search using the preview count, the sort count and the preview time as the browsing history. It is

[0121] In this search processing, attribute items are set as search conditions in step SI. After calculating similarity with the above-described equation (1) for all files in step S2, the degree of similarity is determined in step S3. To display the search results.

[0122] When a browsing operation is executed in step S4 for this search result, if it is a browsing operation such as preview or sorting in step S5, the common attribute value of the preview file is counted in step S5. In step S6, the common attributes and browsing time of the preview file are counted, and in step S7, the score corresponding to the order of rearrangement is counted for each attribute item.

[0123] If the browsing operation is not executed in step S4, the presence or absence of the target file is checked in step S8. If it is not the target file, the recursive search execution capability is checked in step S9. In step S10, a combination of weight calculation is selected.

[0124] As this weight calculation

(1) Weight calculation reflecting the number of previews

(2) Weight calculation reflecting preview time

(3) Weight calculation reflecting the number of rearrangements

Select at least two of the combinations.

[0125] Next, the weight of the attribute item is calculated from each history operation information according to the combination selected in step SI 1, and two weights are calculated for the same attribute item for each weight calculation result. Set the weight to be used to calculate the similarity of the next search by calculating the average of the next search, and restart the next search process from step S1.

In the flowchart of FIG. 19, weight calculation is performed by selecting at least two combinations of (1) to (3), but in addition to this, (1) and ( It may be accompanied by a weight calculation in which 2), (2) and (3), or (1), (2) and (3) are fixedly set.

[0127] The present invention also provides a file search program to be executed by a computer. The file search program of the present invention includes the processing contents shown in the flowchart of FIG. 12, FIG. 16, FIG. 18, or FIG. Will have.

[0128] The present invention also provides a storage medium such as a computer storing a file search program. This storage medium is a storage device such as a CD-ROM, floppy (R) disk, DVD disk, magneto-optical disk, IC card or other card-type storage medium or a node disk installed inside or outside the computer system. In addition to this, it includes a database that holds programs via a line, other computer systems and their data bases, and a transmission medium on the line.

The present invention includes appropriate modifications that do not impair the object and advantages thereof, and is not limited by the numerical values shown in the above embodiments.

Claims

The scope of the claims

[1] On a computer with a file storage unit that is registered by associating attribute information extracted for each file.

A search condition setting step for setting attribute information as a file search condition; and for each file in the file storage unit, similarity is calculated using the attribute information of the search condition and the registered attribute information weight A search step for searching for a file; a search result display step for displaying a search result of the file by the search step and generating a display state corresponding to a browsing operation for the search result;

A browsing operation extraction step of extracting a browsing operation history by the search result display step;

A weight calculation step for calculating a weight used for calculating the similarity of the search step and setting the weight calculated for the search step when performing a search again;

A file search program characterized by causing

[2] The file search program according to claim 1, wherein the attribute information includes a plurality of attribute items, and the search step includes the similarity of the file and the similarity of each attribute item. A file search program characterized by being calculated as a sum total multiplied by a weight.

[3] In the file search program according to claim 2,

The search result display step can display a file preview as a search result by a browsing operation,

The browsing operation extraction step calculates the frequency of each attribute item of one or more previewed files and extracts it as a browsing operation history,

The file search program characterized in that the weight calculation step calculates a weight for each attribute item according to a frequency of each attribute item in the browsing operation history.

[4] In the file search program according to claim 3,

In the browsing operation extraction step, the common attribute value included in each attribute item of the plurality of previewed files is counted, and the maximum frequency of the attribute value is extracted for each attribute item.

In the weight calculation step, the maximum frequency for each attribute item is divided by the number of previews. A file search program that calculates the weight of each attribute item as a normalized value.

[5] In the file search program according to claim 2,

In the search result display step, the search result file can be rearranged and displayed in ascending or descending order according to the similarity of the selected attribute item with respect to any one of the plurality of attribute items.

The browsing operation extraction step calculates a value according to the rank of each attribute item selected for sorting and extracts it as a browsing operation history,

The file calculation program characterized in that the weight calculation step calculates a weight for each attribute item according to a rank of each attribute item in the browsing operation history.

[6] In the file search program according to claim 5,

In the browsing operation extraction step, for the rank of each attribute item selected for sorting, a score that becomes higher and higher in order of the sorting operation is calculated and extracted as a browsing operation history.

The weight calculation step calculates the weight of each attribute item as a normalized value obtained by dividing the score for each attribute item by the number of times of sorting.

[7] In the file search program according to claim 2,

The search result display step can display a preview of the file as a search result by a browsing operation,

The browsing operation extraction step calculates the frequency and browsing time of each attribute item of one or more previewed files and extracts it as a browsing operation history,

The weight calculation step calculates a weight for each attribute item according to the frequency and browsing time of each attribute item in the browsing operation history.

[8] In the file search program according to claim 7,

The weight calculation step is performed by dividing the maximum frequency for each attribute item by the viewing time. A file search program that calculates the weight of each attribute item as a normalized value.

[9] In the file search program according to claim 2,

In the search result display step, a preview of the file can be displayed as a search result by a browsing operation, and ascending order according to the similarity of the selected attribute item with respect to any one of a plurality of attribute items. Or the search result files can be sorted and displayed in descending order,

The browsing operation extraction step includes:

The browsing operation extraction step includes at least one of the following two: a third extraction step that calculates the frequency and browsing time of each attribute item of one or more previewed files and extracts the browsing operation history.

A first weight calculating step of calculating a weight for each attribute item according to the frequency of each attribute item of the browsing operation history by the first extracting step;

The weight calculating step by the second extracting step includes a second weight calculating step for calculating a weight for each attribute item according to a rank of each attribute item in the browsing operation history, and the browsing by the third extracting step. A third weight calculating step for calculating a weight for each attribute item according to the frequency and browsing time of each attribute item in the operation history;

A file search program comprising at least two of the above.

10. The file search program according to claim 9, wherein the weight calculation step is calculated by combining at least two of the first weight calculation step to the third weight calculation step. A file search program characterized in that an average of weights for each attribute item is calculated and set in the search step.

[11] The file search program according to claim 1, wherein the search condition setting step includes: A file search program characterized in that attribute values of a plurality of attribute items input using a search screen are set as search conditions.

[12] In the file search program according to claim 1, the search condition setting step includes setting the search condition by extracting the attribute value of the attribute item from the specified file using the search surface. Feature file search program.

[13] The file search program according to any one of claims 1 to 12, wherein the attribute information includes, as attribute items, a type, an owner, a date of creation or update, and a phrase. Search program.

[14] In a file search method for searching for a target file from a file storage unit registered by associating attribute information extracted for each file,

A search condition setting step for setting attribute information as a file search condition; and for each file in the file storage unit, similarity is calculated using the attribute information of the search condition and the registered attribute information weight A search step for searching for a file; a search result display step for displaying a search result of the file by the search step and generating a display state corresponding to a browsing operation on the search result;

A file search method comprising: a browsing operation history power, a weight calculation step for calculating a weight used for calculating the similarity in the search step, and a weight calculation step set in the search step when performing a search again.

[15] The file search method according to claim 14, wherein the attribute information includes a plurality of attribute items, and the search step includes the similarity of the file and the similarity of each attribute item. A file search method, characterized in that a file sum is calculated as a sum of weights.

[16] The file search method according to claim 15,

The browsing operation extraction step calculates the frequency of each attribute item of one or more previewed files and extracts it as a browsing operation history, The file calculation method according to claim 1, wherein the weight calculation step calculates a weight for each attribute item according to a frequency of each attribute item in the browsing operation history.

[17] The file search method according to claim 15,

The file calculation method according to claim 1, wherein the weight calculation step calculates a weight for each attribute item according to a rank of each attribute item in the browsing operation history.

[18] The file search method according to claim 15,

The file calculation method according to claim 1, wherein the weight calculation step calculates a weight for each attribute item according to a frequency and a browsing time of each attribute item of the browsing operation history.

[19] The file search method according to claim 15,

In the search result display step, a file preview can be displayed as a search result by a browsing operation, and in ascending order according to the similarity of the selected attribute item with respect to any one of a plurality of attribute items. Or the search result files can be sorted and displayed in descending order,

The browsing operation extraction step includes:

In the browsing operation extraction step, each attribute item of the previewed one or more files A third extraction step for calculating the frequency and browsing time of the file and extracting it as a browsing operation history, and at least one of the following two powers:

A file search method comprising at least two of the above.

A file storage unit registered in association with attribute information extracted for each file, a search condition setting unit for setting attribute information as a file search condition,

For each file in the file storage unit, a search unit for searching for a file by calculating similarity using attribute information of the search condition and registered attribute information power weight;

A browsing operation extraction unit that extracts a browsing operation history by the search result display unit, a browsing operation history power, a weight used to calculate the similarity of the searching unit, and the search unit when the search is performed again A weight calculation unit to be set;

A file search apparatus comprising: