WO2009096190A1 - Dispositif de support de détermination de cotation - Google Patents

Dispositif de support de détermination de cotation Download PDF

Info

Publication number
WO2009096190A1
WO2009096190A1 PCT/JP2009/000360 JP2009000360W WO2009096190A1 WO 2009096190 A1 WO2009096190 A1 WO 2009096190A1 JP 2009000360 W JP2009000360 W JP 2009000360W WO 2009096190 A1 WO2009096190 A1 WO 2009096190A1
Authority
WO
WIPO (PCT)
Prior art keywords
determination
citation
range
data
comparison
Prior art date
Application number
PCT/JP2009/000360
Other languages
English (en)
Japanese (ja)
Inventor
Kazunari Sugimitsu
Keiki Onishi
Original Assignee
Kanazawa Institute Of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kanazawa Institute Of Technology filed Critical Kanazawa Institute Of Technology
Publication of WO2009096190A1 publication Critical patent/WO2009096190A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Definitions

  • the present invention relates to a quote determination support apparatus and a quote determination support program that support determination of whether or not document data is cited in determination target data to be determined.
  • Patent Document 1 discloses a technique for investigating the copyright infringement of the content in the web page based on the copyright information received from the server and transmitting a notification to that effect to the server when the copyright infringement is found. It is done.
  • Patent Document 2 discloses a technique for judging the degree of similarity between technical documents and visually displaying the relationship between the two documents.
  • the present invention has been made in view of the above, and it is possible to improve the accuracy of the determination while preventing the increase in the development process and the manufacturing cost by using a general-purpose determination algorithm, and a quotation determination support device,
  • the purpose is to provide a citation judging support program.
  • the citation determination support apparatus of claim 1 supports determination of whether or not document data is cited in the determination target data to be determined.
  • a judgment range specifying unit for specifying a judgment range of presence / absence of citation of the document data from among the judgment object data, and the judgment object data from among the document data;
  • Comparison range specifying means for specifying the comparison range of the above and the description content of the determination range specified by the determination range specifying means are searched from the comparison ranges specified by the comparison range specifying means, Similarity calculation means for calculating the similarity between the description content of the determination range and the description content of the comparison range, and the similarity calculated by the similarity calculation means is greater than or equal to a predetermined threshold value;
  • Judgment Document quoting judging means for judging that the scope is referring to the comparison range; and output means for outputting the judgment range of the judgment target data quoting the comparison range of the document data. It is characterized by
  • the document citation determination means determines that the determination range is citing the comparison range.
  • the present invention is characterized by further comprising legality determination means for determining whether or not the citation is a legitimate citation based on the citation part of the comparison range in the determination range and the vicinity thereof.
  • the citation determination support apparatus is the citation determination support apparatus according to the second aspect, wherein the legality determination means identifies the reference data that is the citation source of the citation part. Is determined to be included in the determination target data.
  • the citation determination support apparatus is the citation determination support apparatus according to claim 2 or 3, wherein the appropriateness determination means determines that the similarity is equal to or more than a predetermined threshold in the determination range. Determining whether or not the determination range conforms to a predetermined citation form, and based on the determination result, determining whether the citation of the comparison range in the determination range is a legitimate citation; It is characterized by
  • the type of the determination target data and the predetermined quotation form may be associated with each other and stored.
  • a storage unit is provided, and the legitimacy determination unit identifies a type of the determination target data, acquires the citation form corresponding to the identified type from the citation form storage unit, and determines the acquired citation form as the determination. Determining whether the citations of the comparison range in the range match.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to fifth aspects, wherein the determination range refers to the comparison range in the document citation determination means.
  • a reference information acquisition unit that acquires reference information for referring to the document data including the comparison range based on the document data, and the output unit is configured to The reference information acquired by the reference information acquisition unit is output in addition to the judgment range of the judgment target data quoting the comparison range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to sixth aspects, wherein the determination range specifying unit is a component that constitutes the determination target data. Among them, a predetermined component is specified as the determination range.
  • the citation determination support apparatus uniquely identifies, in the citation determination support apparatus according to any one of claims 1 to 7, the creator of the determination target data generated in the past.
  • the determination range specifying means acquires, from the history storage means, the creator identification information corresponding to the information indicating that the illegal act of quotation has occurred, or is lower than a predetermined value
  • the determination target data created by the creator identified by the creator identification information acquired by acquiring the creator identification information corresponding to the score of the creator from the history storage unit, And selecting as the determination target from a serial plurality of judgment object data.
  • the citation determination support apparatus corrects the word which can be included in the document data in the citation determination support apparatus according to any one of claims 1 to 8.
  • Dictionary storage means for storing words that can be used in association with each other; and word conversion means for converting words included in the determination target data into words stored in the dictionary storage means, wherein the determination range
  • the identification means is characterized in that the judgment target data subjected to conversion by the word conversion means is the judgment target.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to ninth aspects, further comprising an input unit for receiving an operation input to the citation determination support apparatus, the determination
  • the range specifying means is characterized in that a range specified through the input means is specified as the determination range from among the determination target data.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 10, which stores a plurality of determination target data generated in the past.
  • the similarity degree calculation means further calculates the similarity degree among the plurality of determination target data stored in the determination target data storage means, and the document citation determination means further includes: When the similarity calculated by the similarity calculation means is equal to or greater than a predetermined second threshold, it is determined that the plurality of judgment target data are quoted among each other, and the comparison range specifying means A plurality of determination target data determined to be cited among the plurality of determination target data may be specified as the comparison range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 11, wherein among the data to be determined based on the contents of the description of the data to be determined.
  • Task extraction means for extracting task information indicating a task of the determination target data from the search target data
  • the comparison range specifying means searches the document data using the task information extracted by the task extraction means as a search key And identifying the retrieved document data as the comparison target.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 12, wherein the citation determination means determines that the citation determination support apparatus is cited in the determination range.
  • Document data storage means for storing the document data, wherein the comparison range specifying means includes reference source information specifying the document data cited in the determination target data in the determination target data or not If it is determined that the reference source information is included in the determination target data, whether the document data specified based on the reference source information is stored in the document data storage unit If it is determined that the document data specified based on the reference source information is stored in the document data storage means, the document data Identifying it as a comparison range, characterized by.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 13, wherein the similarity calculation means is specified by the judgment range specifying means.
  • searching from the comparison range specified by the comparison range specifying means using the description content of the determination range as a search key, the number of characters of the search key exceeds the predetermined number of restricted characters.
  • the search key is selected, characters within the limited number of characters are sequentially designated from the determination range as the search key, and the comparison range is searched a plurality of times, and the appearance frequency is predetermined among the plurality of search results.
  • a search result larger than a value is set as a target of the comparison range for calculating the mutual similarity with the description content of the determination range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to the fourteenth aspects, wherein the similarity calculation unit analyzes the determination range to obtain a predetermined number or more. A word which appears is searched a plurality of times from the comparison range specified by the comparison range specifying means for each word using the word as a search key, and a frequency of appearance is larger than a predetermined value among a plurality of search results It is characterized in that the result is a target of the comparison range for calculating the mutual similarity with the description content of the determination range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to fifteenth aspects, further comprising: input means for receiving an input of the predetermined threshold value;
  • the means is characterized in that, when the degree of similarity is equal to or greater than a predetermined threshold value input through the input means, the determination range is determined to refer to the comparison range.
  • the description cited from the comparison range among the description contents of the determination range may further include citation ratio calculation means for calculating a citation ratio occupied by contents, and the output means may output the citation ratio.
  • the citation determination support apparatus is the citation determination support apparatus according to claim 17, wherein the citation ratio calculation means calculates the citation ratio for a plurality of the determination target data, and the output means
  • the present invention is characterized in that the determination target data information uniquely identifying the plurality of determination target data is output in the order based on the quoting ratio calculated by the quoting ratio calculating means with respect to each of the determination target data.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to eighteenth aspects, wherein the similarity of the determination range and the output mode by the output means And output mode information storage means for storing the output mode information in association with each other, wherein the output means acquires the output mode corresponding to the similarity calculated by the similarity degree calculation means from the output mode information storage means And outputting the determination range in the acquired output mode.
  • a citation determination support program is a citation determination support program for supporting determination of whether or not document data is cited in determination target data to be determined, which is a computer.
  • a determination range specifying means for specifying a determination range of presence / absence of citation of the document data among the determination target data, and a comparison range specifying a comparison range with the determination target data among the document data
  • the description content of the determination range specified by the means and the determination range specification means is searched from the comparison range specified by the comparison range specification means, and the description content of the determination range and the comparison range
  • a similarity calculation unit that calculates the similarity between the description contents of the two, and the determination range is earlier if the similarity calculated by the similarity calculation unit is equal to or greater than a predetermined threshold value It is characterized in that it functions as a document quoting judging means which judges that a comparison range is quoted, and an output means which outputs the judgment range of the judgment object data quoting the comparison range of the document data. I assume.
  • the citation judging support device of the first aspect since the judgment of the similarity is performed after the judgment range and the comparison range are automatically limited, the development process and the manufacturing cost can be performed using a general-purpose judgment algorithm. There is an effect that the accuracy of the determination can be improved while preventing the increase of
  • the citation judging support device described in claim 2 it can be easily judged whether the citation is a legitimate citation prescribed by the copyright law, and it can be easily judged the legality of the judgment. it can.
  • the quotation judgment support device of claim 3 it is judged whether the quotation source information specifying the document data which is the quotation source of the cited part is included in the judgment object data or not. It is possible to obtain the judgment material when judging the legitimacy of citation based on the presence or absence of
  • the quoting determination support device of claim 4 it is determined whether or not the determination range conforms to a predetermined citation form, and quoting of the comparison range in the determination range is appropriate based on the determination result. It is possible to easily determine the legitimacy of the citation on the basis of a preset citation format, because it is determined whether or not the citation is a citation.
  • the citation form corresponding to the type of the determination target data is acquired from the citation form storage means, and it is determined whether the citation matches the acquired citation form. Therefore, the legitimacy of the citation can be determined based on a citation format that differs for each type of determination target data.
  • the citation of the citation data is automatically specified, and the citation determination is performed after the citation document is added to the determination range of the determination target data.
  • the document is cited illegally, it has an effect that it can be easily detected.
  • the quoting determination support device of the seventh aspect it is possible to set, in the determination range, a portion which is likely to be cited without permission among the component parts of the determination target data, and to further improve the determination accuracy. The effect of being able to
  • the determination target data of a person having a high probability of performing illegal citation can be automatically set as the determination target, and the possibility that the fraud recurs will be added.
  • the determination can be performed, and the accuracy of the determination can be further improved.
  • the quoting determination support device of claim 9 even if the document data is corrected without being used as it is and quoting improperly, it can be determined whether it is quoting or not, and the development process and manufacturing can be performed. This has the effect that the accuracy of the determination can be further improved while preventing an increase in cost.
  • the target for performing the quoting determination can be limited. It is possible to reduce the load involved in the determination process.
  • the document data having a high possibility of citing the document data of another person can be automatically set in the comparison range, and the development process and the manufacturing cost can be reduced. This has the effect that the accuracy of the determination can be further improved while preventing the increase.
  • an appropriate comparison range can be automatically set in accordance with the description content of the determination target data, thereby preventing an increase in the development process and the manufacturing cost.
  • the effect is that the accuracy of the determination can be further improved.
  • the document data storage means stores the document data determined by the document quoting determination means as being cited in the determination range.
  • the citation source information specifying the document data is included in the determination target data
  • the document data specified based on the citation source information is stored in the document data storage unit.
  • the search can be executed.
  • the entire article data can be substantially included in the search range while sequentially targeting each part of the article data, so that the accuracy of the citation determination can be improved.
  • a search result having a high appearance frequency is automatically specified, and the search result is automatically set as a comparison range used for calculating the degree of similarity.
  • a comparison range that matches the range can be extracted automatically to perform citation determination, and the accuracy of citation determination can be further improved.
  • the determination range is quoting the comparison range.
  • An optimal threshold can be set according to the purpose of the determination, and the determination based on the threshold can be performed.
  • the judgment material of the legitimacy of the citation can be presented.
  • the quoting ratio is calculated for a plurality of determination target data, and the determination target data information is output in the order based on the quoting ratio for each determination target data. It is possible to present judgment material for comparing the legitimacy of citation in the judgment object data of c based on the citation ratio.
  • the output mode corresponding to the similarity calculated by the similarity calculation means is acquired from the output manner information storage means, and the determination is made based on the acquired output manner. Since the range is output, the determination range can be output in such a manner that the user can easily grasp the degree of similarity.
  • FIG. 1 is a block diagram conceptually showing a system configuration including a quotation determination support device according to a first embodiment.
  • 7 is a flowchart showing the procedure of a quotation determination support process of the first embodiment.
  • It is a schematic diagram which shows an example of a quotation determination screen. It is an explanatory view showing an example of a portion of a text specified as a judgment range in article data.
  • It is a schematic diagram which shows an example of the quotation determination screen where the content of the determination range was displayed.
  • It is a flowchart which shows the procedure of a specific process of a comparison range.
  • It is a schematic diagram which shows the state by which the citation place is highlighted within the determination range in a quotation determination screen.
  • FIG. 13 is a block diagram showing a functional configuration of a quote determination assisting device according to a second embodiment. It is an explanatory view showing an example of history data.
  • 7 is a flowchart showing the procedure of a quote determination support process of the second embodiment.
  • 15 is a flowchart illustrating a procedure of identification processing of a determination target according to the second embodiment.
  • FIG. 16 is a block diagram showing a functional configuration of a quote determination assisting device according to a third embodiment. It is explanatory drawing which shows an example of a specialized dictionary.
  • 21 is a flowchart showing a specific procedure of the determination target in the third embodiment.
  • FIG. 18 is a block diagram showing a functional configuration of a quote determination assisting device according to a fourth embodiment.
  • 21 is a flowchart showing the procedure of comparison and determination processing of the fourth embodiment
  • FIG. 18 is a block diagram showing a functional configuration of a quote determination assisting device according to a fifth embodiment.
  • FIG. 21 is a flowchart showing the procedure of comparison range identification processing of the fifth embodiment;
  • FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a sixth embodiment. It is a flowchart which shows the procedure of the search process in similarity calculation of Embodiment 6.
  • FIG. It is a flowchart which shows the procedure of the similarity calculation process of the modification 1.
  • FIG. 16 is a flowchart showing the procedure of comparison range identification processing of Modification 2;
  • FIG. FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a seventh embodiment. It is a table showing information stored in a citation format DB.
  • FIG. 21 is a flow chart showing a procedure of quoting determination support processing of the seventh embodiment;
  • FIG. It is a flowchart which shows the procedure of a citation form setting process. It is the figure which illustrated the citation form setting input screen.
  • It is a flowchart which shows the procedure of legality determination processing.
  • FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to an eighth embodiment. It is the table which illustrated the information which is housed in citation ratio DB.
  • FIG. 21 is a flow chart showing a procedure of quoting determination support processing of the eighth embodiment; FIG. It is the figure which illustrated the quotation judgment screen at the time of outputting and displaying a quotation ratio. It is a flowchart which shows the procedure of a list display process. It is a figure which shows the determination result screen which displays the list
  • FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a ninth embodiment. It is the table which illustrated the information stored in output mode DB.
  • FIG. 20 is a flow chart showing the procedure of a quote determination support process of the ninth embodiment. It is a figure which shows the thesis data displayed on the citation determination screen on a display apparatus.
  • 51 is a flowchart showing the procedure of a quotation determination support process of the tenth embodiment.
  • 51 is a flowchart showing the procedure of comparison and determination processing of the tenth embodiment;
  • This form is a form which selects automatically the composition part with high possibility of quoting the document of a third party among thesis data, and makes it a judgment range.
  • FIG. 1 is a block diagram conceptually showing the system configuration including the citation judging support device according to the first embodiment.
  • the quotation determination support device 100 is communicably connected to the WEB site 131 and the file server 133 via an arbitrary network such as the Internet 130 as shown in FIG.
  • the WEB site 131 and the file server 133 can be configured in the same manner as in the prior art, and thus the detailed description thereof is omitted.
  • the quotation determination support device 100 is configured by connecting a storage unit 101 and a control unit 102 by a bus as shown in FIG. 1, and includes an input device 103 and a display device 104.
  • the storage unit 101 is a storage unit that stores various programs and data necessary for controlling the quotation determination support apparatus 100, and is configured of a storage medium such as a hard disk drive (HDD) or a memory.
  • the storage unit 101 is installed with a quotation determination support program stored in a recording medium (not shown) and read by a reading device (not shown).
  • a document data storage unit 101a, a document list storage unit 101b, and a paper data storage unit 101c are provided in the storage unit 101.
  • the document data storage unit 101 a stores document data that can be a source of citation data.
  • the document data is stored in the document data storage unit 101 a provided in the quotation determination support device 100 and is also stored in the WEB site 131 and the file server 133.
  • the document list storage unit 101b is a document name of document data recorded in the document data storage unit 101a, the document name of document data on the Internet, a storage location such as URL (Uniform Resource Locator) or folder name, a file name, a creator, a creation year A list of bibliographic information such as date and time is stored.
  • URL Uniform Resource Locator
  • the thesis data storage unit 101c stores thesis data to be subjected to the citation determination in association with the student register number for identifying the student who is the creator of the thesis data.
  • This thesis data storage unit 101c receives thesis data to be subject to the citation judgment from the student's terminal and stores it, and all the thesis data submitted in the past and subjected to the citation judgment is its creator It is stored in correspondence with the student register number of the student.
  • the control unit 102 is a control unit that controls the citation determination support apparatus 100, and conceptually illustrates the determination range specification unit 102a, the comparison range specification unit 102b, the similarity calculation unit 102c, the document citation determination unit 102d, and legality.
  • a determination unit 102e, a reference information acquisition unit 102f, an input control unit 102g, and an output control unit 102h are provided.
  • the specific configuration of the control unit 102 is arbitrary, for example, a control program such as an OS (Operating System), an embedded program defining various processing procedures, etc., an internal memory for storing required data, and And a CPU (Central Processing Unit) that executes the program of
  • OS Operating System
  • CPU Central Processing Unit
  • the determination range specifying unit 102 a is a determination range specifying unit that specifies a determination range of presence / absence of citation of document data from the article data stored in the article data storage unit 101 c.
  • the comparison range specifying unit 102 b is a comparison range specifying unit that specifies document data or the like to be a comparison range with the determination range of the article data.
  • the similarity calculation unit 102 c uses the description content of the determination range specified by the determination range specification unit 102 a as a search key, and the document data specified in the comparison range specification unit 102 b or the past paper data (hereinafter referred to as “document data etc.” ) Is a similarity calculation means for searching for the comparison range and calculating the mutual similarity.
  • the document quotation determination unit 102d determines that the document data judgment range refers to a comparison data or the like. It is a judgment means.
  • the legality determination unit 102e refers to the cited place such as the document data or the like in the determination range It is a legitimacy judging means to judge whether the citation is a legal citation based on the place.
  • the reference information acquisition unit 102 f uses the name or title of the document data or the like as reference information for referring to the document data or the like.
  • URL and folder name, etc. are reference information acquisition means for acquiring from attributes such as document data.
  • the input control unit 102 g is an input control unit that receives an event caused by an operation input from the input device 103 and performs input control of the operation input.
  • the output control unit 102 h is an output control unit that performs display control of various screens on the display device 104.
  • the output control unit 102h displays a judgment range display, a judgment range of article data citing a comparison range such as document data, and a quotation judgment screen (described later) showing the reference information on the display device 104.
  • the input device 103 is an input unit such as a keyboard or a pointing device such as a mouse.
  • the display device 104 is an output means such as a monitor.
  • FIG. 2 is a flowchart showing the procedure of the quotation determination support process of the first embodiment.
  • FIG. 3 is a schematic view showing an example of the quotation determination screen.
  • the “simple” button is clicked, the determination range is specified. If you click the "Details" button on the citation judgment screen, various settings for searching the search database that manages the document data etc. of the document data storage unit 101a, language, generation period of cited document data, keywords, creator etc Screen (not shown) is displayed.
  • the determination range identification unit 102a reads the created article data from the article data storage unit 101c (step S11). Then, the determination range specifying unit 102a performs structural analysis of the structure of the data of the article by a known method (step S12), and an introductory portion (constituent portion of "introduction” and the like) configuring the article Obtain the component part of "end", the component part of "acknowledgement”, etc. Then, since the main text portion is a main portion of the article data and is a component portion having a high possibility of citing the document of the third party, the determination range specifying unit 102a selects the component portion obtained by the structural analysis. , And the text part is specified as the determination range (step S13).
  • FIG. 4 is an explanatory view showing an example of a text portion specified as a judgment range in article data.
  • the content described in the answer column is a component corresponding to the text, so the determination range specification unit 102a specifies the described content of this response column as the determination range. .
  • the output control unit 102h displays the content of the specified determination range in the determination range column of the quotation determination screen, as shown in FIG.
  • the input control unit 102g receives the event, and the comparison range specifying unit 102b performs a process of specifying the comparison range (step S14).
  • FIG. 6 is a flowchart showing the procedure of the process of specifying the comparison range.
  • the comparison range specifying unit 102b first reads all the thesis data submitted in the past stored in the thesis data storage unit 101c (step S21).
  • the comparison range specifying unit 102b reads all the document data described in the document list stored in the document list storage unit 101b from the document data storage unit 101a and the Internet 130 (step S22).
  • the comparison range specifying unit 102b specifies all the read article data and the acquired document data (such as document data) as a comparison range (Step S23).
  • the similarity calculation unit 102c searches the data of the comparison range specified using the description content of the specified determination range as the search key (step S15).
  • the similarity of the description content is calculated (step S16).
  • the similarity calculation unit 102c designates a search key to a search program or a search engine using a known search technology, or a search program or a search engine of these to execute a search instruction.
  • logic for calculating the degree of similarity for example, a known logic such as syntactic analysis of the description contents of the judgment range of the article data and the description contents of the document data is used.
  • the document quoting determination unit 102d determines whether the determination range is citing the document of the comparison range by determining whether the calculated similarity is equal to or more than a predetermined threshold (step S17). ).
  • the predetermined threshold can be arbitrarily determined in accordance with the accuracy required for the quotation determination.
  • the legality determination unit 102e determines whether this citation is a legitimate one (step S18).
  • the term "legitimate” is a concept including that the quote is legal under the copyright law, or that the user has requirements set in advance. Specifically, when there is display of the book name near the lower part of the citation place such as the document data in the determination range, the legality determination unit 102e displays the parenthesis "" indicating citation immediately before and after the citation place.
  • the citation is displayed in a font different from the font of the other parts to indicate that it is a citation, it is determined that the citation is legitimately cited based on the copyright law.
  • a predetermined indication for example, author's name, author's name, or publisher's name
  • the citation part is It may be judged that it has been properly cited.
  • Step S18, Yes when it is determined that the reference of the determination range is a legal reference (Step S18, Yes), the processing is ended.
  • the reference information acquisition unit 102f refers to the document data etc. (cited document data or cited paper data) Reference information (file name and title of document data etc., URL, folder name etc.), attribute of document data, etc. or attribute of cited article data etc. (step S19). Then, the output control unit 102 h clearly indicates the location at which the document data and the like are cited within the determination range on the citation determination screen and displays the reference information (step S 20).
  • steps S15 to S20 are repeatedly executed for all the data in the specified comparison range (step S20a, No). If the processes in steps S15 to S20 have been executed for all the data in the specified comparison range (step S20a, Yes), the process ends.
  • FIG. 7 is a schematic view showing a state in which a cited place is highlighted within the judgment range on the quotation judgment screen. Note that the bold and underlined portions in FIG. 7 are the highlighted portions, that is, the portions of the citation.
  • the instruction is accepted by the input control unit 102g, and the output control unit 102h is controlled to display the reference information at the instructed location. There is.
  • FIG. 8 is a schematic view showing a state in which reference information is displayed on the quotation determination screen.
  • the example of FIG. 8 shows the case where the document data on the Internet 130 is cited, and the URL of the document data is displayed as the reference information.
  • the output control unit 102h accesses the WEB page indicated by the URL to display the reference data etc. of the quotation source. Configured.
  • a professor or the like who makes a citation judgment on article data can easily acquire reference data of the citation source.
  • the similarity degree judgment is performed by automatically limiting the document data etc. of the judgment range of the article data and the comparison range, so that the judgment algorithm such as general purpose similarity calculation The citation judgment can be performed using. Therefore, according to the present embodiment, it is possible to improve the accuracy of the determination while preventing the development process and the increase in the manufacturing cost.
  • the text range that is easily cited without permission is identified as the determination range from among the component parts that constitute the article data by the determination range identification unit 102a. Accuracy can be further improved.
  • the legality determination unit 102e places the reference place of the comparison range in the determination range and the vicinity thereof. Since it is determined based on the citation whether or not the citation is a legal citation, it can be easily judged whether citations such as document data are legal citations prescribed by the copyright law. Accuracy can be improved.
  • the reference information acquisition unit 102f acquires reference information for referring to the document data including the comparison range based on the document data, and the comparison range of the document data
  • the document data can be easily referred to by outputting the acquired reference information in addition to the judgment range of the judgment target data quoting the.
  • This form is a form in which thesis data of a student who has performed an illegal citation act in the past or a student whose grade is low is selected as a judgment target.
  • the configuration and the process according to the second embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment
  • the description is abbreviate
  • FIG. 9 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the second embodiment.
  • the quote determination support device 900 differs from the quote determination support device 100 according to the first embodiment in that the storage unit 101 includes the history data storage unit 101 d and the control unit 102 includes the determination range specification unit 102 i.
  • the history data storage unit 101 d is a storage medium, such as a memory or an HDD, which stores history data related to article data generated in the past.
  • FIG. 10 is an explanatory diagram of an example of history data.
  • the historical data includes, for all the thesis data generated in the past, the creation date of the thesis data, the student register number for uniquely identifying the student who is the creator of the thesis data, and illegal citation in the thesis data It is the data which matched the existence of the citation which shows the presence or absence of the student and the student's result (A, B, C, D, A is the best and then the degree of excellence is in the order of B, C, D) is there.
  • Presence is set as the presence or absence of the incorrect citation for thesis data of the student who made the citation in the past.
  • past thesis data itself is stored in the thesis data storage unit 101c in association with the student register number.
  • Judgment range specification unit 102i refers to this history data, and as a person who has a high probability of making an illegal citation, the student register number of “presence or absence of illegal citation” and the grade of C or less (that is, C and D)
  • the student register number of (a) is acquired, and thesis data submitted by the student with the acquired student register number is selected as a judgment target from among a plurality of thesis data submitted (stored in thesis data storage unit 101c) Do.
  • the determination range specifying unit 102i specifies the body part as the determination range from among the component parts of the article data selected as the determination target.
  • FIG. 11 is a flowchart of the quotation determination support process according to the second embodiment.
  • the determination range identification unit 102i first performs determination processing of the determination target (Steps S31). Details of the determination process of the determination target will be described later.
  • the determination range is specified as in the first embodiment on the thesis data of the student as the determination target (steps S32 and S33). Quotation determination is performed in the same processing as in (steps S34 to S40a).
  • FIG. 12 is a flowchart illustrating the procedure of the process of identifying a determination target according to the second embodiment.
  • the determination range identification unit 102i reads out the created article data and the student register number corresponding to the article data from the article data storage unit 101c (step S41).
  • the determination range specifying unit 102i refers to the history data stored in the history data storage unit 101d, and reads out the presence / absence of the illegal citation and the grade corresponding to the read student registry number (step S42).
  • the determination range specifying unit 102i determines whether the presence or absence of the unauthorized use read from the history data is “presence” (step S43). Then, if the presence or absence of unauthorized use is "presence” (step S43, Yes), the thesis data created by the student with the student register number, that is, the thesis data read in step S41 is specified as the determination target (step S45).
  • step S43 determines that the presence or absence of unauthorized use is "absent" (step S43, No).
  • the determination range specifying unit 102i further determines that the score read from the history data is C or less, that is, C or D. It is determined whether there is any (step S44).
  • step S45 the thesis data created by the student with the student register number is specified as the determination target (step S45).
  • step S44, No when the grade is higher than C in step S44 (ie, when it is A or B) (step S44, No), the article data read out in step S41 is not determined.
  • step S41 to step S45 the processing from step S41 to step S45 is performed on the plurality of article data to specify the article data to be determined.
  • the student register number corresponding to "presence" of the presence or absence of fraudulent citation indicating that there has been a fraudulent citation act in the past from history data the score is a predetermined value Since the thesis data created by the student whose student registry number is C or lower is selected as the judgment target from among the plurality of thesis data, the thesis data of the person who has a high probability of making an illegal citation is judged as the judgment target As a result, the determination accuracy can be further improved, and the determination processing load can be reduced and the determination efficiency can be improved by limiting the determination target only to the thesis data having a high probability of illegal citation.
  • the determination range specifying unit 102i both determines from the history data whether there is an illegal quotation indicating that there has been an illegal quotation act in the past, and determines whether or not the score is a predetermined value or less.
  • the paper data to be judged may be specified by only one judgment.
  • This form is a form which performs similarity determination after converting the said word into the word before correction as a countermeasure when the document of citation origin correct
  • the configuration and processing according to the third embodiment are the same as the configuration and processing according to the second embodiment except when particularly described, and the same configuration and processing in the second embodiment will be described.
  • the description is abbreviate
  • FIG. 13 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the third embodiment.
  • the storage unit 101 includes the dictionary storage unit 101e
  • the control unit 102 includes the determination range specification unit 102j and the word conversion unit 102k. It differs from the device 900.
  • the dictionary storage unit 101e is a storage medium such as an HDD or a memory that stores a specialized dictionary in which technical terms in the technical field of the article data and one or more terms that can be used in association with the technical terms are associated and registered.
  • FIG. 14 is an explanatory view showing an example of the specialized dictionary.
  • terms that can be used in association such as a first candidate term and a second candidate term, are associated with terms that can be included in document data and the like. This specialized dictionary is used when correcting the words in the article data by the word conversion unit 102k described later.
  • the determination range specification unit 102j selects the article data converted by the word conversion unit 102k as a determination target. Further, as in the first embodiment, the determination range specifying unit 102j specifies the text portion as the determination range from among the component parts of the article data selected as the determination target.
  • the word conversion unit 102k converts a word included in article data into a first candidate term, a second candidate term, and the like of the corresponding term in the specialized dictionary.
  • FIG. 15 is a flow chart showing a specific procedure of the determination target in the third embodiment.
  • the determination range identification unit 102j reads out the created article data from the article data storage unit 101c (step S51). Then, the determination range specifying unit 102 j performs morphological analysis on the contents of the read article data according to a known method, and divides the contents into morphemes (step S 52).
  • the word conversion unit 102k searches the specialized dictionary using the obtained morpheme word as a search key, and for the word registered as the specialized dictionary term, the word corresponds to the specialized dictionary term It is converted into a first candidate term (step S53).
  • conversion to the nth candidate term (n is an integer of 2 or more) is performed.
  • step S54 it is determined whether the word conversion process is completed for all the words in the article data (step S54), and if not completed (step S54, No), the word conversion process in step S53 is performed.
  • the word conversion unit 102k stores the article data of the converted word as corrected article data. It is stored in the unit 101c (step S55).
  • the word conversion unit 102k determines whether all the candidate terms in the specialized dictionary have been converted (step S56). Then, if conversion into all candidate terms has not been performed (Step S56, No), the word conversion unit 102k selects the next candidate term (the (n + 1) th candidate term) as the term of the specialized dictionary (Step S57) , Steps S53 to S55 are repeated. As a result, for each word of the article data, a plurality of modified version of the article data converted into a plurality of candidate terms are obtained and stored in the article data storage unit 101c.
  • step S56 If it is determined in step S56 that all the candidate terms in the specialized dictionary have been converted (Yes in step S56), the determination range specifying unit 102j specifies a plurality of obtained corrected version paper data as a determination target ( Step S58).
  • the quoting determination support process is performed on a plurality of corrected version paper data specified as the determination target in this way.
  • the words included in the article data are converted into the terms registered in the specialized dictionary, and the article data subjected to the conversion is determined as a judgment target. Even if the data is corrected without being used as it is, and it is cited illegally, it can be judged whether or not it is a quotation, and the accuracy of the judgment can be further improved.
  • This form is a form in which the degree of similarity is calculated between student's past thesis data.
  • the configuration and the process according to the fourth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment
  • the description is abbreviate
  • FIG. 16 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the fourth embodiment.
  • This quotation judgment support device 1600 differs from the quotation judgment support device 100 according to the first embodiment in that the control unit 102 includes a comparison range specification unit 1021, a similarity calculation unit 102m, and a document quotation judgment unit 102n.
  • the similarity calculation unit 102m calculates the similarity between the past paper data of the students stored in the paper data storage unit 101c, in addition to the same function as that of the first embodiment.
  • the calculation of the degree of similarity uses a known method as in the first embodiment.
  • reference quotation judging unit 102 n determines whether there is a mutual relation between a plurality of past paper data when the similarity calculated by similarity calculation unit 102 m is equal to or higher than a predetermined second threshold.
  • the second threshold can be arbitrarily set, and may be the same value as the above-described threshold or any different value.
  • the comparison range specifying unit 1021 specifies, as a comparison range, a plurality of past article data which are determined to have a citation by the document citation determining unit 102n.
  • FIG. 17 is a flowchart showing the procedure of the comparison / determination process of the fourth embodiment.
  • the comparison range specification unit 1021 first extracts two thesis data from all thesis data submitted in the past stored in the thesis data storage unit 101c (step S61).
  • the similarity calculation unit 102m calculates the similarity of the description content of the two extracted article data (step S62).
  • the calculation of the degree of similarity first compares the description of a partial range in one of the two article data with the description content of the other article data, and then the one article data
  • the similarity of each part is calculated while repeating the process of comparing with the description content of the other article data while changing the partial range of the part, and the average value of the similarity of these partial comparison results, etc. Can be calculated as the degree of similarity between all the article data.
  • the method of calculating the degree of similarity is not limited to this.
  • the comparison range specifying unit 102l determines whether or not the calculation process of the similarity is performed for all the past paper data (step S63), and if it is not performed for all the past paper data ( The processes of steps S63 and No) and steps S61 and S62 are repeatedly executed.
  • the document quoting determination unit 102n determines a plurality of documents whose similarity is equal to or more than a predetermined second threshold. If there is a thesis data, it is determined that there are citations among the plurality of thesis data, and the plurality of thesis data are selected (step S64). Then, the comparison range specifying unit 102l specifies the selected plurality of thesis data as the comparison range (step S65). Therefore, the past paper data quoted to each other becomes the comparison range, and the citation judgment of the paper data to be judged is performed.
  • the citation determination support apparatus 1600 performs citation determination on the citation data of the determination target, with the citation data in the past as citation data being used as a comparison range.
  • the comparison range it is possible to improve the accuracy of the judgment while preventing the development process and the increase of the manufacturing cost, and limit the judgment object only to the thesis data with a high probability of illegal citation.
  • the determination processing load can be reduced to improve the determination efficiency.
  • the citation is further determined by the legality determination unit 102e. It may be configured to judge whether or not it is legal, and to specify as a comparison range only when it is illegal.
  • This form is a form which automatically extracts a judgment target with a task sentence of a paper as a keyword.
  • the configuration and the process according to the fifth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate
  • FIG. 18 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the fifth embodiment.
  • the quote determination support device 1800 differs from the quote determination support device 100 according to the first embodiment in that the control unit 102 includes the task extraction unit 102 p and the comparison range specification unit 102 q.
  • the task extraction unit 102p analyzes the structure of the article data as the determination target, and extracts the task sentence of the article from the description content of the article data. Specifically, the task extraction unit 102p identifies and extracts task sentences based on the heading, structure, and the like of the paper obtained as a result of structural analysis.
  • the comparison range specifying unit 102 q searches the corresponding WEB page from the WEB site 131 or the file server 133 on the Internet 130 using the task sentence extracted by the task extracting unit 102 p as a search key, and the URL etc. output as the search result
  • the document data specified by is specified as the comparison range.
  • a known search engine or the like can be used for the search.
  • the comparison range specifying unit 102 q transmits a search request command or the like specifying a search key to a search engine WEB site using a known search engine API (Application Programming Interface), and receives a search result. It should be configured to
  • FIG. 19 is a flowchart showing the procedure of comparison range identification processing of the fifth embodiment.
  • the task extracting unit 102p analyzes the structure of the article data as the determination target to extract task sentences (step S81).
  • the comparison range specifying unit 102 q searches for a corresponding WEB page from the WEB site 131, the file server 133, and the like on the Internet 130 using the extracted task sentence as a search key (Step S82).
  • the comparison range specifying unit 102 q specifies the cited document data specified by the URL of the searched WEB page as the search result as the comparison range (step S 83).
  • the comparative range of the cited document is determined based on the task sentence in the thesis data. It is possible to further improve the accuracy of the determination while preventing an increase in the development process and the manufacturing cost.
  • This form is a form including the correspondence logic when the number of characters to be compared in the article exceeds the number of characters in the search logic.
  • the configuration and the process according to the sixth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate
  • FIG. 20 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the sixth embodiment.
  • the quote determination support device 2000 differs from the quote determination support device 100 according to the first embodiment in that the control unit 102 includes the similarity calculation unit 102r.
  • the similarity calculation unit 102r uses the well-known search technology (search engine etc.) and uses the description content of the determination range specified by the determination range specification unit 102a as a search key, and the comparison range specified by the comparison range specification unit 102b. Search from among At this time, if the number of characters of the search key exceeds a predetermined number of restricted characters (for example, 32 characters), an error message is displayed that includes the number of restricted characters and the number of characters of the search key exceeds the number of restricted characters. It is notified from a search engine etc.
  • search engine etc. uses the description content of the determination range specified by the determination range specification unit 102a as a search key, and the comparison range specified by the comparison range specification unit 102b. Search from among At this time, if the number of characters of the search key exceeds a predetermined number of restricted characters (for example, 32 characters), an error message is displayed that includes the number of restricted characters and the number of characters of the search key exceeds the number of restricted characters. It is notified from a search engine etc.
  • the similarity calculation unit 102r specifies a character within the limited number of characters from the top of the determination range as a search key, performs a search from the comparison range, and stores the search result in a memory or the like. deep. Then, the similarity calculation unit 102 r similarly searches the comparison range using the character string for the next limited number of characters in the determination range as a search key.
  • the similarity calculation unit 102 r sequentially designates the search key while moving the character string of the description content of the determination range by the limited number of characters, performs a plurality of searches, and stores the search result Save to The similarity calculation unit 102r calculates the similarity with the determination range, with the search result having the highest appearance frequency among the plurality of search results as the comparison range to be the target of similarity calculation.
  • search results having a predetermined number or more of appearance frequencies may be configured as targets for similarity calculation.
  • FIG. 21 is a flow chart showing the procedure of search processing in similarity calculation in the sixth embodiment.
  • the similarity calculation unit 102r searches data in the comparison range as a description content search key of the determination range (step S91). Then, the similarity calculation unit 102r determines whether an error notification that the search key has exceeded the limit number of characters has been received (step S92).
  • the similarity calculation unit 102 r selects a search result (step S 100), and the search result is
  • the comparison range is the target of similarity calculation, and the similarity to the determination range is calculated as in the first embodiment.
  • step S92 if an error notification that the search key has exceeded the limited number of characters is received in step S92 (Yes in step S92), the similarity calculation unit 102r acquires the limited number of characters from the received error notification. (Step S93).
  • the similarity calculation unit 102r designates a character string within the limited number of characters as a search key from the head of the determination range (step S94), and searches data in the comparison range with this search key (step S95).
  • the similarity calculation unit 102r stores the search result in the memory (step S96).
  • the similarity calculation unit 102 r determines whether or not the final character string has been reached as a search key of the determination range (step S 97), and if it has not reached yet (step S 97, No), Among them, a character string for the next limited number of characters is designated as a search key (step S98), and the processes of steps S95 and S96 are repeatedly executed.
  • the character string from the first character to the 32nd character is used as a search key
  • the character string from the 33rd character to the 64th character is used as a search key.
  • the following character string is used as a search key, and a search key is specified in the same manner
  • the first time the character string from the first character to the 32nd character is the search key
  • step S97 when the final character string is reached as the search key of the determination range in step S97 (step S97, Yes), the search result having the highest frequency of appearance is selected from the search results stored in the memory ( Step S99), the selected comparison range is the target of similarity calculation, and the similarity to the determination range is calculated.
  • the search key when the search key exceeds the limited number of characters, the search key is specified by the character string for the limited number of characters in the determination range, and the search key is specified. Since the search is performed a plurality of times while shifting the character string in the determination range, the accuracy of the quotation determination can be improved regardless of the limited number of characters of the search key.
  • the legality determination means determines whether or not the determination range conforms to a predetermined citation form, and based on the determination result, whether or not the citation of the comparison range in the determination range is a legitimate citation It is a form to determine whether or not.
  • the configuration and processing according to the seventh embodiment are the same as the configuration and processing according to the first embodiment except when particularly described, and the same configuration and processing in the first embodiment The description is abbreviate
  • FIG. 24 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the seventh embodiment.
  • the citation determination support apparatus 100 includes a citation format setting unit 102 s in the control unit 102 and a citation format database (hereinafter, “database” is abbreviated as “DB”) 101 f in the storage unit 101.
  • database a citation format database
  • the citation format setting unit 102 s is a citation format setting unit that sets a citation format as a reference when judging the legitimacy of citation.
  • the citation format DB 101 f is a citation format storage unit that associates and stores a type for classifying article data and a predetermined citation format.
  • FIG. 25 is a table showing information stored in the citation format DB 101 f.
  • the citation format DB 101 f includes “type”, “citation format”, and “location of legal document storage” as data items, and information corresponding to these is stored in association with each other.
  • the information stored corresponding to the item "type” is information for specifying the type of thesis data, and as exemplified in FIG. 25, the field corresponding to the theme of the article such as "law” or "engineering" Can be stored.
  • the information stored corresponding to the item "citation format” is information for specifying a legal citation format, and as illustrated in FIG. 25, it is necessary to store "" "," “", etc. Can.
  • the information stored corresponding to the item "legal document storage location” is information for specifying the storage location of the document regarded as legal, for example, as shown in FIG. 25, "Z: ⁇ quotaion ⁇ law ⁇ “Z: ⁇ quotaion ⁇ eng ⁇ ” and the like can be stored as a folder name etc. where the document is stored.
  • the "document to be regarded as legal" for example, a document to be regarded as legal when the document is cited corresponds.
  • the storage method and storage timing of the information stored in the citation format DB 101 f can be arbitrarily stored in the citation format DB 101 f, for example, in advance via the input device 103, or stored in the citation format DB 101 f in the citation format setting process described later. can do.
  • FIG. 26 is a flowchart showing the procedure of the quotation determination support process of the seventh embodiment.
  • the processes of steps SA1 to SA13 except steps SA2 and SA9 are the same as the processes of steps S11 to S20a described with reference to FIG. The description is omitted.
  • step SA2 After reading the article data in step SA1 (step SA1), the citation format setting unit 102s sets citation format (step SA2).
  • the citation format setting process is a process for setting a citation format, which is a standard when judging the legitimacy of citation in the article data.
  • FIG. 27 is a flowchart of the citation format setting process.
  • FIG. 28 is a diagram exemplifying a citation format setting input screen.
  • a “type” menu for selecting the type of article data
  • a “citation format” box for inputting a legal citation format
  • a storage location of a document regarded as legal A “law document storage location” box to be specified for example, a confirmation button for giving an instruction to confirm the input content on the citation format setting input screen, an end button for giving an instruction to end citation format setting, etc. are displayed.
  • step SB2 when the end instruction of the citation format setting process is instructed by pressing the end button via the input device 103 (Yes in step SB2), the citation format setting unit 102s ends the citation format setting process and returns to the main routine.
  • the citation format determination unit selects the type of thesis data (for example, from the "type" menu) via the input device 103. It waits until "law” or "engineering” etc. is selected (step SB3, No), and when the type of article data is selected (step SB3, Yes), the selected type is temporarily stored in RAM etc. To do (step SB4).
  • the citation format setting unit 102 stands by until the input content determination instruction is given by pressing the determination button via the input device 103 (step SB5, No), and the input content determination instruction is received (step SB5, Yes), storage of the document specified in the citation format (for example, "" “or” “” etc.) currently input in the "citation format” box, and "the legal document storage location” A place (for example, "Z: ⁇ quotaion ⁇ law ⁇ " or the like) is acquired, and stored in the quotation format DB 101 f in association with the type temporarily stored in the RAM or the like in step SB4 (step SB6). Thereafter, the process returns to step SB2, and it is determined whether an end instruction has been issued (step SB2).
  • step SA8 when the similarity calculated by the similarity calculation unit 102c in step SA7 is equal to or higher than the predetermined threshold in step SA8 (step SA8, Yes), the determination range is the comparison range. Judging that the document data and the like are cited, the legality determination unit 102e executes a legality determination process for determining whether the citation is a legal reference (step SA9).
  • FIG. 29 is a flow chart showing the procedure of legality determination processing.
  • the legality determination unit 102e specifies the type of the article data to be determined (step SC1).
  • the type input screen (not shown) can be output and displayed on the display device 104, and the input of the type of article data to be determined can be received via the input device 103.
  • the citation form DB 101 f is referred to based on the type specified in step SC 1, and a legal citation form corresponding to the type and a storage location of a document considered to be legal are acquired from the citation form DB 101 f (step SC 2).
  • step SC3 it is determined whether or not the citation which is determined in step SA8 that the reference data or the like in the comparison range is referred to is a citation conforming to the legal citation format acquired in step SC2 (step SC3).
  • a proper citation form “” ” is used before or after a citation part, a citation part itself or a reference number to reference information indicating the citation source of the citation part immediately after the citation part is added If it is, or if the cited part is a citation from a document stored in the storage location of the document to be considered legal, it will be judged as a citation conforming to the legal citation form.
  • the legality determination unit 102e causes the display device 104 to display an indication that the citation part is inappropriate.
  • Step SC4 For example, in the quotation determination screen shown in FIG. 7, it is assumed that the display of the cited part is reversed in black and white.
  • step SC3 when it is determined that the document conforms to the legal citation form (step SC3, Yes), or after the process of step SC4, the legality determination unit 102e determines that the document data etc. in the comparison range is cited. It is determined whether a legitimacy determination has been made for all of the parts (step SC5).
  • Step SC5 when it is determined that the legality determination has not been performed for all of the cited parts (Step SC5, No), the legality determination unit 102e determines the legal citation form for the other cited parts for which the legality determination is not performed. It is determined whether or not the citation is in compliance with (step SC3). On the other hand, when it is determined that the legality determination has been performed for all of the quoted parts (Yes in step SC5), the legality determination unit 102e ends the legality determination process and returns to the main routine.
  • the citation form corresponding to the type of the article data is acquired from the citation form DB 101 f and it is determined whether the citation matches the acquired citation form, citation based on a citation form different for each type of the article data The legitimacy of can be determined.
  • This form is a form which calculates the citation ratio which the description content quoted from the comparison range occupies among the description contents of the judgment range.
  • the configuration and the process according to the eighth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment
  • the description is abbreviate
  • FIG. 30 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the eighth embodiment.
  • the citation determination support apparatus 100 includes a citation ratio calculation unit 102t in the control unit 102 and a citation ratio DB 101g in the storage unit 101.
  • the citation ratio calculation unit 102t is a citation ratio calculation unit that calculates the citation ratio occupied by the description content cited from the comparison range among the description content of the determination range.
  • the quotation ratio DB 101 g is a quotation ratio storage unit that stores determination target data information that uniquely identifies determination target data and the quotation ratio calculated by the quotation ratio calculation unit 102 t in association with each other.
  • FIG. 31 is a table exemplifying information stored in the citation ratio DB 101 g.
  • the citation ratio DB 101 g includes “thesis data information”, “document data information”, and “citation ratio” as data items, and information corresponding to these is stored in association with each other.
  • the information stored in the item “dissertation data information” is determination target data information that uniquely identifies the dissertation data that is the determination target, and as shown in FIG. 31, for example, the student register number of the thesis author and thesis creation An identification number including a day is stored.
  • the information stored in the item "document data information” is document data information that uniquely identifies the document data that is the citation source, and as shown in FIG. 31, for example, the document information of the document data is stored.
  • the information stored in the item "quotation ratio” is information for specifying the citation ratio calculated by the citation ratio calculation unit 102t. As shown in FIG. 31, for example, the citation ratio of each document data in the article data And the numerical value which showed the total value of the said individual citation ratio in percentage is stored. The specific content of the citation ratio will be described later.
  • FIG. 32 is a flowchart showing the procedure of the quotation determination support process of the eighth embodiment.
  • the processes in steps SD1 to SD11 are the same as the processes in steps S11 to S20a described with reference to FIG. 2 in the first embodiment, and thus detailed description will be omitted.
  • step SD11 determines that all the processing in steps SD5 to SD10 is completed for all the data in the specified comparison range (Yes in step SD11).
  • the citation ratio calculation unit 102t determines that the description content of the determination range is The citation ratio occupied by the description content cited from the comparison range is calculated (step SD12).
  • the specific content of the citation ratio is arbitrary, and for example, the percentage of the number of characters of the citation part to the number of characters of the determination range is calculated as the citation ratio.
  • the output control unit 102h causes the display device 104 to output and display the citation ratio calculated by the citation ratio calculation unit 102t, and associates the calculated citation ratio with the article data information specifying the article data to be determined.
  • the citation ratio DB 101g is stored (step SD13). When there are citations from a plurality of document data, the individual citation ratio occupied by the description content citation from each document data and the total value of the respective citation ratios are calculated and stored in the citation ratio DB 101 g.
  • FIG. 33 is a diagram exemplifying a quoting determination screen when the quoting ratio is displayed.
  • the citation ratio calculated as a percentage of the number of characters of the quoted portion to the number of characters of the determination range is displayed in the upper right portion of the citation determination screen.
  • the description content of the judgment range is cited from a plurality of document data, as shown in FIG. 33, the total value of the citation ratio from each document data and the individual citation ratio from each document data are displayed together You may make it display, and you may make it display only the total value of the citation ratio from each literature data.
  • This list display process is a process of outputting article data information in the order based on the citation ratio of each article data.
  • FIG. 34 is a flowchart showing the procedure of the list display process.
  • the execution timing of the list display process is arbitrary, and is started, for example, when an instruction to execute the list display process is input via the input device 103.
  • FIG. 35 is a diagram showing a determination result screen displaying a list of article data information in descending order of the total value of the citation ratio. As shown in FIG. 35, thesis data information is displayed on the screen in descending order of the citation ratio. At this time, an individual citation ratio for each document data may be displayed together for each article data information.
  • the quotation decision support apparatus 100 of the eighth embodiment calculates and outputs the citation ratio occupied by the description content cited from the comparison range among the description contents of the judgment range, and therefore the judgment material of the legitimacy of the quotation is Can be presented.
  • citation ratios are calculated for multiple article data, and article data information is output in the order based on the citation ratio for each article data, so that legality of citations in multiple article data is compared based on the citation proportions. It is possible to present the judgment material of
  • FIG. 9 A ninth embodiment will now be described.
  • This form is a form that determines whether citation source information that specifies document data that is a citation source of a citation part is included in the determination target data.
  • the configuration and the process according to the ninth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate
  • FIG. 36 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the ninth embodiment.
  • the quotation determination support apparatus 100 includes an output mode DB 101 h in the storage unit 101.
  • the output mode DB 101 h is an output mode information storage unit that stores the degree of similarity of the determination range and the output mode by the display device 104 in association with each other.
  • FIG. 37 is a table exemplifying information stored in the output mode DB 101 h. As shown in FIG. 37, the output mode DB 101 h includes “similarity S [%]” and “output mode” as data items, and information corresponding to these is stored in association with each other.
  • the information stored corresponding to the item "Similarity S [%]" is information for specifying the similarity of the determination range, and information for specifying the range of the similarity serving as the reference for the quotation determination (in FIG.
  • the information stored corresponding to the item “output mode” is information for specifying the output mode by the display device 104, and information for specifying the mode to be output according to the degree of similarity is stored. In the example of FIG.
  • the character output mode when the similarity is less than 20%, the character output mode is “normal” because the possibility of citation is considered low, and when the similarity is 20% or more and less than 80%, the citation is Since there is a possibility, the output mode of the character is "bold", and when the similarity is 80% or more, the output mode of the character is "reverse” because the possibility of quoting is high.
  • the item “output mode” may store color information specifying a character color or a background color of a character, font information specifying a font of a character, and the like.
  • the storage method and storage timing of the information stored in the output aspect DB 101 h can be arbitrarily stored in the output aspect DB 101 h via, for example, the input device 103 in advance.
  • FIG. 38 is a flowchart of the quotation determination support process of the ninth embodiment.
  • Steps SF1, step SF5, step SF8, step SF9, step SF12, and steps SF15 to SF17 are the steps S11, S14, S15, and S16 described with reference to FIG. 2 in the first embodiment, respectively.
  • Steps S18 and S19 to S20a are the same as the processes described above, and thus detailed description will be omitted.
  • FIG. 39 is a diagram showing article data displayed on the citation determination screen on the display device 104.
  • the output control unit 102h causes the paper data display area 105, the range setting slider 106, the entire view 107, and the document data display area 108 to be displayed on the quotation determination screen.
  • the paper data display area 105 is an area for displaying paper data to be judged.
  • the range setting slider 106 sets a determination range in the article data to be determined, and an area sandwiched between the upper range setting slider 106 a and the lower range setting slider 106 b is set as the determination range.
  • the whole view 107 is an area for displaying the display range of the paper data display area 105, the judgment range, and the approximate position of the cited part in the whole of the paper data to be judged.
  • the document data display area 108 is an area for displaying the content of the cited document data. As shown in FIG. 39, in step SF2, the contents of the article data are displayed in the article data display area 105, and the display range of the article data display area 105 is displayed in the entire view 107 as a rectangular frame.
  • the determination range specifying unit 102a determines whether an instruction input of the determination range is input through the input device 103 (step SF3), and determines that an instruction input of the determination range is input (step SF3). , Yes), the range instructed by the instruction input is specified as the determination range (step SF4).
  • the range between the upper and lower range setting sliders 106 among the article data to be determined is specified as the determination range.
  • the output control unit 102h causes the region outside the determination range to be displayed by hatching in the entire view 107.
  • the document reference determination unit 102d determines whether the threshold of similarity is input through the input device 103 (step SF6), and the similarity When it is determined that the threshold is input (Yes in step SF6), the document reference determination unit 102d sets the input threshold as a threshold of similarity in the reference determination (step SF7).
  • the input method of a threshold is arbitrary, for example, an input box of a threshold can be displayed on a quotation determination screen (illustration omitted), and the numerical value input into the said input box via the input device 103 can be set as a threshold. .
  • a threshold setting slider may be displayed on the quotation determination screen (not shown), and a value corresponding to the position of the setting slider whose position has been changed via the input device 103 may be set as the threshold.
  • the output control unit 102h acquires an output mode corresponding to the calculated similarity from the output mode DB 101h (step SF10),
  • the determination range is output and displayed on the display device 104 based on the acquired output mode (step SF11).
  • the characters in the article data display area 105 are displayed based on the output aspect acquired from the output aspect DB 101 h illustrated in FIG. 37 corresponding to the calculated similarity.
  • a portion with a similarity of less than 20% is displayed normally, a portion with a similarity of 20% or more and less than 80% is displayed in bold, and a portion with a similarity of 80% or more is displayed in reverse.
  • a portion having a similarity of 20% or more is hatched by cross lines in the entire view 107. This makes it possible for the user to roughly grasp the range occupied by the part that can be cited in the entire article data.
  • the citation determination unit 102e determines that the citation is not legal. It is determined whether the citation source information specifying the document data which is the citation source of the part is included in the article data to be determined (step SF13).
  • the specific content of the citation source information is optional. For example, information such as the author name of the citation source document, year of publication, title, published magazine, number of volumes, location page, etc. can be used as the citation source information.
  • a criterion for determining whether or not citation source information is included is optional, for example, whether citation source information is described immediately after the citation, or in a note number described immediately after the citation. Correspondingly, determination can be made based on whether citation information is described at the end of the article data.
  • the output control unit 102h causes the display device 104 to output and display the citation source information (step SF14).
  • the method and procedure for causing the display device 104 to output and display the quotation source information are arbitrary, and for example, an instruction input to display the quotation source information corresponding to the citation portion determined not to be a legal quotation is input via the input device 103 If it does, the citation source information corresponding to the citation part is displayed.
  • FIG. 40 is a diagram showing a quotation determination screen on which quotation source information is displayed.
  • the citation source information corresponding to the designated citation portion (“ ⁇ , ⁇ at the bottom of the displayed paper data ⁇ , “ ⁇ ”, ⁇ magazine, Volume O, ⁇ page ⁇ ⁇ page ”portion is highlighted (in FIG. 40, inverted display).
  • the reference information acquisition unit 102f After displaying the citation source information in this way (step SF14), or when it is determined in step SF13 that the citation source information is not included in the article data (step SF13, No), the reference information acquisition unit 102f references the reference information. It acquires (step SF15).
  • the range designated via the input device 103 is specified as the determination range from the article data, it is possible to limit the target for which the quotation determination is performed, and to reduce the load associated with the determination process.
  • the similarity is equal to or greater than the predetermined threshold input through the input device 103, it is determined that the determination range refers to the comparison range, so an optimum threshold is set according to the purpose of the determination.
  • the determination based on the threshold can be performed.
  • the output mode corresponding to the similarity calculated by the similarity calculation unit 102c is acquired from the output mode DB 101h, and the determination range is output in the acquired output mode, the user can easily grasp the similarity.
  • the judgment range can be output with.
  • FIG. 41 is a flowchart of the quotation determination support process of the tenth embodiment.
  • the processes of steps SG1 to SG12 excluding step SG11 are the same as the processes of steps S11 to S20a described with reference to FIG. 2 in the first embodiment, and thus detailed description will be omitted. .
  • step SG10 after the reference data and the like are explicitly cited in the determination range and the reference information is displayed (step SG10), the comparison range specification unit 102b displays the cited reference data.
  • the bibliographic information for example, author name, publication year, title, publication magazine, URL, etc.
  • the storage location for example, folder name
  • the information is stored in the unit 101b (step SG11). Thereafter, it is determined whether or not the processing from step SG5 to step SG11 has been executed for all the data in the specified comparison range (step SG12).
  • FIG. 42 is a flowchart showing the procedure of the comparison and determination process of the tenth embodiment.
  • the processes in steps SH5 to SH7 are the same as the processes in steps S21 to S23 described with reference to FIG. 6 in the first embodiment, and thus detailed description will be omitted.
  • the comparison range identification unit 102b identifies the document data cited in the article data based on the result of the structural analysis of the article data performed in step SG2 of the quotation determination support process. It is determined whether the citation source information is included in the article data (step SH1).
  • the comparison range specification unit 102b refers to the document list storage unit 101b, and bibliographic information corresponding to the citation source information is stored in the document list storage unit 101b. It is determined whether or not it is stored (step SH2). As a result, when bibliographic information is stored in the document list storage unit 101b (Yes in step SH2), the comparison range specification unit 102b stores the documents stored in the storage location stored in association with the bibliographic information. The data is read from the document data storage unit 101a (step SH3), and the read document data is specified as a comparison range (SH4).
  • step SH1 when it is determined in step SH1 that citation source information is not included in the article data (No in step SH1), or in step SH2, bibliographic information corresponding to the citation source information is not stored in the document list storage unit 101b. If it is determined that the result is (step SH2, No), the comparison range specification unit 102b reads all the thesis data submitted in the past stored in the thesis data storage unit 101c (step SH5).
  • step SH4 the comparison range specifying unit 102b ends the comparison range specifying process and returns to the main routine.
  • the document data storage unit 101 a stores the document data determined by the document quotation determination unit 102 d as being cited in the determination range.
  • citation source information that specifies document data is included in the article data
  • Identify literature data is stored in the document data storage unit 101a.
  • the comparison range can be limited to the document data already stored in the document data storage unit 101a, and the load in searching the content of the determination range from the data of the comparison range can be reduced.
  • the similarity calculation unit 102r performs processing when the character string of the search key exceeds the limited number of characters, but performs processing so that the character string designated as the search key does not exceed the limited number of characters in advance. You can also
  • the determination range is analyzed by text mining processing using morphological analysis etc. and divided into words of a character string less than the limited number of characters, and a word appearing a predetermined number or more is designated as a search key Search multiple times from the comparison range for each word.
  • the similarity calculation unit 102r is a target of a comparison range for calculating the similarity between the comparison range of the search result whose appearance frequency is larger than a predetermined value among the search results of a plurality of times and the description content of the determination range. It may be configured to be determined as
  • FIG. 22 is a flowchart illustrating the procedure of the similarity calculation process of the first modification.
  • the similarity calculation unit 102r performs text mining processing such as morphological analysis on data of the description content of the determination range, and divides the data into words having the number of characters within the limited number of characters (step S111). Then, the similarity calculation unit 102r calculates the appearance frequency of each word (step S112), and sorts the words in descending order of appearance frequency (step S113). Then, the similarity calculation unit 102r designates a word with the highest appearance frequency as a search key (step S114).
  • the similarity calculation unit 102r searches the comparison range with the designated search key (step S115), and stores the search result in the memory (step S116).
  • the similarity calculation unit 102r determines whether the search process has been performed for all the words whose appearance frequency is a predetermined number or more (step S117). Then, when the similarity calculation unit 102 r determines that the search processing has not been performed on all the words having the appearance frequency of the predetermined number or more (No at step S 117), the word having the next highest appearance frequency is Designating as a search key (step S118), the search process of steps S115 and S116 is repeatedly executed.
  • step S117 when it is determined in step S117 that the similarity calculation unit 102r has completed the search processing for all the words having the appearance frequency of a predetermined number or more (Yes in step S117), a plurality of words stored in the memory The comparison range as the search result having the highest frequency of appearance among the search results of is selected (step S119). Thereby, the selected comparison range becomes an object of similarity calculation, and the similarity with the determination range is calculated.
  • a search result having a high appearance frequency is automatically specified, and this search result is automatically set as a comparison range used for similarity calculation.
  • the comparison range that matches with can be extracted automatically and the citation determination can be performed, and the accuracy of the citation determination can be further improved.
  • the similarity calculation is performed so that the process of this modification is performed only when an error notification that the number of characters of the search key exceeds the number of restricted characters is received from the search engine or the like.
  • the unit 102r may be configured.
  • the comparison range specifying unit 102 q receives an error notification that the search key exceeds the number of restricted characters, like the similarity calculation unit 102 r of the sixth embodiment, among the extracted task sentences, A search key is specified by the character string for the limited number of characters, and the search is performed multiple times while shifting the character string of the task sentence as the search key. Then, the comparison range specification unit 102 q may be configured to determine the cited document data specified by the URL with the highest appearance frequency among the plurality of URLs output as the search result as the comparison range.
  • FIG. 23 is a flowchart illustrating the procedure of comparison range identification processing of the second modification.
  • the task extracting unit 102p analyzes the structure of the article data as the determination target to extract task sentences (step S131).
  • the comparison range specifying unit 102 q searches for the corresponding WEB page from the WEB site 131 on the Internet 130, the file server 133, and the like using the extracted task sentence as a search key (step S132).
  • the comparison range specifying unit 102 q determines whether an error notification that the search key has exceeded the limit number of characters has been received (step S 133).
  • the comparison range specifying unit 102q selects the URL of the search result (step S141), and Similar to the fifth aspect, the cited reference data specified by the URL of the search result is specified as the comparison range.
  • step S133 when an error notification that the search key has exceeded the limited number of characters is received (Yes in step S133), the comparison range specifying unit 102q acquires the limited number of characters from the received error notification. (Step S134).
  • the comparison range specifying unit 102q designates a character string of a range for the limited number of characters from the top of the task sentence as a search key (step S135), and searches the WEB page using this search key (step S136).
  • the comparison range specifying unit 102 q stores the URL, which is the search result, in the memory (step S 137).
  • the comparison range specifying unit 102q determines whether the final character string has been reached as a search key for the task sentence (step S138), and if it has not reached yet (step S138, No), Among them, a character string for the next limited number of characters is designated as a search key (step S139), and the processes of steps S136 and S137 are repeatedly executed.
  • step S138 when the final character string is reached as the search key of the task sentence in step S138 (step S138, Yes), the URL of the search result having the highest frequency of appearance among the URLs of the search results stored in the memory. Is selected (step S140), and cited reference data specified by the URL of the selected WEB page is specified as the comparison range.
  • the search key exceeds the limited number of characters
  • the character string of the limited number of characters in the task sentence designates the search key
  • the character string of the task sentence Since the search is performed several times while shifting the key, it is possible to specify an appropriate comparison range of cited reference data according to the content of the thesis, regardless of the limited number of characters of the search key. It can be improved.
  • determination of the legitimacy of citation is performed for a portion where the similarity calculated by the similarity calculation unit 102 c is determined to be equal to or higher than a predetermined threshold. It may be configured to perform legality determination of For example, if the article data to be judged contains a reference symbol (for example, "", "", etc.) corresponding to the type of the article data, citation in the article data is legal. It may be configured to determine the effect.
  • the file name of the thesis data may be output and displayed on the display device 104 in an output mode based on the determination result. For example, the file name of the article data whose citation is determined to be inappropriate may be displayed in reverse in black and white so as to be distinguishable from the file name of the article data determined to be appropriate.
  • the function of automatic reference is incorporated in the judgment range specification units 102a, 102i and 102j of the citation judging support device according to each of the above embodiments, and a desired paper data is selected as a user from the paper data storage unit 101c automatically at startup. It may be configured to read selected article data.
  • the comparison range specifying units 102b, 102l, and 102q of the quotation determination support apparatus according to the above embodiment are not limited to one storage unit or WEB site as a comparison range, but may be a WEB site, a library search database, a local The server may be configured to specify the comparison range across from any combination of these.
  • the dissertation data created by the student has been described as the determination target data, but the present invention is not limited to this, and any data in which sentences are described is used as the determination target data. Can.
  • the output control unit 102 h displays each cited portion in an output mode (for example, different color, font, etc.) different for each document data. It may be configured to output and display at 104. In addition, each citation part may be displayed in a different display manner depending on the citation ratio from each document data.
  • an output mode for example, different color, font, etc.
  • the problems to be solved by the invention and the effects of the invention are not limited to the contents described above, and the present invention solves the problems not described above, and the effects not described above. It may also play, or may only solve some of the listed tasks or only some of the listed effects.
  • the citation determination support program executed by the citation determination support apparatus is a file of an installable format or an executable format, and is a CD-ROM, a flexible disk (FD), a CD-R , And provided by being recorded on a computer readable recording medium such as a DVD (Digital Versatile Disk).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Machine Translation (AREA)

Abstract

La présente invention concerne un dispositif de support de détermination de cotation (100) pourvu d'une section d'identification de plage de détermination (102a) qui identifie une plage, pour déterminer la présence/absence de cotation de données de document hors des données de dissertation, comme la cible de la détermination ; d'une section d'identification de plage de comparaison (102b) qui identifie une plage à comparer avec les données de dissertation hors des données de document ; d'une section de calcul de similarité (102c) qui recherche un contenu décrit dans la plage de détermination identifiée hors de la plage de comparaison identifiée et calcule la similarité entre le contenu décrit dans la plage de détermination et celui de la plage de comparaison ; d'une section de détermination de cotation de document (102d) qui détermine que la plage de comparaison est cotée par la plage de détermination quand la similarité calculée n'est pas inférieure à un seuil prédéterminé ; et d'une section de commande de sortie (102h) qui délivre en sortie la plage de détermination des données de dissertation, qui cote la plage de comparaison des données de document, à un dispositif d'affichage (104).
PCT/JP2009/000360 2008-02-01 2009-01-30 Dispositif de support de détermination de cotation WO2009096190A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008023234 2008-02-01
JP2008-023234 2008-02-01

Publications (1)

Publication Number Publication Date
WO2009096190A1 true WO2009096190A1 (fr) 2009-08-06

Family

ID=40912544

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/000360 WO2009096190A1 (fr) 2008-02-01 2009-01-30 Dispositif de support de détermination de cotation

Country Status (2)

Country Link
JP (2) JP5510912B2 (fr)
WO (1) WO2009096190A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110383264A (zh) * 2016-12-16 2019-10-25 三菱电机株式会社 检索系统

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5207402B2 (ja) * 2009-09-30 2013-06-12 キヤノンマーケティングジャパン株式会社 情報処理装置、情報処理方法及びプログラム
KR101033611B1 (ko) * 2010-07-09 2011-05-11 한국과학기술정보연구원 참고 문헌 적합성 판정 시스템 및 방법
US9218344B2 (en) * 2012-06-29 2015-12-22 Thomson Reuters Global Resources Systems, methods, and software for processing, presenting, and recommending citations
JP5459422B2 (ja) * 2013-02-14 2014-04-02 キヤノンマーケティングジャパン株式会社 情報処理装置、制御方法及びプログラム
JP6052816B2 (ja) 2014-10-27 2016-12-27 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation 電子著作物のコンテンツの二次利用を支援する方法、並びに、電子著作物のコンテンツの二次利用を支援する為のサーバ・コンピュータ、及びそのサーバ・コンピュータ用プログラム
US20170270625A1 (en) * 2016-03-21 2017-09-21 Facebook, Inc. Systems and methods for identifying matching content
JP6691581B2 (ja) * 2018-07-26 2020-04-28 楽天株式会社 情報処理装置、情報処理方法、プログラム、記憶媒体
JP6695538B1 (ja) * 2019-07-30 2020-05-20 株式会社ウェブサークル 類似文章検索装置およびプログラム
JP2022072383A (ja) * 2020-10-29 2022-05-17 株式会社Ipsign 侵害情報抽出システム、方法及びプログラム

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1021239A (ja) * 1996-06-28 1998-01-23 Toshiba Corp 機械翻訳装置及び翻訳処理方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08263512A (ja) * 1995-03-24 1996-10-11 Sumitomo Electric Ind Ltd 文書検索装置
JPH09198409A (ja) * 1996-01-19 1997-07-31 Hitachi Ltd 酷似文書抽出方法
JP3625054B2 (ja) * 2000-11-29 2005-03-02 松下電器産業株式会社 技術文書検索装置
JP2006155556A (ja) * 2004-10-27 2006-06-15 Hitachi Software Eng Co Ltd テキストマイニング方法及びテキストマイニングサーバ
US20070294610A1 (en) * 2006-06-02 2007-12-20 Ching Phillip W System and method for identifying similar portions in documents
JP2008015774A (ja) * 2006-07-05 2008-01-24 Nagaoka Univ Of Technology 模倣文書検出システム及びプログラム

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1021239A (ja) * 1996-06-28 1998-01-23 Toshiba Corp 機械翻訳装置及び翻訳処理方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MURATA T. ET AL.: "Gakusei Report no n-gram ni yoru Ruijido Hyoka no Kento", FIT2002 FORUM ON INFORMATION TECHNOLOGY IPPAN KOEN RONBUNSHU, vol. 2, 13 September 2002 (2002-09-13), pages 101 - 102 *
SHURUTSU KISU: "System ni Shinshoku suru Aratana Kyoi, Spyware o Gekitai Seyo!", COMPUTERWORLD GET TECHNOLOGY RIGHT, vol. 2, no. 1, 1 January 2005 (2005-01-01), pages 86 - 91 *
TAKAHASHI I. ET AL.: "Web kara no Hyosetsu Report Kenshutsu Shuho no Jisso to Hyoka", DAI 46 KAI ADVANCED LEARNING SCIENCE AND TECHNOLOGY SHIRYO (SIG-ALST-A503), 13 March 2006 (2006-03-13), pages 01 - 06 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110383264A (zh) * 2016-12-16 2019-10-25 三菱电机株式会社 检索系统
CN110383264B (zh) * 2016-12-16 2022-12-30 三菱电机株式会社 检索系统

Also Published As

Publication number Publication date
JP5737772B2 (ja) 2015-06-17
JP2014149848A (ja) 2014-08-21
JP2009205674A (ja) 2009-09-10
JP5510912B2 (ja) 2014-06-04

Similar Documents

Publication Publication Date Title
WO2009096190A1 (fr) Dispositif de support de détermination de cotation
JP4654776B2 (ja) 質問応答システム、およびデータ検索方法、並びにコンピュータ・プログラム
US9015175B2 (en) Method and system for filtering an information resource displayed with an electronic device
US7318021B2 (en) Machine translation system, method and program
US10552467B2 (en) System and method for language sensitive contextual searching
US20060173682A1 (en) Information retrieval system, method, and program
JP2005128873A (ja) 質問応答型文書検索システム及び質問応答型文書検索プログラム
JP2017504105A (ja) インメモリデータベースサーチのためのシステム及び方法
JP2006099428A (ja) 文書要約作成システム、方法、及びプログラム
JP2007065745A (ja) 文書検索方法および文書検索装置、プログラム
JP2006343925A (ja) 関連語辞書作成装置、および関連語辞書作成方法、並びにコンピュータ・プログラム
JP2007172260A (ja) 文書ルール作成支援装置および文書ルール作成支援方法並びに文書ルール作成支援プログラム
US20080071593A1 (en) Business process editor, business process editing method, and computer product
JP6305671B1 (ja) テンプレート生成装置、テンプレート生成プログラム及びテンプレート生成方法
JP2009157620A (ja) 情報検索支援装置
CN114357961A (zh) 一种项目可行性研究报告生成方法、装置、设备及存储介质
US6122650A (en) Method and apparatus for updating time related data in a modified document
Pirzadeh et al. Resilient user interface level tests
JP2008250893A (ja) 情報検索装置、情報検索方法およびそのプログラム
JP2009169761A (ja) 電子辞書システム、電子辞書の表示制御方法、コンピュータプログラムおよびデータ記憶媒体
JP2006155653A (ja) 情報表示制御装置及びプログラム
JP2005115457A (ja) 文書ファイル検索方法
JPH0668137A (ja) 操作指示対象情報生成システムおよび操作指示対象認識システム
JP2011095802A (ja) 機械翻訳装置及びプログラム
JP2010002830A (ja) 音声認識装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09706575

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09706575

Country of ref document: EP

Kind code of ref document: A1