WO2009096190A1 - Quotation judgment supporting device - Google Patents

Quotation judgment supporting device Download PDF

Info

Publication number
WO2009096190A1
WO2009096190A1 PCT/JP2009/000360 JP2009000360W WO2009096190A1 WO 2009096190 A1 WO2009096190 A1 WO 2009096190A1 JP 2009000360 W JP2009000360 W JP 2009000360W WO 2009096190 A1 WO2009096190 A1 WO 2009096190A1
Authority
WO
WIPO (PCT)
Prior art keywords
determination
citation
range
data
comparison
Prior art date
Application number
PCT/JP2009/000360
Other languages
French (fr)
Japanese (ja)
Inventor
Kazunari Sugimitsu
Keiki Onishi
Original Assignee
Kanazawa Institute Of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kanazawa Institute Of Technology filed Critical Kanazawa Institute Of Technology
Publication of WO2009096190A1 publication Critical patent/WO2009096190A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Definitions

  • the present invention relates to a quote determination support apparatus and a quote determination support program that support determination of whether or not document data is cited in determination target data to be determined.
  • Patent Document 1 discloses a technique for investigating the copyright infringement of the content in the web page based on the copyright information received from the server and transmitting a notification to that effect to the server when the copyright infringement is found. It is done.
  • Patent Document 2 discloses a technique for judging the degree of similarity between technical documents and visually displaying the relationship between the two documents.
  • the present invention has been made in view of the above, and it is possible to improve the accuracy of the determination while preventing the increase in the development process and the manufacturing cost by using a general-purpose determination algorithm, and a quotation determination support device,
  • the purpose is to provide a citation judging support program.
  • the citation determination support apparatus of claim 1 supports determination of whether or not document data is cited in the determination target data to be determined.
  • a judgment range specifying unit for specifying a judgment range of presence / absence of citation of the document data from among the judgment object data, and the judgment object data from among the document data;
  • Comparison range specifying means for specifying the comparison range of the above and the description content of the determination range specified by the determination range specifying means are searched from the comparison ranges specified by the comparison range specifying means, Similarity calculation means for calculating the similarity between the description content of the determination range and the description content of the comparison range, and the similarity calculated by the similarity calculation means is greater than or equal to a predetermined threshold value;
  • Judgment Document quoting judging means for judging that the scope is referring to the comparison range; and output means for outputting the judgment range of the judgment target data quoting the comparison range of the document data. It is characterized by
  • the document citation determination means determines that the determination range is citing the comparison range.
  • the present invention is characterized by further comprising legality determination means for determining whether or not the citation is a legitimate citation based on the citation part of the comparison range in the determination range and the vicinity thereof.
  • the citation determination support apparatus is the citation determination support apparatus according to the second aspect, wherein the legality determination means identifies the reference data that is the citation source of the citation part. Is determined to be included in the determination target data.
  • the citation determination support apparatus is the citation determination support apparatus according to claim 2 or 3, wherein the appropriateness determination means determines that the similarity is equal to or more than a predetermined threshold in the determination range. Determining whether or not the determination range conforms to a predetermined citation form, and based on the determination result, determining whether the citation of the comparison range in the determination range is a legitimate citation; It is characterized by
  • the type of the determination target data and the predetermined quotation form may be associated with each other and stored.
  • a storage unit is provided, and the legitimacy determination unit identifies a type of the determination target data, acquires the citation form corresponding to the identified type from the citation form storage unit, and determines the acquired citation form as the determination. Determining whether the citations of the comparison range in the range match.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to fifth aspects, wherein the determination range refers to the comparison range in the document citation determination means.
  • a reference information acquisition unit that acquires reference information for referring to the document data including the comparison range based on the document data, and the output unit is configured to The reference information acquired by the reference information acquisition unit is output in addition to the judgment range of the judgment target data quoting the comparison range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to sixth aspects, wherein the determination range specifying unit is a component that constitutes the determination target data. Among them, a predetermined component is specified as the determination range.
  • the citation determination support apparatus uniquely identifies, in the citation determination support apparatus according to any one of claims 1 to 7, the creator of the determination target data generated in the past.
  • the determination range specifying means acquires, from the history storage means, the creator identification information corresponding to the information indicating that the illegal act of quotation has occurred, or is lower than a predetermined value
  • the determination target data created by the creator identified by the creator identification information acquired by acquiring the creator identification information corresponding to the score of the creator from the history storage unit, And selecting as the determination target from a serial plurality of judgment object data.
  • the citation determination support apparatus corrects the word which can be included in the document data in the citation determination support apparatus according to any one of claims 1 to 8.
  • Dictionary storage means for storing words that can be used in association with each other; and word conversion means for converting words included in the determination target data into words stored in the dictionary storage means, wherein the determination range
  • the identification means is characterized in that the judgment target data subjected to conversion by the word conversion means is the judgment target.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to ninth aspects, further comprising an input unit for receiving an operation input to the citation determination support apparatus, the determination
  • the range specifying means is characterized in that a range specified through the input means is specified as the determination range from among the determination target data.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 10, which stores a plurality of determination target data generated in the past.
  • the similarity degree calculation means further calculates the similarity degree among the plurality of determination target data stored in the determination target data storage means, and the document citation determination means further includes: When the similarity calculated by the similarity calculation means is equal to or greater than a predetermined second threshold, it is determined that the plurality of judgment target data are quoted among each other, and the comparison range specifying means A plurality of determination target data determined to be cited among the plurality of determination target data may be specified as the comparison range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 11, wherein among the data to be determined based on the contents of the description of the data to be determined.
  • Task extraction means for extracting task information indicating a task of the determination target data from the search target data
  • the comparison range specifying means searches the document data using the task information extracted by the task extraction means as a search key And identifying the retrieved document data as the comparison target.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 12, wherein the citation determination means determines that the citation determination support apparatus is cited in the determination range.
  • Document data storage means for storing the document data, wherein the comparison range specifying means includes reference source information specifying the document data cited in the determination target data in the determination target data or not If it is determined that the reference source information is included in the determination target data, whether the document data specified based on the reference source information is stored in the document data storage unit If it is determined that the document data specified based on the reference source information is stored in the document data storage means, the document data Identifying it as a comparison range, characterized by.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of claims 1 to 13, wherein the similarity calculation means is specified by the judgment range specifying means.
  • searching from the comparison range specified by the comparison range specifying means using the description content of the determination range as a search key, the number of characters of the search key exceeds the predetermined number of restricted characters.
  • the search key is selected, characters within the limited number of characters are sequentially designated from the determination range as the search key, and the comparison range is searched a plurality of times, and the appearance frequency is predetermined among the plurality of search results.
  • a search result larger than a value is set as a target of the comparison range for calculating the mutual similarity with the description content of the determination range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to the fourteenth aspects, wherein the similarity calculation unit analyzes the determination range to obtain a predetermined number or more. A word which appears is searched a plurality of times from the comparison range specified by the comparison range specifying means for each word using the word as a search key, and a frequency of appearance is larger than a predetermined value among a plurality of search results It is characterized in that the result is a target of the comparison range for calculating the mutual similarity with the description content of the determination range.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to fifteenth aspects, further comprising: input means for receiving an input of the predetermined threshold value;
  • the means is characterized in that, when the degree of similarity is equal to or greater than a predetermined threshold value input through the input means, the determination range is determined to refer to the comparison range.
  • the description cited from the comparison range among the description contents of the determination range may further include citation ratio calculation means for calculating a citation ratio occupied by contents, and the output means may output the citation ratio.
  • the citation determination support apparatus is the citation determination support apparatus according to claim 17, wherein the citation ratio calculation means calculates the citation ratio for a plurality of the determination target data, and the output means
  • the present invention is characterized in that the determination target data information uniquely identifying the plurality of determination target data is output in the order based on the quoting ratio calculated by the quoting ratio calculating means with respect to each of the determination target data.
  • the citation determination support apparatus is the citation determination support apparatus according to any one of the first to eighteenth aspects, wherein the similarity of the determination range and the output mode by the output means And output mode information storage means for storing the output mode information in association with each other, wherein the output means acquires the output mode corresponding to the similarity calculated by the similarity degree calculation means from the output mode information storage means And outputting the determination range in the acquired output mode.
  • a citation determination support program is a citation determination support program for supporting determination of whether or not document data is cited in determination target data to be determined, which is a computer.
  • a determination range specifying means for specifying a determination range of presence / absence of citation of the document data among the determination target data, and a comparison range specifying a comparison range with the determination target data among the document data
  • the description content of the determination range specified by the means and the determination range specification means is searched from the comparison range specified by the comparison range specification means, and the description content of the determination range and the comparison range
  • a similarity calculation unit that calculates the similarity between the description contents of the two, and the determination range is earlier if the similarity calculated by the similarity calculation unit is equal to or greater than a predetermined threshold value It is characterized in that it functions as a document quoting judging means which judges that a comparison range is quoted, and an output means which outputs the judgment range of the judgment object data quoting the comparison range of the document data. I assume.
  • the citation judging support device of the first aspect since the judgment of the similarity is performed after the judgment range and the comparison range are automatically limited, the development process and the manufacturing cost can be performed using a general-purpose judgment algorithm. There is an effect that the accuracy of the determination can be improved while preventing the increase of
  • the citation judging support device described in claim 2 it can be easily judged whether the citation is a legitimate citation prescribed by the copyright law, and it can be easily judged the legality of the judgment. it can.
  • the quotation judgment support device of claim 3 it is judged whether the quotation source information specifying the document data which is the quotation source of the cited part is included in the judgment object data or not. It is possible to obtain the judgment material when judging the legitimacy of citation based on the presence or absence of
  • the quoting determination support device of claim 4 it is determined whether or not the determination range conforms to a predetermined citation form, and quoting of the comparison range in the determination range is appropriate based on the determination result. It is possible to easily determine the legitimacy of the citation on the basis of a preset citation format, because it is determined whether or not the citation is a citation.
  • the citation form corresponding to the type of the determination target data is acquired from the citation form storage means, and it is determined whether the citation matches the acquired citation form. Therefore, the legitimacy of the citation can be determined based on a citation format that differs for each type of determination target data.
  • the citation of the citation data is automatically specified, and the citation determination is performed after the citation document is added to the determination range of the determination target data.
  • the document is cited illegally, it has an effect that it can be easily detected.
  • the quoting determination support device of the seventh aspect it is possible to set, in the determination range, a portion which is likely to be cited without permission among the component parts of the determination target data, and to further improve the determination accuracy. The effect of being able to
  • the determination target data of a person having a high probability of performing illegal citation can be automatically set as the determination target, and the possibility that the fraud recurs will be added.
  • the determination can be performed, and the accuracy of the determination can be further improved.
  • the quoting determination support device of claim 9 even if the document data is corrected without being used as it is and quoting improperly, it can be determined whether it is quoting or not, and the development process and manufacturing can be performed. This has the effect that the accuracy of the determination can be further improved while preventing an increase in cost.
  • the target for performing the quoting determination can be limited. It is possible to reduce the load involved in the determination process.
  • the document data having a high possibility of citing the document data of another person can be automatically set in the comparison range, and the development process and the manufacturing cost can be reduced. This has the effect that the accuracy of the determination can be further improved while preventing the increase.
  • an appropriate comparison range can be automatically set in accordance with the description content of the determination target data, thereby preventing an increase in the development process and the manufacturing cost.
  • the effect is that the accuracy of the determination can be further improved.
  • the document data storage means stores the document data determined by the document quoting determination means as being cited in the determination range.
  • the citation source information specifying the document data is included in the determination target data
  • the document data specified based on the citation source information is stored in the document data storage unit.
  • the search can be executed.
  • the entire article data can be substantially included in the search range while sequentially targeting each part of the article data, so that the accuracy of the citation determination can be improved.
  • a search result having a high appearance frequency is automatically specified, and the search result is automatically set as a comparison range used for calculating the degree of similarity.
  • a comparison range that matches the range can be extracted automatically to perform citation determination, and the accuracy of citation determination can be further improved.
  • the determination range is quoting the comparison range.
  • An optimal threshold can be set according to the purpose of the determination, and the determination based on the threshold can be performed.
  • the judgment material of the legitimacy of the citation can be presented.
  • the quoting ratio is calculated for a plurality of determination target data, and the determination target data information is output in the order based on the quoting ratio for each determination target data. It is possible to present judgment material for comparing the legitimacy of citation in the judgment object data of c based on the citation ratio.
  • the output mode corresponding to the similarity calculated by the similarity calculation means is acquired from the output manner information storage means, and the determination is made based on the acquired output manner. Since the range is output, the determination range can be output in such a manner that the user can easily grasp the degree of similarity.
  • FIG. 1 is a block diagram conceptually showing a system configuration including a quotation determination support device according to a first embodiment.
  • 7 is a flowchart showing the procedure of a quotation determination support process of the first embodiment.
  • It is a schematic diagram which shows an example of a quotation determination screen. It is an explanatory view showing an example of a portion of a text specified as a judgment range in article data.
  • It is a schematic diagram which shows an example of the quotation determination screen where the content of the determination range was displayed.
  • It is a flowchart which shows the procedure of a specific process of a comparison range.
  • It is a schematic diagram which shows the state by which the citation place is highlighted within the determination range in a quotation determination screen.
  • FIG. 13 is a block diagram showing a functional configuration of a quote determination assisting device according to a second embodiment. It is an explanatory view showing an example of history data.
  • 7 is a flowchart showing the procedure of a quote determination support process of the second embodiment.
  • 15 is a flowchart illustrating a procedure of identification processing of a determination target according to the second embodiment.
  • FIG. 16 is a block diagram showing a functional configuration of a quote determination assisting device according to a third embodiment. It is explanatory drawing which shows an example of a specialized dictionary.
  • 21 is a flowchart showing a specific procedure of the determination target in the third embodiment.
  • FIG. 18 is a block diagram showing a functional configuration of a quote determination assisting device according to a fourth embodiment.
  • 21 is a flowchart showing the procedure of comparison and determination processing of the fourth embodiment
  • FIG. 18 is a block diagram showing a functional configuration of a quote determination assisting device according to a fifth embodiment.
  • FIG. 21 is a flowchart showing the procedure of comparison range identification processing of the fifth embodiment;
  • FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a sixth embodiment. It is a flowchart which shows the procedure of the search process in similarity calculation of Embodiment 6.
  • FIG. It is a flowchart which shows the procedure of the similarity calculation process of the modification 1.
  • FIG. 16 is a flowchart showing the procedure of comparison range identification processing of Modification 2;
  • FIG. FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a seventh embodiment. It is a table showing information stored in a citation format DB.
  • FIG. 21 is a flow chart showing a procedure of quoting determination support processing of the seventh embodiment;
  • FIG. It is a flowchart which shows the procedure of a citation form setting process. It is the figure which illustrated the citation form setting input screen.
  • It is a flowchart which shows the procedure of legality determination processing.
  • FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to an eighth embodiment. It is the table which illustrated the information which is housed in citation ratio DB.
  • FIG. 21 is a flow chart showing a procedure of quoting determination support processing of the eighth embodiment; FIG. It is the figure which illustrated the quotation judgment screen at the time of outputting and displaying a quotation ratio. It is a flowchart which shows the procedure of a list display process. It is a figure which shows the determination result screen which displays the list
  • FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a ninth embodiment. It is the table which illustrated the information stored in output mode DB.
  • FIG. 20 is a flow chart showing the procedure of a quote determination support process of the ninth embodiment. It is a figure which shows the thesis data displayed on the citation determination screen on a display apparatus.
  • 51 is a flowchart showing the procedure of a quotation determination support process of the tenth embodiment.
  • 51 is a flowchart showing the procedure of comparison and determination processing of the tenth embodiment;
  • This form is a form which selects automatically the composition part with high possibility of quoting the document of a third party among thesis data, and makes it a judgment range.
  • FIG. 1 is a block diagram conceptually showing the system configuration including the citation judging support device according to the first embodiment.
  • the quotation determination support device 100 is communicably connected to the WEB site 131 and the file server 133 via an arbitrary network such as the Internet 130 as shown in FIG.
  • the WEB site 131 and the file server 133 can be configured in the same manner as in the prior art, and thus the detailed description thereof is omitted.
  • the quotation determination support device 100 is configured by connecting a storage unit 101 and a control unit 102 by a bus as shown in FIG. 1, and includes an input device 103 and a display device 104.
  • the storage unit 101 is a storage unit that stores various programs and data necessary for controlling the quotation determination support apparatus 100, and is configured of a storage medium such as a hard disk drive (HDD) or a memory.
  • the storage unit 101 is installed with a quotation determination support program stored in a recording medium (not shown) and read by a reading device (not shown).
  • a document data storage unit 101a, a document list storage unit 101b, and a paper data storage unit 101c are provided in the storage unit 101.
  • the document data storage unit 101 a stores document data that can be a source of citation data.
  • the document data is stored in the document data storage unit 101 a provided in the quotation determination support device 100 and is also stored in the WEB site 131 and the file server 133.
  • the document list storage unit 101b is a document name of document data recorded in the document data storage unit 101a, the document name of document data on the Internet, a storage location such as URL (Uniform Resource Locator) or folder name, a file name, a creator, a creation year A list of bibliographic information such as date and time is stored.
  • URL Uniform Resource Locator
  • the thesis data storage unit 101c stores thesis data to be subjected to the citation determination in association with the student register number for identifying the student who is the creator of the thesis data.
  • This thesis data storage unit 101c receives thesis data to be subject to the citation judgment from the student's terminal and stores it, and all the thesis data submitted in the past and subjected to the citation judgment is its creator It is stored in correspondence with the student register number of the student.
  • the control unit 102 is a control unit that controls the citation determination support apparatus 100, and conceptually illustrates the determination range specification unit 102a, the comparison range specification unit 102b, the similarity calculation unit 102c, the document citation determination unit 102d, and legality.
  • a determination unit 102e, a reference information acquisition unit 102f, an input control unit 102g, and an output control unit 102h are provided.
  • the specific configuration of the control unit 102 is arbitrary, for example, a control program such as an OS (Operating System), an embedded program defining various processing procedures, etc., an internal memory for storing required data, and And a CPU (Central Processing Unit) that executes the program of
  • OS Operating System
  • CPU Central Processing Unit
  • the determination range specifying unit 102 a is a determination range specifying unit that specifies a determination range of presence / absence of citation of document data from the article data stored in the article data storage unit 101 c.
  • the comparison range specifying unit 102 b is a comparison range specifying unit that specifies document data or the like to be a comparison range with the determination range of the article data.
  • the similarity calculation unit 102 c uses the description content of the determination range specified by the determination range specification unit 102 a as a search key, and the document data specified in the comparison range specification unit 102 b or the past paper data (hereinafter referred to as “document data etc.” ) Is a similarity calculation means for searching for the comparison range and calculating the mutual similarity.
  • the document quotation determination unit 102d determines that the document data judgment range refers to a comparison data or the like. It is a judgment means.
  • the legality determination unit 102e refers to the cited place such as the document data or the like in the determination range It is a legitimacy judging means to judge whether the citation is a legal citation based on the place.
  • the reference information acquisition unit 102 f uses the name or title of the document data or the like as reference information for referring to the document data or the like.
  • URL and folder name, etc. are reference information acquisition means for acquiring from attributes such as document data.
  • the input control unit 102 g is an input control unit that receives an event caused by an operation input from the input device 103 and performs input control of the operation input.
  • the output control unit 102 h is an output control unit that performs display control of various screens on the display device 104.
  • the output control unit 102h displays a judgment range display, a judgment range of article data citing a comparison range such as document data, and a quotation judgment screen (described later) showing the reference information on the display device 104.
  • the input device 103 is an input unit such as a keyboard or a pointing device such as a mouse.
  • the display device 104 is an output means such as a monitor.
  • FIG. 2 is a flowchart showing the procedure of the quotation determination support process of the first embodiment.
  • FIG. 3 is a schematic view showing an example of the quotation determination screen.
  • the “simple” button is clicked, the determination range is specified. If you click the "Details" button on the citation judgment screen, various settings for searching the search database that manages the document data etc. of the document data storage unit 101a, language, generation period of cited document data, keywords, creator etc Screen (not shown) is displayed.
  • the determination range identification unit 102a reads the created article data from the article data storage unit 101c (step S11). Then, the determination range specifying unit 102a performs structural analysis of the structure of the data of the article by a known method (step S12), and an introductory portion (constituent portion of "introduction” and the like) configuring the article Obtain the component part of "end", the component part of "acknowledgement”, etc. Then, since the main text portion is a main portion of the article data and is a component portion having a high possibility of citing the document of the third party, the determination range specifying unit 102a selects the component portion obtained by the structural analysis. , And the text part is specified as the determination range (step S13).
  • FIG. 4 is an explanatory view showing an example of a text portion specified as a judgment range in article data.
  • the content described in the answer column is a component corresponding to the text, so the determination range specification unit 102a specifies the described content of this response column as the determination range. .
  • the output control unit 102h displays the content of the specified determination range in the determination range column of the quotation determination screen, as shown in FIG.
  • the input control unit 102g receives the event, and the comparison range specifying unit 102b performs a process of specifying the comparison range (step S14).
  • FIG. 6 is a flowchart showing the procedure of the process of specifying the comparison range.
  • the comparison range specifying unit 102b first reads all the thesis data submitted in the past stored in the thesis data storage unit 101c (step S21).
  • the comparison range specifying unit 102b reads all the document data described in the document list stored in the document list storage unit 101b from the document data storage unit 101a and the Internet 130 (step S22).
  • the comparison range specifying unit 102b specifies all the read article data and the acquired document data (such as document data) as a comparison range (Step S23).
  • the similarity calculation unit 102c searches the data of the comparison range specified using the description content of the specified determination range as the search key (step S15).
  • the similarity of the description content is calculated (step S16).
  • the similarity calculation unit 102c designates a search key to a search program or a search engine using a known search technology, or a search program or a search engine of these to execute a search instruction.
  • logic for calculating the degree of similarity for example, a known logic such as syntactic analysis of the description contents of the judgment range of the article data and the description contents of the document data is used.
  • the document quoting determination unit 102d determines whether the determination range is citing the document of the comparison range by determining whether the calculated similarity is equal to or more than a predetermined threshold (step S17). ).
  • the predetermined threshold can be arbitrarily determined in accordance with the accuracy required for the quotation determination.
  • the legality determination unit 102e determines whether this citation is a legitimate one (step S18).
  • the term "legitimate” is a concept including that the quote is legal under the copyright law, or that the user has requirements set in advance. Specifically, when there is display of the book name near the lower part of the citation place such as the document data in the determination range, the legality determination unit 102e displays the parenthesis "" indicating citation immediately before and after the citation place.
  • the citation is displayed in a font different from the font of the other parts to indicate that it is a citation, it is determined that the citation is legitimately cited based on the copyright law.
  • a predetermined indication for example, author's name, author's name, or publisher's name
  • the citation part is It may be judged that it has been properly cited.
  • Step S18, Yes when it is determined that the reference of the determination range is a legal reference (Step S18, Yes), the processing is ended.
  • the reference information acquisition unit 102f refers to the document data etc. (cited document data or cited paper data) Reference information (file name and title of document data etc., URL, folder name etc.), attribute of document data, etc. or attribute of cited article data etc. (step S19). Then, the output control unit 102 h clearly indicates the location at which the document data and the like are cited within the determination range on the citation determination screen and displays the reference information (step S 20).
  • steps S15 to S20 are repeatedly executed for all the data in the specified comparison range (step S20a, No). If the processes in steps S15 to S20 have been executed for all the data in the specified comparison range (step S20a, Yes), the process ends.
  • FIG. 7 is a schematic view showing a state in which a cited place is highlighted within the judgment range on the quotation judgment screen. Note that the bold and underlined portions in FIG. 7 are the highlighted portions, that is, the portions of the citation.
  • the instruction is accepted by the input control unit 102g, and the output control unit 102h is controlled to display the reference information at the instructed location. There is.
  • FIG. 8 is a schematic view showing a state in which reference information is displayed on the quotation determination screen.
  • the example of FIG. 8 shows the case where the document data on the Internet 130 is cited, and the URL of the document data is displayed as the reference information.
  • the output control unit 102h accesses the WEB page indicated by the URL to display the reference data etc. of the quotation source. Configured.
  • a professor or the like who makes a citation judgment on article data can easily acquire reference data of the citation source.
  • the similarity degree judgment is performed by automatically limiting the document data etc. of the judgment range of the article data and the comparison range, so that the judgment algorithm such as general purpose similarity calculation The citation judgment can be performed using. Therefore, according to the present embodiment, it is possible to improve the accuracy of the determination while preventing the development process and the increase in the manufacturing cost.
  • the text range that is easily cited without permission is identified as the determination range from among the component parts that constitute the article data by the determination range identification unit 102a. Accuracy can be further improved.
  • the legality determination unit 102e places the reference place of the comparison range in the determination range and the vicinity thereof. Since it is determined based on the citation whether or not the citation is a legal citation, it can be easily judged whether citations such as document data are legal citations prescribed by the copyright law. Accuracy can be improved.
  • the reference information acquisition unit 102f acquires reference information for referring to the document data including the comparison range based on the document data, and the comparison range of the document data
  • the document data can be easily referred to by outputting the acquired reference information in addition to the judgment range of the judgment target data quoting the.
  • This form is a form in which thesis data of a student who has performed an illegal citation act in the past or a student whose grade is low is selected as a judgment target.
  • the configuration and the process according to the second embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment
  • the description is abbreviate
  • FIG. 9 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the second embodiment.
  • the quote determination support device 900 differs from the quote determination support device 100 according to the first embodiment in that the storage unit 101 includes the history data storage unit 101 d and the control unit 102 includes the determination range specification unit 102 i.
  • the history data storage unit 101 d is a storage medium, such as a memory or an HDD, which stores history data related to article data generated in the past.
  • FIG. 10 is an explanatory diagram of an example of history data.
  • the historical data includes, for all the thesis data generated in the past, the creation date of the thesis data, the student register number for uniquely identifying the student who is the creator of the thesis data, and illegal citation in the thesis data It is the data which matched the existence of the citation which shows the presence or absence of the student and the student's result (A, B, C, D, A is the best and then the degree of excellence is in the order of B, C, D) is there.
  • Presence is set as the presence or absence of the incorrect citation for thesis data of the student who made the citation in the past.
  • past thesis data itself is stored in the thesis data storage unit 101c in association with the student register number.
  • Judgment range specification unit 102i refers to this history data, and as a person who has a high probability of making an illegal citation, the student register number of “presence or absence of illegal citation” and the grade of C or less (that is, C and D)
  • the student register number of (a) is acquired, and thesis data submitted by the student with the acquired student register number is selected as a judgment target from among a plurality of thesis data submitted (stored in thesis data storage unit 101c) Do.
  • the determination range specifying unit 102i specifies the body part as the determination range from among the component parts of the article data selected as the determination target.
  • FIG. 11 is a flowchart of the quotation determination support process according to the second embodiment.
  • the determination range identification unit 102i first performs determination processing of the determination target (Steps S31). Details of the determination process of the determination target will be described later.
  • the determination range is specified as in the first embodiment on the thesis data of the student as the determination target (steps S32 and S33). Quotation determination is performed in the same processing as in (steps S34 to S40a).
  • FIG. 12 is a flowchart illustrating the procedure of the process of identifying a determination target according to the second embodiment.
  • the determination range identification unit 102i reads out the created article data and the student register number corresponding to the article data from the article data storage unit 101c (step S41).
  • the determination range specifying unit 102i refers to the history data stored in the history data storage unit 101d, and reads out the presence / absence of the illegal citation and the grade corresponding to the read student registry number (step S42).
  • the determination range specifying unit 102i determines whether the presence or absence of the unauthorized use read from the history data is “presence” (step S43). Then, if the presence or absence of unauthorized use is "presence” (step S43, Yes), the thesis data created by the student with the student register number, that is, the thesis data read in step S41 is specified as the determination target (step S45).
  • step S43 determines that the presence or absence of unauthorized use is "absent" (step S43, No).
  • the determination range specifying unit 102i further determines that the score read from the history data is C or less, that is, C or D. It is determined whether there is any (step S44).
  • step S45 the thesis data created by the student with the student register number is specified as the determination target (step S45).
  • step S44, No when the grade is higher than C in step S44 (ie, when it is A or B) (step S44, No), the article data read out in step S41 is not determined.
  • step S41 to step S45 the processing from step S41 to step S45 is performed on the plurality of article data to specify the article data to be determined.
  • the student register number corresponding to "presence" of the presence or absence of fraudulent citation indicating that there has been a fraudulent citation act in the past from history data the score is a predetermined value Since the thesis data created by the student whose student registry number is C or lower is selected as the judgment target from among the plurality of thesis data, the thesis data of the person who has a high probability of making an illegal citation is judged as the judgment target As a result, the determination accuracy can be further improved, and the determination processing load can be reduced and the determination efficiency can be improved by limiting the determination target only to the thesis data having a high probability of illegal citation.
  • the determination range specifying unit 102i both determines from the history data whether there is an illegal quotation indicating that there has been an illegal quotation act in the past, and determines whether or not the score is a predetermined value or less.
  • the paper data to be judged may be specified by only one judgment.
  • This form is a form which performs similarity determination after converting the said word into the word before correction as a countermeasure when the document of citation origin correct
  • the configuration and processing according to the third embodiment are the same as the configuration and processing according to the second embodiment except when particularly described, and the same configuration and processing in the second embodiment will be described.
  • the description is abbreviate
  • FIG. 13 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the third embodiment.
  • the storage unit 101 includes the dictionary storage unit 101e
  • the control unit 102 includes the determination range specification unit 102j and the word conversion unit 102k. It differs from the device 900.
  • the dictionary storage unit 101e is a storage medium such as an HDD or a memory that stores a specialized dictionary in which technical terms in the technical field of the article data and one or more terms that can be used in association with the technical terms are associated and registered.
  • FIG. 14 is an explanatory view showing an example of the specialized dictionary.
  • terms that can be used in association such as a first candidate term and a second candidate term, are associated with terms that can be included in document data and the like. This specialized dictionary is used when correcting the words in the article data by the word conversion unit 102k described later.
  • the determination range specification unit 102j selects the article data converted by the word conversion unit 102k as a determination target. Further, as in the first embodiment, the determination range specifying unit 102j specifies the text portion as the determination range from among the component parts of the article data selected as the determination target.
  • the word conversion unit 102k converts a word included in article data into a first candidate term, a second candidate term, and the like of the corresponding term in the specialized dictionary.
  • FIG. 15 is a flow chart showing a specific procedure of the determination target in the third embodiment.
  • the determination range identification unit 102j reads out the created article data from the article data storage unit 101c (step S51). Then, the determination range specifying unit 102 j performs morphological analysis on the contents of the read article data according to a known method, and divides the contents into morphemes (step S 52).
  • the word conversion unit 102k searches the specialized dictionary using the obtained morpheme word as a search key, and for the word registered as the specialized dictionary term, the word corresponds to the specialized dictionary term It is converted into a first candidate term (step S53).
  • conversion to the nth candidate term (n is an integer of 2 or more) is performed.
  • step S54 it is determined whether the word conversion process is completed for all the words in the article data (step S54), and if not completed (step S54, No), the word conversion process in step S53 is performed.
  • the word conversion unit 102k stores the article data of the converted word as corrected article data. It is stored in the unit 101c (step S55).
  • the word conversion unit 102k determines whether all the candidate terms in the specialized dictionary have been converted (step S56). Then, if conversion into all candidate terms has not been performed (Step S56, No), the word conversion unit 102k selects the next candidate term (the (n + 1) th candidate term) as the term of the specialized dictionary (Step S57) , Steps S53 to S55 are repeated. As a result, for each word of the article data, a plurality of modified version of the article data converted into a plurality of candidate terms are obtained and stored in the article data storage unit 101c.
  • step S56 If it is determined in step S56 that all the candidate terms in the specialized dictionary have been converted (Yes in step S56), the determination range specifying unit 102j specifies a plurality of obtained corrected version paper data as a determination target ( Step S58).
  • the quoting determination support process is performed on a plurality of corrected version paper data specified as the determination target in this way.
  • the words included in the article data are converted into the terms registered in the specialized dictionary, and the article data subjected to the conversion is determined as a judgment target. Even if the data is corrected without being used as it is, and it is cited illegally, it can be judged whether or not it is a quotation, and the accuracy of the judgment can be further improved.
  • This form is a form in which the degree of similarity is calculated between student's past thesis data.
  • the configuration and the process according to the fourth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment
  • the description is abbreviate
  • FIG. 16 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the fourth embodiment.
  • This quotation judgment support device 1600 differs from the quotation judgment support device 100 according to the first embodiment in that the control unit 102 includes a comparison range specification unit 1021, a similarity calculation unit 102m, and a document quotation judgment unit 102n.
  • the similarity calculation unit 102m calculates the similarity between the past paper data of the students stored in the paper data storage unit 101c, in addition to the same function as that of the first embodiment.
  • the calculation of the degree of similarity uses a known method as in the first embodiment.
  • reference quotation judging unit 102 n determines whether there is a mutual relation between a plurality of past paper data when the similarity calculated by similarity calculation unit 102 m is equal to or higher than a predetermined second threshold.
  • the second threshold can be arbitrarily set, and may be the same value as the above-described threshold or any different value.
  • the comparison range specifying unit 1021 specifies, as a comparison range, a plurality of past article data which are determined to have a citation by the document citation determining unit 102n.
  • FIG. 17 is a flowchart showing the procedure of the comparison / determination process of the fourth embodiment.
  • the comparison range specification unit 1021 first extracts two thesis data from all thesis data submitted in the past stored in the thesis data storage unit 101c (step S61).
  • the similarity calculation unit 102m calculates the similarity of the description content of the two extracted article data (step S62).
  • the calculation of the degree of similarity first compares the description of a partial range in one of the two article data with the description content of the other article data, and then the one article data
  • the similarity of each part is calculated while repeating the process of comparing with the description content of the other article data while changing the partial range of the part, and the average value of the similarity of these partial comparison results, etc. Can be calculated as the degree of similarity between all the article data.
  • the method of calculating the degree of similarity is not limited to this.
  • the comparison range specifying unit 102l determines whether or not the calculation process of the similarity is performed for all the past paper data (step S63), and if it is not performed for all the past paper data ( The processes of steps S63 and No) and steps S61 and S62 are repeatedly executed.
  • the document quoting determination unit 102n determines a plurality of documents whose similarity is equal to or more than a predetermined second threshold. If there is a thesis data, it is determined that there are citations among the plurality of thesis data, and the plurality of thesis data are selected (step S64). Then, the comparison range specifying unit 102l specifies the selected plurality of thesis data as the comparison range (step S65). Therefore, the past paper data quoted to each other becomes the comparison range, and the citation judgment of the paper data to be judged is performed.
  • the citation determination support apparatus 1600 performs citation determination on the citation data of the determination target, with the citation data in the past as citation data being used as a comparison range.
  • the comparison range it is possible to improve the accuracy of the judgment while preventing the development process and the increase of the manufacturing cost, and limit the judgment object only to the thesis data with a high probability of illegal citation.
  • the determination processing load can be reduced to improve the determination efficiency.
  • the citation is further determined by the legality determination unit 102e. It may be configured to judge whether or not it is legal, and to specify as a comparison range only when it is illegal.
  • This form is a form which automatically extracts a judgment target with a task sentence of a paper as a keyword.
  • the configuration and the process according to the fifth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate
  • FIG. 18 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the fifth embodiment.
  • the quote determination support device 1800 differs from the quote determination support device 100 according to the first embodiment in that the control unit 102 includes the task extraction unit 102 p and the comparison range specification unit 102 q.
  • the task extraction unit 102p analyzes the structure of the article data as the determination target, and extracts the task sentence of the article from the description content of the article data. Specifically, the task extraction unit 102p identifies and extracts task sentences based on the heading, structure, and the like of the paper obtained as a result of structural analysis.
  • the comparison range specifying unit 102 q searches the corresponding WEB page from the WEB site 131 or the file server 133 on the Internet 130 using the task sentence extracted by the task extracting unit 102 p as a search key, and the URL etc. output as the search result
  • the document data specified by is specified as the comparison range.
  • a known search engine or the like can be used for the search.
  • the comparison range specifying unit 102 q transmits a search request command or the like specifying a search key to a search engine WEB site using a known search engine API (Application Programming Interface), and receives a search result. It should be configured to
  • FIG. 19 is a flowchart showing the procedure of comparison range identification processing of the fifth embodiment.
  • the task extracting unit 102p analyzes the structure of the article data as the determination target to extract task sentences (step S81).
  • the comparison range specifying unit 102 q searches for a corresponding WEB page from the WEB site 131, the file server 133, and the like on the Internet 130 using the extracted task sentence as a search key (Step S82).
  • the comparison range specifying unit 102 q specifies the cited document data specified by the URL of the searched WEB page as the search result as the comparison range (step S 83).
  • the comparative range of the cited document is determined based on the task sentence in the thesis data. It is possible to further improve the accuracy of the determination while preventing an increase in the development process and the manufacturing cost.
  • This form is a form including the correspondence logic when the number of characters to be compared in the article exceeds the number of characters in the search logic.
  • the configuration and the process according to the sixth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate
  • FIG. 20 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the sixth embodiment.
  • the quote determination support device 2000 differs from the quote determination support device 100 according to the first embodiment in that the control unit 102 includes the similarity calculation unit 102r.
  • the similarity calculation unit 102r uses the well-known search technology (search engine etc.) and uses the description content of the determination range specified by the determination range specification unit 102a as a search key, and the comparison range specified by the comparison range specification unit 102b. Search from among At this time, if the number of characters of the search key exceeds a predetermined number of restricted characters (for example, 32 characters), an error message is displayed that includes the number of restricted characters and the number of characters of the search key exceeds the number of restricted characters. It is notified from a search engine etc.
  • search engine etc. uses the description content of the determination range specified by the determination range specification unit 102a as a search key, and the comparison range specified by the comparison range specification unit 102b. Search from among At this time, if the number of characters of the search key exceeds a predetermined number of restricted characters (for example, 32 characters), an error message is displayed that includes the number of restricted characters and the number of characters of the search key exceeds the number of restricted characters. It is notified from a search engine etc.
  • the similarity calculation unit 102r specifies a character within the limited number of characters from the top of the determination range as a search key, performs a search from the comparison range, and stores the search result in a memory or the like. deep. Then, the similarity calculation unit 102 r similarly searches the comparison range using the character string for the next limited number of characters in the determination range as a search key.
  • the similarity calculation unit 102 r sequentially designates the search key while moving the character string of the description content of the determination range by the limited number of characters, performs a plurality of searches, and stores the search result Save to The similarity calculation unit 102r calculates the similarity with the determination range, with the search result having the highest appearance frequency among the plurality of search results as the comparison range to be the target of similarity calculation.
  • search results having a predetermined number or more of appearance frequencies may be configured as targets for similarity calculation.
  • FIG. 21 is a flow chart showing the procedure of search processing in similarity calculation in the sixth embodiment.
  • the similarity calculation unit 102r searches data in the comparison range as a description content search key of the determination range (step S91). Then, the similarity calculation unit 102r determines whether an error notification that the search key has exceeded the limit number of characters has been received (step S92).
  • the similarity calculation unit 102 r selects a search result (step S 100), and the search result is
  • the comparison range is the target of similarity calculation, and the similarity to the determination range is calculated as in the first embodiment.
  • step S92 if an error notification that the search key has exceeded the limited number of characters is received in step S92 (Yes in step S92), the similarity calculation unit 102r acquires the limited number of characters from the received error notification. (Step S93).
  • the similarity calculation unit 102r designates a character string within the limited number of characters as a search key from the head of the determination range (step S94), and searches data in the comparison range with this search key (step S95).
  • the similarity calculation unit 102r stores the search result in the memory (step S96).
  • the similarity calculation unit 102 r determines whether or not the final character string has been reached as a search key of the determination range (step S 97), and if it has not reached yet (step S 97, No), Among them, a character string for the next limited number of characters is designated as a search key (step S98), and the processes of steps S95 and S96 are repeatedly executed.
  • the character string from the first character to the 32nd character is used as a search key
  • the character string from the 33rd character to the 64th character is used as a search key.
  • the following character string is used as a search key, and a search key is specified in the same manner
  • the first time the character string from the first character to the 32nd character is the search key
  • step S97 when the final character string is reached as the search key of the determination range in step S97 (step S97, Yes), the search result having the highest frequency of appearance is selected from the search results stored in the memory ( Step S99), the selected comparison range is the target of similarity calculation, and the similarity to the determination range is calculated.
  • the search key when the search key exceeds the limited number of characters, the search key is specified by the character string for the limited number of characters in the determination range, and the search key is specified. Since the search is performed a plurality of times while shifting the character string in the determination range, the accuracy of the quotation determination can be improved regardless of the limited number of characters of the search key.
  • the legality determination means determines whether or not the determination range conforms to a predetermined citation form, and based on the determination result, whether or not the citation of the comparison range in the determination range is a legitimate citation It is a form to determine whether or not.
  • the configuration and processing according to the seventh embodiment are the same as the configuration and processing according to the first embodiment except when particularly described, and the same configuration and processing in the first embodiment The description is abbreviate
  • FIG. 24 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the seventh embodiment.
  • the citation determination support apparatus 100 includes a citation format setting unit 102 s in the control unit 102 and a citation format database (hereinafter, “database” is abbreviated as “DB”) 101 f in the storage unit 101.
  • database a citation format database
  • the citation format setting unit 102 s is a citation format setting unit that sets a citation format as a reference when judging the legitimacy of citation.
  • the citation format DB 101 f is a citation format storage unit that associates and stores a type for classifying article data and a predetermined citation format.
  • FIG. 25 is a table showing information stored in the citation format DB 101 f.
  • the citation format DB 101 f includes “type”, “citation format”, and “location of legal document storage” as data items, and information corresponding to these is stored in association with each other.
  • the information stored corresponding to the item "type” is information for specifying the type of thesis data, and as exemplified in FIG. 25, the field corresponding to the theme of the article such as "law” or "engineering" Can be stored.
  • the information stored corresponding to the item "citation format” is information for specifying a legal citation format, and as illustrated in FIG. 25, it is necessary to store "" "," “", etc. Can.
  • the information stored corresponding to the item "legal document storage location” is information for specifying the storage location of the document regarded as legal, for example, as shown in FIG. 25, "Z: ⁇ quotaion ⁇ law ⁇ “Z: ⁇ quotaion ⁇ eng ⁇ ” and the like can be stored as a folder name etc. where the document is stored.
  • the "document to be regarded as legal" for example, a document to be regarded as legal when the document is cited corresponds.
  • the storage method and storage timing of the information stored in the citation format DB 101 f can be arbitrarily stored in the citation format DB 101 f, for example, in advance via the input device 103, or stored in the citation format DB 101 f in the citation format setting process described later. can do.
  • FIG. 26 is a flowchart showing the procedure of the quotation determination support process of the seventh embodiment.
  • the processes of steps SA1 to SA13 except steps SA2 and SA9 are the same as the processes of steps S11 to S20a described with reference to FIG. The description is omitted.
  • step SA2 After reading the article data in step SA1 (step SA1), the citation format setting unit 102s sets citation format (step SA2).
  • the citation format setting process is a process for setting a citation format, which is a standard when judging the legitimacy of citation in the article data.
  • FIG. 27 is a flowchart of the citation format setting process.
  • FIG. 28 is a diagram exemplifying a citation format setting input screen.
  • a “type” menu for selecting the type of article data
  • a “citation format” box for inputting a legal citation format
  • a storage location of a document regarded as legal A “law document storage location” box to be specified for example, a confirmation button for giving an instruction to confirm the input content on the citation format setting input screen, an end button for giving an instruction to end citation format setting, etc. are displayed.
  • step SB2 when the end instruction of the citation format setting process is instructed by pressing the end button via the input device 103 (Yes in step SB2), the citation format setting unit 102s ends the citation format setting process and returns to the main routine.
  • the citation format determination unit selects the type of thesis data (for example, from the "type" menu) via the input device 103. It waits until "law” or "engineering” etc. is selected (step SB3, No), and when the type of article data is selected (step SB3, Yes), the selected type is temporarily stored in RAM etc. To do (step SB4).
  • the citation format setting unit 102 stands by until the input content determination instruction is given by pressing the determination button via the input device 103 (step SB5, No), and the input content determination instruction is received (step SB5, Yes), storage of the document specified in the citation format (for example, "" “or” “” etc.) currently input in the "citation format” box, and "the legal document storage location” A place (for example, "Z: ⁇ quotaion ⁇ law ⁇ " or the like) is acquired, and stored in the quotation format DB 101 f in association with the type temporarily stored in the RAM or the like in step SB4 (step SB6). Thereafter, the process returns to step SB2, and it is determined whether an end instruction has been issued (step SB2).
  • step SA8 when the similarity calculated by the similarity calculation unit 102c in step SA7 is equal to or higher than the predetermined threshold in step SA8 (step SA8, Yes), the determination range is the comparison range. Judging that the document data and the like are cited, the legality determination unit 102e executes a legality determination process for determining whether the citation is a legal reference (step SA9).
  • FIG. 29 is a flow chart showing the procedure of legality determination processing.
  • the legality determination unit 102e specifies the type of the article data to be determined (step SC1).
  • the type input screen (not shown) can be output and displayed on the display device 104, and the input of the type of article data to be determined can be received via the input device 103.
  • the citation form DB 101 f is referred to based on the type specified in step SC 1, and a legal citation form corresponding to the type and a storage location of a document considered to be legal are acquired from the citation form DB 101 f (step SC 2).
  • step SC3 it is determined whether or not the citation which is determined in step SA8 that the reference data or the like in the comparison range is referred to is a citation conforming to the legal citation format acquired in step SC2 (step SC3).
  • a proper citation form “” ” is used before or after a citation part, a citation part itself or a reference number to reference information indicating the citation source of the citation part immediately after the citation part is added If it is, or if the cited part is a citation from a document stored in the storage location of the document to be considered legal, it will be judged as a citation conforming to the legal citation form.
  • the legality determination unit 102e causes the display device 104 to display an indication that the citation part is inappropriate.
  • Step SC4 For example, in the quotation determination screen shown in FIG. 7, it is assumed that the display of the cited part is reversed in black and white.
  • step SC3 when it is determined that the document conforms to the legal citation form (step SC3, Yes), or after the process of step SC4, the legality determination unit 102e determines that the document data etc. in the comparison range is cited. It is determined whether a legitimacy determination has been made for all of the parts (step SC5).
  • Step SC5 when it is determined that the legality determination has not been performed for all of the cited parts (Step SC5, No), the legality determination unit 102e determines the legal citation form for the other cited parts for which the legality determination is not performed. It is determined whether or not the citation is in compliance with (step SC3). On the other hand, when it is determined that the legality determination has been performed for all of the quoted parts (Yes in step SC5), the legality determination unit 102e ends the legality determination process and returns to the main routine.
  • the citation form corresponding to the type of the article data is acquired from the citation form DB 101 f and it is determined whether the citation matches the acquired citation form, citation based on a citation form different for each type of the article data The legitimacy of can be determined.
  • This form is a form which calculates the citation ratio which the description content quoted from the comparison range occupies among the description contents of the judgment range.
  • the configuration and the process according to the eighth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment
  • the description is abbreviate
  • FIG. 30 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the eighth embodiment.
  • the citation determination support apparatus 100 includes a citation ratio calculation unit 102t in the control unit 102 and a citation ratio DB 101g in the storage unit 101.
  • the citation ratio calculation unit 102t is a citation ratio calculation unit that calculates the citation ratio occupied by the description content cited from the comparison range among the description content of the determination range.
  • the quotation ratio DB 101 g is a quotation ratio storage unit that stores determination target data information that uniquely identifies determination target data and the quotation ratio calculated by the quotation ratio calculation unit 102 t in association with each other.
  • FIG. 31 is a table exemplifying information stored in the citation ratio DB 101 g.
  • the citation ratio DB 101 g includes “thesis data information”, “document data information”, and “citation ratio” as data items, and information corresponding to these is stored in association with each other.
  • the information stored in the item “dissertation data information” is determination target data information that uniquely identifies the dissertation data that is the determination target, and as shown in FIG. 31, for example, the student register number of the thesis author and thesis creation An identification number including a day is stored.
  • the information stored in the item "document data information” is document data information that uniquely identifies the document data that is the citation source, and as shown in FIG. 31, for example, the document information of the document data is stored.
  • the information stored in the item "quotation ratio” is information for specifying the citation ratio calculated by the citation ratio calculation unit 102t. As shown in FIG. 31, for example, the citation ratio of each document data in the article data And the numerical value which showed the total value of the said individual citation ratio in percentage is stored. The specific content of the citation ratio will be described later.
  • FIG. 32 is a flowchart showing the procedure of the quotation determination support process of the eighth embodiment.
  • the processes in steps SD1 to SD11 are the same as the processes in steps S11 to S20a described with reference to FIG. 2 in the first embodiment, and thus detailed description will be omitted.
  • step SD11 determines that all the processing in steps SD5 to SD10 is completed for all the data in the specified comparison range (Yes in step SD11).
  • the citation ratio calculation unit 102t determines that the description content of the determination range is The citation ratio occupied by the description content cited from the comparison range is calculated (step SD12).
  • the specific content of the citation ratio is arbitrary, and for example, the percentage of the number of characters of the citation part to the number of characters of the determination range is calculated as the citation ratio.
  • the output control unit 102h causes the display device 104 to output and display the citation ratio calculated by the citation ratio calculation unit 102t, and associates the calculated citation ratio with the article data information specifying the article data to be determined.
  • the citation ratio DB 101g is stored (step SD13). When there are citations from a plurality of document data, the individual citation ratio occupied by the description content citation from each document data and the total value of the respective citation ratios are calculated and stored in the citation ratio DB 101 g.
  • FIG. 33 is a diagram exemplifying a quoting determination screen when the quoting ratio is displayed.
  • the citation ratio calculated as a percentage of the number of characters of the quoted portion to the number of characters of the determination range is displayed in the upper right portion of the citation determination screen.
  • the description content of the judgment range is cited from a plurality of document data, as shown in FIG. 33, the total value of the citation ratio from each document data and the individual citation ratio from each document data are displayed together You may make it display, and you may make it display only the total value of the citation ratio from each literature data.
  • This list display process is a process of outputting article data information in the order based on the citation ratio of each article data.
  • FIG. 34 is a flowchart showing the procedure of the list display process.
  • the execution timing of the list display process is arbitrary, and is started, for example, when an instruction to execute the list display process is input via the input device 103.
  • FIG. 35 is a diagram showing a determination result screen displaying a list of article data information in descending order of the total value of the citation ratio. As shown in FIG. 35, thesis data information is displayed on the screen in descending order of the citation ratio. At this time, an individual citation ratio for each document data may be displayed together for each article data information.
  • the quotation decision support apparatus 100 of the eighth embodiment calculates and outputs the citation ratio occupied by the description content cited from the comparison range among the description contents of the judgment range, and therefore the judgment material of the legitimacy of the quotation is Can be presented.
  • citation ratios are calculated for multiple article data, and article data information is output in the order based on the citation ratio for each article data, so that legality of citations in multiple article data is compared based on the citation proportions. It is possible to present the judgment material of
  • FIG. 9 A ninth embodiment will now be described.
  • This form is a form that determines whether citation source information that specifies document data that is a citation source of a citation part is included in the determination target data.
  • the configuration and the process according to the ninth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate
  • FIG. 36 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the ninth embodiment.
  • the quotation determination support apparatus 100 includes an output mode DB 101 h in the storage unit 101.
  • the output mode DB 101 h is an output mode information storage unit that stores the degree of similarity of the determination range and the output mode by the display device 104 in association with each other.
  • FIG. 37 is a table exemplifying information stored in the output mode DB 101 h. As shown in FIG. 37, the output mode DB 101 h includes “similarity S [%]” and “output mode” as data items, and information corresponding to these is stored in association with each other.
  • the information stored corresponding to the item "Similarity S [%]" is information for specifying the similarity of the determination range, and information for specifying the range of the similarity serving as the reference for the quotation determination (in FIG.
  • the information stored corresponding to the item “output mode” is information for specifying the output mode by the display device 104, and information for specifying the mode to be output according to the degree of similarity is stored. In the example of FIG.
  • the character output mode when the similarity is less than 20%, the character output mode is “normal” because the possibility of citation is considered low, and when the similarity is 20% or more and less than 80%, the citation is Since there is a possibility, the output mode of the character is "bold", and when the similarity is 80% or more, the output mode of the character is "reverse” because the possibility of quoting is high.
  • the item “output mode” may store color information specifying a character color or a background color of a character, font information specifying a font of a character, and the like.
  • the storage method and storage timing of the information stored in the output aspect DB 101 h can be arbitrarily stored in the output aspect DB 101 h via, for example, the input device 103 in advance.
  • FIG. 38 is a flowchart of the quotation determination support process of the ninth embodiment.
  • Steps SF1, step SF5, step SF8, step SF9, step SF12, and steps SF15 to SF17 are the steps S11, S14, S15, and S16 described with reference to FIG. 2 in the first embodiment, respectively.
  • Steps S18 and S19 to S20a are the same as the processes described above, and thus detailed description will be omitted.
  • FIG. 39 is a diagram showing article data displayed on the citation determination screen on the display device 104.
  • the output control unit 102h causes the paper data display area 105, the range setting slider 106, the entire view 107, and the document data display area 108 to be displayed on the quotation determination screen.
  • the paper data display area 105 is an area for displaying paper data to be judged.
  • the range setting slider 106 sets a determination range in the article data to be determined, and an area sandwiched between the upper range setting slider 106 a and the lower range setting slider 106 b is set as the determination range.
  • the whole view 107 is an area for displaying the display range of the paper data display area 105, the judgment range, and the approximate position of the cited part in the whole of the paper data to be judged.
  • the document data display area 108 is an area for displaying the content of the cited document data. As shown in FIG. 39, in step SF2, the contents of the article data are displayed in the article data display area 105, and the display range of the article data display area 105 is displayed in the entire view 107 as a rectangular frame.
  • the determination range specifying unit 102a determines whether an instruction input of the determination range is input through the input device 103 (step SF3), and determines that an instruction input of the determination range is input (step SF3). , Yes), the range instructed by the instruction input is specified as the determination range (step SF4).
  • the range between the upper and lower range setting sliders 106 among the article data to be determined is specified as the determination range.
  • the output control unit 102h causes the region outside the determination range to be displayed by hatching in the entire view 107.
  • the document reference determination unit 102d determines whether the threshold of similarity is input through the input device 103 (step SF6), and the similarity When it is determined that the threshold is input (Yes in step SF6), the document reference determination unit 102d sets the input threshold as a threshold of similarity in the reference determination (step SF7).
  • the input method of a threshold is arbitrary, for example, an input box of a threshold can be displayed on a quotation determination screen (illustration omitted), and the numerical value input into the said input box via the input device 103 can be set as a threshold. .
  • a threshold setting slider may be displayed on the quotation determination screen (not shown), and a value corresponding to the position of the setting slider whose position has been changed via the input device 103 may be set as the threshold.
  • the output control unit 102h acquires an output mode corresponding to the calculated similarity from the output mode DB 101h (step SF10),
  • the determination range is output and displayed on the display device 104 based on the acquired output mode (step SF11).
  • the characters in the article data display area 105 are displayed based on the output aspect acquired from the output aspect DB 101 h illustrated in FIG. 37 corresponding to the calculated similarity.
  • a portion with a similarity of less than 20% is displayed normally, a portion with a similarity of 20% or more and less than 80% is displayed in bold, and a portion with a similarity of 80% or more is displayed in reverse.
  • a portion having a similarity of 20% or more is hatched by cross lines in the entire view 107. This makes it possible for the user to roughly grasp the range occupied by the part that can be cited in the entire article data.
  • the citation determination unit 102e determines that the citation is not legal. It is determined whether the citation source information specifying the document data which is the citation source of the part is included in the article data to be determined (step SF13).
  • the specific content of the citation source information is optional. For example, information such as the author name of the citation source document, year of publication, title, published magazine, number of volumes, location page, etc. can be used as the citation source information.
  • a criterion for determining whether or not citation source information is included is optional, for example, whether citation source information is described immediately after the citation, or in a note number described immediately after the citation. Correspondingly, determination can be made based on whether citation information is described at the end of the article data.
  • the output control unit 102h causes the display device 104 to output and display the citation source information (step SF14).
  • the method and procedure for causing the display device 104 to output and display the quotation source information are arbitrary, and for example, an instruction input to display the quotation source information corresponding to the citation portion determined not to be a legal quotation is input via the input device 103 If it does, the citation source information corresponding to the citation part is displayed.
  • FIG. 40 is a diagram showing a quotation determination screen on which quotation source information is displayed.
  • the citation source information corresponding to the designated citation portion (“ ⁇ , ⁇ at the bottom of the displayed paper data ⁇ , “ ⁇ ”, ⁇ magazine, Volume O, ⁇ page ⁇ ⁇ page ”portion is highlighted (in FIG. 40, inverted display).
  • the reference information acquisition unit 102f After displaying the citation source information in this way (step SF14), or when it is determined in step SF13 that the citation source information is not included in the article data (step SF13, No), the reference information acquisition unit 102f references the reference information. It acquires (step SF15).
  • the range designated via the input device 103 is specified as the determination range from the article data, it is possible to limit the target for which the quotation determination is performed, and to reduce the load associated with the determination process.
  • the similarity is equal to or greater than the predetermined threshold input through the input device 103, it is determined that the determination range refers to the comparison range, so an optimum threshold is set according to the purpose of the determination.
  • the determination based on the threshold can be performed.
  • the output mode corresponding to the similarity calculated by the similarity calculation unit 102c is acquired from the output mode DB 101h, and the determination range is output in the acquired output mode, the user can easily grasp the similarity.
  • the judgment range can be output with.
  • FIG. 41 is a flowchart of the quotation determination support process of the tenth embodiment.
  • the processes of steps SG1 to SG12 excluding step SG11 are the same as the processes of steps S11 to S20a described with reference to FIG. 2 in the first embodiment, and thus detailed description will be omitted. .
  • step SG10 after the reference data and the like are explicitly cited in the determination range and the reference information is displayed (step SG10), the comparison range specification unit 102b displays the cited reference data.
  • the bibliographic information for example, author name, publication year, title, publication magazine, URL, etc.
  • the storage location for example, folder name
  • the information is stored in the unit 101b (step SG11). Thereafter, it is determined whether or not the processing from step SG5 to step SG11 has been executed for all the data in the specified comparison range (step SG12).
  • FIG. 42 is a flowchart showing the procedure of the comparison and determination process of the tenth embodiment.
  • the processes in steps SH5 to SH7 are the same as the processes in steps S21 to S23 described with reference to FIG. 6 in the first embodiment, and thus detailed description will be omitted.
  • the comparison range identification unit 102b identifies the document data cited in the article data based on the result of the structural analysis of the article data performed in step SG2 of the quotation determination support process. It is determined whether the citation source information is included in the article data (step SH1).
  • the comparison range specification unit 102b refers to the document list storage unit 101b, and bibliographic information corresponding to the citation source information is stored in the document list storage unit 101b. It is determined whether or not it is stored (step SH2). As a result, when bibliographic information is stored in the document list storage unit 101b (Yes in step SH2), the comparison range specification unit 102b stores the documents stored in the storage location stored in association with the bibliographic information. The data is read from the document data storage unit 101a (step SH3), and the read document data is specified as a comparison range (SH4).
  • step SH1 when it is determined in step SH1 that citation source information is not included in the article data (No in step SH1), or in step SH2, bibliographic information corresponding to the citation source information is not stored in the document list storage unit 101b. If it is determined that the result is (step SH2, No), the comparison range specification unit 102b reads all the thesis data submitted in the past stored in the thesis data storage unit 101c (step SH5).
  • step SH4 the comparison range specifying unit 102b ends the comparison range specifying process and returns to the main routine.
  • the document data storage unit 101 a stores the document data determined by the document quotation determination unit 102 d as being cited in the determination range.
  • citation source information that specifies document data is included in the article data
  • Identify literature data is stored in the document data storage unit 101a.
  • the comparison range can be limited to the document data already stored in the document data storage unit 101a, and the load in searching the content of the determination range from the data of the comparison range can be reduced.
  • the similarity calculation unit 102r performs processing when the character string of the search key exceeds the limited number of characters, but performs processing so that the character string designated as the search key does not exceed the limited number of characters in advance. You can also
  • the determination range is analyzed by text mining processing using morphological analysis etc. and divided into words of a character string less than the limited number of characters, and a word appearing a predetermined number or more is designated as a search key Search multiple times from the comparison range for each word.
  • the similarity calculation unit 102r is a target of a comparison range for calculating the similarity between the comparison range of the search result whose appearance frequency is larger than a predetermined value among the search results of a plurality of times and the description content of the determination range. It may be configured to be determined as
  • FIG. 22 is a flowchart illustrating the procedure of the similarity calculation process of the first modification.
  • the similarity calculation unit 102r performs text mining processing such as morphological analysis on data of the description content of the determination range, and divides the data into words having the number of characters within the limited number of characters (step S111). Then, the similarity calculation unit 102r calculates the appearance frequency of each word (step S112), and sorts the words in descending order of appearance frequency (step S113). Then, the similarity calculation unit 102r designates a word with the highest appearance frequency as a search key (step S114).
  • the similarity calculation unit 102r searches the comparison range with the designated search key (step S115), and stores the search result in the memory (step S116).
  • the similarity calculation unit 102r determines whether the search process has been performed for all the words whose appearance frequency is a predetermined number or more (step S117). Then, when the similarity calculation unit 102 r determines that the search processing has not been performed on all the words having the appearance frequency of the predetermined number or more (No at step S 117), the word having the next highest appearance frequency is Designating as a search key (step S118), the search process of steps S115 and S116 is repeatedly executed.
  • step S117 when it is determined in step S117 that the similarity calculation unit 102r has completed the search processing for all the words having the appearance frequency of a predetermined number or more (Yes in step S117), a plurality of words stored in the memory The comparison range as the search result having the highest frequency of appearance among the search results of is selected (step S119). Thereby, the selected comparison range becomes an object of similarity calculation, and the similarity with the determination range is calculated.
  • a search result having a high appearance frequency is automatically specified, and this search result is automatically set as a comparison range used for similarity calculation.
  • the comparison range that matches with can be extracted automatically and the citation determination can be performed, and the accuracy of the citation determination can be further improved.
  • the similarity calculation is performed so that the process of this modification is performed only when an error notification that the number of characters of the search key exceeds the number of restricted characters is received from the search engine or the like.
  • the unit 102r may be configured.
  • the comparison range specifying unit 102 q receives an error notification that the search key exceeds the number of restricted characters, like the similarity calculation unit 102 r of the sixth embodiment, among the extracted task sentences, A search key is specified by the character string for the limited number of characters, and the search is performed multiple times while shifting the character string of the task sentence as the search key. Then, the comparison range specification unit 102 q may be configured to determine the cited document data specified by the URL with the highest appearance frequency among the plurality of URLs output as the search result as the comparison range.
  • FIG. 23 is a flowchart illustrating the procedure of comparison range identification processing of the second modification.
  • the task extracting unit 102p analyzes the structure of the article data as the determination target to extract task sentences (step S131).
  • the comparison range specifying unit 102 q searches for the corresponding WEB page from the WEB site 131 on the Internet 130, the file server 133, and the like using the extracted task sentence as a search key (step S132).
  • the comparison range specifying unit 102 q determines whether an error notification that the search key has exceeded the limit number of characters has been received (step S 133).
  • the comparison range specifying unit 102q selects the URL of the search result (step S141), and Similar to the fifth aspect, the cited reference data specified by the URL of the search result is specified as the comparison range.
  • step S133 when an error notification that the search key has exceeded the limited number of characters is received (Yes in step S133), the comparison range specifying unit 102q acquires the limited number of characters from the received error notification. (Step S134).
  • the comparison range specifying unit 102q designates a character string of a range for the limited number of characters from the top of the task sentence as a search key (step S135), and searches the WEB page using this search key (step S136).
  • the comparison range specifying unit 102 q stores the URL, which is the search result, in the memory (step S 137).
  • the comparison range specifying unit 102q determines whether the final character string has been reached as a search key for the task sentence (step S138), and if it has not reached yet (step S138, No), Among them, a character string for the next limited number of characters is designated as a search key (step S139), and the processes of steps S136 and S137 are repeatedly executed.
  • step S138 when the final character string is reached as the search key of the task sentence in step S138 (step S138, Yes), the URL of the search result having the highest frequency of appearance among the URLs of the search results stored in the memory. Is selected (step S140), and cited reference data specified by the URL of the selected WEB page is specified as the comparison range.
  • the search key exceeds the limited number of characters
  • the character string of the limited number of characters in the task sentence designates the search key
  • the character string of the task sentence Since the search is performed several times while shifting the key, it is possible to specify an appropriate comparison range of cited reference data according to the content of the thesis, regardless of the limited number of characters of the search key. It can be improved.
  • determination of the legitimacy of citation is performed for a portion where the similarity calculated by the similarity calculation unit 102 c is determined to be equal to or higher than a predetermined threshold. It may be configured to perform legality determination of For example, if the article data to be judged contains a reference symbol (for example, "", "", etc.) corresponding to the type of the article data, citation in the article data is legal. It may be configured to determine the effect.
  • the file name of the thesis data may be output and displayed on the display device 104 in an output mode based on the determination result. For example, the file name of the article data whose citation is determined to be inappropriate may be displayed in reverse in black and white so as to be distinguishable from the file name of the article data determined to be appropriate.
  • the function of automatic reference is incorporated in the judgment range specification units 102a, 102i and 102j of the citation judging support device according to each of the above embodiments, and a desired paper data is selected as a user from the paper data storage unit 101c automatically at startup. It may be configured to read selected article data.
  • the comparison range specifying units 102b, 102l, and 102q of the quotation determination support apparatus according to the above embodiment are not limited to one storage unit or WEB site as a comparison range, but may be a WEB site, a library search database, a local The server may be configured to specify the comparison range across from any combination of these.
  • the dissertation data created by the student has been described as the determination target data, but the present invention is not limited to this, and any data in which sentences are described is used as the determination target data. Can.
  • the output control unit 102 h displays each cited portion in an output mode (for example, different color, font, etc.) different for each document data. It may be configured to output and display at 104. In addition, each citation part may be displayed in a different display manner depending on the citation ratio from each document data.
  • an output mode for example, different color, font, etc.
  • the problems to be solved by the invention and the effects of the invention are not limited to the contents described above, and the present invention solves the problems not described above, and the effects not described above. It may also play, or may only solve some of the listed tasks or only some of the listed effects.
  • the citation determination support program executed by the citation determination support apparatus is a file of an installable format or an executable format, and is a CD-ROM, a flexible disk (FD), a CD-R , And provided by being recorded on a computer readable recording medium such as a DVD (Digital Versatile Disk).

Abstract

A quotation judgment supporting device (100) is provided with a judgment range identifying section (102a) which identifies a range to judge the presence/absence of quotation of document data out of dissertation data as the target of judgment, a comparison range identifying section (102b) which identifies a range to be compared with the dissertation data out of the document data, a similarity calculation section (102c) which searches for described content in the identified judgment range out of the identified comparison range and calculates similarity between the described content in the judgment range and that in the comparison range, a document quotation judgment section (102d) which judges that the comparison range is quoted by the judgment range when the calculated similarity is not smaller than a predetermined threshold, and an output control section (102h) which outputs the judgment range of the dissertation data which quotes the comparison range of the document data, to a display device (104).

Description

[規則37.2に基づきISAが決定した発明の名称] 引用判定支援装置[Title of invention determined by ISA based on rule 37.2]
 本発明は、判定対象となる判定対象データの中で、文献データが引用されているか否かの判定を支援する引用判定支援装置および引用判定支援プログラムに関する。 The present invention relates to a quote determination support apparatus and a quote determination support program that support determination of whether or not document data is cited in determination target data to be determined.
 学生や研究者等が作成した論文において、他人の著作物が無断で引用される場合がある。特にインターネットが発達した近年では、WEBページ等から他人の著作物を検索して、検索された他人の著作物を無断で引用して文書を作成することが容易になり、不正利用の問題が深刻化してきている。 In the articles created by students and researchers, the work of others may be cited without permission. In particular, in recent years when the Internet has developed, it is easy to search for the work of another person from a web page etc. and quote the work of another person without permission to create a document, and the problem of unauthorized use becomes serious It has become
 このため、文書の内容における著作物の引用を判断することが必要となってくるが、人手によって引用有無を判断することは極めて煩雑である。このため、他人の著作物が引用されているか否かを自動的に判別するソフトウェアも種々提案されている。 For this reason, it is necessary to determine the citation of the work in the content of the document, but it is extremely complicated to determine the presence or absence of citation by hand. For this reason, various softwares have also been proposed which automatically determine whether or not the work of another person is cited.
 例えば、特許文献1には、サーバから受信した著作権情報に基づいて、ウェブページ内のコンテンツの著作権侵害を調べ、著作権侵害が発見されたらその旨の通知をサーバに送信する技術が開示されている。また、特許文献2には、技術文書間の類似度を判定して両文書の関係を視覚的に表示する技術が開示されている。 For example, Patent Document 1 discloses a technique for investigating the copyright infringement of the content in the web page based on the copyright information received from the server and transmitting a notification to that effect to the server when the copyright infringement is found. It is done. Further, Patent Document 2 discloses a technique for judging the degree of similarity between technical documents and visually displaying the relationship between the two documents.
特開2002-366531号公報JP 2002-366531 A 特開2000-363384号公報JP 2000-363384 A
 このような判定ソフトウェアでは、著作物の引用判定の精度を向上させることが望まれる。この判定精度の向上を図るためには、高度な判定アルゴリズムを開発することが好ましいが、この場合には、判定ソフトウェアの開発工程が増大し、この結果、判定ソフトウェアの製造コストも増大するという問題がある。 In such judgment software, it is desirable to improve the accuracy of judging the citation of a work. In order to improve the determination accuracy, it is preferable to develop a high-level determination algorithm, but in this case, the process of developing the determination software increases, and as a result, the manufacturing cost of the determination software also increases. There is.
 本発明は、上記に鑑みてなされたものであって、汎用的な判定アルゴリズムを利用して開発工程および製造コストの増大を防止しつつ、判定の精度を向上させることができる引用判定支援装置および引用判定支援プログラムを提供することを目的とする。 The present invention has been made in view of the above, and it is possible to improve the accuracy of the determination while preventing the increase in the development process and the manufacturing cost by using a general-purpose determination algorithm, and a quotation determination support device, The purpose is to provide a citation judging support program.
 上述した課題を解決し、目的を達成するために、請求項1に記載の引用判定支援装置は、判定対象となる判定対象データの中で、文献データが引用されているか否かの判定を支援するための引用判定支援装置であって、前記判定対象データの中から、前記文献データの引用の有無の判定範囲を特定する判定範囲特定手段と、前記文献データの中から、前記判定対象データとの比較範囲を特定する比較範囲特定手段と、前記判定範囲特定手段にて特定された前記判定範囲の記述内容を、前記比較範囲特定手段にて特定された前記比較範囲の中から検索し、前記判定範囲の記述内容と前記比較範囲の記述内容の相互の類似度を算出する類似度算出手段と、前記類似度算出手段にて算出された前記類似度が所定の閾値以上である場合に、前記判定範囲が前記比較範囲を引用していると判定する文献引用判定手段と、前記文献データの前記比較範囲を引用している前記判定対象データの前記判定範囲を出力する出力手段と、を備えたことを特徴とする。 In order to solve the problems described above and achieve the purpose, the citation determination support apparatus of claim 1 supports determination of whether or not document data is cited in the determination target data to be determined. A judgment range specifying unit for specifying a judgment range of presence / absence of citation of the document data from among the judgment object data, and the judgment object data from among the document data; Comparison range specifying means for specifying the comparison range of the above and the description content of the determination range specified by the determination range specifying means are searched from the comparison ranges specified by the comparison range specifying means, Similarity calculation means for calculating the similarity between the description content of the determination range and the description content of the comparison range, and the similarity calculated by the similarity calculation means is greater than or equal to a predetermined threshold value; Judgment Document quoting judging means for judging that the scope is referring to the comparison range; and output means for outputting the judgment range of the judgment target data quoting the comparison range of the document data. It is characterized by
 また、請求項2に記載の引用判定支援装置は、請求項1に記載の引用判定支援装置において、前記文献引用判定手段にて前記判定範囲が前記比較範囲を引用していると判定された場合に、当該判定範囲における当該比較範囲の引用箇所およびその近傍箇所に基づいて、当該引用が適法な引用であるか否かを判定する適法性判定手段、を備えたことを特徴とする。 In the citation determination support apparatus according to the second aspect of the present invention, in the citation determination support apparatus according to the first aspect, the document citation determination means determines that the determination range is citing the comparison range. The present invention is characterized by further comprising legality determination means for determining whether or not the citation is a legitimate citation based on the citation part of the comparison range in the determination range and the vicinity thereof.
 また、請求項3に記載の引用判定支援装置は、請求項2に記載の引用判定支援装置において、前記適法性判定手段は、前記引用箇所の引用元である前記文献データを特定する引用元情報が、前記判定対象データに含まれているか否かを判定すること、を特徴とする。 The citation determination support apparatus according to the third aspect of the present invention is the citation determination support apparatus according to the second aspect, wherein the legality determination means identifies the reference data that is the citation source of the citation part. Is determined to be included in the determination target data.
 また、請求項4に記載の引用判定支援装置は、請求項2又は3に記載の引用判定支援装置において、前記適法性判定手段は、前記判定範囲において前記類似度が所定の閾値以上である場合に、当該判定範囲が所定の引用形式に合致するか否かを判定し、当該判定結果に基づいて、当該判定範囲における前記比較範囲の引用が適法な引用であるか否かを判定すること、を特徴とする。 The citation determination support apparatus according to claim 4 is the citation determination support apparatus according to claim 2 or 3, wherein the appropriateness determination means determines that the similarity is equal to or more than a predetermined threshold in the determination range. Determining whether or not the determination range conforms to a predetermined citation form, and based on the determination result, determining whether the citation of the comparison range in the determination range is a legitimate citation; It is characterized by
 また、請求項5に記載の引用判定支援装置は、請求項4に記載の引用判定支援装置において、前記判定対象データの種別と、前記所定の引用形式とを、相互に関連付けて格納する引用形式格納手段を備え、前記適法性判定手段は、前記判定対象データの種別を特定し、当該特定した種別に対応する前記引用形式を前記引用形式格納手段から取得し、当該取得した引用形式に前記判定範囲における前記比較範囲の引用が合致するか否かを判定すること、を特徴とする。 In the quotation determination support apparatus, the type of the determination target data and the predetermined quotation form may be associated with each other and stored. A storage unit is provided, and the legitimacy determination unit identifies a type of the determination target data, acquires the citation form corresponding to the identified type from the citation form storage unit, and determines the acquired citation form as the determination. Determining whether the citations of the comparison range in the range match.
 また、請求項6に記載の引用判定支援装置は、請求項1から5のいずれか一項に記載の引用判定支援装置において、前記文献引用判定手段にて前記判定範囲が前記比較範囲を引用していると判定された場合に、当該比較範囲を含む前記文献データを参照するための参照情報を、当該文献データに基づいて取得する参照情報取得手段を備え、前記出力手段は、前記文献データの前記比較範囲を引用している前記判定対象データの前記判定範囲に加えて、前記参照情報取得手段にて取得された前記参照情報を出力することを特徴とする。 The citation determination support apparatus according to claim 6 is the citation determination support apparatus according to any one of the first to fifth aspects, wherein the determination range refers to the comparison range in the document citation determination means. A reference information acquisition unit that acquires reference information for referring to the document data including the comparison range based on the document data, and the output unit is configured to The reference information acquired by the reference information acquisition unit is output in addition to the judgment range of the judgment target data quoting the comparison range.
 また、請求項7に記載の引用判定支援装置は、請求項1から6のいずれか一項に記載の引用判定支援装置において、前記判定範囲特定手段は、前記判定対象データを構成する構成部分の中から、所定の構成部分を前記判定範囲として特定することを特徴とする。 The citation determination support apparatus according to a seventh aspect of the present invention is the citation determination support apparatus according to any one of the first to sixth aspects, wherein the determination range specifying unit is a component that constitutes the determination target data. Among them, a predetermined component is specified as the determination range.
 また、請求項8に記載の引用判定支援装置は、請求項1から7のいずれか一項に記載の引用判定支援装置において、過去に生成された前記判定対象データの作成者を一意に識別するための作成者識別情報に対して、前記判定対象データにおける不正な引用行為の有無を示す情報、又は前記作成者の成績を対応づけて記憶する履歴記憶手段を備え、判定対象となり得る前記判定対象データが複数存在する場合において、前記判定範囲特定手段は、前記不正な引用行為が有った旨を示す情報に対応する前記作成者識別情報を前記履歴記憶手段から取得し、又は所定値より低い前記作成者の成績に対応する前記作成者識別情報を前記履歴記憶手段から取得し、当該取得した作成者識別情報にて識別される作成者が作成した前記判定対象データを、前記複数の判定対象データの中から前記判定対象として選択することを特徴とする。 The citation determination support apparatus according to claim 8 uniquely identifies, in the citation determination support apparatus according to any one of claims 1 to 7, the creator of the determination target data generated in the past. Information indicating the presence or absence of an illegal act of quotation in the determination target data with respect to creator identification information for the purpose, or history storage means for storing the result of the creator in association with the determination target which can be the determination target In the case where there is a plurality of data, the determination range specifying means acquires, from the history storage means, the creator identification information corresponding to the information indicating that the illegal act of quotation has occurred, or is lower than a predetermined value The determination target data created by the creator identified by the creator identification information acquired by acquiring the creator identification information corresponding to the score of the creator from the history storage unit, And selecting as the determination target from a serial plurality of judgment object data.
 また、請求項9に記載の引用判定支援装置は、請求項1から8のいずれか一項に記載の引用判定支援装置において、前記文献データに含まれ得る単語に対して、当該単語を修正する際に用いられ得る単語を対応づけて記憶する辞書記憶手段と、前記判定対象データに含まれる単語を、前記辞書記憶手段にて記憶された単語に変換する単語変換手段とを備え、前記判定範囲特定手段は、前記単語変換手段による変換が行われた前記判定対象データを、前記判定対象とすることを特徴とする。 The citation determination support apparatus according to claim 9 corrects the word which can be included in the document data in the citation determination support apparatus according to any one of claims 1 to 8. Dictionary storage means for storing words that can be used in association with each other; and word conversion means for converting words included in the determination target data into words stored in the dictionary storage means, wherein the determination range The identification means is characterized in that the judgment target data subjected to conversion by the word conversion means is the judgment target.
 また、請求項10に記載の引用判定支援装置は、請求項1から9のいずれか一項に記載の引用判定支援装置において、当該引用判定支援装置に対する操作入力を受け付ける入力手段を備え、前記判定範囲特定手段は、前記判定対象データの中から、前記入力手段を介して指定された範囲を前記判定範囲として特定すること、を特徴とする。 The citation determination support apparatus according to claim 10 is the citation determination support apparatus according to any one of the first to ninth aspects, further comprising an input unit for receiving an operation input to the citation determination support apparatus, the determination The range specifying means is characterized in that a range specified through the input means is specified as the determination range from among the determination target data.
 また、請求項11に記載の引用判定支援装置は、請求項1から10のいずれか一項に記載の引用判定支援装置において、過去に生成された複数の判定対象データを記憶する判定対象データ記憶手段をさらに備え、前記類似度算出手段は、さらに、前記判定対象データ記憶手段に記憶された前記複数の判定対象データの相互間において、前記類似度を算出し、前記文献引用判定手段は、さらに、前記類似度算出手段にて算出された前記類似度が所定の第2閾値以上である場合に、前記複数の判定対象データの相互間において引用していると判定し、前記比較範囲特定手段は、前記複数の判定対象データの相互間において引用ありと判定された複数の判定対象データを前記比較範囲として特定することを特徴とする。 The citation determination support apparatus according to claim 11 is the citation determination support apparatus according to any one of claims 1 to 10, which stores a plurality of determination target data generated in the past. The similarity degree calculation means further calculates the similarity degree among the plurality of determination target data stored in the determination target data storage means, and the document citation determination means further includes: When the similarity calculated by the similarity calculation means is equal to or greater than a predetermined second threshold, it is determined that the plurality of judgment target data are quoted among each other, and the comparison range specifying means A plurality of determination target data determined to be cited among the plurality of determination target data may be specified as the comparison range.
 また、請求項12に記載の引用判定支援装置は、請求項1から11のいずれか一項に記載の引用判定支援装置において、前記判定対象データの記述内容に基づいて、当該判定対象データの中から、当該判定対象データの課題を示す課題情報を抽出する課題抽出手段を備え、前記比較範囲特定手段は、前記課題抽出手段にて抽出された前記課題情報を検索キーとして前記文献データを検索し、当該検索された文献データを前記比較対象として特定することを特徴とする。 The citation determination support apparatus according to claim 12 is the citation determination support apparatus according to any one of claims 1 to 11, wherein among the data to be determined based on the contents of the description of the data to be determined. Task extraction means for extracting task information indicating a task of the determination target data from the search target data, and the comparison range specifying means searches the document data using the task information extracted by the task extraction means as a search key And identifying the retrieved document data as the comparison target.
 また、請求項13に記載の引用判定支援装置は、請求項1から12のいずれか一項に記載の引用判定支援装置において、前記判定範囲において引用されていると前記文献引用判定手段によって判定された前記文献データを記憶する文献データ記憶手段を備え、前記比較範囲特定手段は、前記判定対象データにおいて引用されている前記文献データを特定する引用元情報が当該判定対象データに含まれているか否かを判定し、当該引用元情報が当該判定対象データに含まれていると判定した場合、当該引用元情報に基づいて特定される前記文献データが前記文献データ記憶手段に格納されているか否かを判定し、当該引用元情報に基づいて特定される前記文献データが前記文献データ記憶手段に格納されていると判定した場合、当該文献データを比較範囲として特定すること、を特徴とする。 The citation determination support apparatus according to claim 13 is the citation determination support apparatus according to any one of claims 1 to 12, wherein the citation determination means determines that the citation determination support apparatus is cited in the determination range. Document data storage means for storing the document data, wherein the comparison range specifying means includes reference source information specifying the document data cited in the determination target data in the determination target data or not If it is determined that the reference source information is included in the determination target data, whether the document data specified based on the reference source information is stored in the document data storage unit If it is determined that the document data specified based on the reference source information is stored in the document data storage means, the document data Identifying it as a comparison range, characterized by.
 また、請求項14に記載の引用判定支援装置は、請求項1から13のいずれか一つに記載の引用判定支援装置において、前記類似度算出手段は、前記判定範囲特定手段にて特定された前記判定範囲の記述内容を検索キーとして、前記比較範囲特定手段にて特定された前記比較範囲の中から検索した場合であって、前記検索キーの文字数が、予め定められた制限文字数を超えている場合に、前記検索キーとして前記制限文字数以内の文字を前記判定範囲の中から順次指定して、前記比較範囲の中から複数回検索し、複数回の検索結果の中で出現頻度が所定の値より大きい検索結果を、前記判定範囲の記述内容との相互の類似度を算出する前記比較範囲の対象とすることを特徴とする。 The citation determination support apparatus according to claim 14 is the citation determination support apparatus according to any one of claims 1 to 13, wherein the similarity calculation means is specified by the judgment range specifying means. In the case of searching from the comparison range specified by the comparison range specifying means, using the description content of the determination range as a search key, the number of characters of the search key exceeds the predetermined number of restricted characters. When the search key is selected, characters within the limited number of characters are sequentially designated from the determination range as the search key, and the comparison range is searched a plurality of times, and the appearance frequency is predetermined among the plurality of search results. A search result larger than a value is set as a target of the comparison range for calculating the mutual similarity with the description content of the determination range.
 また、請求項15に記載の引用判定支援装置は、請求項1から14のいずれか一つに記載の引用判定支援装置において、前記類似度算出手段は、前記判定範囲を解析して所定数以上出現する単語を検索キーとして、前記単語ごとに前記比較範囲特定手段にて特定された前記比較範囲の中から複数回検索し、複数回の検索結果の中で出現頻度が所定の値より大きい検索結果を、前記判定範囲の記述内容との相互の類似度を算出する前記比較範囲の対象とすることを特徴とする。 The citation determination support apparatus according to a fifteenth aspect of the present invention is the citation determination support apparatus according to any one of the first to the fourteenth aspects, wherein the similarity calculation unit analyzes the determination range to obtain a predetermined number or more. A word which appears is searched a plurality of times from the comparison range specified by the comparison range specifying means for each word using the word as a search key, and a frequency of appearance is larger than a predetermined value among a plurality of search results It is characterized in that the result is a target of the comparison range for calculating the mutual similarity with the description content of the determination range.
 また、請求項16に記載の引用判定支援装置は、請求項1から15のいずれか一つに記載の引用判定支援装置において、前記所定の閾値の入力を受け付ける入力手段を備え、前記文献引用判定手段は、前記類似度が前記入力手段を介して入力された所定の閾値以上である場合に、前記判定範囲が前記比較範囲を引用していると判定すること、を特徴とする。 The citation determination support apparatus according to a sixteenth aspect of the present invention is the citation determination support apparatus according to any one of the first to fifteenth aspects, further comprising: input means for receiving an input of the predetermined threshold value; The means is characterized in that, when the degree of similarity is equal to or greater than a predetermined threshold value input through the input means, the determination range is determined to refer to the comparison range.
 また、請求項17に記載の引用判定支援装置は、請求項1から16のいずれか一つに記載の引用判定支援装置において、前記判定範囲の記述内容の内、前記比較範囲から引用された記述内容が占める引用割合を算出する引用割合算出手段を備え、前記出力手段は、前記引用割合を出力すること、を特徴とする。 In the citation determination support apparatus according to a seventeenth aspect of the present invention, in the citation determination support apparatus according to any one of the first to sixteenth aspects, the description cited from the comparison range among the description contents of the determination range. The information processing apparatus may further include citation ratio calculation means for calculating a citation ratio occupied by contents, and the output means may output the citation ratio.
 また、請求項18に記載の引用判定支援装置は、請求項17に記載の引用判定支援装置において、前記引用割合算出手段は、複数の前記判定対象データについて前記引用割合を算出し、前記出力手段は、前記複数の判定対象データを一意に識別する判定対象データ情報を、当該各判定対象データについて前記引用割合算出手段が算出した前記引用割合に基づく順序で出力すること、を特徴とする。 The citation determination support apparatus according to claim 18 is the citation determination support apparatus according to claim 17, wherein the citation ratio calculation means calculates the citation ratio for a plurality of the determination target data, and the output means The present invention is characterized in that the determination target data information uniquely identifying the plurality of determination target data is output in the order based on the quoting ratio calculated by the quoting ratio calculating means with respect to each of the determination target data.
 また、請求項19に記載の引用判定支援装置は、請求項1から18のいずれか一つに記載の引用判定支援装置において、前記判定範囲の前記類似度と、前記出力手段による出力態様とを、相互に対応付けて格納する出力態様情報格納手段を備え、前記出力手段は、前記類似度算出手段にて算出された前記類似度に対応する前記出力態様を前記出力態様情報格納手段から取得し、当該取得した出力態様にて前記判定範囲を出力すること、を特徴とする。 The citation determination support apparatus according to a nineteenth aspect of the present invention is the citation determination support apparatus according to any one of the first to eighteenth aspects, wherein the similarity of the determination range and the output mode by the output means And output mode information storage means for storing the output mode information in association with each other, wherein the output means acquires the output mode corresponding to the similarity calculated by the similarity degree calculation means from the output mode information storage means And outputting the determination range in the acquired output mode.
 また、請求項20に記載の引用判定支援プログラムは、判定対象となる判定対象データの中で、文献データが引用されているか否かの判定を支援するための引用判定支援プログラムであって、コンピュータを、前記判定対象データの中から、前記文献データの引用の有無の判定範囲を特定する判定範囲特定手段と、前記文献データの中から、前記判定対象データとの比較範囲を特定する比較範囲特定手段と、前記判定範囲特定手段にて特定された前記判定範囲の記述内容を、前記比較範囲特定手段にて特定された前記比較範囲の中から検索し、前記判定範囲の記述内容と前記比較範囲の記述内容の相互の類似度を算出する類似度算出手段と、前記類似度算出手段にて算出された前記類似度が所定の閾値以上である場合に、前記判定範囲が前記比較範囲を引用していると判定する文献引用判定手段と、前記文献データの前記比較範囲を引用している前記判定対象データの前記判定範囲を出力する出力手段と、して機能させることを特徴とする。 Further, a citation determination support program according to claim 20 is a citation determination support program for supporting determination of whether or not document data is cited in determination target data to be determined, which is a computer. A determination range specifying means for specifying a determination range of presence / absence of citation of the document data among the determination target data, and a comparison range specifying a comparison range with the determination target data among the document data The description content of the determination range specified by the means and the determination range specification means is searched from the comparison range specified by the comparison range specification means, and the description content of the determination range and the comparison range A similarity calculation unit that calculates the similarity between the description contents of the two, and the determination range is earlier if the similarity calculated by the similarity calculation unit is equal to or greater than a predetermined threshold value It is characterized in that it functions as a document quoting judging means which judges that a comparison range is quoted, and an output means which outputs the judgment range of the judgment object data quoting the comparison range of the document data. I assume.
 この請求項1に記載の引用判定支援装置によれば、判定範囲と比較範囲を自動的に限定した上で類似度の判定を行なうので、汎用的な判定アルゴリズムを利用して開発工程および製造コストの増大を防止しつつ、判定の精度を向上させることができるという効果を奏する。 According to the citation judging support device of the first aspect, since the judgment of the similarity is performed after the judgment range and the comparison range are automatically limited, the development process and the manufacturing cost can be performed using a general-purpose judgment algorithm. There is an effect that the accuracy of the determination can be improved while preventing the increase of
 また、請求項2に記載の引用判定支援装置によれば、引用が著作権法で規定する適法な引用か否かを容易に判断することができ、判定の適法性を容易に判別することができる。 Further, according to the citation judging support device described in claim 2, it can be easily judged whether the citation is a legitimate citation prescribed by the copyright law, and it can be easily judged the legality of the judgment. it can.
 また、請求項3に記載の引用判定支援装置によれば、引用箇所の引用元である文献データを特定する引用元情報が判定対象データに含まれているか否かを判定するので、引用元情報の有無に基づいて引用の適法性を判定する際の判断材料を取得できる。 Further, according to the quotation judgment support device of claim 3, it is judged whether the quotation source information specifying the document data which is the quotation source of the cited part is included in the judgment object data or not. It is possible to obtain the judgment material when judging the legitimacy of citation based on the presence or absence of
 また、請求項4に記載の引用判定支援装置によれば、判定範囲が所定の引用形式に合致するか否かを判定し、当該判定結果に基づいて、当該判定範囲における比較範囲の引用が適法な引用であるか否かを判定するので、予め設定した引用形式に基づき、引用の適法性を容易に判定することができる。 Further, according to the quoting determination support device of claim 4, it is determined whether or not the determination range conforms to a predetermined citation form, and quoting of the comparison range in the determination range is appropriate based on the determination result. It is possible to easily determine the legitimacy of the citation on the basis of a preset citation format, because it is determined whether or not the citation is a citation.
 また、請求項5に記載の引用判定支援装置によれば、判定対象データの種別に対応する引用形式を引用形式格納手段から取得し、当該取得した引用形式に引用が合致するか否かを判定するので、判定対象データの種別毎に異なる引用形式に基づき、引用の適法性を判定することができる。 In addition, according to the quotation determination support device of claim 5, the citation form corresponding to the type of the determination target data is acquired from the citation form storage means, and it is determined whether the citation matches the acquired citation form. Therefore, the legitimacy of the citation can be determined based on a citation format that differs for each type of determination target data.
 また、請求項6に記載の引用判定支援装置によれば、文献データの引用文献を自動的に特定し、この引用文献を判定対象データの判定範囲に加えた上で引用判定を行なうので、引用文献を不正に引用している場合にはこれを容易に検知することができるという効果を奏する。 Further, according to the quoting determination support device of claim 6, the citation of the citation data is automatically specified, and the citation determination is performed after the citation document is added to the determination range of the determination target data. When the document is cited illegally, it has an effect that it can be easily detected.
 また、請求項7に記載の引用判定支援装置によれば、判定対象データの構成部分の中で無断で引用されやすい箇所を判定範囲に設定することができ、判定の精度をより向上させることができるという効果を奏する。 Further, according to the quoting determination support device of the seventh aspect, it is possible to set, in the determination range, a portion which is likely to be cited without permission among the component parts of the determination target data, and to further improve the determination accuracy. The effect of being able to
 また、請求項8に記載の引用判定支援装置によれば、不正な引用を行う確率が高い者の判定対象データを自動的に判定対象に設定することができ、不正が再発する可能性を加味した上で判定を行なうことができ、判定の精度をより向上させることができるという効果を奏する。 In addition, according to the quotation determination support device of claim 8, the determination target data of a person having a high probability of performing illegal citation can be automatically set as the determination target, and the possibility that the fraud recurs will be added. Thus, the determination can be performed, and the accuracy of the determination can be further improved.
 また、請求項9に記載の引用判定支援装置によれば、文献データをそのまま利用せずに修正した上で不正に引用した場合でも、引用か否かを判定することができ、開発工程および製造コストの増大を防止しつつ判定の精度をより向上させることができるという効果を奏する。 Further, according to the quoting determination support device of claim 9, even if the document data is corrected without being used as it is and quoting improperly, it can be determined whether it is quoting or not, and the development process and manufacturing can be performed. This has the effect that the accuracy of the determination can be further improved while preventing an increase in cost.
 また、請求項10に記載の引用判定支援装置によれば、判定対象データの中から、入力手段を介して指定された範囲を判定範囲として特定するので、引用判定を行う対象を限定することができ、判定処理に伴う負荷を低減することができる。 Further, according to the quoting determination support device of claim 10, since the range designated through the input means is specified as the determination range from among the determination target data, the target for performing the quoting determination can be limited. It is possible to reduce the load involved in the determination process.
 また、請求項11に記載の引用判定支援装置によれば、他人の文献データを引用している可能性の高い文献データを自動的に比較範囲に設定することができ、開発工程および製造コストの増大を防止しつつ判定の精度をより向上させることができるという効果を奏する。 Further, according to the citation judging support device of claim 11, the document data having a high possibility of citing the document data of another person can be automatically set in the comparison range, and the development process and the manufacturing cost can be reduced. This has the effect that the accuracy of the determination can be further improved while preventing the increase.
 また、請求項12に記載の引用判定支援装置によれば、判定対象データの記述内容に即した適切な比較範囲を自動的に設定することができ、開発工程および製造コストの増大を防止しつつ判定の精度をより向上させることができるという効果を奏する。 In addition, according to the quotation determination support device of claim 12, an appropriate comparison range can be automatically set in accordance with the description content of the determination target data, thereby preventing an increase in the development process and the manufacturing cost. The effect is that the accuracy of the determination can be further improved.
 また、請求項13に記載の引用判定支援装置によれば、判定範囲において引用されていると文献引用判定手段によって判定された文献データを文献データ記憶手段に記憶させる。また、文献データを特定する引用元情報が判定対象データに含まれている場合において、当該引用元情報に基づいて特定される文献データが文献データ記憶手段に格納されていると判定した場合、当該文献データを比較範囲として特定する。これにより、既に文献データ記憶手段に記憶されている文献データに比較範囲を限定することができ、比較範囲のデータから判定範囲の内容を検索する際の負荷を低減することができる。 Further, according to the quoting determination support device of claim 13, the document data storage means stores the document data determined by the document quoting determination means as being cited in the determination range. When it is determined that the citation source information specifying the document data is included in the determination target data, when it is determined that the document data specified based on the citation source information is stored in the document data storage unit, Identify literature data as a comparison range. Thereby, the comparison range can be limited to the document data already stored in the document data storage means, and the load when searching the content of the determination range from the data of the comparison range can be reduced.
 また、請求項14に記載の引用判定支援装置によれば、判定対象になる論文データが検索キーの制限文字数を超えている場合であっても、検索を実行することが可能になる。また、論文データの各部を順次検索対象としつつ、最終的には論文データ全体を実質的に検索範囲に含めることができるので、引用判定の精度を向上させることができるという効果を奏する。 Further, according to the quoting determination support device of claim 14, even when the thesis data to be determined exceeds the limited number of characters of the search key, the search can be executed. In addition, the entire article data can be substantially included in the search range while sequentially targeting each part of the article data, so that the accuracy of the citation determination can be improved.
 また、請求項15に記載の引用判定支援装置によれば、出現頻度の高い検索結果を自動的に特定し、この検索結果を、類似度算出に用いる比較範囲として自動的に設定するので、判定範囲にマッチする比較範囲を自動的に抽出して引用判定を行なうことができ、引用判定の精度を一層向上させることができる。 In addition, according to the quotation determination support device of claim 15, a search result having a high appearance frequency is automatically specified, and the search result is automatically set as a comparison range used for calculating the degree of similarity. A comparison range that matches the range can be extracted automatically to perform citation determination, and the accuracy of citation determination can be further improved.
 また、請求項16に記載の引用判定支援装置によれば、類似度が入力手段を介して入力された所定の閾値以上である場合に、判定範囲が比較範囲を引用していると判定するので、判定の目的に合わせて最適な閾値を設定し、当該閾値に基づく判定を行わせることができる。 Further, according to the quotation determination support device of claim 16, when the similarity is equal to or more than the predetermined threshold value input through the input means, it is determined that the determination range is quoting the comparison range. An optimal threshold can be set according to the purpose of the determination, and the determination based on the threshold can be performed.
 また、請求項17に記載の引用判定支援装置によれば、判定範囲の記述内容の内、比較範囲から引用された記述内容が占める引用割合を算出及び出力するので、引用の適法性の判断材料を提示することができる。 Further, according to the citation judging support device of claim 17, since the citation ratio occupied by the description content cited from the comparison range is calculated and output among the description contents of the judgment range, the judgment material of the legitimacy of the citation Can be presented.
 また、請求項18に記載の引用判定支援装置によれば、複数の判定対象データについて引用割合を算出し、各判定対象データについての引用割合に基づく順序で判定対象データ情報を出力するので、複数の判定対象データにおける引用の適法性を引用割合に基づいて比較するための判断材料を提示することができる。 Further, according to the quoting determination support device of claim 18, the quoting ratio is calculated for a plurality of determination target data, and the determination target data information is output in the order based on the quoting ratio for each determination target data. It is possible to present judgment material for comparing the legitimacy of citation in the judgment object data of c based on the citation ratio.
 また、請求項19に記載の引用判定支援装置によれば、類似度算出手段にて算出された類似度に対応する出力態様を出力態様情報格納手段から取得し、当該取得した出力態様にて判定範囲を出力するので、ユーザが類似度を把握し易い態様で判定範囲を出力することができる。 Further, according to the quotation determination support device of claim 19, the output mode corresponding to the similarity calculated by the similarity calculation means is acquired from the output manner information storage means, and the determination is made based on the acquired output manner. Since the range is output, the determination range can be output in such a manner that the user can easily grasp the degree of similarity.
 また、請求項20に記載の引用判定支援プログラムによれば、判定範囲と比較範囲を自動的に限定した上で類似度の判定を行なうので、汎用的な判定アルゴリズムを利用して開発工程および製造コストの増大を防止しつつ、判定の精度を向上させることができるという効果を奏する。 Further, according to the citation judging support program of claim 20, since the judgment of the similarity is performed after the judgment range and the comparison range are automatically limited, the development process and the manufacture are performed using a general purpose judgment algorithm. This has the effect that the accuracy of the determination can be improved while preventing the increase in cost.
実施の形態1に係る引用判定支援装置を含むシステム構成を機能概念的に示すブロック図である。FIG. 1 is a block diagram conceptually showing a system configuration including a quotation determination support device according to a first embodiment. 実施の形態1の引用判定支援処理の手順を示すフローチャートである。7 is a flowchart showing the procedure of a quotation determination support process of the first embodiment. 引用判定画面の一例を示す模式図である。It is a schematic diagram which shows an example of a quotation determination screen. 論文データ中で判定範囲として特定される本文の部分の一例を示す説明図である。It is an explanatory view showing an example of a portion of a text specified as a judgment range in article data. 判定範囲の内容が表示された引用判定画面の一例を示す模式図である。It is a schematic diagram which shows an example of the quotation determination screen where the content of the determination range was displayed. 比較範囲の特定処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of a specific process of a comparison range. 引用判定画面において判定範囲内で引用箇所が強調表示されている状態を示す模式図である。It is a schematic diagram which shows the state by which the citation place is highlighted within the determination range in a quotation determination screen. 引用判定画面において参照情報が表示された状態を示す模式図である。It is a schematic diagram which shows the state on which reference information was displayed on the quotation determination screen. 実施の形態2に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 13 is a block diagram showing a functional configuration of a quote determination assisting device according to a second embodiment. 履歴データの一例を示す説明図である。It is an explanatory view showing an example of history data. 実施の形態2の引用判定支援処理の手順を示すフローチャートである。7 is a flowchart showing the procedure of a quote determination support process of the second embodiment. 実施の形態2の判定対象の特定処理の手順を示すフローチャートである。15 is a flowchart illustrating a procedure of identification processing of a determination target according to the second embodiment. 実施の形態3に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 16 is a block diagram showing a functional configuration of a quote determination assisting device according to a third embodiment. 専門辞書の一例を示す説明図である。It is explanatory drawing which shows an example of a specialized dictionary. 実施の形態3の判定対象の特定の手順を示すフローチャートである。21 is a flowchart showing a specific procedure of the determination target in the third embodiment. 実施の形態4に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 18 is a block diagram showing a functional configuration of a quote determination assisting device according to a fourth embodiment. 実施の形態4の比較判定処理の手順を示すフローチャートである。21 is a flowchart showing the procedure of comparison and determination processing of the fourth embodiment; 実施の形態5に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 18 is a block diagram showing a functional configuration of a quote determination assisting device according to a fifth embodiment. 実施の形態5の比較範囲特定処理の手順を示すフローチャートである。FIG. 21 is a flowchart showing the procedure of comparison range identification processing of the fifth embodiment; FIG. 実施の形態6に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a sixth embodiment. 実施の形態6の類似度算出における検索処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the search process in similarity calculation of Embodiment 6. FIG. 変形例1の類似度算出処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the similarity calculation process of the modification 1. FIG. 変形例2の比較範囲特定処理の手順を示すフローチャートである。FIG. 16 is a flowchart showing the procedure of comparison range identification processing of Modification 2; FIG. 実施の形態7に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a seventh embodiment. 引用形式DBに格納される情報を示した表である。It is a table showing information stored in a citation format DB. 実施の形態7の引用判定支援処理の手順を示すフローチャートである。FIG. 21 is a flow chart showing a procedure of quoting determination support processing of the seventh embodiment; FIG. 引用形式設定処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of a citation form setting process. 引用形式設定入力画面を例示した図である。It is the figure which illustrated the citation form setting input screen. 適法性判定処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of legality determination processing. 実施の形態8に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to an eighth embodiment. 引用割合DBに格納される情報を例示した表である。It is the table which illustrated the information which is housed in citation ratio DB. 実施の形態8の引用判定支援処理の手順を示すフローチャートである。FIG. 21 is a flow chart showing a procedure of quoting determination support processing of the eighth embodiment; FIG. 引用割合を出力表示した場合の引用判定画面を例示した図である。It is the figure which illustrated the quotation judgment screen at the time of outputting and displaying a quotation ratio. リスト表示処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of a list display process. 引用割合の合計値の降順で論文データ情報のリストを表示する判定結果画面を示す図である。It is a figure which shows the determination result screen which displays the list | wrist of article data information in descending order of the total value of a citation ratio. 実施の形態9に係る引用判定支援装置の機能的構成を示すブロック図である。FIG. 21 is a block diagram showing a functional configuration of a quote determination assisting device according to a ninth embodiment. 出力態様DBに格納される情報を例示した表である。It is the table which illustrated the information stored in output mode DB. 実施の形態9の引用判定支援処理の手順を示すフローチャートである。FIG. 20 is a flow chart showing the procedure of a quote determination support process of the ninth embodiment. 表示装置上の引用判定画面に表示された論文データを示す図である。It is a figure which shows the thesis data displayed on the citation determination screen on a display apparatus. 引用元情報が表示された引用判定画面を示す図である。It is a figure which shows the quotation determination screen where quotation source information was displayed. 実施の形態10の引用判定支援処理の手順を示すフローチャートである。51 is a flowchart showing the procedure of a quotation determination support process of the tenth embodiment. 実施の形態10の比較判定処理の手順を示すフローチャートである。51 is a flowchart showing the procedure of comparison and determination processing of the tenth embodiment;
符号の説明Explanation of sign
100,900,1300,1600,1800,2000 引用判定支援装置
101 記憶部
101a 文献データ記憶部
101b 文献リスト記憶部
101c 論文データ記憶部
101d 履歴データ記憶部
101e 辞書記憶部
101f 引用形式DB
101g 引用割合DB
101h 出力態様DB
102 制御部
102a、102i、102j 判定範囲特定部
102b、102l、102q 比較範囲特定部
102c、102m、102r 類似度算出部
102d、102n 文献引用判定部
102e 適法性判定部
102f 参照情報取得部
102g 入力制御部
102h 出力制御部
102k 単語変換部
102p 課題抽出部
102s 引用形式設定部
102t 引用割合算出部
103 入力装置
104 表示装置
105 論文データ表示エリア
106 範囲設定スライダ
106a 上側の範囲設定スライダ
106b 下側の範囲設定スライダ
107 全体ビュー
108 文献データ表示エリア
130 インターネット
131 WEBサイト
133 ファイルサーバ
100, 900, 1300, 1600, 1800, 2000 Citation determination support device 101 storage unit 101a document data storage unit 101b document list storage unit 101c paper data storage unit 101d history data storage unit 101e dictionary storage unit 101f citation format DB
101g citation ratio DB
101h output mode DB
102 control unit 102a, 102i, 102j determination range specification unit 102b, 1021, 102q comparison range specification unit 102c, 102m, 102r similarity calculation unit 102d, 102n document reference determination unit 102e appropriateness determination unit 102f reference information acquisition unit 102g input control Section 102h Output control section 102k Word conversion section 102p Task extraction section 102s Quotation format setting section 102t Quotation ratio calculation section 103 Input device 104 Display device 105 Article data display area 106 Range setting slider 106a Upper range setting slider 106b Lower range setting Slider 107 Whole view 108 Literature data display area 130 Internet 131 WEB site 133 File server
 以下に添付図面を参照して、この発明に係る引用判定支援装置および引用判定支援プログラムの実施の形態を詳細に説明する。まず、本実施の形態の構成を説明した後、本実施の形態の処理内容について説明し、最後に本実施の形態に対する変形例について説明する。ただし、本実施の形態によって本発明が限定されるものではない。 DETAILED DESCRIPTION OF THE INVENTION Embodiments of a quote determination support apparatus and a quote determination support program according to the present invention will be described in detail with reference to the accompanying drawings. First, after the configuration of the present embodiment is described, the processing content of the present embodiment will be described, and finally, a modified example of the present embodiment will be described. However, the present invention is not limited by the present embodiment.
〔実施の形態1〕
 まず実施の形態1について説明する。この形態は、論文データのうち、第三者の文献を引用する可能性が高い構成部分を自動的に選定して判定範囲とする形態である。
First Embodiment
First, the first embodiment will be described. This form is a form which selects automatically the composition part with high possibility of quoting the document of a third party among thesis data, and makes it a judgment range.
 図1は、実施の形態1に係る引用判定支援装置を含むシステム構成を機能概念的に示すブロック図である。引用判定支援装置100は、図1に示すように、インターネット130等の任意のネットワークを介して、WEBサイト131やファイルサーバ133に対して通信可能に接続されている。なお、これらWEBサイト131やファイルサーバ133は従来と同様に構成することができるので、その詳細な説明を省略する。 FIG. 1 is a block diagram conceptually showing the system configuration including the citation judging support device according to the first embodiment. The quotation determination support device 100 is communicably connected to the WEB site 131 and the file server 133 via an arbitrary network such as the Internet 130 as shown in FIG. The WEB site 131 and the file server 133 can be configured in the same manner as in the prior art, and thus the detailed description thereof is omitted.
(構成)
 引用判定支援装置100は、図1に示すように、記憶部101及び制御部102をバスにて接続して構成されると共に、入力装置103及び表示装置104とを備えている。
(Constitution)
The quotation determination support device 100 is configured by connecting a storage unit 101 and a control unit 102 by a bus as shown in FIG. 1, and includes an input device 103 and a display device 104.
 記憶部101は、引用判定支援装置100の制御に必要な各種のプログラム及びデータを格納する記憶手段であり、例えばハードディスクドライブ装置(HDD)やメモリ等の記憶媒体にて構成されている。特に、記憶部101には、図示しない記録媒体に格納され図示しない読み取り装置にて読み取られた引用判定支援プログラムがインストールされている。この記憶部101には、文献データ記憶部101a、文献リスト記憶部101b、及び論文データ記憶部101cが設けられている。 The storage unit 101 is a storage unit that stores various programs and data necessary for controlling the quotation determination support apparatus 100, and is configured of a storage medium such as a hard disk drive (HDD) or a memory. In particular, the storage unit 101 is installed with a quotation determination support program stored in a recording medium (not shown) and read by a reading device (not shown). In the storage unit 101, a document data storage unit 101a, a document list storage unit 101b, and a paper data storage unit 101c are provided.
 文献データ記憶部101aは、論文データの引用元となりうる文献データを記憶するものである。本実施の形態では、文献データが、引用判定支援装置100に備えた文献データ記憶部101aに記憶されると共に、WEBサイト131及びファイルサーバ133にも保存されているものとする。 The document data storage unit 101 a stores document data that can be a source of citation data. In the present embodiment, it is assumed that the document data is stored in the document data storage unit 101 a provided in the quotation determination support device 100 and is also stored in the WEB site 131 and the file server 133.
 文献リスト記憶部101bは、文献データ記憶部101aに記録された文献データおよびインターネット上の文献データの文献名、URL(Uniform Resource Locator)やフォルダ名等の保存場所、ファイル名、作成者、作成年月日等の書誌情報の一覧を記憶するものである。 The document list storage unit 101b is a document name of document data recorded in the document data storage unit 101a, the document name of document data on the Internet, a storage location such as URL (Uniform Resource Locator) or folder name, a file name, a creator, a creation year A list of bibliographic information such as date and time is stored.
 論文データ記憶部101cは、引用判定の対象となる論文データを、論文データの作成者である学生を識別するための学籍簿番号と対応づけて記憶するものである。この論文データ記憶部101cには、今回引用判定の対象となる論文データを学生の端末から受信して格納しておく他、過去に提出されて引用判定を行った全ての論文データがその作成者である学生の学籍簿番号と対応づけられて格納されている。 The thesis data storage unit 101c stores thesis data to be subjected to the citation determination in association with the student register number for identifying the student who is the creator of the thesis data. This thesis data storage unit 101c receives thesis data to be subject to the citation judgment from the student's terminal and stores it, and all the thesis data submitted in the past and subjected to the citation judgment is its creator It is stored in correspondence with the student register number of the student.
 制御部102は、引用判定支援装置100の制御を行う制御手段であり、機能概念的に、判定範囲特定部102a、比較範囲特定部102b、類似度算出部102c、文献引用判定部102d、適法性判定部102e、参照情報取得部102f、入力制御部102g、及び出力制御部102hを備える。この制御部102の具体的構成は任意であるが、例えば、OS(Operating System)などの制御プログラム、各種の処理手順などを規定した組み込みプログラム、所要データを格納するための内部メモリ、及び、これらのプログラムを実行するCPU(Central Processing Unit)を備えて構成される。 The control unit 102 is a control unit that controls the citation determination support apparatus 100, and conceptually illustrates the determination range specification unit 102a, the comparison range specification unit 102b, the similarity calculation unit 102c, the document citation determination unit 102d, and legality. A determination unit 102e, a reference information acquisition unit 102f, an input control unit 102g, and an output control unit 102h are provided. Although the specific configuration of the control unit 102 is arbitrary, for example, a control program such as an OS (Operating System), an embedded program defining various processing procedures, etc., an internal memory for storing required data, and And a CPU (Central Processing Unit) that executes the program of
 判定範囲特定部102aは、論文データ記憶部101cに保存されている論文データの中から、文献データの引用の有無の判定範囲を特定する判定範囲特定手段である。 The determination range specifying unit 102 a is a determination range specifying unit that specifies a determination range of presence / absence of citation of document data from the article data stored in the article data storage unit 101 c.
 比較範囲特定部102bは、論文データの判定範囲との比較範囲となる文献データ等を特定する比較範囲特定手段である。 The comparison range specifying unit 102 b is a comparison range specifying unit that specifies document data or the like to be a comparison range with the determination range of the article data.
 類似度算出部102cは、判定範囲特定部102aによって特定された判定範囲の記述内容を検索キーとして、比較範囲特定部102bによって特定された文献データや過去の論文データ(以下、「文献データ等」という)の比較範囲を検索し、相互の類似度を算出する類似度算出手段である。 The similarity calculation unit 102 c uses the description content of the determination range specified by the determination range specification unit 102 a as a search key, and the document data specified in the comparison range specification unit 102 b or the past paper data (hereinafter referred to as “document data etc.” ) Is a similarity calculation means for searching for the comparison range and calculating the mutual similarity.
 文献引用判定部102dは、類似度算出部102cによって算出された類似度が所定の閾値以上である場合に、論文データの判定範囲が比較範囲の文献データ等を引用していると判定する文献引用判定手段である。 When the similarity calculated by the similarity calculation unit 102c is equal to or higher than a predetermined threshold value, the document quotation determination unit 102d determines that the document data judgment range refers to a comparison data or the like. It is a judgment means.
 適法性判定部102eは、文献引用判定部102dによって、論文データの判定範囲が比較範囲の文献データ等を引用していると判定された場合に、判定範囲における文献データ等の引用箇所およびその近傍箇所に基づいて、当該引用が適法な引用であるか否かを判定する適法性判定手段である。 When it is determined by the document quoting determination unit 102d that the determination range of the article data is quoting the document data or the like of the comparison range, the legality determination unit 102e refers to the cited place such as the document data or the like in the determination range It is a legitimacy judging means to judge whether the citation is a legal citation based on the place.
 参照情報取得部102fは、論文データの判定範囲が比較範囲の文献データ等を引用していると判定された場合に、文献データ等を参照するための参照情報として、文献データ等の名称や題名、URLやフォルダ名等を、文献データ等の属性等から取得する参照情報取得手段である。 When it is determined that the determination range of the article data refers to document data or the like of the comparison range, the reference information acquisition unit 102 f uses the name or title of the document data or the like as reference information for referring to the document data or the like. , URL and folder name, etc. are reference information acquisition means for acquiring from attributes such as document data.
 入力制御部102gは、入力装置103から操作入力されることによるイベントを受付けたり、操作入力の入力制御を行う入力制御手段である。 The input control unit 102 g is an input control unit that receives an event caused by an operation input from the input device 103 and performs input control of the operation input.
 出力制御部102hは、表示装置104に対する各種画面の表示制御を行う出力制御手段である。この出力制御部102hは、判定範囲の表示や文献データ等の比較範囲を引用している論文データの判定範囲と上記参照情報を示す引用判定画面(後述)を表示装置104に表示する。 The output control unit 102 h is an output control unit that performs display control of various screens on the display device 104. The output control unit 102h displays a judgment range display, a judgment range of article data citing a comparison range such as document data, and a quotation judgment screen (described later) showing the reference information on the display device 104.
 入力装置103は、キーボードやマウス等のポインティングデバイスの如き入力手段である。 The input device 103 is an input unit such as a keyboard or a pointing device such as a mouse.
 表示装置104は、モニタの如き出力手段である。 The display device 104 is an output means such as a monitor.
(処理)
 次に、以上のように構成された実施の形態1の引用判定支援装置100で実行される引用判定支援処理について説明する。図2は、実施の形態1の引用判定支援処理の手順を示すフローチャートである。
(processing)
Next, the citation determination support process executed by citation determination support apparatus 100 of the first embodiment configured as described above will be described. FIG. 2 is a flowchart showing the procedure of the quotation determination support process of the first embodiment.
 利用者が入力装置103を介して所定方法で引用判定支援処理の実行を指示すると、まず、引用判定画面が出力制御部102hによって表示装置104に表示される。図3は、引用判定画面の一例を示す模式図である。この画面で、「簡易」ボタンをクリックすると判定範囲の特定が行われる。なお、引用判定画面の「詳細」ボタンをクリックすると、文献データ記憶部101aの文献データ等を管理する検索データベース、言語、引用文献データの生成期間、キーワード、作成者などの検索のための各種設定を行う画面(図示せず)が表示される。 When the user instructs execution of the quotation determination support process by a predetermined method via the input device 103, the quotation control screen is first displayed on the display device 104 by the output control unit 102h. FIG. 3 is a schematic view showing an example of the quotation determination screen. In this screen, when the “simple” button is clicked, the determination range is specified. If you click the "Details" button on the citation judgment screen, various settings for searching the search database that manages the document data etc. of the document data storage unit 101a, language, generation period of cited document data, keywords, creator etc Screen (not shown) is displayed.
 判定範囲を特定するため、判定範囲特定部102aは、作成された論文データを論文データ記憶部101cから読み出す(ステップS11)。そして、判定範囲特定部102aは、公知の手法で論文データの構成の構造解析を行い(ステップS12)、論文を構成する導入部分(「はじめに」の構成部分等)、本文部分、最終部分等(「最後に」、「謝辞」の構成部分等)の構成部分を得る。そして、本文部分が論文データの主要部分であり、第三者の文献を引用する可能性が高い構成部分であることから、判定範囲特定部102aは、構造解析により得られた構成部分の中から、本文部分を判定範囲として特定する(ステップS13)。 In order to specify the determination range, the determination range identification unit 102a reads the created article data from the article data storage unit 101c (step S11). Then, the determination range specifying unit 102a performs structural analysis of the structure of the data of the article by a known method (step S12), and an introductory portion (constituent portion of "introduction" and the like) configuring the article Obtain the component part of "end", the component part of "acknowledgement", etc. Then, since the main text portion is a main portion of the article data and is a component portion having a high possibility of citing the document of the third party, the determination range specifying unit 102a selects the component portion obtained by the structural analysis. , And the text part is specified as the determination range (step S13).
 図4は、論文データ中で判定範囲として特定される本文の部分の一例を示す説明図である。図4に示すようなレポートの場合、回答欄に記載された内容が本文に相当する構成部分であるため、判定範囲特定部102aは、この回答欄の記載内容を判定範囲として特定することになる。このように判定範囲が特定されると、出力制御部102hは、図5に示すように、特定された判定範囲の内容を引用判定画面の判定範囲欄に表示する。 FIG. 4 is an explanatory view showing an example of a text portion specified as a judgment range in article data. In the case of a report as shown in FIG. 4, the content described in the answer column is a component corresponding to the text, so the determination range specification unit 102a specifies the described content of this response column as the determination range. . When the determination range is specified as described above, the output control unit 102h displays the content of the specified determination range in the determination range column of the quotation determination screen, as shown in FIG.
 図2に戻り、判定範囲の特定が終了したら、図5に示す引用判定画面において利用者は「検索実行」ボタンをクリックする。この「検索実行」ボタンがクリックされると入力制御部102gはそのイベントを受信し、これにより比較範囲特定部102bによる比較範囲の特定処理が行われる(ステップS14)。 Returning to FIG. 2, when the determination of the determination range is completed, the user clicks the “search execution” button on the quotation determination screen shown in FIG. When the "search execution" button is clicked, the input control unit 102g receives the event, and the comparison range specifying unit 102b performs a process of specifying the comparison range (step S14).
 図6は、比較範囲の特定処理の手順を示すフローチャートである。比較範囲特定部102bは、まず、論文データ記憶部101cに格納されている過去に提出された全ての論文データを読み出す(ステップS21)。次に、比較範囲特定部102bは、文献リスト記憶部101bに保存されている文献リストに記載されている全ての文献データを文献データ記憶部101aおよびインターネット130上から読み出す(ステップS22)。そして、比較範囲特定部102bは、読み出した全ての論文データと取得した文献データ(文献データ等)とを比較範囲として特定する(ステップS23)。 FIG. 6 is a flowchart showing the procedure of the process of specifying the comparison range. The comparison range specifying unit 102b first reads all the thesis data submitted in the past stored in the thesis data storage unit 101c (step S21). Next, the comparison range specifying unit 102b reads all the document data described in the document list stored in the document list storage unit 101b from the document data storage unit 101a and the Internet 130 (step S22). Then, the comparison range specifying unit 102b specifies all the read article data and the acquired document data (such as document data) as a comparison range (Step S23).
 図2に戻り、比較範囲の特定が終了したら、類似度算出部102cは、特定された判定範囲の記述内容を検索キーとして特定された比較範囲のデータを検索し(ステップS15)、比較範囲の記述内容の類似度を算出する(ステップS16)。類似度算出部102cは、具体的には、公知の検索技術を利用した検索プログラムや検索エンジン、またはこれらの検索プログラムや検索エンジンに対して検索キーを指定して検索指示を実行させる。ここでは、類似度の算出ロジックとして、例えば、論文データの判定範囲の記述内容、文献データの記述内容をそれぞれ構文解析し、各単語や文節の一致度を数値化する等の公知のロジックを用いる。そして、文献引用判定部102dは、算出された類似度が所定の閾値以上であるか否かを判断することにより、判定範囲が比較範囲の文献を引用しているか否かを判断する(ステップS17)。 Returning to FIG. 2, when the specification of the comparison range is completed, the similarity calculation unit 102c searches the data of the comparison range specified using the description content of the specified determination range as the search key (step S15). The similarity of the description content is calculated (step S16). Specifically, the similarity calculation unit 102c designates a search key to a search program or a search engine using a known search technology, or a search program or a search engine of these to execute a search instruction. Here, as logic for calculating the degree of similarity, for example, a known logic such as syntactic analysis of the description contents of the judgment range of the article data and the description contents of the document data is used. . Then, the document quoting determination unit 102d determines whether the determination range is citing the document of the comparison range by determining whether the calculated similarity is equal to or more than a predetermined threshold (step S17). ).
 そして、算出された類似度が所定の閾値より小さい場合には(ステップS17,No)、判定範囲は比較範囲の文献データ等を引用していないと判断し、処理を終了する。ここで、所定の閾値は、引用判定の求める精度に応じて任意に定めることができる。 Then, if the calculated similarity is smaller than the predetermined threshold (step S17, No), it is determined that the determination range does not cite document data or the like in the comparison range, and the process is ended. Here, the predetermined threshold can be arbitrarily determined in accordance with the accuracy required for the quotation determination.
 一方、算出された類似度が所定の閾値以上である場合には(ステップS17,Yes)、判定範囲は比較範囲の文献データ等を引用していると判断し、次に、適法性判定部102eがこの引用が適法な引用か否かを判断する(ステップS18)。ここで、引用が「適法」とは、当該引用が著作権法上適法であること、あるいは利用者が予め設定した要件を具備していること等を含む概念である。具体的には、適法性判定部102eは、判定範囲における文献データ等の引用箇所の下方近傍に書籍名の表示がある場合、引用箇所の直前直後に引用を示す括弧「」の表示がある場合、引用部分であることを示すため引用箇所を他の部分のフォントと異なるフォントで表示している場合には、引用箇所が著作権法に基づいて適法に引用されたものであると判断する。この他にも、文献データ等の引用箇所の下方近傍に、引用の適法性を肯定するような所定表示(例えば、作者名、著者名、あるいは出版社名)がある場合に、当該引用箇所は適法に引用されたものであると判断するようにしてもよい。 On the other hand, when the calculated similarity is equal to or higher than the predetermined threshold (Yes at step S17), it is determined that the determination range is quoting document data or the like in the comparison range, and then the legality determination unit 102e Determines whether this citation is a legitimate one (step S18). Here, the term "legitimate" is a concept including that the quote is legal under the copyright law, or that the user has requirements set in advance. Specifically, when there is display of the book name near the lower part of the citation place such as the document data in the determination range, the legality determination unit 102e displays the parenthesis "" indicating citation immediately before and after the citation place. If the citation is displayed in a font different from the font of the other parts to indicate that it is a citation, it is determined that the citation is legitimately cited based on the copyright law. In addition to this, when there is a predetermined indication (for example, author's name, author's name, or publisher's name) which affirms the legitimacy of the citation, the citation part is It may be judged that it has been properly cited.
 そして、判定範囲の引用が適法な引用であると判断された場合には(ステップS18,Yes)、処理を終了する。 Then, when it is determined that the reference of the determination range is a legal reference (Step S18, Yes), the processing is ended.
 一方、判定範囲の引用が適法な引用でないと判断された場合には(ステップS18,No)、参照情報取得部102fは、文献データ等(引用された文献データ若しくは引用された論文データ)を参照するための参照情報(文献データ等のファイル名や題名、URLやフォルダ名等)、文献データの属性等若しくは引用された論文データの属性等から取得する(ステップS19)。そして、出力制御部102hは、文献データ等を判定範囲内で引用している箇所を引用判定画面に明示するとともに参照情報を表示する(ステップS20)。 On the other hand, when it is determined that the citation of the determination range is not a legal citation (Step S18, No), the reference information acquisition unit 102f refers to the document data etc. (cited document data or cited paper data) Reference information (file name and title of document data etc., URL, folder name etc.), attribute of document data, etc. or attribute of cited article data etc. (step S19). Then, the output control unit 102 h clearly indicates the location at which the document data and the like are cited within the determination range on the citation determination screen and displays the reference information (step S 20).
 以上のステップS15からS20までの処理は、特定された比較範囲のデータの全てについて繰り返し実行される(ステップS20a,No)。ステップS15からS20までの処理が特定された比較範囲のデータの全てについて実行された場合には(ステップS20a,Yes)、処理を終了する。 The processes in steps S15 to S20 are repeatedly executed for all the data in the specified comparison range (step S20a, No). If the processes in steps S15 to S20 have been executed for all the data in the specified comparison range (step S20a, Yes), the process ends.
 本実施の形態では、出力制御部102hは、まず判定範囲内で引用している箇所を変色、反転等で強調表示を行う。図7は、引用判定画面において判定範囲内で引用箇所が強調表示されている状態を示す模式図である。なお、図7において太字かつ下線が付されている部分が、強調表示された部分、すなわち引用箇所の部分である。 In the present embodiment, first, the output control unit 102h highlights the portion quoted in the determination range by color change, inversion, or the like. FIG. 7 is a schematic view showing a state in which a cited place is highlighted within the judgment range on the quotation judgment screen. Note that the bold and underlined portions in FIG. 7 are the highlighted portions, that is, the portions of the citation.
 そして、利用者がこの引用箇所を入力装置103を介して指示すると、かかる指示を入力制御部102gで受け付けて、出力制御部102hは、指示された箇所に参照情報を表示するように制御している。 Then, when the user designates this cited place via the input device 103, the instruction is accepted by the input control unit 102g, and the output control unit 102h is controlled to display the reference information at the instructed location. There is.
 図8は、引用判定画面において参照情報が表示された状態を示す模式図である。図8の例では、インターネット130上の文献データが引用された場合を示しており、その参照情報として文献データのURLが表示されている。本実施の形態では、ユーザは、このURLを入力装置103のポインティングデバイスでクリックしたときに、出力制御部102h、当該URLの示すWEBページにアクセスして引用元の文献データ等を表示するように構成している。これにより、論文データの引用判定を行う教授などは引用元の文献データを容易に取得することが可能となる。 FIG. 8 is a schematic view showing a state in which reference information is displayed on the quotation determination screen. The example of FIG. 8 shows the case where the document data on the Internet 130 is cited, and the URL of the document data is displayed as the reference information. In this embodiment, when the user clicks this URL with the pointing device of the input device 103, the output control unit 102h accesses the WEB page indicated by the URL to display the reference data etc. of the quotation source. Configured. As a result, a professor or the like who makes a citation judgment on article data can easily acquire reference data of the citation source.
(効果)
 このように実施の形態1の引用判定支援装置100では、論文データの判定範囲と比較範囲の文献データ等を自動的に限定して類似度判定を行なうので、汎用的な類似算出などの判定アルゴリズムを利用して引用判定を行うことができる。このため、本実施の形態によれば、開発工程および製造コストの増大を防止しつつ判定の精度を向上させることができる。
(effect)
As described above, in the quotation judgment support apparatus 100 of the first embodiment, the similarity degree judgment is performed by automatically limiting the document data etc. of the judgment range of the article data and the comparison range, so that the judgment algorithm such as general purpose similarity calculation The citation judgment can be performed using. Therefore, according to the present embodiment, it is possible to improve the accuracy of the determination while preventing the development process and the increase in the manufacturing cost.
 また、実施の形態1の引用判定支援装置100では、判定範囲特定部102aによって論文データを構成する構成部分の中から、無断で引用されやすい本文部分を判定範囲として特定しているので、判定の精度をより向上させることができる。 Further, in the quotation determination support apparatus 100 of the first embodiment, the text range that is easily cited without permission is identified as the determination range from among the component parts that constitute the article data by the determination range identification unit 102a. Accuracy can be further improved.
 また、実施の形態1の引用判定支援装置100では、判定範囲が比較範囲を引用していると判定された場合に、適法性判定部102eによって判定範囲における比較範囲の引用箇所およびその近傍箇所に基づいて、当該引用が適法な引用であるか否かを判定しているので、文献データ等の引用が著作権法で規定する適法な引用か否かを容易に判断することができ、判定の精度を向上させることができる。 In addition, in the quotation determination support apparatus 100 of the first embodiment, when it is determined that the determination range is referring to the comparison range, the legality determination unit 102e places the reference place of the comparison range in the determination range and the vicinity thereof. Since it is determined based on the citation whether or not the citation is a legal citation, it can be easily judged whether citations such as document data are legal citations prescribed by the copyright law. Accuracy can be improved.
 また、実施の形態1の引用判定支援装置100では、参照情報取得部102fによって、比較範囲を含む文献データを参照するための参照情報を、当該文献データに基づいて取得し、文献データの比較範囲を引用している判定対象データの判定範囲に加えて取得された参照情報を出力することで、文献データを容易に参照することができる。 In addition, in the quotation determination support apparatus 100 of the first embodiment, the reference information acquisition unit 102f acquires reference information for referring to the document data including the comparison range based on the document data, and the comparison range of the document data The document data can be easily referred to by outputting the acquired reference information in addition to the judgment range of the judgment target data quoting the.
〔実施の形態2〕
 次に、実施の形態2について説明する。この形態は、過去に不正な引用行為を行った学生や成績が低い学生の論文データを判定対象として選択する形態である。ただし、実施の形態2に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Second Embodiment
Next, the second embodiment will be described. This form is a form in which thesis data of a student who has performed an illegal citation act in the past or a student whose grade is low is selected as a judgment target. However, the configuration and the process according to the second embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図9は、実施の形態2に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置900は、記憶部101に履歴データ記憶部101dを備えると共に、制御部102に判定範囲特定部102iを備える点において、実施の形態1に係る引用判定支援装置100と異なる。
(Constitution)
FIG. 9 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the second embodiment. The quote determination support device 900 differs from the quote determination support device 100 according to the first embodiment in that the storage unit 101 includes the history data storage unit 101 d and the control unit 102 includes the determination range specification unit 102 i.
 履歴データ記憶部101dは、過去に生成された論文データに関する履歴データを記憶するメモリ、HDDなどの記憶媒体である。図10は、履歴データの一例を示す説明図である。履歴データは、過去に生成された全ての論文データに対し、論文データの作成日付と、論文データの作成者である学生を一意に識別するための学籍簿番号と、論文データにおける不正な引用行為の有無を示す不正引用の有無と、学生の成績(A,B,C,Dであり、Aが最も優秀で、次いで優秀度合いはB,C,Dの順になる)とを対応づけたデータである。ここでは、過去に不正な引用を行った学生の論文データに対しては、不正引用の有無に「有」が設定されている。なお、過去の論文データ自体は、実施の形態1と同様に、学籍簿番号に対応づけられて論文データ記憶部101cに格納されている。 The history data storage unit 101 d is a storage medium, such as a memory or an HDD, which stores history data related to article data generated in the past. FIG. 10 is an explanatory diagram of an example of history data. The historical data includes, for all the dissertation data generated in the past, the creation date of the dissertation data, the student register number for uniquely identifying the student who is the creator of the dissertation data, and illegal citation in the dissertation data It is the data which matched the existence of the citation which shows the presence or absence of the student and the student's result (A, B, C, D, A is the best and then the degree of excellence is in the order of B, C, D) is there. Here, “presence” is set as the presence or absence of the incorrect citation for thesis data of the student who made the citation in the past. As in the first embodiment, past thesis data itself is stored in the thesis data storage unit 101c in association with the student register number.
 判定範囲特定部102iは、この履歴データを参照して、不正な引用を行う確率が高い者として、不正引用の有無が「有」の学籍簿番号と、成績がC以下(すなわち、CおよびD)の学籍簿番号を取得して、取得した学籍簿番号の学生が提出した論文データを、提出された複数の論文データ(論文データ記憶部101cに保存されている)の中から判定対象として選択する。また、判定範囲特定部102iは、判定対象として選択された論文データの構成部分の中から、実施の形態1と同様に、本文部分を判定範囲として特定する。 Judgment range specification unit 102i refers to this history data, and as a person who has a high probability of making an illegal citation, the student register number of “presence or absence of illegal citation” and the grade of C or less (that is, C and D) The student register number of (a) is acquired, and thesis data submitted by the student with the acquired student register number is selected as a judgment target from among a plurality of thesis data submitted (stored in thesis data storage unit 101c) Do. Further, as in the first embodiment, the determination range specifying unit 102i specifies the body part as the determination range from among the component parts of the article data selected as the determination target.
(処理)
 次に、以上のように構成された実施の形態2の引用判定支援装置900による引用判定支援処理について説明する。図11は、実施の形態2の引用判定支援処理の手順を示すフローチャートである。
(processing)
Next, citation determination support processing by citation determination support apparatus 900 of the second embodiment configured as described above will be described. FIG. 11 is a flowchart of the quotation determination support process according to the second embodiment.
 表示装置104に表示されている実施の形態1と同様の図3に示す引用判定画面において利用者が簡易ボタンをクリックすると、まず、判定範囲特定部102iによって判定対象の特定処理が行われる(ステップS31)。かかる判定対象の特定処理の詳細については後述する。 When the user clicks the simple button on the quotation determination screen shown in FIG. 3 similar to the first embodiment displayed on the display device 104, the determination range identification unit 102i first performs determination processing of the determination target (Steps S31). Details of the determination process of the determination target will be described later.
 この判定対象の特定処理が完了すると、判定対象となった学生の論文データに対して、実施の形態1と同様に判定範囲の特定が行われ(ステップS32,33)、以降は実施の形態1と同様の処理で引用判定が行われる(ステップS34からS40a)。 When the determination process of the determination target is completed, the determination range is specified as in the first embodiment on the thesis data of the student as the determination target (steps S32 and S33). Quotation determination is performed in the same processing as in (steps S34 to S40a).
 次に、ステップS31における判定対象の特定処理について詳細に説明する。図12は、実施の形態2の判定対象の特定処理の手順を示すフローチャートである。 Next, the process of identifying the determination target in step S31 will be described in detail. FIG. 12 is a flowchart illustrating the procedure of the process of identifying a determination target according to the second embodiment.
 まず、判定範囲特定部102iは、作成された論文データと当該論文データに対応する学籍簿番号とを論文データ記憶部101cから読み出す(ステップS41)。次に、判定範囲特定部102iは、履歴データ記憶部101dに記憶されている履歴データを参照して、読み出した学籍簿番号に対応する不正引用の有無と成績とを読み出す(ステップS42)。 First, the determination range identification unit 102i reads out the created article data and the student register number corresponding to the article data from the article data storage unit 101c (step S41). Next, the determination range specifying unit 102i refers to the history data stored in the history data storage unit 101d, and reads out the presence / absence of the illegal citation and the grade corresponding to the read student registry number (step S42).
 そして、判定範囲特定部102iは、履歴データから読み出した不正利用の有無が「有」であるか否かを判断する(ステップS43)。そして、不正利用の有無が「有」の場合には(ステップS43,Yes)、この学籍簿番号の学生が作成した論文データ、すなわち、ステップS41で読み出した論文データを判定対象として特定する(ステップS45)。 Then, the determination range specifying unit 102i determines whether the presence or absence of the unauthorized use read from the history data is “presence” (step S43). Then, if the presence or absence of unauthorized use is "presence" (step S43, Yes), the thesis data created by the student with the student register number, that is, the thesis data read in step S41 is specified as the determination target (step S45).
 一方、ステップS43において、不正利用の有無が「無」の場合には(ステップS43,No)、判定範囲特定部102iは、さらに、履歴データから読み出した成績がC以下、すなわち、CまたはDであるか否かを判断する(ステップS44)。 On the other hand, if it is determined in step S43 that the presence or absence of unauthorized use is "absent" (step S43, No), the determination range specifying unit 102i further determines that the score read from the history data is C or less, that is, C or D. It is determined whether there is any (step S44).
 そして、成績がCまたはDである場合には(ステップS44,Yes)、この学籍簿番号の学生が作成した論文データを判定対象として特定する(ステップS45)。 Then, if the grade is C or D (step S44, Yes), the thesis data created by the student with the student register number is specified as the determination target (step S45).
 一方、ステップS44において成績がCより高い(すなわち、AまたはBである場合)には(ステップS44,No)、ステップS41で読み出した論文データを判定対象としない。 On the other hand, when the grade is higher than C in step S44 (ie, when it is A or B) (step S44, No), the article data read out in step S41 is not determined.
 引用判定すべき作成された論文データが複数存在する場合には、このステップS41からS45までの処理を当該複数の論文データに対して行って、判定対象の論文データを特定する。 If there are a plurality of created article data to be cited and determined, the processing from step S41 to step S45 is performed on the plurality of article data to specify the article data to be determined.
(効果)
 このように実施の形態2の引用判定支援装置900では、履歴データから過去に不正な引用行為が有った旨を示す不正引用の有無の「有」に対応する学籍簿番号、成績は所定値であるC以下の学籍簿番号の学生が作成した論文データを、複数の論文データの中から判定対象として選択しているので、不正な引用を行う確率が高い者の論文データを判定対象とすることができ、判定の精度をより向上させることができると共に、不正引用の確率が高い論文データのみに判定対象を限定することで判定処理負荷を低減して判定効率を高めることができる。
(effect)
As described above, in the quotation determination support device 900 of the second embodiment, the student register number corresponding to "presence" of the presence or absence of fraudulent citation indicating that there has been a fraudulent citation act in the past from history data, the score is a predetermined value Since the dissertation data created by the student whose student registry number is C or lower is selected as the judgment target from among the plurality of thesis data, the dissertation data of the person who has a high probability of making an illegal citation is judged as the judgment target As a result, the determination accuracy can be further improved, and the determination processing load can be reduced and the determination efficiency can be improved by limiting the determination target only to the thesis data having a high probability of illegal citation.
 なお、本実施の形態では、判定範囲特定部102iが履歴データから過去に不正な引用行為が有った旨を示す不正引用の有無の判断、成績が所定値以下であるかの判断を両方行っているが、一方のみの判断で判定対象の論文データを特定するように構成してもよい。 In the present embodiment, the determination range specifying unit 102i both determines from the history data whether there is an illegal quotation indicating that there has been an illegal quotation act in the past, and determines whether or not the score is a predetermined value or less. However, the paper data to be judged may be specified by only one judgment.
〔実施の形態3〕
 次に、実施の形態3について説明する。この形態は、引用元の文献が単語を修正された上で論文に不正引用された場合の対策として、当該単語を修正前の単語に変換した上で類似度判定を行う形態である。ただし、実施の形態3に係る構成及び処理は、特に説明する場合を除いては実施の形態2に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態2で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Third Embodiment
Next, the third embodiment will be described. This form is a form which performs similarity determination after converting the said word into the word before correction as a countermeasure when the document of citation origin correct | amends a word and is quoted incorrectly by the dissertation. However, the configuration and processing according to the third embodiment are the same as the configuration and processing according to the second embodiment except when particularly described, and the same configuration and processing in the second embodiment will be described. The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図13は、実施の形態3に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置1300は、記憶部101に辞書記憶部101eを備えると共に、制御部102に判定範囲特定部102j及び単語変換部102kを備えている点において、実施の形態2に係る引用判定支援装置900と異なる。
(Constitution)
FIG. 13 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the third embodiment. In this quotation determination support device 1300, the storage unit 101 includes the dictionary storage unit 101e, and the control unit 102 includes the determination range specification unit 102j and the word conversion unit 102k. It differs from the device 900.
 辞書記憶部101eは、論文データの技術分野における専門用語とその専門用語と関連して使用され得る一または複数の用語を対応づけて登録した専門辞書を記憶するHDDやメモリ等の記憶媒体である。図14は、専門辞書の一例を示す説明図である。専門辞書は、図14に示すように、文献データ等に含まれる得る用語に対して、第1候補用語、第2候補用語など、関連して使用され得る用語が対応づけられている。この専門辞書は、後述する単語変換部102kによって論文データ中の単語を修正する際に使用される。 The dictionary storage unit 101e is a storage medium such as an HDD or a memory that stores a specialized dictionary in which technical terms in the technical field of the article data and one or more terms that can be used in association with the technical terms are associated and registered. . FIG. 14 is an explanatory view showing an example of the specialized dictionary. In the specialized dictionary, as shown in FIG. 14, terms that can be used in association, such as a first candidate term and a second candidate term, are associated with terms that can be included in document data and the like. This specialized dictionary is used when correcting the words in the article data by the word conversion unit 102k described later.
 判定範囲特定部102jは、単語変換部102kよる変換が行われた論文データを判定対象として選択するものである。また、判定範囲特定部102jは、判定対象として選択された論文データの構成部分の中から、実施の形態1と同様に、本文部分を判定範囲として特定する。 The determination range specification unit 102j selects the article data converted by the word conversion unit 102k as a determination target. Further, as in the first embodiment, the determination range specifying unit 102j specifies the text portion as the determination range from among the component parts of the article data selected as the determination target.
 単語変換部102kは、論文データに含まれる単語を、専門辞書の該当する用語の第1候補用語、第2候補用語等に変換するものである。 The word conversion unit 102k converts a word included in article data into a first candidate term, a second candidate term, and the like of the corresponding term in the specialized dictionary.
(処理)
 次に、以上のように構成された実施の形態3の引用判定支援装置1300による引用判定支援処理について説明する。本実施の形態の引用判定の全体処理は、図11で説明した実施の形態2の引用判定支援処理と同様に行われる。本実施の形態では、図11のステップS31における判定対象の特定処理が実施の形態2と異なっている。
(processing)
Next, citation determination support processing by citation determination support apparatus 1300 of the third embodiment configured as described above will be described. The entire processing of the quoting determination of the present embodiment is performed in the same manner as the quoting determination support processing of the second embodiment described with reference to FIG. In the present embodiment, the process of specifying the determination target in step S31 of FIG. 11 is different from that of the second embodiment.
 図15は、実施の形態3の判定対象の特定の手順を示すフローチャートである。まず、判定範囲特定部102jは、作成された論文データを論文データ記憶部101cから読み出す(ステップS51)。そして、判定範囲特定部102jは、読み出した論文データの内容を公知の手法で形態素解析を行って、形態素に分割する(ステップS52)。 FIG. 15 is a flow chart showing a specific procedure of the determination target in the third embodiment. First, the determination range identification unit 102j reads out the created article data from the article data storage unit 101c (step S51). Then, the determination range specifying unit 102 j performs morphological analysis on the contents of the read article data according to a known method, and divides the contents into morphemes (step S 52).
 次に、単語変換部102kは、得られた形態素の単語を検索キーとして専門辞書を検索して、専門辞書の用語として登録されている単語に対して、当該単語を専門辞書の用語に対応する第1候補用語に変換する(ステップS53)。なお、2回目以降の単語変換の場合には、第n候補用語(nは2以上の整数)に変換する。 Next, the word conversion unit 102k searches the specialized dictionary using the obtained morpheme word as a search key, and for the word registered as the specialized dictionary term, the word corresponds to the specialized dictionary term It is converted into a first candidate term (step S53). In the case of the second and subsequent word conversion, conversion to the nth candidate term (n is an integer of 2 or more) is performed.
 そして、論文データの全ての単語に対して単語変換の処理が終了したか否かを判断し(ステップS54)、終了していなければ(ステップS54,No)、このステップS53の単語変換の処理を繰り返し行う。 Then, it is determined whether the word conversion process is completed for all the words in the article data (step S54), and if not completed (step S54, No), the word conversion process in step S53 is performed. Repeat
 一方、論文データの全ての単語に対して単語変換の処理が終了した場合には(ステップS54,Yes)、単語変換部102kは、変換された単語の論文データを修正版論文データとして論文データ記憶部101cに保存する(ステップS55)。 On the other hand, when the word conversion process is completed for all the words of the article data (step S54, Yes), the word conversion unit 102k stores the article data of the converted word as corrected article data. It is stored in the unit 101c (step S55).
 そして、単語変換部102kは、専門辞書の全ての候補用語に変換したか否かを判断する(ステップS56)。そして、まだ全ての候補用語に変換していない場合には(ステップS56,No)、単語変換部102kは、専門辞書の用語として次の候補用語(第n+1候補用語)を選択し(ステップS57)、ステップS53からS55までの処理を繰り返す。これにより、論文データの一の単語につき、複数の候補用語に変換された複数の修正版論文データが得られ、論文データ記憶部101cに保存されることになる。 Then, the word conversion unit 102k determines whether all the candidate terms in the specialized dictionary have been converted (step S56). Then, if conversion into all candidate terms has not been performed (Step S56, No), the word conversion unit 102k selects the next candidate term (the (n + 1) th candidate term) as the term of the specialized dictionary (Step S57) , Steps S53 to S55 are repeated. As a result, for each word of the article data, a plurality of modified version of the article data converted into a plurality of candidate terms are obtained and stored in the article data storage unit 101c.
 ステップS56において、専門辞書の全ての候補用語に変換したと判断した場合には(ステップS56,Yes)、判定範囲特定部102jは、得られた複数の修正版論文データを判定対象として特定する(ステップS58)。 If it is determined in step S56 that all the candidate terms in the specialized dictionary have been converted (Yes in step S56), the determination range specifying unit 102j specifies a plurality of obtained corrected version paper data as a determination target ( Step S58).
 引用判定支援処理は、このように判定対象として特定された複数の修正版論文データに対して行われることになる。 The quoting determination support process is performed on a plurality of corrected version paper data specified as the determination target in this way.
(効果)
 このように実施の形態3の引用判定支援装置1300では、論文データに含まれる単語を、専門辞書に登録された用語に変換し、変換が行われた論文データを、判定対象としているので、文献データをそのまま利用せず修正した上で不正に引用した場合でも、引用か否かを判定することができ、判定の精度をより向上させることができる。
(effect)
As described above, in the quotation decision support apparatus 1300 of the third embodiment, the words included in the article data are converted into the terms registered in the specialized dictionary, and the article data subjected to the conversion is determined as a judgment target. Even if the data is corrected without being used as it is, and it is cited illegally, it can be judged whether or not it is a quotation, and the accuracy of the judgment can be further improved.
〔実施の形態4〕
 次に、実施の形態4について説明する。この形態は、学生の過去の論文データの相互間で、類似度を算出する形態である。ただし、実施の形態4に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Fourth Embodiment
Next, the fourth embodiment will be described. This form is a form in which the degree of similarity is calculated between student's past dissertation data. However, the configuration and the process according to the fourth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図16は、実施の形態4に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置1600は、制御部102に、比較範囲特定部102l、類似度算出部102m、及び文献引用判定部102nを備える点において、実施の形態1に係る引用判定支援装置100と異なる。
(Constitution)
FIG. 16 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the fourth embodiment. This quotation judgment support device 1600 differs from the quotation judgment support device 100 according to the first embodiment in that the control unit 102 includes a comparison range specification unit 1021, a similarity calculation unit 102m, and a document quotation judgment unit 102n.
 類似度算出部102mは、実施の形態1と同様の機能の他、論文データ記憶部101cに記憶された学生の過去の論文データの相互間で、類似度を算出するものである。類似度の算出は、実施の形態1と同様に公知の手法を用いる。 The similarity calculation unit 102m calculates the similarity between the past paper data of the students stored in the paper data storage unit 101c, in addition to the same function as that of the first embodiment. The calculation of the degree of similarity uses a known method as in the first embodiment.
 文献引用判定部102nは、実施の形態1と同様の機能の他、類似度算出部102mにより算出された類似度が所定の第2閾値以上である場合に、複数の過去の論文データの相互間において引用が有る旨の判定を行うものである。ここで、第2閾値は、任意に定めることができ、上述した閾値と同一の値、異なる値のいずれであってもよい。 In addition to the same function as that of the first embodiment, reference quotation judging unit 102 n determines whether there is a mutual relation between a plurality of past paper data when the similarity calculated by similarity calculation unit 102 m is equal to or higher than a predetermined second threshold. In the above, it is judged that there is a citation. Here, the second threshold can be arbitrarily set, and may be the same value as the above-described threshold or any different value.
 比較範囲特定部102lは、文献引用判定部102nにより相互間において引用有りと判定された複数の過去の論文データを比較範囲として特定するものである。 The comparison range specifying unit 1021 specifies, as a comparison range, a plurality of past article data which are determined to have a citation by the document citation determining unit 102n.
(処理)
 次に、以上のように構成された実施の形態4の引用判定支援装置1600による引用判定支援処理について説明する。本実施の形態の引用判定の全体処理については図2を用いて説明した実施の形態1における引用判定支援処理の手順と同様である。本実施の形態では、図2における比較判定処理(ステップS14)の手順が実施の形態1と異なっている。
(processing)
Next, a quote determination support process by quote determination support device 1600 of the fourth embodiment configured as described above will be described. The entire process of the quoting determination of the present embodiment is the same as the procedure of the quoting determination support process in the first embodiment described with reference to FIG. In the present embodiment, the procedure of the comparison determination process (step S14) in FIG. 2 is different from that of the first embodiment.
 図17は、実施の形態4の比較判定処理の手順を示すフローチャートである。比較範囲特定部102lは、まず、論文データ記憶部101cに格納されている過去に提出された全ての論文データの中から2つの論文データを抽出する(ステップS61)。次に、類似度算出部102mは、抽出された2つの論文データの記述内容の類似度を算出する(ステップS62)。ここで、類似度の算出は、まず、2つの論文データのうち一の論文データの中の一部の範囲の記述と他方の論文データの記述内容とを比較し、次に、一の論文データの一部の範囲を変更しながら他方の論文データの記述内容とを比較するような処理を繰り返し行いながら部分ごとの類似度を算出し、これらの部分的な比較結果の類似度の平均値等を論文データ全体同士の類似度として求めるように構成すればよい。ただし、類似度の算出の手法はこれに限定されるものではない。 FIG. 17 is a flowchart showing the procedure of the comparison / determination process of the fourth embodiment. The comparison range specification unit 1021 first extracts two thesis data from all thesis data submitted in the past stored in the thesis data storage unit 101c (step S61). Next, the similarity calculation unit 102m calculates the similarity of the description content of the two extracted article data (step S62). Here, the calculation of the degree of similarity first compares the description of a partial range in one of the two article data with the description content of the other article data, and then the one article data The similarity of each part is calculated while repeating the process of comparing with the description content of the other article data while changing the partial range of the part, and the average value of the similarity of these partial comparison results, etc. Can be calculated as the degree of similarity between all the article data. However, the method of calculating the degree of similarity is not limited to this.
 そして、比較範囲特定部102lは、全ての過去の論文データについて、このような類似度の算出処理を行ったか否かを判断し(ステップS63)、全ての過去の論文データに行っていなければ(ステップS63,No)、ステップS61およびS62の処理を繰り返し実行する。 Then, the comparison range specifying unit 102l determines whether or not the calculation process of the similarity is performed for all the past paper data (step S63), and if it is not performed for all the past paper data ( The processes of steps S63 and No) and steps S61 and S62 are repeatedly executed.
 一方、全ての過去の論文データに対して類似度の算出処理が完了した場合には(ステップS63,Yes)、文献引用判定部102nは、類似度が予め定められた第2閾値以上である複数の論文データがある場合には、これらの複数の論文データは相互間で引用箇所があるという引用有りの判断をおこなって、これらの複数の論文データを選択する(ステップS64)。そして、比較範囲特定部102lは、この選択された複数の論文データを比較範囲として特定する(ステップS65)。従って、互いに引用している過去の論文データが比較範囲となり、判定対象の論文データの引用判定が行われることになる。 On the other hand, when the calculation process of the similarity is completed for all the past article data (Yes at step S63), the document quoting determination unit 102n determines a plurality of documents whose similarity is equal to or more than a predetermined second threshold. If there is a dissertation data, it is determined that there are citations among the plurality of thesis data, and the plurality of thesis data are selected (step S64). Then, the comparison range specifying unit 102l specifies the selected plurality of thesis data as the comparison range (step S65). Therefore, the past paper data quoted to each other becomes the comparison range, and the citation judgment of the paper data to be judged is performed.
(効果)
 このように実施の形態4にかかる引用判定支援装置1600では、相互間で引用している過去の論文データを比較範囲として、判定対象の論文データの引用判定を行うので、引用している可能性の高いものを比較範囲とすることができ、開発工程および製造コストの増大を防止しつつ判定の精度をより向上させることができると共に、不正引用の確率が高い論文データのみに判定対象を限定することで判定処理負荷を低減して判定効率を高めることができる。
(effect)
As described above, the citation determination support apparatus 1600 according to the fourth embodiment performs citation determination on the citation data of the determination target, with the citation data in the past as citation data being used as a comparison range. As the comparison range, it is possible to improve the accuracy of the judgment while preventing the development process and the increase of the manufacturing cost, and limit the judgment object only to the thesis data with a high probability of illegal citation. Thus, the determination processing load can be reduced to improve the determination efficiency.
 なお、本実施の形態では、類似度が第2閾値以上であるものを、過去の論文データ相互間で引用していると判定しているが、さらに、適法性判定部102eによって、この引用が適法であるか否かを判断し、不適法である場合にのみ比較範囲として特定するように構成してもよい。 In the present embodiment, although it is determined that the one having the similarity degree equal to or more than the second threshold is cited among the past paper data, the citation is further determined by the legality determination unit 102e. It may be configured to judge whether or not it is legal, and to specify as a comparison range only when it is illegal.
〔実施の形態5〕
 次に、実施の形態5について説明する。この形態は、論文の課題文をキーワードとして判定対象を自動的に抽出する形態である。ただし、実施の形態5に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Fifth Embodiment
Next, the fifth embodiment will be described. This form is a form which automatically extracts a judgment target with a task sentence of a paper as a keyword. However, the configuration and the process according to the fifth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図18は、実施の形態5に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置1800は、制御部102に課題抽出部102p及び比較範囲特定部102qを備える点において、実施の形態1に係る引用判定支援装置100と異なる。
(Constitution)
FIG. 18 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the fifth embodiment. The quote determination support device 1800 differs from the quote determination support device 100 according to the first embodiment in that the control unit 102 includes the task extraction unit 102 p and the comparison range specification unit 102 q.
 課題抽出部102pは、判定対象となった論文データの構造解析を行って、論文データの記述内容から、論文の課題文を抽出するものである。具体的には、課題抽出部102pは、構造解析の結果得られた論文の見出しや構造等により、課題文を特定して抽出する。 The task extraction unit 102p analyzes the structure of the article data as the determination target, and extracts the task sentence of the article from the description content of the article data. Specifically, the task extraction unit 102p identifies and extracts task sentences based on the heading, structure, and the like of the paper obtained as a result of structural analysis.
 比較範囲特定部102qは、課題抽出部102pで抽出された課題文を検索キーとしてインターネット130上のWEBサイト131やファイルサーバ133等から該当するWEBページを検索し、検索結果として出力されたURL等で指定される文献データを比較範囲として特定するものである。検索には、公知の検索エンジンなどを利用することができる。この場合には、比較範囲特定部102qは、公知の検索エンジンのAPI(Application Programing Interface)を利用して検索キーを指定した検索依頼コマンド等を検索エンジンのWEBサイトに送出して検索結果を受信するように構成すればよい。 The comparison range specifying unit 102 q searches the corresponding WEB page from the WEB site 131 or the file server 133 on the Internet 130 using the task sentence extracted by the task extracting unit 102 p as a search key, and the URL etc. output as the search result The document data specified by is specified as the comparison range. A known search engine or the like can be used for the search. In this case, the comparison range specifying unit 102 q transmits a search request command or the like specifying a search key to a search engine WEB site using a known search engine API (Application Programming Interface), and receives a search result. It should be configured to
(処理)
 次に、以上のように構成された実施の形態5の引用判定支援装置1800による引用判定支援処理について説明する。本実施の形態の引用判定の全体処理については図2を用いて説明した実施の形態1における引用判定支援処理の手順と同様である。本実施の形態では、図2における比較判定処理(ステップS14)の手順が実施の形態1と異なっている。
(processing)
Next, citation determination support processing by citation determination support apparatus 1800 of the fifth embodiment configured as described above will be described. The entire process of the quoting determination of the present embodiment is the same as the procedure of the quoting determination support process in the first embodiment described with reference to FIG. In the present embodiment, the procedure of the comparison determination process (step S14) in FIG. 2 is different from that of the first embodiment.
 図19は、実施の形態5の比較範囲特定処理の手順を示すフローチャートである。まず、課題抽出部102pは、判定対象となった論文データに構造解析を行って、課題文を抽出する(ステップS81)。次に、比較範囲特定部102qは、抽出された課題文を検索キーとして、インターネット130上のWEBサイト131やファイルサーバ133等から該当するWEBページを検索する(ステップS82)。そして、比較範囲特定部102qは、検索結果としての検索されたWEBページのURLで指定された引用文献データを比較範囲として特定する(ステップS83)。 FIG. 19 is a flowchart showing the procedure of comparison range identification processing of the fifth embodiment. First, the task extracting unit 102p analyzes the structure of the article data as the determination target to extract task sentences (step S81). Next, the comparison range specifying unit 102 q searches for a corresponding WEB page from the WEB site 131, the file server 133, and the like on the Internet 130 using the extracted task sentence as a search key (Step S82). Then, the comparison range specifying unit 102 q specifies the cited document data specified by the URL of the searched WEB page as the search result as the comparison range (step S 83).
(効果)
 このように実施の形態5の引用判定支援装置1800では、論文データにおける課題文に基づいて引用文献の比較範囲を決定しているので、論文の内容に即して適切な引用文献の比較範囲を定めることができ、開発工程および製造コストの増大を防止しつつ判定の精度をより向上させることができる。
(effect)
As described above, in the citation judging support device 1800 of the fifth embodiment, the comparative range of the cited document is determined based on the task sentence in the thesis data. It is possible to further improve the accuracy of the determination while preventing an increase in the development process and the manufacturing cost.
〔実施の形態6〕
 次に、実施の形態6について説明する。この形態は、論文における比較対象の文字数が検索ロジックの文字制限数を超える場合の対応ロジックを含んだ形態である。ただし、実施の形態6に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Sixth Embodiment
A sixth embodiment will now be described. This form is a form including the correspondence logic when the number of characters to be compared in the article exceeds the number of characters in the search logic. However, the configuration and the process according to the sixth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図20は、実施の形態6に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置2000は、制御部102に類似度算出部102rを備える点において、実施の形態1に係る引用判定支援装置100と異なる。
(Constitution)
FIG. 20 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the sixth embodiment. The quote determination support device 2000 differs from the quote determination support device 100 according to the first embodiment in that the control unit 102 includes the similarity calculation unit 102r.
 類似度算出部102rは、公知の検索技術(検索エンジン等)を利用して、判定範囲特定部102aで特定された判定範囲の記述内容を検索キーとして、比較範囲特定部102bで特定した比較範囲の中から検索する。この際、検索キーの文字数が、予め定められた制限文字数(例えば、32文字)を超えている場合には、制限文字数を含み、検索キーの文字数が制限文字数を超えている旨のエラーメッセージが検索エンジン等から通知される。このような場合には、類似度算出部102rは、検索キーとして制限文字数以内の文字を判定範囲の先頭から指定して、比較範囲の中から検索を行い、検索結果をメモリ等に保存しておく。そして、類似度算出部102rは、判定範囲の中で次の制限文字数分の文字列を検索キーとして同様に比較範囲の中から検索を行う。このようにして、類似度算出部102rは、順次、制限文字数分だけ判定範囲の記述内容の文字列を移動させながら検索キーを指定し、複数回の検索を行って、その検索結果をメモリ等に保存する。類似度算出部102rは、複数の検索結果の中から最も出現頻度の高い検索結果を、類似度算出の対象となる比較範囲とし、判定範囲との類似度の算出を行う。なお、出現頻度が所定数以上の検索結果を類似度算出の対象とするように構成してもよい。 The similarity calculation unit 102r uses the well-known search technology (search engine etc.) and uses the description content of the determination range specified by the determination range specification unit 102a as a search key, and the comparison range specified by the comparison range specification unit 102b. Search from among At this time, if the number of characters of the search key exceeds a predetermined number of restricted characters (for example, 32 characters), an error message is displayed that includes the number of restricted characters and the number of characters of the search key exceeds the number of restricted characters. It is notified from a search engine etc. In such a case, the similarity calculation unit 102r specifies a character within the limited number of characters from the top of the determination range as a search key, performs a search from the comparison range, and stores the search result in a memory or the like. deep. Then, the similarity calculation unit 102 r similarly searches the comparison range using the character string for the next limited number of characters in the determination range as a search key. In this manner, the similarity calculation unit 102 r sequentially designates the search key while moving the character string of the description content of the determination range by the limited number of characters, performs a plurality of searches, and stores the search result Save to The similarity calculation unit 102r calculates the similarity with the determination range, with the search result having the highest appearance frequency among the plurality of search results as the comparison range to be the target of similarity calculation. Note that search results having a predetermined number or more of appearance frequencies may be configured as targets for similarity calculation.
(処理)
 次に、以上のように構成された実施の形態6の引用判定支援装置2000による引用判定支援処理について説明する。本実施の形態の引用判定の全体処理については図2を用いて説明した実施の形態1における引用判定支援処理の手順と同様である。本実施の形態では、図2における類似度算出部が行う検索処理(ステップS15)の手順が実施の形態1と異なっている。
(processing)
Next, citation determination support processing by citation determination support apparatus 2000 of the sixth embodiment configured as described above will be described. The entire process of the quoting determination of the present embodiment is the same as the procedure of the quoting determination support process in the first embodiment described with reference to FIG. In the present embodiment, the procedure of the search process (step S15) performed by the similarity calculation unit in FIG. 2 is different from that of the first embodiment.
 図21は、実施の形態6の類似度算出における検索処理の手順を示すフローチャートである。まず、類似度算出部102rは、判定範囲の記述内容検索キーとして比較範囲のデータを検索する(ステップS91)。そして、類似度算出部102rは、検索キーが制限文字数を超えた旨のエラー通知を受信したか否かを判断する(ステップS92)。 FIG. 21 is a flow chart showing the procedure of search processing in similarity calculation in the sixth embodiment. First, the similarity calculation unit 102r searches data in the comparison range as a description content search key of the determination range (step S91). Then, the similarity calculation unit 102r determines whether an error notification that the search key has exceeded the limit number of characters has been received (step S92).
 そして、検索キーが制限文字数を超えた旨のエラー通知を受信しなかった場合には(ステップS92,No)、類似度算出部102rは、検索結果を選択し(ステップS100)、この検索結果の比較範囲が類似度算出の対象となり、実施の形態1と同様に、判定範囲との類似度が算出される。 Then, when an error notification that the search key has exceeded the limit number of characters has not been received (step S 92, No), the similarity calculation unit 102 r selects a search result (step S 100), and the search result is The comparison range is the target of similarity calculation, and the similarity to the determination range is calculated as in the first embodiment.
 一方、ステップS92において、検索キーが制限文字数を超えた旨のエラー通知を受信した場合には(ステップS92,Yes)、類似度算出部102rは、受信したエラー通知の中から制限文字数を取得する(ステップS93)。 On the other hand, if an error notification that the search key has exceeded the limited number of characters is received in step S92 (Yes in step S92), the similarity calculation unit 102r acquires the limited number of characters from the received error notification. (Step S93).
 そして、類似度算出部102rは、判定範囲の先頭から、制限文字数分の範囲の文字列を検索キーと指定し(ステップS94)、この検索キーで比較範囲のデータを検索する(ステップS95)。類似度算出部102rは、その検索結果をメモリに記憶する(ステップS96)。 Then, the similarity calculation unit 102r designates a character string within the limited number of characters as a search key from the head of the determination range (step S94), and searches data in the comparison range with this search key (step S95). The similarity calculation unit 102r stores the search result in the memory (step S96).
 そして、類似度算出部102rは、判定範囲の検索キーとして最終文字列まで到達したか否かを判断し(ステップS97)、まだ到達していない場合には(ステップS97,No)、判定範囲の中で次の制限文字数分の文字列を検索キーに指定して(ステップS98)、ステップS95およびS96の処理を繰り返し実行する。なお、このように制限文字数分の文字列を指定する場合の具体的な方法は任意であるが、一例としては、制限文字数分を一単位として移動させる方法(例えば、制限文字数=32文字の場合において、初回は1文字目から32文字目までの文字列を検索キーとし、2回目は33文字目からから64文字目までの文字列を検索キーとし、以下同様に検索キーの指定を行う方法)、一文字単位で移動させる方法(例えば、制限文字数=32文字の場合において、初回は1文字目から32文字目までの文字列を検索キーとし、2回目は2文字目からから33文字目までの文字列を検索キーとし、以下同様に検索キーの指定を行う方法)、あるいは任意の文字数毎に移動させる方法(例えば、制限文字数=32文字の場合であって、任意の文字数=10文字とする場合において、初回は1文字目から32文字目までの文字列を検索キーとし、2回目は11文字目から42文字目までの文字列を検索キーとし、以下同様に検索キーの指定を行う方法)を採用することができる。 Then, the similarity calculation unit 102 r determines whether or not the final character string has been reached as a search key of the determination range (step S 97), and if it has not reached yet (step S 97, No), Among them, a character string for the next limited number of characters is designated as a search key (step S98), and the processes of steps S95 and S96 are repeatedly executed. Although the specific method for specifying the character string for the limited number of characters in this way is arbitrary, as an example, a method for moving the limited number of characters as one unit (for example, in the case of limited number of characters = 32 characters) In the first, the character string from the first character to the 32nd character is used as a search key, and for the second time, the character string from the 33rd character to the 64th character is used as a search key. Method to move in units of one character (for example, in the case of limited number of characters = 32 characters, the first character string from the first character to the 32nd character is the search key, and the second one from the second character to the 33rd character) The following character string is used as a search key, and a search key is specified in the same manner) or a method of moving to any number of characters (for example, in the case of limited number of characters = 32 characters, any number of characters = 1) In the case of characters, the first time the character string from the first character to the 32nd character is the search key, the second time the character string from the 11th character to the 42nd character is the search key, and so on. Method) can be adopted.
 一方、ステップS97において、判定範囲の検索キーとして最終文字列まで到達した場合には(ステップS97,Yes)、メモリに保存された検索結果の中で、最も出現頻度の高い検索結果を選択し(ステップS99)、選択された比較範囲が類似度算出の対象となり、判定範囲との類似度が算出される。 On the other hand, when the final character string is reached as the search key of the determination range in step S97 (step S97, Yes), the search result having the highest frequency of appearance is selected from the search results stored in the memory ( Step S99), the selected comparison range is the target of similarity calculation, and the similarity to the determination range is calculated.
(効果)
 このように実施の形態6の引用判定支援装置2000では、検索キーが制限文字数を超えた場合には、判定範囲の中で制限文字数分の文字列で検索キーを指定して、検索キーとしての判定範囲の文字列をずらしながら複数回の検索を行っているので、検索キーの制限文字数にかかわらず、引用判定の精度を向上させることができる。
(effect)
As described above, in the quotation determination support device 2000 of the sixth embodiment, when the search key exceeds the limited number of characters, the search key is specified by the character string for the limited number of characters in the determination range, and the search key is specified. Since the search is performed a plurality of times while shifting the character string in the determination range, the accuracy of the quotation determination can be improved regardless of the limited number of characters of the search key.
〔実施の形態7〕
 次に、実施の形態7について説明する。この形態は、適法性判定手段が、判定範囲が所定の引用形式に合致するか否かを判定し、当該判定結果に基づいて、当該判定範囲における比較範囲の引用が適法な引用であるか否かを判定する形態である。ただし、実施の形態7に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Seventh Embodiment
A seventh embodiment will now be described. In this aspect, the legality determination means determines whether or not the determination range conforms to a predetermined citation form, and based on the determination result, whether or not the citation of the comparison range in the determination range is a legitimate citation It is a form to determine whether or not. However, the configuration and processing according to the seventh embodiment are the same as the configuration and processing according to the first embodiment except when particularly described, and the same configuration and processing in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図24は、実施の形態7に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置100は、制御部102に引用形式設定部102sを備えると共に、記憶部101に引用形式データベース(以下、「データベース」を「DB」と略記する)101fを備えている。
(Constitution)
FIG. 24 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the seventh embodiment. The citation determination support apparatus 100 includes a citation format setting unit 102 s in the control unit 102 and a citation format database (hereinafter, “database” is abbreviated as “DB”) 101 f in the storage unit 101.
 引用形式設定部102sは、引用の適法性が判定される際の基準となる引用形式を設定する引用形式設定手段である。 The citation format setting unit 102 s is a citation format setting unit that sets a citation format as a reference when judging the legitimacy of citation.
 引用形式DB101fは、論文データを分類する種別と、所定の引用形式とを、相互に関連付けて格納する引用形式格納手段である。図25は、引用形式DB101fに格納される情報を示した表である。図25に示したように、引用形式DB101fはデータ項目として「種別」「引用形式」「適法文書格納場所」を備え、これらに対応する情報が相互に関連付けて格納されている。項目「種別」に対応して格納される情報は、論文データの種別を特定するための情報であり、図25に例示したように、「法律」や「工学」といった論文のテーマに応じた分野を特定する情報を格納することができる。項目「引用形式」に対応して格納される情報は、適法な引用形式を特定するための情報であり、図25に例示したように、「『』」や「””」等を格納することができる。項目「適法文書格納場所」に対応して格納される情報は、適法とみなす文書の格納場所を特定するための情報であり、例えば図25に示したように、「Z:¥quotaion¥law¥」や「Z:¥quotaion¥eng¥」等、文書の格納先となるフォルダ名等を格納することができる。なお、「適法とみなす文書」としては、例えば当該文書が引用された場合には適法とみなされる文書が該当する。この引用形式DB101fに格納される情報の格納方法や格納タイミングは任意で、例えば予め入力装置103を介して引用形式DB101fに格納することができ、あるいは後述する引用形式設定処理において引用形式DB101fに格納することができる。 The citation format DB 101 f is a citation format storage unit that associates and stores a type for classifying article data and a predetermined citation format. FIG. 25 is a table showing information stored in the citation format DB 101 f. As shown in FIG. 25, the citation format DB 101 f includes “type”, “citation format”, and “location of legal document storage” as data items, and information corresponding to these is stored in association with each other. The information stored corresponding to the item "type" is information for specifying the type of thesis data, and as exemplified in FIG. 25, the field corresponding to the theme of the article such as "law" or "engineering" Can be stored. The information stored corresponding to the item "citation format" is information for specifying a legal citation format, and as illustrated in FIG. 25, it is necessary to store "" "," "", etc. Can. The information stored corresponding to the item "legal document storage location" is information for specifying the storage location of the document regarded as legal, for example, as shown in FIG. 25, "Z: \ quotaion \ law \ “Z: ¥ quotaion ¥ eng ¥” and the like can be stored as a folder name etc. where the document is stored. In addition, as the "document to be regarded as legal", for example, a document to be regarded as legal when the document is cited corresponds. The storage method and storage timing of the information stored in the citation format DB 101 f can be arbitrarily stored in the citation format DB 101 f, for example, in advance via the input device 103, or stored in the citation format DB 101 f in the citation format setting process described later. can do.
(処理-引用判定支援処理)
 次に、以上のように構成された実施の形態7の引用判定支援装置100で実行される引用判定支援処理について説明する。図26は、実施の形態7の引用判定支援処理の手順を示すフローチャートである。なお、ステップSA2及びステップSA9を除くステップSA1からステップSA13までの各処理は、実施の形態1において図2を参照して説明したステップS11からステップS20aまでの各処理と同様であるので、詳細な説明は省略する。
(Processing-quote judgment support process)
Next, the citation determination support process executed by citation determination support apparatus 100 of the seventh embodiment configured as described above will be described. FIG. 26 is a flowchart showing the procedure of the quotation determination support process of the seventh embodiment. The processes of steps SA1 to SA13 except steps SA2 and SA9 are the same as the processes of steps S11 to S20a described with reference to FIG. The description is omitted.
 ステップSA1において論文データの読み出しを行った後(ステップSA1)、引用形式設定部102sは引用形式の設定を行う(ステップSA2)。 After reading the article data in step SA1 (step SA1), the citation format setting unit 102s sets citation format (step SA2).
 ここで、引用形式設定処理について説明する。引用形式設定処理は、論文データにおける引用の適法性が判定される際の基準となる、引用形式を設定するための処理である。図27は、引用形式設定処理の手順を示すフローチャートである。 Here, the citation format setting process will be described. The citation format setting process is a process for setting a citation format, which is a standard when judging the legitimacy of citation in the article data. FIG. 27 is a flowchart of the citation format setting process.
 図27に示すように、引用形式設定処理が起動されると、出力制御部102hは、引用形式設定入力画面を表示装置104に出力表示させる(ステップSB1)。図28は、引用形式設定入力画面を例示した図である。図28に示したように、引用形式設定入力画面には、例えば論文データの種別を選択する「種別」メニュー、適法な引用形式を入力する「引用形式」ボックス、適法とみなす文書の格納場所を指定する「適法文書格納場所」ボックス、引用形式設定入力画面における入力内容の確定指示を行うための確定ボタン、引用形式設定の終了指示を行うための終了ボタン等が表示される。 As shown in FIG. 27, when the citation format setting process is activated, the output control unit 102h causes the display device 104 to output and display a citation format setting input screen (step SB1). FIG. 28 is a diagram exemplifying a citation format setting input screen. As shown in FIG. 28, in the citation format setting input screen, for example, a “type” menu for selecting the type of article data, a “citation format” box for inputting a legal citation format, and a storage location of a document regarded as legal A “law document storage location” box to be specified, a confirmation button for giving an instruction to confirm the input content on the citation format setting input screen, an end button for giving an instruction to end citation format setting, etc. are displayed.
 そして、引用形式設定部102sは、入力装置103を介した終了ボタンの押下により引用形式設定処理の終了指示がされた場合(ステップSB2、Yes)、引用形式設定処理を終了してメインルーチンに戻る。一方、終了ボタンが押下されず、引用形式設定処理の終了指示がされない場合(ステップSB2、No)、引用形式判定部は、入力装置103を介して「種別」メニューから論文データの種別(例えば、「法律」や「工学」等)が選択されるまで待機し(ステップSB3、No)、論文データの種別が選択された場合(ステップSB3、Yes)、当該選択された種別をRAM等に一時記憶する(ステップSB4)。 Then, when the end instruction of the citation format setting process is instructed by pressing the end button via the input device 103 (Yes in step SB2), the citation format setting unit 102s ends the citation format setting process and returns to the main routine. . On the other hand, when the end button is not pressed and an instruction to end the citation format setting process is not issued (No in step SB2), the citation format determination unit selects the type of thesis data (for example, from the "type" menu) via the input device 103. It waits until "law" or "engineering" etc. is selected (step SB3, No), and when the type of article data is selected (step SB3, Yes), the selected type is temporarily stored in RAM etc. To do (step SB4).
 続いて、引用形式設定部102sは、入力装置103を介した確定ボタンの押下により入力内容の確定指示がされるまで待機し(ステップSB5、No)、入力内容の確定指示がされた場合(ステップSB5、Yes)、その時点で「引用形式」ボックスに入力されている引用形式(例えば、「『』」や「””」等)、及び「適法文書格納場所」で指定されている文書の格納場所(例えば、「Z:¥quotaion¥law¥」等)を取得し、ステップSB4でRAM等に一時記憶した種別と対応付けて、引用形式DB101fに格納する(ステップSB6)。その後、ステップSB2に戻り、終了指示がされたか否かの判定を行う(ステップSB2)。 Subsequently, the citation format setting unit 102 s stands by until the input content determination instruction is given by pressing the determination button via the input device 103 (step SB5, No), and the input content determination instruction is received (step SB5, Yes), storage of the document specified in the citation format (for example, "" "or" "" etc.) currently input in the "citation format" box, and "the legal document storage location" A place (for example, "Z: \ quotaion \ law \" or the like) is acquired, and stored in the quotation format DB 101 f in association with the type temporarily stored in the RAM or the like in step SB4 (step SB6). Thereafter, the process returns to step SB2, and it is determined whether an end instruction has been issued (step SB2).
 図26に示す引用判定支援処理に戻り、ステップSA8において、ステップSA7で類似度算出部102cによって算出された類似度が所定の閾値以上である場合(ステップSA8,Yes)、判定範囲は比較範囲の文献データ等を引用していると判断し、適法性判定部102eは当該引用が適法な引用か否かを判定するための適法性判定処理を実行する(ステップSA9)。 Returning to the quotation determination support processing shown in FIG. 26, when the similarity calculated by the similarity calculation unit 102c in step SA7 is equal to or higher than the predetermined threshold in step SA8 (step SA8, Yes), the determination range is the comparison range. Judging that the document data and the like are cited, the legality determination unit 102e executes a legality determination process for determining whether the citation is a legal reference (step SA9).
(処理-適法性判定処理)
 ここで、適法性判定処理について説明する。図29は、適法性判定処理の手順を示すフローチャートである。適法性判定処理が起動されると、適法性判定部102eは、判定対象の論文データの種別を特定する(ステップSC1)。例えば、種別入力画面(図示省略)を表示装置104に出力表示させ、入力装置103を介して判定対象の論文データの種別の入力を受け付けることができる。
(Process-legality judgment process)
Here, the legality determination process will be described. FIG. 29 is a flow chart showing the procedure of legality determination processing. When the legality determination process is activated, the legality determination unit 102e specifies the type of the article data to be determined (step SC1). For example, the type input screen (not shown) can be output and displayed on the display device 104, and the input of the type of article data to be determined can be received via the input device 103.
 次に、ステップSC1で特定した種別に基づいて引用形式DB101fを参照し、当該種別に対応する適法な引用形式、及び適法とみなす文書の格納場所を、引用形式DB101fから取得する(ステップSC2)。 Next, the citation form DB 101 f is referred to based on the type specified in step SC 1, and a legal citation form corresponding to the type and a storage location of a document considered to be legal are acquired from the citation form DB 101 f (step SC 2).
 そして、ステップSA8で比較範囲の文献データ等を引用していると判断した引用が、ステップSC2で取得した適法な引用形式に準拠した引用か否かを判定する(ステップSC3)。例えば、引用部分の前後に、適法な引用形式「『』」が用いられている場合や、引用部分自体や引用部分の直後に当該引用部分の引用元を示す参考文献情報への参照番号が付記されている場合、あるいは当該引用部分が適法とみなす文書の格納場所に格納されている文書からの引用である場合、適法な引用形式に準拠した引用と判定する。 Then, it is determined whether or not the citation which is determined in step SA8 that the reference data or the like in the comparison range is referred to is a citation conforming to the legal citation format acquired in step SC2 (step SC3). For example, when a proper citation form “” ”is used before or after a citation part, a citation part itself or a reference number to reference information indicating the citation source of the citation part immediately after the citation part is added If it is, or if the cited part is a citation from a document stored in the storage location of the document to be considered legal, it will be judged as a citation conforming to the legal citation form.
 その結果、適法な引用形式に準拠していないと判定した場合(ステップSC3、No)、適法性判定部102eは、当該引用部分が不適法である旨の表示を表示装置104に出力表示をさせる(ステップSC4)。例えば、図7に示した引用判定画面において、当該引用部分の表示を白黒反転させた表示とする。 As a result, when it is determined that the content does not conform to the legal citation form (No in step SC3), the legality determination unit 102e causes the display device 104 to display an indication that the citation part is inappropriate. (Step SC4). For example, in the quotation determination screen shown in FIG. 7, it is assumed that the display of the cited part is reversed in black and white.
 一方、適法な引用形式に準拠していると判定した場合(ステップSC3、Yes)、又はステップSC4の処理の後、適法性判定部102eは比較範囲の文献データ等を引用していると判断した部分の全てについて適法性判定を行ったか否かを判定する(ステップSC5)。 On the other hand, when it is determined that the document conforms to the legal citation form (step SC3, Yes), or after the process of step SC4, the legality determination unit 102e determines that the document data etc. in the comparison range is cited. It is determined whether a legitimacy determination has been made for all of the parts (step SC5).
 その結果、引用部分の全てについて適法性判定を行っていないと判定した場合(ステップSC5、No)、適法性判定部102eは、適法性判定を行っていない他の引用部分について、適法な引用形式に準拠した引用か否かの判定を行う(ステップSC3)。一方、引用部分の全てについて適法性判定を行ったと判定した場合(ステップSC5、Yes)、適法性判定部102eは適法性判定処理を終了し、メインルーチンに戻る。 As a result, when it is determined that the legality determination has not been performed for all of the cited parts (Step SC5, No), the legality determination unit 102e determines the legal citation form for the other cited parts for which the legality determination is not performed. It is determined whether or not the citation is in compliance with (step SC3). On the other hand, when it is determined that the legality determination has been performed for all of the quoted parts (Yes in step SC5), the legality determination unit 102e ends the legality determination process and returns to the main routine.
(効果)
 このように実施の形態7の引用判定支援装置100では、判定範囲が所定の引用形式に合致するか否かを判定し、当該判定結果に基づいて、当該判定範囲における比較範囲の引用が適法な引用であるか否かを判定するので、予め設定した引用形式に基づき、引用の適法性を容易に判定することができる。
(effect)
Thus, in the quotation determination support apparatus 100 of the seventh embodiment, it is determined whether or not the determination range conforms to a predetermined citation form, and citation of the comparison range in the determination range is legal based on the determination result. Since it is determined whether or not it is a citation, the legitimacy of the citation can be easily determined based on a preset citation format.
 また、論文データの種別に対応する引用形式を引用形式DB101fから取得し、当該取得した引用形式に引用が合致するか否かを判定するので、論文データの種別毎に異なる引用形式に基づき、引用の適法性を判定することができる。 In addition, since the citation form corresponding to the type of the article data is acquired from the citation form DB 101 f and it is determined whether the citation matches the acquired citation form, citation based on a citation form different for each type of the article data The legitimacy of can be determined.
〔実施の形態8〕
 次に、実施の形態8について説明する。この形態は、判定範囲の記述内容の内、比較範囲から引用された記述内容が占める引用割合を算出する形態である。ただし、実施の形態8に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Eighth Embodiment
An eighth embodiment will now be described. This form is a form which calculates the citation ratio which the description content quoted from the comparison range occupies among the description contents of the judgment range. However, the configuration and the process according to the eighth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図30は、実施の形態8に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置100は、制御部102に引用割合算出部102tを備えると共に、記憶部101に引用割合DB101gを備えている。
(Constitution)
FIG. 30 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the eighth embodiment. The citation determination support apparatus 100 includes a citation ratio calculation unit 102t in the control unit 102 and a citation ratio DB 101g in the storage unit 101.
 引用割合算出部102tは、判定範囲の記述内容の内、比較範囲から引用された記述内容が占める引用割合を算出する引用割合算出手段である。 The citation ratio calculation unit 102t is a citation ratio calculation unit that calculates the citation ratio occupied by the description content cited from the comparison range among the description content of the determination range.
 引用割合DB101gは、判定対象データを一意に識別する判定対象データ情報と、引用割合算出部102tが算出した引用割合とを、相互に関連付けて格納する引用割合格納手段である。図31は、引用割合DB101gに格納される情報を例示した表である。図31に示すように、引用割合DB101gはデータ項目として「論文データ情報」「文献データ情報」「引用割合」を備え、これらに対応する情報が相互に関連付けて格納されている。項目「論文データ情報」に格納される情報は、判定対象である論文データを一意に識別する判定対象データ情報であり、図31に示したように、例えば論文作成者の学籍簿番号及び論文作成日を含む識別番号が格納される。項目「文献データ情報」に格納される情報は、引用元である文献データを一意に識別する文献データ情報であり、図31に示したように、例えば文献データの文献情報が格納される。項目「引用割合」に格納される情報は、引用割合算出部102tが算出した引用割合を特定する情報であり、図31に示したように、例えば論文データにおける各文献データからの個別の引用割合及び当該個別の引用割合の合計値を百分率で示した数値が格納される。なお、引用割合の具体的内容については後述する。これらの情報は、次に述べる引用判定支援処理において引用割合DB101gに格納される。 The quotation ratio DB 101 g is a quotation ratio storage unit that stores determination target data information that uniquely identifies determination target data and the quotation ratio calculated by the quotation ratio calculation unit 102 t in association with each other. FIG. 31 is a table exemplifying information stored in the citation ratio DB 101 g. As shown in FIG. 31, the citation ratio DB 101 g includes “thesis data information”, “document data information”, and “citation ratio” as data items, and information corresponding to these is stored in association with each other. The information stored in the item “dissertation data information” is determination target data information that uniquely identifies the dissertation data that is the determination target, and as shown in FIG. 31, for example, the student register number of the dissertation author and dissertation creation An identification number including a day is stored. The information stored in the item "document data information" is document data information that uniquely identifies the document data that is the citation source, and as shown in FIG. 31, for example, the document information of the document data is stored. The information stored in the item "quotation ratio" is information for specifying the citation ratio calculated by the citation ratio calculation unit 102t. As shown in FIG. 31, for example, the citation ratio of each document data in the article data And the numerical value which showed the total value of the said individual citation ratio in percentage is stored. The specific content of the citation ratio will be described later. These pieces of information are stored in the citation ratio DB 101 g in the citation determination support process described next.
(処理)
 次に、以上のように構成された実施の形態8の引用判定支援装置100で実行される処理について説明する。実施の形態8の引用判定支援装置100が実行する処理は、引用判定支援処理及びリスト表示処理に大別される。
(processing)
Next, processing executed by the quotation determination support apparatus 100 of the eighth embodiment configured as described above will be described. The processing executed by the quotation determination support apparatus 100 of the eighth embodiment is roughly classified into a quotation determination support processing and a list display processing.
(処理-引用判定支援処理)
 まず、引用判定支援処理について説明する。図32は、実施の形態8の引用判定支援処理の手順を示すフローチャートである。なお、ステップSD1からステップSD11までの各処理は、実施の形態1において図2を参照して説明したステップS11からステップS20aまでの各処理と同様であるので、詳細な説明は省略する。
(Processing-quote judgment support process)
First, the quote determination support process will be described. FIG. 32 is a flowchart showing the procedure of the quotation determination support process of the eighth embodiment. The processes in steps SD1 to SD11 are the same as the processes in steps S11 to S20a described with reference to FIG. 2 in the first embodiment, and thus detailed description will be omitted.
 ステップSD11において、特定された比較範囲のデータの全てについてステップSD5からSD10までの処理が全て終了したと判定した場合(ステップSD11、Yes)、引用割合算出部102tは、判定範囲の記述内容のうち、比較範囲から引用された記述内容が占める引用割合を算出する(ステップSD12)。引用割合の具体的な内容は任意であり、例えば、判定範囲の文字数に対する引用部分の文字数の百分率を、引用割合として算出する。 When it is determined in step SD11 that all the processing in steps SD5 to SD10 is completed for all the data in the specified comparison range (Yes in step SD11), the citation ratio calculation unit 102t determines that the description content of the determination range is The citation ratio occupied by the description content cited from the comparison range is calculated (step SD12). The specific content of the citation ratio is arbitrary, and for example, the percentage of the number of characters of the citation part to the number of characters of the determination range is calculated as the citation ratio.
 そして、出力制御部102hは、引用割合算出部102tが算出した引用割合を表示装置104に出力表示させると共に、当該算出した引用割合を、判定対象の論文データを特定する論文データ情報に対応付けて引用割合DB101gに格納する(ステップSD13)。複数の文献データからの引用がある場合には、各文献データから引用された記述内容が占める個別の引用割合と、当該個別の引用割合の合計値とを算出し、引用割合DB101gに格納する。 Then, the output control unit 102h causes the display device 104 to output and display the citation ratio calculated by the citation ratio calculation unit 102t, and associates the calculated citation ratio with the article data information specifying the article data to be determined. The citation ratio DB 101g is stored (step SD13). When there are citations from a plurality of document data, the individual citation ratio occupied by the description content citation from each document data and the total value of the respective citation ratios are calculated and stored in the citation ratio DB 101 g.
 図33は、引用割合を出力表示した場合の引用判定画面を例示した図である。図33の例では、判定範囲の文字数に対する引用部分の文字数の百分率として算出された引用割合を引用判定画面の右上部に表示する。なお、判定範囲の記述内容が複数の文献データから引用されている場合、図33に示したように各文献データからの引用割合の合計値と各文献データからの個別の引用割合とを共に表示させてもよく、各文献データからの引用割合の合計値のみを表示させてもよい。 FIG. 33 is a diagram exemplifying a quoting determination screen when the quoting ratio is displayed. In the example of FIG. 33, the citation ratio calculated as a percentage of the number of characters of the quoted portion to the number of characters of the determination range is displayed in the upper right portion of the citation determination screen. In addition, when the description content of the judgment range is cited from a plurality of document data, as shown in FIG. 33, the total value of the citation ratio from each document data and the individual citation ratio from each document data are displayed together You may make it display, and you may make it display only the total value of the citation ratio from each literature data.
(処理-リスト表示処理)
 次に、リスト表示処理について説明する。このリスト表示処理は、各論文データの引用割合に基づく順序で論文データ情報を出力する処理である。図34は、リスト表示処理の手順を示すフローチャートである。このリスト表示処理の実行タイミングは任意で、例えば、入力装置103を介してリスト表示処理の実行指示が入力された場合に起動される。
(Process-list display process)
Next, the list display process will be described. This list display process is a process of outputting article data information in the order based on the citation ratio of each article data. FIG. 34 is a flowchart showing the procedure of the list display process. The execution timing of the list display process is arbitrary, and is started, for example, when an instruction to execute the list display process is input via the input device 103.
 リスト表示処理が起動されると、出力制御部102hは、全ての論文データ情報及び対応する引用割合の合計値を引用割合DB101gから取得する(ステップSE1)。続いて、出力制御部102hは、取得した論文データ情報を、対応する引用割合の合計値の降順でソートし、表示装置104に出力表示させる(ステップSE2)。図35は、引用割合の合計値の降順で論文データ情報のリストを表示する判定結果画面を示す図である。図35に示すように、引用割合の降順で論文データ情報が画面表示される。この際、各論文データ情報について文献データ毎の個別の引用割合を併せて表示させてもよい。 When the list display process is activated, the output control unit 102h acquires the total value of all the article data information and the corresponding citation ratio from the citation ratio DB 101g (step SE1). Subsequently, the output control unit 102h sorts the acquired article data information in the descending order of the total value of the corresponding citation ratios, and causes the display device 104 to output and display the data (step SE2). FIG. 35 is a diagram showing a determination result screen displaying a list of article data information in descending order of the total value of the citation ratio. As shown in FIG. 35, thesis data information is displayed on the screen in descending order of the citation ratio. At this time, an individual citation ratio for each document data may be displayed together for each article data information.
(効果)
 このように実施の形態8の引用判定支援装置100では、判定範囲の記述内容の内、比較範囲から引用された記述内容が占める引用割合を算出及び出力するので、引用の適法性の判断材料を提示することができる。
(effect)
As described above, the quotation decision support apparatus 100 of the eighth embodiment calculates and outputs the citation ratio occupied by the description content cited from the comparison range among the description contents of the judgment range, and therefore the judgment material of the legitimacy of the quotation is Can be presented.
 また、複数の論文データについて引用割合を算出し、各論文データについての引用割合に基づく順序で論文データ情報を出力するので、複数の論文データにおける引用の適法性を引用割合に基づいて比較するための判断材料を提示することができる。 In addition, citation ratios are calculated for multiple article data, and article data information is output in the order based on the citation ratio for each article data, so that legality of citations in multiple article data is compared based on the citation proportions. It is possible to present the judgment material of
〔実施の形態9〕
 次に、実施の形態9について説明する。この形態は、引用箇所の引用元である文献データを特定する引用元情報が、判定対象データに含まれているか否かを判定する形態である。ただし、実施の形態9に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
[Embodiment 9]
A ninth embodiment will now be described. This form is a form that determines whether citation source information that specifies document data that is a citation source of a citation part is included in the determination target data. However, the configuration and the process according to the ninth embodiment are the same as the configuration and the process according to the first embodiment except when particularly described, and the same configuration and the process in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(構成)
 図36は、実施の形態9に係る引用判定支援装置の機能的構成を示すブロック図である。この引用判定支援装置100は、記憶部101に出力態様DB101hを備えている。
(Constitution)
FIG. 36 is a block diagram showing a functional configuration of the quotation decision support apparatus according to the ninth embodiment. The quotation determination support apparatus 100 includes an output mode DB 101 h in the storage unit 101.
 出力態様DB101hは、判定範囲の類似度と、表示装置104による出力態様とを、相互に対応付けて格納する出力態様情報格納手段である。図37は、出力態様DB101hに格納される情報を例示した表である。図37に示したように、出力態様DB101hはデータ項目として「類似度S[%]」「出力態様」を備え、これらに対応する情報が相互に関連付けて格納されている。項目「類似度S[%]」に対応して格納される情報は、判定範囲の類似度を特定する情報であり、引用判定の基準となる類似度の範囲を特定する情報(図37では「0≦S<20」「20≦S<80」等)が格納される。なお、図37では類似度の範囲を三段階に区分しているが、二段階、あるいは四段階以上に区分してもよい。項目「出力態様」に対応して格納される情報は、表示装置104による出力態様を特定するための情報であり、類似度に応じて出力させるべき態様を特定する情報が格納される。図37の例では、類似度が20%未満の場合は引用の可能性が低いと考えられることから文字の出力態様を「通常」とし、類似度が20%以上80%未満の場合は引用の可能性があることから文字の出力態様を「太字」とし、類似度が80%以上の場合は引用の可能性が高いことから文字の出力態様を「反転」としている。また、項目「出力態様」には、文字色や文字の背景色を特定する色情報、文字のフォントを特定するフォント情報等を格納してもよい。この出力態様DB101hに格納される情報の格納方法や格納タイミングは任意で、例えば予め入力装置103を介して出力態様DB101hに格納することができる。 The output mode DB 101 h is an output mode information storage unit that stores the degree of similarity of the determination range and the output mode by the display device 104 in association with each other. FIG. 37 is a table exemplifying information stored in the output mode DB 101 h. As shown in FIG. 37, the output mode DB 101 h includes “similarity S [%]” and “output mode” as data items, and information corresponding to these is stored in association with each other. The information stored corresponding to the item "Similarity S [%]" is information for specifying the similarity of the determination range, and information for specifying the range of the similarity serving as the reference for the quotation determination (in FIG. 37, " 0 ≦ S <20 ”,“ 20 ≦ S <80 ”, etc. are stored. Although the range of similarity is divided into three levels in FIG. 37, it may be divided into two levels, or four or more levels. The information stored corresponding to the item “output mode” is information for specifying the output mode by the display device 104, and information for specifying the mode to be output according to the degree of similarity is stored. In the example of FIG. 37, when the similarity is less than 20%, the character output mode is “normal” because the possibility of citation is considered low, and when the similarity is 20% or more and less than 80%, the citation is Since there is a possibility, the output mode of the character is "bold", and when the similarity is 80% or more, the output mode of the character is "reverse" because the possibility of quoting is high. Further, the item “output mode” may store color information specifying a character color or a background color of a character, font information specifying a font of a character, and the like. The storage method and storage timing of the information stored in the output aspect DB 101 h can be arbitrarily stored in the output aspect DB 101 h via, for example, the input device 103 in advance.
(処理)
 次に、以上のように構成された実施の形態9の引用判定支援装置100で実行される引用判定支援処理について説明する。図38は、実施の形態9の引用判定支援処理の手順を示すフローチャートである。なお、ステップSF1、ステップSF5、ステップSF8、ステップSF9、ステップSF12、及びステップSF15からステップSF17は、それぞれ実施の形態1において図2を参照して説明したステップS11、ステップS14、ステップS15、ステップS16、ステップS18、ステップS19からステップS20aの各処理と同様であるので、詳細な説明は省略する。
(processing)
Next, the citation determination support process executed by citation determination support apparatus 100 of the ninth embodiment configured as described above will be described. FIG. 38 is a flowchart of the quotation determination support process of the ninth embodiment. Steps SF1, step SF5, step SF8, step SF9, step SF12, and steps SF15 to SF17 are the steps S11, S14, S15, and S16 described with reference to FIG. 2 in the first embodiment, respectively. Steps S18 and S19 to S20a are the same as the processes described above, and thus detailed description will be omitted.
 ステップSF1において論文データの読み出しを行った後(ステップSF1)、出力制御部102hは論文データを表示装置104に出力表示させる(ステップSF2)。図39は、表示装置104上の引用判定画面に表示された論文データを示す図である。図39に示した例では、出力制御部102hは引用判定画面上に、論文データ表示エリア105、範囲設定スライダ106、全体ビュー107、文献データ表示エリア108を表示させる。論文データ表示エリア105は、判定対象の論文データを表示させる領域である。範囲設定スライダ106は、判定対象の論文データにおける判定範囲を設定するものであり、上側の範囲設定スライダ106aと下側の範囲設定スライダ106bとに挟まれた領域が判定範囲として設定される。全体ビュー107は、判定対象の論文データの全体における、論文データ表示エリア105の表示範囲、判定範囲、及び引用部分の概略位置を表示させる領域である。文献データ表示エリア108は、引用された文献データの内容を表示させる領域である。図39に示したように、ステップSF2では、論文データ表示エリア105に論文データの内容が表示されると共に、論文データ表示エリア105の表示範囲が長方形の枠線として全体ビュー107に表示される。 After reading the article data in step SF1 (step SF1), the output control unit 102h causes the display device 104 to output and display the article data (step SF2). FIG. 39 is a diagram showing article data displayed on the citation determination screen on the display device 104. As shown in FIG. In the example shown in FIG. 39, the output control unit 102h causes the paper data display area 105, the range setting slider 106, the entire view 107, and the document data display area 108 to be displayed on the quotation determination screen. The paper data display area 105 is an area for displaying paper data to be judged. The range setting slider 106 sets a determination range in the article data to be determined, and an area sandwiched between the upper range setting slider 106 a and the lower range setting slider 106 b is set as the determination range. The whole view 107 is an area for displaying the display range of the paper data display area 105, the judgment range, and the approximate position of the cited part in the whole of the paper data to be judged. The document data display area 108 is an area for displaying the content of the cited document data. As shown in FIG. 39, in step SF2, the contents of the article data are displayed in the article data display area 105, and the display range of the article data display area 105 is displayed in the entire view 107 as a rectangular frame.
 図38に戻り、判定範囲特定部102aは、入力装置103を介して判定範囲の指示入力がされたか否かを判定し(ステップSF3)、判定範囲の指示入力がされたと判定した場合(ステップSF3、Yes)、当該指示入力により指示された範囲を判定範囲として特定する(ステップSF4)。図39に示した例では、判定対象の論文データのうち、上側と下側との範囲設定スライダ106によって挟まれた範囲を判定範囲として特定する。また、出力制御部102hは、図39に例示したように、判定範囲外の領域を全体ビュー107において斜線ハッチングで表示させる。 Referring back to FIG. 38, the determination range specifying unit 102a determines whether an instruction input of the determination range is input through the input device 103 (step SF3), and determines that an instruction input of the determination range is input (step SF3). , Yes), the range instructed by the instruction input is specified as the determination range (step SF4). In the example shown in FIG. 39, the range between the upper and lower range setting sliders 106 among the article data to be determined is specified as the determination range. In addition, as illustrated in FIG. 39, the output control unit 102h causes the region outside the determination range to be displayed by hatching in the entire view 107.
 図38に戻り、ステップSF5において比較範囲が特定された後、文献引用判定部102dは、入力装置103を介して類似度の閾値が入力されたか否かを判定し(ステップSF6)、類似度の閾値が入力されたと判定した場合(ステップSF6、Yes)、文献引用判定部102dは当該入力された閾値を引用判定における類似度の閾値として設定する(ステップSF7)。なお、閾値の入力方法は任意であり、例えば閾値の入力ボックスを引用判定画面に表示させ(図示省略)、入力装置103を介して当該入力ボックスに入力された数値を閾値として設定することができる。あるいは、閾値の設定スライダを引用判定画面に表示させ(図示省略)、入力装置103を介して位置変更された設定スライダの位置に対応する値を閾値として設定することができる。 Referring back to FIG. 38, after the comparison range is specified in step SF5, the document reference determination unit 102d determines whether the threshold of similarity is input through the input device 103 (step SF6), and the similarity When it is determined that the threshold is input (Yes in step SF6), the document reference determination unit 102d sets the input threshold as a threshold of similarity in the reference determination (step SF7). In addition, the input method of a threshold is arbitrary, for example, an input box of a threshold can be displayed on a quotation determination screen (illustration omitted), and the numerical value input into the said input box via the input device 103 can be set as a threshold. . Alternatively, a threshold setting slider may be displayed on the quotation determination screen (not shown), and a value corresponding to the position of the setting slider whose position has been changed via the input device 103 may be set as the threshold.
 ステップSF9において類似度算出部102cが類似度を算出した後(ステップSF9)、出力制御部102hは、当該算出された類似度に対応する出力態様を出力態様DB101hから取得し(ステップSF10)、当該取得した出力態様に基づいて判定範囲を表示装置104に出力表示させる(ステップSF11)。図39の例では、論文データ表示エリア105における文字が、算出された類似度に対応して図37に例示した出力態様DB101hから取得された出力態様に基づいて表示されている。すなわち、類似度が20%未満の部分は通常の表示、類似度が20%以上80%未満の部分は太字表示、類似度が80%以上の部分は反転表示される。また、類似度が20%以上の部分については、全体ビュー107において交差線によるハッチングで表示される。これにより、論文データ全体において引用の可能性がある部分が占める範囲を、ユーザが概略的に把握することが可能となる。 After the similarity calculation unit 102c calculates the similarity in step SF9 (step SF9), the output control unit 102h acquires an output mode corresponding to the calculated similarity from the output mode DB 101h (step SF10), The determination range is output and displayed on the display device 104 based on the acquired output mode (step SF11). In the example of FIG. 39, the characters in the article data display area 105 are displayed based on the output aspect acquired from the output aspect DB 101 h illustrated in FIG. 37 corresponding to the calculated similarity. That is, a portion with a similarity of less than 20% is displayed normally, a portion with a similarity of 20% or more and less than 80% is displayed in bold, and a portion with a similarity of 80% or more is displayed in reverse. In addition, a portion having a similarity of 20% or more is hatched by cross lines in the entire view 107. This makes it possible for the user to roughly grasp the range occupied by the part that can be cited in the entire article data.
 図38に戻り、ステップSF12において、適法性判定部102eによって判定範囲の引用が適法な引用でないと判定された場合(ステップSF12、No)、適法性判定部102eは、適法でないと判定された引用箇所の引用元である文献データを特定する引用元情報が、判定対象の論文データに含まれているか否かを判定する(ステップSF13)。引用元情報の具体的な内容は任意で、例えば、引用元文献の著者名、発表年、タイトル、掲載雑誌、巻数、所在ペ―ジ等の情報を引用元情報とすることができる。また、引用元情報が含まれているか否かの判定基準は任意で、例えば、引用箇所の直後に引用元情報が記載されているか否か、あるいは、引用箇所の直後に記載された注番号に対応して論文データの末尾に引用情報が記載されているか否かを基準として判定を行うことができる。 Referring back to FIG. 38, when it is determined in step SF12 that the citation of the determination range is not a legitimate citation by the legality determination unit 102e (No in step SF12), the citation determination unit 102e determines that the citation is not legal. It is determined whether the citation source information specifying the document data which is the citation source of the part is included in the article data to be determined (step SF13). The specific content of the citation source information is optional. For example, information such as the author name of the citation source document, year of publication, title, published magazine, number of volumes, location page, etc. can be used as the citation source information. In addition, a criterion for determining whether or not citation source information is included is optional, for example, whether citation source information is described immediately after the citation, or in a note number described immediately after the citation. Correspondingly, determination can be made based on whether citation information is described at the end of the article data.
 その結果、引用元情報が判定対象の論文データに含まれていると判定した場合(ステップSF13、Yes)、出力制御部102hは、当該引用元情報を表示装置104に出力表示させる(ステップSF14)。引用元情報を表示装置104に出力表示させる方法や手順は任意で、例えば適法な引用でないと判定された引用部分に対応する引用元情報を表示させる旨の指示入力が入力装置103を介してされた場合、当該引用部分に対応する引用元情報を表示させる。また、適法な引用でないと判定された引用部分が複数個所存在する場合は、入力装置103を介した指示入力により複数の引用部分の中から選択された引用部分に対応する引用元情報を表示させる。図40は、引用元情報が表示された引用判定画面を示す図である。図40に示したように、引用判定画面の論文データ表示エリア105において、指定された引用部分に対応する引用元情報(表示されている論文データの最下部の「○○○○,△△△△,「××××××」,□□誌,第○巻,△頁-□頁」部分)が強調表示される(図40では反転表示)。 As a result, when it is determined that citation source information is included in the article data to be determined (Yes in step SF13), the output control unit 102h causes the display device 104 to output and display the citation source information (step SF14). . The method and procedure for causing the display device 104 to output and display the quotation source information are arbitrary, and for example, an instruction input to display the quotation source information corresponding to the citation portion determined not to be a legal quotation is input via the input device 103 If it does, the citation source information corresponding to the citation part is displayed. In addition, when there are a plurality of citation parts determined not to be a legal citation, the citation source information corresponding to the citation part selected from among the plurality of citation parts is displayed by an instruction input via the input device 103. . FIG. 40 is a diagram showing a quotation determination screen on which quotation source information is displayed. As shown in FIG. 40, in the thesis data display area 105 of the citation determination screen, the citation source information corresponding to the designated citation portion (“○○○, ΔΔΔ at the bottom of the displayed paper data Δ, “××××××”, □□ magazine, Volume O, Δ page − Δ page ”portion is highlighted (in FIG. 40, inverted display).
 このように引用元情報を表示した後(ステップSF14)、又はステップSF13で引用元情報が論文データに含まれていないと判定した場合(ステップSF13、No)、参照情報取得部102fは参照情報を取得する(ステップSF15)。 After displaying the citation source information in this way (step SF14), or when it is determined in step SF13 that the citation source information is not included in the article data (step SF13, No), the reference information acquisition unit 102f references the reference information. It acquires (step SF15).
(効果)
 このように実施の形態9の引用判定支援装置100では、引用箇所の引用元である文献データを特定する引用元情報が論文データに含まれているか否かを判定するので、引用元情報の有無に基づいて引用の適法性を判定する際の判断材料を取得できる。
(effect)
As described above, in the quotation determination support apparatus 100 of the ninth embodiment, it is determined whether or not the citation source information specifying the document data which is the citation source of the citation part is included in the article data. It is possible to obtain judgmental materials in determining the legitimacy of citation based on
 また、論文データの中から、入力装置103を介して指定された範囲を判定範囲として特定するので、引用判定を行う対象を限定することができ、判定処理に伴う負荷を低減することができる。 In addition, since the range designated via the input device 103 is specified as the determination range from the article data, it is possible to limit the target for which the quotation determination is performed, and to reduce the load associated with the determination process.
 また、類似度が入力装置103を介して入力された所定の閾値以上である場合に、判定範囲が比較範囲を引用していると判定するので、判定の目的に合わせて最適な閾値を設定し、当該閾値に基づく判定を行わせることができる。 In addition, when the similarity is equal to or greater than the predetermined threshold input through the input device 103, it is determined that the determination range refers to the comparison range, so an optimum threshold is set according to the purpose of the determination. The determination based on the threshold can be performed.
 また、類似度算出部102cにて算出された類似度に対応する出力態様を出力態様DB101hから取得し、当該取得した出力態様にて判定範囲を出力するので、ユーザが類似度を把握し易い態様で判定範囲を出力することができる。 In addition, since the output mode corresponding to the similarity calculated by the similarity calculation unit 102c is acquired from the output mode DB 101h, and the determination range is output in the acquired output mode, the user can easily grasp the similarity. The judgment range can be output with.
〔実施の形態10〕
 次に、実施の形態10について説明する。この形態は、文献データを特定する引用元情報が判定対象データに含まれている場合において、当該引用元情報に基づいて特定される文献データが文献データ記憶手段に格納されていると判定した場合、当該文献データを比較範囲として特定する形態である。ただし、実施の形態10に係る構成及び処理は、特に説明する場合を除いては実施の形態1に係る構成及び処理と同じであるものとし、同一の構成及び処理については、実施の形態1で使用したものと同一の名称又は符号を必要に応じて用いることで、その説明を省略する。
Tenth Embodiment
A tenth embodiment will now be described. In this embodiment, when citation source information specifying document data is included in the determination target data, it is determined that the document data specified based on the citation source information is stored in the document data storage unit The document data is specified as a comparison range. However, the configuration and processing according to the tenth embodiment are the same as the configuration and processing according to the first embodiment except when particularly described, and the same configuration and processing in the first embodiment The description is abbreviate | omitted by using the same name or code | symbol as what was used as needed.
(処理-引用判定支援処理)
 本実施の形態10の引用判定支援装置100で実行される引用判定支援処理について説明する。図41は、実施の形態10の引用判定支援処理の手順を示すフローチャートである。なお、ステップSG11を除くステップSG1からステップSG12までの各処理は、実施の形態1において図2を参照して説明したステップS11からステップS20aの各処理と同様であるので、詳細な説明は省略する。
(Processing-quote judgment support process)
A quote determination support process executed by the quote determination support apparatus 100 of the tenth embodiment will be described. FIG. 41 is a flowchart of the quotation determination support process of the tenth embodiment. The processes of steps SG1 to SG12 excluding step SG11 are the same as the processes of steps S11 to S20a described with reference to FIG. 2 in the first embodiment, and thus detailed description will be omitted. .
 ステップSG10において、文献データ等を判定範囲内で引用している箇所を引用判定画面に明示するとともに参照情報を表示した後(ステップSG10)、比較範囲特定部102bは、引用されていた文献データを文献データ記憶部101aに記憶させると共に、当該文献データの書誌情報(例えば、著者名、発表年、タイトル、掲載雑誌、URL等)、及び当該文献データの保存場所(例えばフォルダ名)を文献リスト記憶部101bに記憶させる(ステップSG11)。その後、ステップSG5からステップSG11までの処理が、特定された比較範囲のデータの全てについて実行されたか否かを判定する(ステップSG12)。 In step SG10, after the reference data and the like are explicitly cited in the determination range and the reference information is displayed (step SG10), the comparison range specification unit 102b displays the cited reference data. The bibliographic information (for example, author name, publication year, title, publication magazine, URL, etc.) of the document data and the storage location (for example, folder name) of the document data are stored in the document list while being stored in the document data storage unit 101a. The information is stored in the unit 101b (step SG11). Thereafter, it is determined whether or not the processing from step SG5 to step SG11 has been executed for all the data in the specified comparison range (step SG12).
(処理-比較範囲特定処理)
 ここで、引用判定支援処理のステップSG4で実行される比較範囲特定処理について説明する。図42は、実施の形態10の比較判定処理の手順を示すフローチャートである。なお、ステップSH5からステップSH7までの各処理は、実施の形態1において図6を参照して説明したステップS21からステップS23までの各処理と同様であるので、詳細な説明は省略する。
(Process-comparison range identification process)
Here, the comparison range identification process executed in step SG4 of the quote determination support process will be described. FIG. 42 is a flowchart showing the procedure of the comparison and determination process of the tenth embodiment. The processes in steps SH5 to SH7 are the same as the processes in steps S21 to S23 described with reference to FIG. 6 in the first embodiment, and thus detailed description will be omitted.
 比較範囲特定処理が起動されると、比較範囲特定部102bは、引用判定支援処理のステップSG2で行われた論文データの構造解析の結果に基づき、論文データにおいて引用されている文献データを特定する引用元情報が、当該論文データに含まれているか否かを判定する(ステップSH1)。 When the comparison range identification process is activated, the comparison range identification unit 102b identifies the document data cited in the article data based on the result of the structural analysis of the article data performed in step SG2 of the quotation determination support process. It is determined whether the citation source information is included in the article data (step SH1).
 その結果、引用元情報が含まれていた場合(ステップSH1、Yes)、比較範囲特定部102bは文献リスト記憶部101bを参照し、引用元情報に対応する書誌情報が当該文献リスト記憶部101bに記憶されているか否かを判定する(ステップSH2)。その結果、書誌情報が文献リスト記憶部101bに記憶されていた場合(ステップSH2、Yes)、比較範囲特定部102bは、当該書誌情報に対応付けて記憶されている保存場所に保存されている文献データを文献データ記憶部101aから読み出し(ステップSH3)、当該読出した文献データを、比較範囲として特定する(SH4)。 As a result, when the citation source information is included (step SH1, Yes), the comparison range specification unit 102b refers to the document list storage unit 101b, and bibliographic information corresponding to the citation source information is stored in the document list storage unit 101b. It is determined whether or not it is stored (step SH2). As a result, when bibliographic information is stored in the document list storage unit 101b (Yes in step SH2), the comparison range specification unit 102b stores the documents stored in the storage location stored in association with the bibliographic information. The data is read from the document data storage unit 101a (step SH3), and the read document data is specified as a comparison range (SH4).
 一方、ステップSH1において引用元情報が論文データに含まれていないと判定した場合(ステップSH1、No)、あるいはステップSH2において引用元情報に対応する書誌情報が文献リスト記憶部101bに記憶されていないと判定した場合(ステップSH2、No)、比較範囲特定部102bは、論文データ記憶部101cに格納されている過去に提出された全ての論文データを読み出す(ステップSH5)。 On the other hand, when it is determined in step SH1 that citation source information is not included in the article data (No in step SH1), or in step SH2, bibliographic information corresponding to the citation source information is not stored in the document list storage unit 101b. If it is determined that the result is (step SH2, No), the comparison range specification unit 102b reads all the thesis data submitted in the past stored in the thesis data storage unit 101c (step SH5).
 ステップSH4又はステップSH7の処理の後、比較範囲特定部102bは比較範囲特定処理を終了し、メインルーチンに戻る。 After the process of step SH4 or step SH7, the comparison range specifying unit 102b ends the comparison range specifying process and returns to the main routine.
(効果)
 このように実施の形態10の引用判定支援装置100では、判定範囲において引用されていると文献引用判定部102dによって判定された文献データを文献データ記憶部101aに記憶させる。また、文献データを特定する引用元情報が論文データに含まれている場合において、当該引用元情報に基づいて特定される文献データが文献データ記憶部101aに格納されていると判定した場合、当該文献データを比較範囲として特定する。これにより、既に文献データ記憶部101aに記憶されている文献データに比較範囲を限定することができ、比較範囲のデータから判定範囲の内容を検索する際の負荷を低減することができる。
(effect)
Thus, in the quotation determination support apparatus 100 of the tenth embodiment, the document data storage unit 101 a stores the document data determined by the document quotation determination unit 102 d as being cited in the determination range. In addition, when citation source information that specifies document data is included in the article data, when it is determined that the document data specified based on the citation source information is stored in the document data storage unit 101a, Identify literature data as a comparison range. As a result, the comparison range can be limited to the document data already stored in the document data storage unit 101a, and the load in searching the content of the determination range from the data of the comparison range can be reduced.
〔実施の形態6に対する変形例1〕
 以上、本発明に係る各実施の形態について説明したが、本発明の具体的な構成及び手段は、特許請求の範囲に記載した各発明の技術的思想の範囲内において、任意に改変及び改良することができる。以下、このような変形例について説明する。
[Modification 1 to Embodiment 6]
Although the embodiments according to the present invention have been described above, the specific configuration and means of the present invention can be modified and improved arbitrarily within the scope of the technical idea of each invention described in the claims. be able to. Hereinafter, such a modified example will be described.
 実施の形態6では、類似度算出部102rにおいて、検索キーの文字列が制限文字数を超えた場合の処理を行っているが、予め検索キーとして指定する文字列が制限文字数を超えないように処理することもできる。 In the sixth embodiment, the similarity calculation unit 102r performs processing when the character string of the search key exceeds the limited number of characters, but performs processing so that the character string designated as the search key does not exceed the limited number of characters in advance. You can also
 すなわち、類似度算出部102rにおいて、判定範囲を形態素解析などを利用したテキストマイニング処理により解析して制限文字数以下の文字列の単語に分割し、所定数以上出現する単語を検索キーと指定して、単語ごとに比較範囲の中から複数回検索する。そして、類似度算出部102rは、複数回の検索結果の中で出現頻度が所定の値より大きい検索結果の比較範囲を、判定範囲の記述内容との相互の類似度を算出する比較範囲の対象として決定するように構成すればよい。 That is, in the similarity calculation unit 102r, the determination range is analyzed by text mining processing using morphological analysis etc. and divided into words of a character string less than the limited number of characters, and a word appearing a predetermined number or more is designated as a search key Search multiple times from the comparison range for each word. Then, the similarity calculation unit 102r is a target of a comparison range for calculating the similarity between the comparison range of the search result whose appearance frequency is larger than a predetermined value among the search results of a plurality of times and the description content of the determination range. It may be configured to be determined as
 このような変形例1における類似度算出処理について説明する。図22は、変形例1の類似度算出処理の手順を示すフローチャートである。 The similarity calculation process in the first modification will be described. FIG. 22 is a flowchart illustrating the procedure of the similarity calculation process of the first modification.
 まず、類似度算出部102rは、判定範囲の記述内容のデータに対して形態素解析等のテキストマイニング処理を施して、制限文字数以内の文字数の単語に分割する(ステップS111)。そして、類似度算出部102rは、単語ごとの出現頻度を算出し(ステップS112)、出現頻度の高い順に単語をソートする(ステップS113)。そして、類似度算出部102rは、最も出現頻度の高い単語を検索キーとして指定する(ステップS114)。 First, the similarity calculation unit 102r performs text mining processing such as morphological analysis on data of the description content of the determination range, and divides the data into words having the number of characters within the limited number of characters (step S111). Then, the similarity calculation unit 102r calculates the appearance frequency of each word (step S112), and sorts the words in descending order of appearance frequency (step S113). Then, the similarity calculation unit 102r designates a word with the highest appearance frequency as a search key (step S114).
 次に、類似度算出部102rは、指定された検索キーで比較範囲を検索し(ステップS115)、その検索結果をメモリに記憶する(ステップS116)。 Next, the similarity calculation unit 102r searches the comparison range with the designated search key (step S115), and stores the search result in the memory (step S116).
 次に、類似度算出部102rは、出現頻度が所定数以上の全ての単語につき検索の処理を行ったか否かを判断する(ステップS117)。そして、類似度算出部102rは、出現頻度が所定数以上の全ての単語に対してまだ検索処理を行っていないと判断した場合には(ステップS117,No)、次に出現頻度の高い単語を検索キーとして指定し(ステップS118)、ステップS115およびS116の検索処理を繰り返し実行する。 Next, the similarity calculation unit 102r determines whether the search process has been performed for all the words whose appearance frequency is a predetermined number or more (step S117). Then, when the similarity calculation unit 102 r determines that the search processing has not been performed on all the words having the appearance frequency of the predetermined number or more (No at step S 117), the word having the next highest appearance frequency is Designating as a search key (step S118), the search process of steps S115 and S116 is repeatedly executed.
 一方、ステップS117において、類似度算出部102rは、出現頻度が所定数以上の全ての単語に対して検索処理が完了したと判断した場合には(ステップS117,Yes)、メモリに記憶された複数の検索結果の中で最も出現頻度の高い検索結果としての比較範囲を選択する(ステップS119)。これにより、選択された比較範囲が類似度算出の対象となり、判定範囲との類似度が算出される。 On the other hand, when it is determined in step S117 that the similarity calculation unit 102r has completed the search processing for all the words having the appearance frequency of a predetermined number or more (Yes in step S117), a plurality of words stored in the memory The comparison range as the search result having the highest frequency of appearance among the search results of is selected (step S119). Thereby, the selected comparison range becomes an object of similarity calculation, and the similarity with the determination range is calculated.
 このように実施の形態6の変形例1によれば、出現頻度の高い検索結果を自動的に特定し、この検索結果を、類似度算出に用いる比較範囲として自動的に設定するので、判定範囲にマッチする比較範囲を自動的に抽出して引用判定を行なうことができ、引用判定の精度を一層向上させることができる。また、判定範囲の記述内容の中から予め制限文字数以下の単語を検索キーとして検索を行うこともでき、この場合には、検索キーの制限文字数にかかわらず、引用判定の精度を向上させることができる。 As described above, according to the first modification of the sixth embodiment, a search result having a high appearance frequency is automatically specified, and this search result is automatically set as a comparison range used for similarity calculation. The comparison range that matches with can be extracted automatically and the citation determination can be performed, and the accuracy of the citation determination can be further improved. In addition, it is also possible to search in advance using words less than the limited number of characters from the description content of the determination range as the search key. In this case, the accuracy of the quotation determination can be improved regardless of the limited number of characters of the search key. it can.
 なお、本変形例の処理を、実施の形態6と同様に、検索エンジン等から検索キーの文字数が制限文字数を超えている旨のエラー通知を受信したときのみに実行するように、類似度算出部102rを構成してもよい。 As in the sixth embodiment, the similarity calculation is performed so that the process of this modification is performed only when an error notification that the number of characters of the search key exceeds the number of restricted characters is received from the search engine or the like. The unit 102r may be configured.
〔実施の形態5および6に対する変形例2〕
 実施の形態6では、類似度算出部102rにおいて、判定範囲と比較範囲の類似度を算出する際に、検索キーが制限文字数を超えた場合の処理をおこなっていたが、かかる処理を、実施の形態5における比較範囲特定部102qの処理に適用することもできる。
[Modification 2 to Embodiments 5 and 6]
In the sixth embodiment, when the similarity calculating unit 102r calculates the similarity between the determination range and the comparison range, processing is performed in the case where the search key exceeds the limited number of characters. The present invention can also be applied to the processing of the comparison range identification unit 102 q in the fifth embodiment.
 すなわち、論文データから抽出した課題文を検索キーにしてWEBページを検索しているが、課題文が制限文字数を超えるような長い文章の場合には、検索エンジン等から実施の形態6で説明したエラー通知が送信される。このため、比較範囲特定部102qにおいて、検索キーが制限文字数を超える旨のエラー通知を受信した場合には、実施の形態6の類似度算出部102rと同様に、抽出された課題文の中で制限文字数分の文字列で検索キーを指定して、検索キーとしての課題文の文字列をずらしながら複数回の検索を行う。そして、検索結果として出力された複数のURLの中から最も出現頻度の高いURLで指定された引用文献データを比較範囲と決定するように比較範囲特定部102qを構成すればよい。 That is, although the WEB page is searched using the task sentence extracted from the article data as a search key, in the case of a long sentence in which the task sentence exceeds the limited number of characters, the search engine etc. An error notification is sent. For this reason, when the comparison range specifying unit 102 q receives an error notification that the search key exceeds the number of restricted characters, like the similarity calculation unit 102 r of the sixth embodiment, among the extracted task sentences, A search key is specified by the character string for the limited number of characters, and the search is performed multiple times while shifting the character string of the task sentence as the search key. Then, the comparison range specification unit 102 q may be configured to determine the cited document data specified by the URL with the highest appearance frequency among the plurality of URLs output as the search result as the comparison range.
 図23は、変形例2の比較範囲特定処理の手順を示すフローチャートである。まず、課題抽出部102pは、判定対象となった論文データに構造解析を行って、課題文を抽出する(ステップS131)。次に、比較範囲特定部102qは、抽出された課題文を検索キーとして、インターネット130上のWEBサイト131やファイルサーバ133等から該当するWEBページを検索する(ステップS132)。 FIG. 23 is a flowchart illustrating the procedure of comparison range identification processing of the second modification. First, the task extracting unit 102p analyzes the structure of the article data as the determination target to extract task sentences (step S131). Next, the comparison range specifying unit 102 q searches for the corresponding WEB page from the WEB site 131 on the Internet 130, the file server 133, and the like using the extracted task sentence as a search key (step S132).
 そして、比較範囲特定部102qは、検索キーが制限文字数を超えた旨のエラー通知を受信したか否かを判断する(ステップS133)。 Then, the comparison range specifying unit 102 q determines whether an error notification that the search key has exceeded the limit number of characters has been received (step S 133).
 そして、検索キーが制限文字数を超えた旨のエラー通知を受信しなかった場合には(ステップS133,No)、比較範囲特定部102qは、検索結果のURLを選択し(ステップS141)、実施の形態5と同様に、この検索結果のURLで指定された引用文献データが比較範囲として特定されることになる。 When the error notification that the search key has exceeded the limit number of characters has not been received (step S133, No), the comparison range specifying unit 102q selects the URL of the search result (step S141), and Similar to the fifth aspect, the cited reference data specified by the URL of the search result is specified as the comparison range.
 一方、ステップS133において、検索キーが制限文字数を超えた旨のエラー通知を受信した場合には(ステップS133,Yes)、比較範囲特定部102qは、受信したエラー通知の中から制限文字数を取得する(ステップS134)。 On the other hand, in step S133, when an error notification that the search key has exceeded the limited number of characters is received (Yes in step S133), the comparison range specifying unit 102q acquires the limited number of characters from the received error notification. (Step S134).
 そして、比較範囲特定部102qは、課題文の先頭から、制限文字数分の範囲の文字列を検索キーと指定し(ステップS135)、この検索キーでWEBページを検索する(ステップS136)。比較範囲特定部102qは、その検索結果であるURLをメモリに記憶する(ステップS137)。 Then, the comparison range specifying unit 102q designates a character string of a range for the limited number of characters from the top of the task sentence as a search key (step S135), and searches the WEB page using this search key (step S136). The comparison range specifying unit 102 q stores the URL, which is the search result, in the memory (step S 137).
 そして、比較範囲特定部102qは、課題文の検索キーとして最終文字列まで到達したか否かを判断し(ステップS138)、まだ到達していない場合には(ステップS138,No)、課題文の中で次の制限文字数分の文字列を検索キーに指定して(ステップS139)、ステップS136およびS137の処理を繰り返し実行する。 Then, the comparison range specifying unit 102q determines whether the final character string has been reached as a search key for the task sentence (step S138), and if it has not reached yet (step S138, No), Among them, a character string for the next limited number of characters is designated as a search key (step S139), and the processes of steps S136 and S137 are repeatedly executed.
 一方、ステップS138において、課題文の検索キーとして最終文字列まで到達した場合には(ステップS138,Yes)、メモリに保存された検索結果のURLの中で、最も出現頻度の高い検索結果のURLを選択し(ステップS140)、選択されたWEBページのURLで指定された引用文献データが比較範囲として特定されることになる。 On the other hand, when the final character string is reached as the search key of the task sentence in step S138 (step S138, Yes), the URL of the search result having the highest frequency of appearance among the URLs of the search results stored in the memory. Is selected (step S140), and cited reference data specified by the URL of the selected WEB page is specified as the comparison range.
 従って、この変形例2によれば、検索キーが制限文字数を超えた場合には、課題文の中で制限文字数分の文字列で検索キーを指定して、検索キーとしての課題文の文字列をずらしながら複数回の検索を行っているので、検索キーの制限文字数にかかわらず、論文データの内容に即した適切な引用文献データの比較範囲を特定することができ、引用判定の精度をより向上させることができる。 Therefore, according to the second modification, when the search key exceeds the limited number of characters, the character string of the limited number of characters in the task sentence designates the search key, and the character string of the task sentence as the search key Since the search is performed several times while shifting the key, it is possible to specify an appropriate comparison range of cited reference data according to the content of the dissertation, regardless of the limited number of characters of the search key. It can be improved.
〔実施の形態5および6に対する変形例3〕
 さらに、実施の形態5の比較範囲特定部102qを、上記変形例1で説明したような処理、すなわち、予め検索キーとして指定する文字列が制限文字数を超えないように、課題文を単語に分割して検索キーとして指定して検索を行うように構成してもよい。
[Modification 3 to Embodiments 5 and 6]
Furthermore, the process as described in the first modification of the comparison range specifying unit 102q according to the fifth embodiment, that is, the task sentence is divided into words such that the character string specified in advance as a search key does not exceed the limited number of characters. Then, the search may be configured by specifying it as a search key.
〔実施の形態7に対する変形例〕
 上述の実施の形態7では、類似度算出部102cによって算出された類似度が所定の閾値以上と判定された部分について引用の適法性の判定を行っているが、判定対象の論文データ全体について引用の適法性判定を行うように構成してもよい。例えば、判定対象の論文データに、当該論文データの種別に対応する引用形式の記号(例えば「『』」、「””」等)が含まれている場合、当該論文データにおける引用は適法である旨の判定をするように構成してもよい。また、当該判定結果に基づく出力態様で、論文データのファイル名を表示装置104に出力表示させてもよい。例えば、引用が不適法と判定された論文データのファイル名を白黒反転表示させ、適法と判定された論文データのファイル名と識別可能としてもよい。
[Modification to Embodiment 7]
In the above-described seventh embodiment, determination of the legitimacy of citation is performed for a portion where the similarity calculated by the similarity calculation unit 102 c is determined to be equal to or higher than a predetermined threshold. It may be configured to perform legality determination of For example, if the article data to be judged contains a reference symbol (for example, "", "", etc.) corresponding to the type of the article data, citation in the article data is legal. It may be configured to determine the effect. In addition, the file name of the thesis data may be output and displayed on the display device 104 in an output mode based on the determination result. For example, the file name of the article data whose citation is determined to be inappropriate may be displayed in reverse in black and white so as to be distinguishable from the file name of the article data determined to be appropriate.
〔実施の形態1から10に対するその他の変形例〕
 上記各実施の形態にかかる引用判定支援装置の判定範囲特定部102a、102i、102jに自動参照の機能を組み込み、起動時に自動的に論文データ記憶部101cから、所望の論文データを利用者に選択させて選択された論文データを読み込むように構成してもよい。
 また、上記実施の形態にかかる引用判定支援装置の比較範囲特定部102b、102l、102qを、比較範囲となる記憶部やWEBサイト等を一つに限定せず、WEBサイト、図書館検索データベース、ローカルサーバ、これらを任意に組み合わせたものから横断的に比較範囲を特定するように構成してもよい。
[Other Modifications to Embodiments 1 to 10]
The function of automatic reference is incorporated in the judgment range specification units 102a, 102i and 102j of the citation judging support device according to each of the above embodiments, and a desired paper data is selected as a user from the paper data storage unit 101c automatically at startup. It may be configured to read selected article data.
In addition, the comparison range specifying units 102b, 102l, and 102q of the quotation determination support apparatus according to the above embodiment are not limited to one storage unit or WEB site as a comparison range, but may be a WEB site, a library search database, a local The server may be configured to specify the comparison range across from any combination of these.
 また、以上説明した実施の形態では、判定対象データとして学生が作成した論文データをあげて説明したが、これに限定されるものではなく、文章が記述されたあらゆるデータを判定対象データとすることができる。 Further, in the embodiment described above, the dissertation data created by the student has been described as the determination target data, but the present invention is not limited to this, and any data in which sentences are described is used as the determination target data. Can.
 また、判定範囲の記述内容が複数の文献データから引用されている場合には、出力制御部102hが、文献データ毎に異なる出力態様(例えば、異なる色彩や字体等)で各引用部分を表示装置104に出力表示させるように構成してもよい。また、各文献データからの引用割合に応じて異なる表示態様で、各引用部分を表示させてもよい。 In addition, when the description content of the determination range is cited from a plurality of document data, the output control unit 102 h displays each cited portion in an output mode (for example, different color, font, etc.) different for each document data. It may be configured to output and display at 104. In addition, each citation part may be displayed in a different display manner depending on the citation ratio from each document data.
 また、発明が解決しようとする課題や発明の効果は、前記した内容に限定されるものではなく、本発明によって、前記に記載されていない課題を解決したり、前記に記載されていない効果を奏することもでき、また、記載されている課題の一部のみを解決したり、記載されている効果の一部のみを奏することがある。 Further, the problems to be solved by the invention and the effects of the invention are not limited to the contents described above, and the present invention solves the problems not described above, and the effects not described above. It may also play, or may only solve some of the listed tasks or only some of the listed effects.
 実施の形態1から10および上記変形例の引用判定支援装置で実行される引用判定支援プログラムは、インストール可能な形式又は実行可能な形式のファイルでCD-ROM、フレキシブルディスク(FD)、CD-R、DVD(Digital Versatile Disk)等のコンピュータで読み取り可能な記録媒体に記録されて提供される。 The citation determination support program executed by the citation determination support apparatus according to the first to tenth embodiments and the above modification example is a file of an installable format or an executable format, and is a CD-ROM, a flexible disk (FD), a CD-R , And provided by being recorded on a computer readable recording medium such as a DVD (Digital Versatile Disk).

Claims (20)

  1.  判定対象となる判定対象データの中で、文献データが引用されているか否かを判定するための引用判定支援装置であって、
     前記判定対象データの中から、前記文献データの引用の有無の判定範囲を特定する判定範囲特定手段と、
     前記文献データの中から、前記判定対象データとの比較範囲を特定する比較範囲特定手段と、
     前記判定範囲特定手段にて特定された前記判定範囲の記述内容を、前記比較範囲特定手段にて特定された前記比較範囲の中から検索し、前記判定範囲の記述内容と前記比較範囲の記述内容の相互の類似度を算出する類似度算出手段と、
     前記類似度算出手段にて算出された前記類似度が所定の閾値以上である場合に、前記判定範囲が前記比較範囲を引用していると判定する文献引用判定手段と、
     前記文献データの前記比較範囲を引用している前記判定対象データの前記判定範囲を出力する出力手段と、
     を備えたことを特徴とする引用判定支援装置。
    A quoting determination support device for determining whether or not document data is cited in determination target data to be determined,
    A determination range specifying unit that specifies a determination range of presence / absence of citation of the document data from the determination target data;
    Comparison range specifying means for specifying a comparison range with the determination target data among the document data;
    Descriptive contents of the judgment range specified by the judgment range specifying means are searched from the comparison range specified by the comparison range specifying means, and description contents of the judgment range and description contents of the comparison range Similarity calculation means for calculating the mutual similarity of
    Document quoting judging means for judging that the judgment range cites the comparison range when the similarity calculated by the similarity calculating means is equal to or more than a predetermined threshold value;
    An output unit that outputs the determination range of the determination target data quoting the comparison range of the document data;
    A citation judging support device characterized by comprising:
  2.  前記文献引用判定手段にて前記判定範囲が前記比較範囲を引用していると判定された場合に、当該判定範囲における当該比較範囲の引用箇所およびその近傍箇所に基づいて、当該引用が適法な引用であるか否かを判定する適法性判定手段、
     を備えたことを特徴とする請求項1に記載の引用判定支援装置。
    When it is determined that the determination range refers to the comparison range by the document reference determination means, the citation is legally quoted based on the reference place of the comparison range in the determination range and the vicinity thereof. Legality determination means for determining whether or not
    The citation judging support device according to claim 1, comprising:
  3.  前記適法性判定手段は、前記引用箇所の引用元である前記文献データを特定する引用元情報が、前記判定対象データに含まれているか否かを判定すること、
     を特徴とする請求項2に記載の引用判定支援装置。
    The legality determination means determines whether or not the citation source information specifying the document data that is the citation source of the citation part is included in the determination target data.
    The citation judging support device according to claim 2, characterized in that
  4.  前記適法性判定手段は、前記判定範囲において前記類似度が所定の閾値以上である場合に、当該判定範囲が所定の引用形式に合致するか否かを判定し、当該判定結果に基づいて、当該判定範囲における前記比較範囲の引用が適法な引用であるか否かを判定すること、
     を特徴とする請求項2又は3に記載の引用判定支援装置。
    When the similarity is equal to or greater than a predetermined threshold in the determination range, the appropriateness determination means determines whether the determination range matches a predetermined citation form, and the determination is based on the determination result. Determining whether citation of the comparison range in the determination range is a legitimate citation;
    The citation judging support device according to claim 2 or 3, characterized in that
  5.  前記判定対象データの種別と、前記所定の引用形式とを、相互に関連付けて格納する引用形式格納手段を備え、
     前記適法性判定手段は、前記判定対象データの種別を特定し、当該特定した種別に対応する前記引用形式を前記引用形式格納手段から取得し、当該取得した引用形式に前記判定範囲における前記比較範囲の引用が合致するか否かを判定すること、
     を特徴とする請求項4に記載の引用判定支援装置。
    And c) a citation format storage unit that associates and stores the type of the determination target data and the predetermined citation format.
    The appropriateness determination means specifies the type of the determination target data, acquires the citation form corresponding to the specified type from the citation form storage means, and the comparison range in the determination range in the acquired citation form To determine whether the citations of
    The citation judging support device according to claim 4, characterized in that
  6.  前記文献引用判定手段にて前記判定範囲が前記比較範囲を引用していると判定された場合に、当該比較範囲を含む前記文献データを参照するための参照情報を、当該文献データに基づいて取得する参照情報取得手段を備え、
     前記出力手段は、前記文献データの前記比較範囲を引用している前記判定対象データの前記判定範囲に加えて、前記参照情報取得手段にて取得された前記参照情報を出力すること、
     を特徴とする請求項1から5のいずれか一項に記載の引用判定支援装置。
    When it is determined that the determination range refers to the comparison range by the document reference determination means, reference information for referring to the document data including the comparison range is acquired based on the document data Provided with reference information acquisition means
    The output unit outputs the reference information acquired by the reference information acquisition unit, in addition to the determination range of the determination target data quoting the comparison range of the document data.
    The citation judging support device according to any one of claims 1 to 5, characterized in that
  7.  前記判定範囲特定手段は、前記判定対象データを構成する構成部分の中から、所定の構成部分を前記判定範囲として特定すること、
     を特徴とする請求項1から6のいずれか一項に記載の引用判定支援装置。
    The determination range specifying means specifies a predetermined component as the determination range from among the component parts constituting the determination target data.
    The citation judging support device according to any one of claims 1 to 6, characterized in that
  8.  過去に生成された前記判定対象データの作成者を一意に識別するための作成者識別情報に対して、前記判定対象データにおける不正な引用行為の有無を示す情報、又は前記作成者の成績を対応づけて記憶する履歴記憶手段を備え、
     判定対象となり得る前記判定対象データが複数存在する場合において、前記判定範囲特定手段は、前記不正な引用行為が有った旨を示す情報に対応する前記作成者識別情報を前記履歴記憶手段から取得し、又は所定値より低い前記作成者の成績に対応する前記作成者識別情報を前記履歴記憶手段から取得し、当該取得した作成者識別情報にて識別される作成者が作成した前記判定対象データを、前記複数の判定対象データの中から前記判定対象として選択すること、
     を特徴とする請求項1から7のいずれか一項に記載の引用判定支援装置。
    The creator identification information for uniquely identifying the creator of the determination target data generated in the past corresponds to the information indicating the presence or absence of an illegal quoting act in the determination target data, or the result of the creator Provided with history storage means for storing
    In the case where there are a plurality of judgment target data that can be judgment targets, the judgment range specification means acquires, from the history storage means, the creator identification information corresponding to the information indicating that the illegal act of quotation has occurred. Or the determination target data created by the creator identified by the creator identification information acquired by acquiring the creator identification information corresponding to the score of the creator lower than a predetermined value from the history storage unit Selecting one of the plurality of determination target data as the determination target,
    The citation judging support device according to any one of claims 1 to 7, characterized in that
  9.  前記文献データに含まれ得る単語に対して、当該単語を修正する際に用いられ得る単語を対応づけて記憶する辞書記憶手段と、
     前記判定対象データに含まれる単語を、前記辞書記憶手段にて記憶された単語に変換する単語変換手段とを備え、
     前記判定範囲特定手段は、前記単語変換手段による変換が行われた前記判定対象データを、前記判定対象とすること、
     を特徴とする請求項1から8のいずれか一項に記載の引用判定支援装置。
    Dictionary storage means for associating and storing a word that can be used when correcting the word with respect to a word that can be included in the document data;
    Word conversion means for converting words contained in the determination target data into words stored in the dictionary storage means;
    The determination range specifying means may set the determination target data subjected to conversion by the word conversion means as the determination target.
    The citation judging support device according to any one of claims 1 to 8, characterized in that
  10.  当該引用判定支援装置に対する操作入力を受け付ける入力手段を備え、
     前記判定範囲特定手段は、前記判定対象データの中から、前記入力手段を介して指定された範囲を前記判定範囲として特定すること、
     を特徴とする請求項1から9のいずれか一項に記載の引用判定支援装置。
    And an input unit for receiving an operation input to the quote determination support device;
    The determination range specifying means specifies, as the determination range, a range specified through the input means among the determination target data.
    The quotation determination support apparatus according to any one of claims 1 to 9, characterized in that
  11.  過去に生成された複数の判定対象データを記憶する判定対象データ記憶手段をさらに備え、
     前記類似度算出手段は、さらに、前記判定対象データ記憶手段に記憶された前記複数の判定対象データの相互間において、前記類似度を算出し、
     前記文献引用判定手段は、さらに、前記類似度算出手段にて算出された前記類似度が所定の第2閾値以上である場合に、前記複数の判定対象データの相互間において引用していると判定し、
     前記比較範囲特定手段は、前記複数の判定対象データの相互間において引用ありと判定された複数の判定対象データを前記比較範囲として特定すること、
     を特徴とする請求項1から10のいずれか一項に記載の引用判定支援装置。
    A determination target data storage unit that stores a plurality of determination target data generated in the past;
    The similarity degree calculation means further calculates the similarity degree among the plurality of determination target data stored in the determination target data storage means,
    Further, when the degree of similarity calculated by the degree of similarity calculation means is equal to or more than a predetermined second threshold value, the document reference determination means determines that the plurality of determination target data are referred to each other. And
    The comparison range specifying means specifies, as the comparison range, a plurality of determination target data determined to be cited among the plurality of determination target data.
    The citation judging support device according to any one of claims 1 to 10, characterized in that
  12.  前記判定対象データの記述内容に基づいて、当該判定対象データの中から、当該判定対象データの課題を示す課題情報を抽出する課題抽出手段を備え、
     前記比較範囲特定手段は、前記課題抽出手段にて抽出された前記課題情報を検索キーとして前記文献データを検索し、当該検索された文献データを前記比較対象として特定すること、
     を特徴とする請求項1から11のいずれか一項に記載の引用判定支援装置。
    The task extraction unit is configured to extract task information indicating a task of the determination target data from the determination target data based on the description content of the determination target data.
    The comparison range specifying unit searches the document data using the task information extracted by the task extraction unit as a search key, and specifies the searched document data as the comparison target.
    The citation judging support device according to any one of claims 1 to 11, characterized in that
  13.  前記判定範囲において引用されていると前記文献引用判定手段によって判定された前記文献データを記憶する文献データ記憶手段を備え、
     前記比較範囲特定手段は、前記判定対象データにおいて引用されている前記文献データを特定する引用元情報が当該判定対象データに含まれているか否かを判定し、当該引用元情報が当該判定対象データに含まれていると判定した場合、当該引用元情報に基づいて特定される前記文献データが前記文献データ記憶手段に格納されているか否かを判定し、当該引用元情報に基づいて特定される前記文献データが前記文献データ記憶手段に格納されていると判定した場合、当該文献データを比較範囲として特定すること、
     を特徴とする請求項1から12のいずれか一項に記載の引用判定支援装置。
    Document data storage means for storing the document data determined by the document citation determining means to be cited in the determination range;
    The comparison range specifying means determines whether or not the reference source information specifying the document data cited in the determination target data is included in the determination target data, and the reference source information is the determination target data If it is determined that the document data contained in the document data storage means is stored in the document data storage unit, the document data storage device determines whether the document data specified in the document data storage means is stored. When it is determined that the document data is stored in the document data storage unit, specifying the document data as a comparison range;
    The citation judging support device according to any one of claims 1 to 12, characterized in that
  14.  前記類似度算出手段は、前記判定範囲特定手段にて特定された前記判定範囲の記述内容を検索キーとして、前記比較範囲特定手段にて特定された前記比較範囲の中から検索した場合であって、前記検索キーの文字数が、予め定められた制限文字数を超えている場合に、前記検索キーとして前記制限文字数以内の文字を前記判定範囲の中から順次指定して、前記比較範囲の中から複数回検索し、複数回の検索結果の中で出現頻度が所定の値より大きい検索結果を、前記判定範囲の記述内容との相互の類似度を算出する前記比較範囲の対象とすること、
     を特徴とする請求項1から13のいずれか一つに記載の引用判定支援装置。
    The similarity degree calculation means is a case where the description content of the determination range specified by the determination range specification means is searched from among the comparison ranges specified by the comparison range specification means, using as a search key the description content of the determination range. When the number of characters of the search key exceeds a predetermined number of restricted characters, characters within the number of restricted characters are sequentially designated as the search key from the determination range, and a plurality of characters are selected from the comparison range. Setting a search result having a frequency of appearance greater than a predetermined value among a plurality of search results as a target of the comparison range for calculating mutual similarity with the description content of the determination range;
    The citation judging support device according to any one of claims 1 to 13, characterized in that
  15.  前記類似度算出手段は、前記判定範囲を解析して所定数以上出現する単語を検索キーとして、前記単語ごとに前記比較範囲特定手段にて特定された前記比較範囲の中から複数回検索し、複数回の検索結果の中で出現頻度が所定の値より大きい検索結果を、前記判定範囲の記述内容との相互の類似度を算出する前記比較範囲の対象とすること、
     を特徴とする請求項1から14のいずれか一つに記載の引用判定支援装置。
    The similarity calculation means analyzes the determination range and searches a word appearing a predetermined number or more as a search key multiple times from the comparison range specified by the comparison range specifying means for each word. A search result having a frequency of appearance greater than a predetermined value among a plurality of search results is set as a target of the comparison range for calculating the mutual similarity with the description content of the determination range;
    The citation determination support device according to any one of claims 1 to 14, characterized in that
  16.  前記所定の閾値の入力を受け付ける入力手段を備え、
     前記文献引用判定手段は、前記類似度が前記入力手段を介して入力された所定の閾値以上である場合に、前記判定範囲が前記比較範囲を引用していると判定すること、
     を特徴とする請求項1から15のいずれか一項に記載の引用判定支援装置。
    And an input unit for receiving an input of the predetermined threshold value;
    The document quoting determination means determines that the determination range is quoting the comparison range when the similarity is equal to or greater than a predetermined threshold value input through the input means.
    The citation judging support device according to any one of claims 1 to 15, characterized in that
  17.  前記判定範囲の記述内容の内、前記比較範囲から引用された記述内容が占める引用割合を算出する引用割合算出手段を備え、
     前記出力手段は、前記引用割合を出力すること、
     を特徴とする請求項1から16のいずれか一項に記載の引用判定支援装置。
    Among the description contents of the determination range, citation ratio calculation means for calculating the citation ratio occupied by the description content cited from the comparison range,
    The output means outputs the citation ratio.
    The citation judging support device according to any one of claims 1 to 16, characterized in that
  18.  前記引用割合算出手段は、複数の前記判定対象データについて前記引用割合を算出し、
     前記出力手段は、前記複数の判定対象データを一意に識別する判定対象データ情報を、当該各判定対象データについて前記引用割合算出手段が算出した前記引用割合に基づく順序で出力すること、
     を特徴とする請求項17に記載の引用判定支援装置。
    The citation ratio calculation means calculates the citation ratio for a plurality of the determination target data;
    The output means outputs determination target data information uniquely identifying the plurality of determination target data in an order based on the quoting ratio calculated by the quoting ratio calculating means for each of the determination target data.
    The citation judging support device according to claim 17, characterized in that
  19.  前記判定範囲の前記類似度と、前記出力手段による出力態様とを、相互に対応付けて格納する出力態様情報格納手段を備え、
     前記出力手段は、前記類似度算出手段にて算出された前記類似度に対応する前記出力態様を前記出力態様情報格納手段から取得し、当該取得した出力態様にて前記判定範囲を出力すること、
     を特徴とする請求項1から18のいずれか一項に記載の引用判定支援装置。
    The output mode information storage unit stores the similarity of the determination range and the output mode of the output unit in association with each other.
    The output unit acquires the output mode corresponding to the similarity calculated by the similarity calculation unit from the output mode information storage unit, and outputs the determination range in the acquired output mode.
    The citation judging support device according to any one of claims 1 to 18, characterized in that
  20.  判定対象となる判定対象データの中で、文献データが引用されているか否かを判定するための引用判定支援プログラムであって、
     コンピュータを、
     前記判定対象データの中から、前記文献データの引用の有無の判定範囲を特定する判定範囲特定手段と、
     前記文献データの中から、前記判定対象データとの比較範囲を特定する比較範囲特定手段と、
     前記判定範囲特定手段にて特定された前記判定範囲の記述内容を、前記比較範囲特定手段にて特定された前記比較範囲の中から検索し、前記判定範囲の記述内容と前記比較範囲の記述内容の相互の類似度を算出する類似度算出手段と、
     前記類似度算出手段にて算出された前記類似度が所定の閾値以上である場合に、前記判定範囲が前記比較範囲を引用していると判定する文献引用判定手段と、
     前記文献データの前記比較範囲を引用している前記判定対象データの前記判定範囲を出力する出力手段と、
     して機能させることを特徴とする引用判定支援プログラム。
    A quoting determination support program for determining whether or not document data is cited in determination target data to be determined,
    Computer,
    A determination range specifying unit that specifies a determination range of presence / absence of citation of the document data from the determination target data;
    Comparison range specifying means for specifying a comparison range with the determination target data among the document data;
    Descriptive contents of the judgment range specified by the judgment range specifying means are searched from the comparison range specified by the comparison range specifying means, and description contents of the judgment range and description contents of the comparison range Similarity calculation means for calculating the mutual similarity of
    Document quoting judging means for judging that the judgment range cites the comparison range when the similarity calculated by the similarity calculating means is equal to or more than a predetermined threshold value;
    An output unit that outputs the determination range of the determination target data quoting the comparison range of the document data;
    A citation judging support program characterized by making it function.
PCT/JP2009/000360 2008-02-01 2009-01-30 Quotation judgment supporting device WO2009096190A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008023234 2008-02-01
JP2008-023234 2008-02-01

Publications (1)

Publication Number Publication Date
WO2009096190A1 true WO2009096190A1 (en) 2009-08-06

Family

ID=40912544

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/000360 WO2009096190A1 (en) 2008-02-01 2009-01-30 Quotation judgment supporting device

Country Status (2)

Country Link
JP (2) JP5510912B2 (en)
WO (1) WO2009096190A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110383264A (en) * 2016-12-16 2019-10-25 三菱电机株式会社 Searching system

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5207402B2 (en) * 2009-09-30 2013-06-12 キヤノンマーケティングジャパン株式会社 Information processing apparatus, information processing method, and program
KR101033611B1 (en) * 2010-07-09 2011-05-11 한국과학기술정보연구원 System and method for evaluating the suitability of reference
US9218344B2 (en) * 2012-06-29 2015-12-22 Thomson Reuters Global Resources Systems, methods, and software for processing, presenting, and recommending citations
JP5459422B2 (en) * 2013-02-14 2014-04-02 キヤノンマーケティングジャパン株式会社 Information processing apparatus, control method, and program
JP6052816B2 (en) 2014-10-27 2016-12-27 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Method for supporting secondary use of contents of electronic work, server computer for supporting secondary use of contents of electronic work, and program for server computer
US20170270625A1 (en) * 2016-03-21 2017-09-21 Facebook, Inc. Systems and methods for identifying matching content
JP6691581B2 (en) * 2018-07-26 2020-04-28 楽天株式会社 Information processing apparatus, information processing method, program, storage medium
JP6695538B1 (en) * 2019-07-30 2020-05-20 株式会社ウェブサークル Similar sentence retrieval device and program
JP2022072383A (en) * 2020-10-29 2022-05-17 株式会社Ipsign System, method, and program for extracting infringement information

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1021239A (en) * 1996-06-28 1998-01-23 Toshiba Corp Machine translation system and its method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08263512A (en) * 1995-03-24 1996-10-11 Sumitomo Electric Ind Ltd Document retrieval device
JPH09198409A (en) * 1996-01-19 1997-07-31 Hitachi Ltd Extremely similar docuemtn extraction method
JP3625054B2 (en) * 2000-11-29 2005-03-02 松下電器産業株式会社 Technical document retrieval device
JP2006155556A (en) * 2004-10-27 2006-06-15 Hitachi Software Eng Co Ltd Text mining method and text mining server
US20070294610A1 (en) * 2006-06-02 2007-12-20 Ching Phillip W System and method for identifying similar portions in documents
JP2008015774A (en) * 2006-07-05 2008-01-24 Nagaoka Univ Of Technology Imitation document detection system and program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1021239A (en) * 1996-06-28 1998-01-23 Toshiba Corp Machine translation system and its method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MURATA T. ET AL.: "Gakusei Report no n-gram ni yoru Ruijido Hyoka no Kento", FIT2002 FORUM ON INFORMATION TECHNOLOGY IPPAN KOEN RONBUNSHU, vol. 2, 13 September 2002 (2002-09-13), pages 101 - 102 *
SHURUTSU KISU: "System ni Shinshoku suru Aratana Kyoi, Spyware o Gekitai Seyo!", COMPUTERWORLD GET TECHNOLOGY RIGHT, vol. 2, no. 1, 1 January 2005 (2005-01-01), pages 86 - 91 *
TAKAHASHI I. ET AL.: "Web kara no Hyosetsu Report Kenshutsu Shuho no Jisso to Hyoka", DAI 46 KAI ADVANCED LEARNING SCIENCE AND TECHNOLOGY SHIRYO (SIG-ALST-A503), 13 March 2006 (2006-03-13), pages 01 - 06 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110383264A (en) * 2016-12-16 2019-10-25 三菱电机株式会社 Searching system
CN110383264B (en) * 2016-12-16 2022-12-30 三菱电机株式会社 Retrieval system

Also Published As

Publication number Publication date
JP2009205674A (en) 2009-09-10
JP5510912B2 (en) 2014-06-04
JP2014149848A (en) 2014-08-21
JP5737772B2 (en) 2015-06-17

Similar Documents

Publication Publication Date Title
WO2009096190A1 (en) Quotation judgment supporting device
US7296019B1 (en) System and methods for providing runtime spelling analysis and correction
JP4654776B2 (en) Question answering system, data retrieval method, and computer program
US9015175B2 (en) Method and system for filtering an information resource displayed with an electronic device
US7318021B2 (en) Machine translation system, method and program
US10552467B2 (en) System and method for language sensitive contextual searching
US20080172220A1 (en) Incorrect Hyperlink Detecting Apparatus and Method
JP2005128873A (en) Question/answer type document retrieval system and question/answer type document retrieval program
JP2017504105A (en) System and method for in-memory database search
JP2016099741A (en) Information extraction support apparatus, method and program
JP2006343925A (en) Related-word dictionary creating device, related-word dictionary creating method, and computer program
JP2007172260A (en) Document rule preparation support apparatus, document rule preparation support method and document rule preparation support program
US20080071593A1 (en) Business process editor, business process editing method, and computer product
JP6305671B1 (en) Template generating apparatus, template generating program, and template generating method
JP2009157620A (en) Information search support device
CN114357961A (en) Project feasibility research report generation method, device, equipment and storage medium
US6122650A (en) Method and apparatus for updating time related data in a modified document
Pirzadeh et al. Resilient user interface level tests
JP2008250893A (en) Information retrieval device, information retrieval method and its program
JP2009169761A (en) Electronic dictionary system, display control method of electronic dictionary, computer program, and data storage medium
JP2006155653A (en) Information display control device and program
JP2005115457A (en) Method of retrieving document file
JP2000200279A (en) Information retrieving device
JPH0668137A (en) Operation command object information generating system and operation command object recognition system
Suchomel et al. Source retrieval for plagiarism detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09706575

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09706575

Country of ref document: EP

Kind code of ref document: A1