WO2023074457A1 - Matching system, matching method, program, and trained model - Google Patents

Matching system, matching method, program, and trained model Download PDF

Info

Publication number
WO2023074457A1
WO2023074457A1 PCT/JP2022/038679 JP2022038679W WO2023074457A1 WO 2023074457 A1 WO2023074457 A1 WO 2023074457A1 JP 2022038679 W JP2022038679 W JP 2022038679W WO 2023074457 A1 WO2023074457 A1 WO 2023074457A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
matching
seeds
needs
degree
Prior art date
Application number
PCT/JP2022/038679
Other languages
French (fr)
Japanese (ja)
Inventor
拓己 石渡
恵美子 寄▲崎▼
Original Assignee
コニカミノルタ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by コニカミノルタ株式会社 filed Critical コニカミノルタ株式会社
Publication of WO2023074457A1 publication Critical patent/WO2023074457A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/908Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals

Definitions

  • This disclosure relates to matching systems, matching methods, programs, and trained models.
  • Patent Document 1 using seed data such as in-house technology stored in memory and needs data of the public and customers, a SWOT analysis is used to identify business fields in which the retained seeds meet public needs. is disclosed.
  • the purpose of this disclosure is to provide a matching system, matching method, program, and trained model that can more easily and objectively quantitatively evaluate the match between seeds and needs.
  • an acquisition unit that acquires first data related to needs and second data related to seeds; an analysis unit that quantifies and analyzes the acquired first data and second data; an output unit that outputs matching information indicating the degree of matching between the first data and the second data using the analysis result by the analysis unit; is a matching system comprising
  • the invention according to claim 2 is the matching system according to claim 1,
  • the analysis unit converts the first data and the second data into vectors representing a multidimensional space in the digitization.
  • the analysis unit calculates the degree of conformity based on a degree of similarity between a first vector obtained from the first data and a second vector obtained from the second data.
  • the analysis unit has a trained model that uses as inputs a first vector obtained from the first data and a second vector obtained from the second data and outputs the degree of fitness.
  • the invention according to claim 5 is the matching system according to claim 4,
  • the trained model is based on a pattern recognition algorithm.
  • the invention according to claim 6 is the matching system according to any one of claims 1 to 5,
  • An input unit for acquiring input data The acquisition unit acquires the first data or the second data from the input data.
  • the invention according to claim 7 is the matching system according to any one of claims 1 to 6,
  • the first data includes at least one predetermined element related to needs,
  • the second data includes at least one predetermined element related to seeds.
  • the element of the first data includes at least one of the purpose, content, field and name of the needs
  • the elements of the second data include at least one of features, functions, competing technologies and names of seeds.
  • the invention according to claim 9 is the matching system according to claim 7 or 8,
  • An input unit for acquiring input data The acquisition unit extracts the element when acquiring the first data or the second data from the input data.
  • each of the first data and the second data includes at least a noun or a verb indicating a function.
  • the invention according to claim 11 is the matching system according to claim 10,
  • the first data and the second data each include an object for a noun or verb indicating the function.
  • a matching method performed by a computer control unit, an acquisition step of acquiring first data related to needs and second data related to seeds; an analysis step of quantifying and analyzing the acquired first data and second data, respectively; An output step of outputting matching information indicating the degree of matching between the first data and the second data using the analysis result in the analysis step; including.
  • the invention according to claim 13, the computer, Acquisition means for acquiring first data related to needs and second data related to seeds; analysis means for digitizing and analyzing the acquired first data and second data, respectively; output means for outputting matching information indicating the degree of matching between the first data and the second data using the analysis result by the analysis means; It is a program that functions as
  • a first vector obtained from the first data related to the needs and a second vector obtained from the second data related to the seeds are used as inputs, and the degree of conformity between the first data and the second data is output. It is a trained model that
  • FIG. 1 is an overall configuration diagram of a matching system according to an embodiment; FIG. It is a figure explaining the content of seeds/needs data.
  • 4 is a flowchart showing a control procedure of database generation processing;
  • FIG. 10 is a flow chart showing a control procedure of correspondence search control processing;
  • FIG. 11 is a flow chart showing a processing procedure of matching degree calculation processing called in correspondence search control processing;
  • FIG. It is a figure explaining the setting of a conformity degree of a 2nd example.
  • 7 is a flowchart showing a control procedure of learned model generation processing;
  • FIG. 11 is a flow chart showing a control procedure of a second example of a degree-of-fit calculation process using a trained model;
  • FIG. 4 is a flowchart showing a control procedure of matrix generation processing related to collaborative filtering
  • FIG. 11 is a flow chart showing a control procedure of a matching degree calculation process of a third example using a technique of collaborative filtering
  • FIG. 10 is a diagram showing an example of output results
  • FIG. 1 is an overall configuration diagram of a matching system 100 of this embodiment.
  • a matching system 100 of this embodiment includes an information processing device 1 , a database device 2 , and a terminal device 3 .
  • the information processing device 1 is a computer that performs processing related to matching in this embodiment, and may be, for example, an ordinary PC (Personal Computer).
  • the information processing device 1 is connected to the database device 2, refers to the held data stored in the database device 2, and writes and adds new data.
  • the database device 2 may not be directly connected to the information processing device 1, and may be accessible through the network N.
  • the information processing device 1 includes a control unit 11 (acquisition unit, analysis unit, output unit), a storage unit 12, a communication unit 13, and the like.
  • the control unit 11 has a hardware processor such as a CPU that performs arithmetic processing and controls the operation of the information processing apparatus 1 and a memory such as a RAM.
  • the storage unit 12 has a nonvolatile memory and stores a program 121, setting data, and the like.
  • the program 121 includes a control program related to the correspondence search control process of this embodiment.
  • the program 121 may include a part or all of the database generation processing, learned model generation processing, and matrix generation processing described later.
  • the program 121 also includes the generated trained model 1210 .
  • the non-volatile memory may be, for example, flash memory or HDD (Hard Disk Drive).
  • the communication unit 13 controls transmission and reception of data with external devices via the network N.
  • the terminal device 3 is included in the external device.
  • a communication standard for data transmission/reception may be, for example, a LAN (Local Area Network) standard (TCP/IP, etc.).
  • the information processing device 1 may also include a display unit, an operation reception unit, and the like.
  • the database device 2 has a storage unit 21 .
  • the storage unit 21 stores and holds needs/seeds data 211 acquired in advance, learning data and learned models for converting each term, phrase, sentence, etc. into a multidimensional vector (semantic vector). .
  • the terminal device 3 may be an ordinary PC or a mobile terminal (smartphone, etc.), and performs operations such as inputting data to be matched and displaying matching results.
  • the terminal device 3 includes a control unit 31, a communication unit 32, a display unit 33, an operation reception unit 34 (input unit), and the like.
  • the control unit 31 includes a hardware processor and a memory that perform arithmetic processing and centrally control the operation of the terminal device 3 .
  • the communication unit 32 controls transmission and reception of data with an external device via the network N.
  • FIG. The external device includes the information processing apparatus 1 described above.
  • the communication standard related to data transmission/reception is the same as that of the information processing apparatus 1, that is, the standard related to LAN (TCP/IP, etc.) is included.
  • the display unit 33 has a display screen on which characters can be displayed, and performs display operations on the display screen under the control of the control unit 31 .
  • the display screen is not particularly limited, but is, for example, a liquid crystal display screen (LCD).
  • the operation reception unit 34 receives an input operation from the user of the terminal device 3 and outputs the content of the input operation to the control unit 31 as an operation signal.
  • the operation reception unit 34 has, for example, a keyboard and a pointing device.
  • the pointing device includes a mouse and the like.
  • the operation reception unit 34 may have a touch panel or the like that overlaps the display screen. Alternatively, these may be externally attached peripheral devices.
  • seeds such as knowledge related to technology and intellectual property in general owned by itself (for example, a corporation such as its own company) (that is, things including intangibles. Here, things do not include people) seeds
  • Matching needs such as demands and wishes of customers and society.
  • the degree of suitability is an index that indicates whether the seeds can satisfy the needs/whether the needs can be satisfied by the seeds. In the case of a problem, it can be an index that shows whether the seeds can be the solution.
  • Input of data is received by the operation receiving unit 34 of the terminal device 3 or the like.
  • the operation accepting unit 34 may directly accept a data input operation, or may accept an input operation specifying a file name and, if necessary, a path where the file is located.
  • the control unit 31 transmits the input data received by the operation receiving unit 34 through the communication unit 32 to the information processing device 1 via the network N.
  • each of the elements of seeds and elements of needs is represented (converted) by a multidimensional vector (vector representing a multidimensional space) that is an array of numerical values representing the sizes of multiple semantic components. digitized by Then, the degree of conformity is quantitatively evaluated according to the degree of matching (distance) between the multidimensional vector of seeds (second vector) and the multidimensional vector of needs (first vector).
  • FIG. 2A is a diagram for explaining the content of seeds/needs data.
  • the seeds and needs element concisely presents the required information.
  • the four elements of the need (Why), the content (What), the field and name of the request, and the four elements of the seeds, such as technology features, functions, competing technologies and names are defined. .
  • These eight elements are known as the Elevator Pitch syntax, a short, to-the-point business speech syntax.
  • the four elements related to needs and the four elements related to seeds are different data. may be extracted from the set.
  • the four elements of needs can be acquired from information collected by sales representatives, mass media information, Internet information, and the like.
  • the four elements related to the seeds can be obtained from technical documents within the company.
  • Technical documentation may be internal documentation only, or may include contractual documents, publicly available press releases and patent documents, and the like.
  • the number of elements may be narrowed down and acquired and used within a range in which the elements related to needs and the elements related to seeds correspond. For example, a total of four elements may be obtained: two elements of the purpose and content of the request for the needs and two elements of the features and functions of the technology for the seeds.
  • the above eight elements may not necessarily be extracted according to the elevator pitch syntax, and other items may be set so that the correspondence between needs and seeds can be appropriately quantitatively evaluated as described later. .
  • items (elements) related to needs may be determined in advance from business factors such as management status and market size, and items (elements) related to seeds include the status of joint research and the status of disclosure. may be predetermined.
  • the process of extracting these elements from the original document may be done manually by the person in charge. Alternatively, part or all of the processing may be performed by the information processing device 1 or another terminal device based on an input, instruction, or the like from the terminal device 3 .
  • extracting elements partially for example, input data is decomposed into morphemes such as words using morphological analysis, and then syntactic analysis is used to determine dependencies between morphemes, co-occurrence relationships, etc.
  • morphemes such as words using morphological analysis
  • syntactic analysis is used to determine dependencies between morphemes, co-occurrence relationships, etc.
  • predicates that is, nouns used for verbs and functional expressions (such as nouns that can be combined with "do" to be verbs; Nouns) and objects are used as the minimum units, and elements are extracted by adding modifiers to them according to necessity and conditions.
  • nouns used for verbs and functional expressions such as nouns that can be combined with "do" to be verbs; Nouns
  • objects are used as the minimum units, and elements are extracted by adding modifiers to them according to necessity and conditions.
  • ⁇ hot water'' included in modifiers or objects
  • ⁇ temperature'' object
  • ⁇ maintain'' predicate
  • the gist of the content may be determined based on, for example, the name of the document file or the title of the text document. Through the morphological analysis, syntactic analysis, etc., the main content, purpose, function, feature, etc. corresponding to the name can be automatically extracted.
  • FIG. 2B is a flowchart showing a control procedure by the control unit 11 of the database generation process executed by the information processing device 1 or the like. For example, one or a large number of text data are prepared in advance so as to be readable, and this processing related to the generation of comparison target data is executed by a predetermined input operation or at execution timing such as periodic processing. Note that this database generation process may be executed so that the setting as to whether the text data to be read corresponds to needs or seeds can be acquired.
  • control unit 11 selects and acquires one text data from the prepared text data (step S301; acquisition unit, acquisition step, acquisition means).
  • the control unit 11 determines whether the input data is data related to needs or data related to seeds, and extracts four elements corresponding to the determination result (step S302).
  • four elements may be extracted for each.
  • the control unit 11 organizes the contents of the four extracted elements into verbs (predicates), objects, and modifiers (step S303).
  • the control unit 11 determines whether or not the sorted content overlaps with the content already stored in the storage unit 21 of the database device 2 (step S304).
  • the control unit 11 stores the sorted data in the storage unit 21 of the database device 2 after newly adding or partially updating the data according to the presence or absence of duplication (step S305). It should be noted that if the new organization data completely overlaps with the existing content, there is no need to update it, so the control section 11 may omit the process of step S305.
  • the control unit 11 determines whether or not all the input data to be processed have been acquired (step S306). When it is determined that all the input data have been obtained (“YES” in step S306), the control unit 11 ends the database generation process. If it is determined that the acquisition of input data has not ended (“NO” in step S306), the process of the control unit 11 returns to step S301.
  • each input data is read in order to extract and organize four elements. Extraction and arrangement of four elements may be performed.
  • the needs/seeds data 211 which is a database obtained in advance, manually or automatically by the database generation process, seeds that match a certain need that has been separately input, or seeds that have been input Processing is performed to search for needs that match the seeds.
  • FIG. 3 is a flowchart showing the control procedure by the control unit 11 of the correspondence search control process executed by the information processing device 1.
  • FIG. This corresponding search control process is started, for example, when the control unit 11 acquires a search execution command together with the input data of the needs or seeds input by the terminal device 3 .
  • the input data has necessary elements determined in advance as described above.
  • control unit 11 acquires input data from the terminal device 3 (step S101; acquisition unit, acquisition step, acquisition means). The control unit 11 acquires a setting as to whether the input data is data related to needs or seeds (step S102).
  • the control unit 11 determines whether the input is data (first data) related to needs (step S103). If the input is determined to be the data (first data) related to needs ("YES" in step S103), the control unit 11 sets the retained data (second data) related to seeds as a search target (step S104). Then, the processing of the control unit 11 proceeds to step S106. When it is determined that the input is not the data related to the needs (the data is the data related to the seeds (second data)) ("NO" in step S103), the control unit 11 outputs the held data related to the needs (the first data). data) is set as a search target (step S105). Then, the processing of the control unit 11 proceeds to step S106.
  • control unit 11 executes a matching degree calculation process, which will be described later, to calculate the matching degree of each search target (step S106).
  • the control unit 11 extracts the data of seeds or needs to be searched whose matching degree satisfies the criteria (step S107).
  • the control unit 11 appropriately processes the extracted data as necessary and outputs it to the terminal device 3 as matching information in an easy-to-read form (step S108; output unit, output step, output means). Then, the control unit 11 terminates the correspondence search control process.
  • numerical evaluation is performed by converting the contents extracted and organized in the form of verbs (predicates) and objects into multidimensional vectors.
  • the number of dimensions of the multidimensional vector is not particularly limited, but is, for example, 50 to 200 dimensions in total. Alternatively, they may be converted element by element into multidimensional vectors and these may simply be combined.
  • the request of the need and the content of the need may each be represented by a 50-dimensional vector, and the need may be represented by a combined 100-dimensional vector.
  • a seed feature and a seed function may each be represented by a 50-dimensional vector, and the seed may be represented by a combined 100-dimensional vector.
  • Word2vec, doc2vec derived therefrom, BERT (Bidirectional Encoder Representations from Transformers), and the like are known for conversion from natural language expression to multidimensional vector expression, although not particularly limited.
  • Machine learning related to these conversions may be executed within the matching system 100 of the present embodiment, may be acquired from the outside and used what has already been learned, or access an external server to perform these You can use the program.
  • the degree of adaptation is represented by the distance (an example of similarity) between a first vector representing needs and a second vector representing seeds.
  • distance an example of similarity
  • cosine similarity is used as the distance.
  • Cosine similarity is the inner product of two vectors divided by the product of their respective magnitudes. If the two vectors are unit vectors, the cosine similarity is simply the inner product of the two vectors.
  • the distance may be represented by other indices such as the Euclidean distance.
  • FIG. 4 is a flow chart showing the processing procedure of the degree-of-match calculation process called in the above correspondence search control process.
  • This matching degree calculation process constitutes an analysis step in the matching method of this embodiment, and also constitutes an analysis means in the program 121 .
  • the control unit 11 converts the content of the input data into a multidimensional vector as described above (step S151).
  • the control unit 11 acquires one search target data from the needs/seeds data 211 (step S152).
  • the control unit 11 converts the acquired search target data into a multidimensional vector (step S153).
  • the control unit 11 calculates the distance (for example, cosine similarity) between the multidimensional vector related to the input data and the multidimensional vector related to the search target data (step S154).
  • the control unit 11 determines whether or not all search target data has been acquired (step S155). If it is determined that all search target data has been acquired ("YES" in step S155), the control unit 11 terminates the matching degree calculation process and returns the process to the corresponding search control process.
  • step S155 If it is determined that not all search target data has been acquired (there is search target data that has not been acquired) ("NO" in step S155), the processing of the control unit 11 returns to step S152.
  • a machine learning model related to image recognition may be used as another example (second example) of the degree of conformity.
  • a 50 x 2 pixel matrix in which the values of each component of the multidimensional vector of purpose and content related to needs (requirements) are arranged in one column, and a multidimensional vector of features and functions related to seeds (technologies)
  • a matrix of 50 ⁇ 2 pixels in which each component value is arranged in one column is further combined to generate a matrix of 50 ⁇ 4 pixels (200 pixels, each component value corresponding to a tone value).
  • the degree of compatibility is obtained from the degree of similarity of this matrix pattern to the tendency of the matrix pattern when the seeds and needs are matched.
  • FIG. 5A is a diagram explaining the setting of the degree of conformity in this second example.
  • the 200-pixel matrix pattern is determined.
  • a matrix pattern of 4 rows and 50 columns is used here, it is not limited to this. It may be a simple vector in which 200 elements are arranged in a line, or may be a matrix pattern with other numbers of rows and columns, such as 2 rows and 100 columns.
  • the degree of conformity of the matrix pattern is obtained by inputting this 200-pixel matrix (that is, the first vector and the second vector) to the trained model 1210 that has previously learned a machine learning model. can get.
  • the learned model 1210 may be generated between each data related to seeds and each data related to needs stored in the database device 2 as described above.
  • the distance (cosine similarity) between the second vector related to the data of a certain seed and the first vector related to the data of a certain need is obtained as described above.
  • the above matrix data is generated and used as learning data for those having a high matching degree according to the distance (a small value in the case of cosine similarity) and satisfying the matching condition.
  • a machine learning model is learned by associating this learning data with a numerical value representing high matching (“Good”) as teacher data.
  • a plurality of items are extracted that have a small matching degree according to the distance obtained above (the value is large for cosine similarity) and satisfy the non-matching condition.
  • Learning data is generated by further randomly rearranging vector components related to each element of seeds and each element of needs in the plurality of sets of extracted data.
  • a machine learning model is learned by associating this learning data with numerical values representing low adaptation (“Bad”) as teacher data.
  • a trained model 1210 is obtained in which the rate of high matching with respect to the input matrix pattern (first data and second data) is output as a numerical value or the like as the degree of matching.
  • the algorithm of the machine learning model for example, a supervised model and an algorithm related to pattern recognition may be used, including pattern recognition algorithms such as support vector machines and neural networks, and particularly deep learning.
  • a trained model that outputs a need that matches the input of a multidimensional vector related to a certain seed and a trained model that outputs a seed that matches the input of a multidimensional vector related to a certain need are common. , or may be generated separately.
  • FIG. 5B is a flowchart showing a control procedure by the control unit 11 for the learned model generation process. This process is prepared in an unlearned state with the algorithm of the machine learning model determined in advance, and according to a predetermined input operation from the terminal device 3 or the update of the stored data related to the needs and seeds of the database device 2 It can be started automatically.
  • the control unit 11 converts the content of a certain seed or need into a multidimensional vector (step S201).
  • the control unit 11 acquires one comparison target data (needs data if the input is seeds, and seeds data if the input is needs) (step S202).
  • the control unit 11 converts the obtained comparison target data into a multidimensional vector (step S203).
  • the control unit 11 calculates the distance between the two obtained multidimensional vectors (step S204).
  • the control unit 11 determines whether the calculated distance is within the lower reference (less than the lower reference value) (step S205). If it is determined to be within the lower standard ("YES" in step S205), the control unit 11 generates matrix data combining two multidimensional vectors (step S206).
  • the generated matrix data may be, for example, 4 rows and 50 columns, although it is not particularly limited, as described above.
  • the control unit 11 inputs the generated matrix data to the machine learning model.
  • the control unit 11 optimizes the parameters of the machine learning model by setting highly compatible "Good” as teacher data for this matrix data and back propagating deviations (errors) from the output results. (step S207). Then, the processing of the control unit 11 proceeds to step S210.
  • step S208 If it is determined in the determination process in step S205 that the calculated distance is not within the lower reference value (is greater than or equal to the lower reference value) ("NO" in step S205), the control unit 11 It is determined whether or not the distance obtained is within the upper reference (greater than the upper reference value) (step S208). The distance within the upper reference is greater than that within the lower reference. Between the upper criterion and the lower criterion, there may be a distance range that is not included in either. If it is determined that the distance is within the upper criterion (“YES" in step S208), the control unit 11 converts the original needs data and seeds data set of the two multidimensional vectors from which the distance was obtained to This is stored (step S209). Then, the processing of the control unit 11 proceeds to step S210.
  • step S210 determines whether or not all data to be compared has been acquired. If it is determined that not all the data to be compared has been acquired (there is data to be compared that has not been acquired) ("NO" in step S210), the process of the control unit 11 proceeds to step S202. return.
  • step S211 determines whether or not all data of needs or seeds to be input has been input. If it is determined that all the data of the needs or seeds to be input has not been input (there is data that has not been input) ("NO” in step S211), the processing of the control unit 11 proceeds to step S201. return. At this time, all acquisition information of the comparison target data is initialized.
  • the control unit 11 stores the set of needs data and seeds data stored in the process of step S209. A part of the elements of any one of them is appropriately replaced with a part of the same elements in the other stored set, and then each multidimensional vector is generated again, and matrix data combining these is generated. (Step S212). It should be noted that replacement data may be determined so that the distance between the needs and the seeds does not become close as a result of the replacement.
  • the control unit 11 inputs the matrix data to the machine learning model.
  • control unit 11 sets low-matching “Bad” as teacher data, and optimizes the parameters of the machine learning model by, for example, backpropagating the difference (error) between the output result and the teacher data (step S213). .
  • the trained model 1210 is stored in the storage unit 12, and the control unit 11 performs a trained model generation process. exit.
  • FIG. 6 is a flow chart showing the control procedure of the degree-of-fit calculation process of the second example using the trained model 1210 generated in this way.
  • This conformity level calculation process includes steps S161 and S162 in place of the process of step S154 in the conformity level calculation process of the first example.
  • Other processes are the same, and the same processing contents are assigned the same reference numerals, and detailed description thereof will be omitted.
  • the control unit 11 When the content of the search target data is converted into a multidimensional vector in the process of step S153, the control unit 11 combines the multidimensional vector relating to the input content and the multidimensional vector relating to the search target data to generate matrix data. (Step S161). The control unit 11 inputs this matrix data to the learned model 1210 and performs arithmetic processing related to the learned model 1210 . The control unit 11 acquires the value of the degree of adaptation (output from the learned model 1210) obtained as a result of the processing (step S162). Then, the processing of the control unit 11 proceeds to step S155.
  • Collaborative filtering defines the correspondence between two parameters (here, needs and seeds), and when there is an input of one parameter, the tendency of the other parameter for the other parameter (here, selection, output based on the tendency of similar items, the other parameter, which is not selected and output in response to the other parameter, is selected and output.
  • FIG. 7A is a diagram illustrating the correspondence between needs and seeds related to this collaborative filtering.
  • needs and seeds are arranged in a matrix, and "1" is input for the corresponding relationship that has been selected.
  • the tendencies of the seeds selected for the need 03 and the need NM are similar (seeds 02, 05, 06, etc.), and the corresponding relationships are close.
  • the need 03 is newly input, the seeds 01 that have been selected for the needs NM and not selected for the needs 03 are output.
  • the matching system 100 of the present embodiment using this collaborative filtering technology, when there is an input of needs data or seeds data (one), a multidimensional vector close to the multidimensional vector related to the input content , and select seeds or needs (other) that correspond to the selected needs or seeds (one). That is, since the input needs or seeds (one) is not necessarily the same as what is already held, a close one is used. In this collaborative filtering, appropriate output cannot be produced unless a certain degree of selection has been made and the similarity tendency has been determined. becomes.
  • the selection related to the correspondence relationship is not limited to the selection operation in the correspondence search control process.
  • the selection may also include correspondence through development and sales of actual products. Specifically, instead of the database generation process that extracts seeds and needs from separate text data as described above, needs described in correspondence with seeds data in data such as product development information and sales information Data may be obtained and defined as being in the selected correspondence.
  • FIG. 7B is a flowchart showing a control procedure by the control unit 11 for matrix generation processing related to collaborative filtering.
  • the control unit 11 acquires the contents of each seed and need (one to four elements each) stored from the database device 2, and converts them into multidimensional vectors (step S251).
  • the control unit 11 allocates each seed and need as a component of each row/column of the two-dimensional matrix together with the obtained multidimensional vector (step S252).
  • the control unit 11 sets "1" to the selected cell for each cell indicating a combination of seeds and needs (step S253). Note that, for example, each cell is set to "0" as an initial value (initialized), and only the cell set to "1" is changed from “0" to "1". All you have to do is Then, the control unit 11 terminates the matrix generation process.
  • FIG. 8 is a flow chart showing the control procedure by the control unit 11 of the matching degree calculation process of the third example using this collaborative filtering technique. In this matching level calculation process, only step S151 in the matching level calculation process of the first example shown in FIG.
  • control unit 11 calculates the distance between the input data and the data of the same classification (seeds or needs) set in the matrix (step S171).
  • the control unit 11 extracts the reference number of data in descending order of the calculated distance (step S172).
  • the control unit 11 selects other classified data that has been selected corresponding to the extracted same classified data (step S173).
  • the control unit 11 calculates a score for each of the selected other classification data (step S174).
  • the score may be determined based on the absolute value of the cosine similarity, the reciprocal of the Euclidean distance, or the like so that the larger the distance (the smaller the degree of similarity), the smaller the score.
  • the control unit 11 outputs each selection data together with the score (step S175). Then, the control unit 11 terminates the matching degree calculation process and returns the process to the correspondence search control process.
  • FIG. 9 is a diagram showing an example of output results. Here, there is shown a bar graph in which some of the requirements (needs) held in advance for the technology (seed) D are listed in percent.
  • the matching system 100 of this embodiment includes the control unit 11 of the information processing device 1 .
  • the control unit 11 as an acquisition unit, acquires first data related to needs and second data related to seeds, and as an analysis unit, quantifies and analyzes the acquired first data and second data, and outputs As a part, this analysis result is used to output matching information indicating the degree of matching between the first data and the second data.
  • this matching system 100 it is possible to quantitatively evaluate the match between seeds and needs more easily and objectively.
  • control unit 11 converts the first data and the second data into vectors representing a multidimensional space in the quantification of seeds and needs. Natural language expressions that express seeds and needs still contain a large amount of information even if summarized concisely. Numerical values corresponding to the meaning can be obtained more accurately.
  • control unit 11 determines the degree of conformity based on the degree of similarity (for example, cosine similarity) between the first vector obtained from the first data and the second vector obtained from the second data. calculate. According to such processing, the numerical similarity can be obtained from the obtained first vector and the second vector by simple calculation. can be obtained.
  • degree of similarity for example, cosine similarity
  • control unit 11 as the analysis unit uses the learned model 1210 that uses the first vector and the second vector as inputs and outputs the degree of adaptation.
  • the matching system 100 can obtain an objective and quantitative evaluation by properly learning the machine learning model and outputting the degree of conformity of the first vector and the second vector.
  • this trained model 1210 is based on a pattern recognition algorithm. By performing pattern recognition processing using this trained model 1210 based on the matrix pattern in which the values of the respective direction components converted into multidimensional vectors are arranged, the matching system 100 can detect the overall similarity of the multidimensional vectors. It is possible to appropriately quantitatively evaluate the degree of
  • the terminal device 3 also includes an operation reception unit 34 as an input unit that acquires input data.
  • the control unit 11 as an acquisition unit, acquires the first data or the previous second data from the input data.
  • the degree of compatibility with the other is calculated for most of the data held in the database device 2 (here, round-robin). Therefore, in the matching system 100, when a user desires to acquire other information that matches one need or seed, desired information can be obtained easily and appropriately.
  • the first data includes at least one predetermined element related to needs
  • the second data includes at least one predetermined element related to seeds.
  • Elements of the first data include at least one of the purpose, content, field, and name of the needs
  • elements of the second data include at least one of the features, functions, competing technologies, and names of the seeds. including.
  • control unit 11 when the control unit 11 acquires the input data accepted by the operation accepting unit 34 as an acquisition unit, the control unit 11 extracts at least one of the eight elements related to the elevator pitch syntax from the input data. In this way, elements may be automatically extracted from input document data. This eliminates the need to organize and generate input information manually in advance, thereby reducing labor.
  • the first data and the second data each include at least a noun or a verb indicating a function.
  • the degree of compatibility with high accuracy. be able to.
  • the first data and the second data each include an object for a noun or a verb indicating a function.
  • the target of the operation is also included in the first data and the second data, it is possible to obtain input data that more accurately expresses the operation content or the request content in a concise manner. It is also possible to improve the accuracy of the fitness results obtained based on the quantified data.
  • the matching method of the present embodiment includes an acquisition step of acquiring first data related to needs and second data related to seeds, an analysis step of numerically analyzing the acquired first data and second data, An output step of outputting matching information indicating the degree of matching between the first data and the second data using the analysis result in the analysis step.
  • the trained model 1210 of the present embodiment uses the first vector obtained from the first data related to needs and the second vector obtained from the second data related to seeds as inputs, and uses these first vectors as inputs. A degree of matching between the data and the second data is output. With such a trained model 1210, it is possible to obtain the overall degree of matching between the seeds data and the needs data in a more appropriate and objective value.
  • one data of needs or seeds (one data) is input, the degree of conformity with respect to a plurality of other data held for this is calculated, and the one with a large degree of conformity
  • a large number of data related to needs and data related to seeds may be combined in a round-robin manner to detect omissions in implementation.
  • the needs/seeds data 211 held in the storage unit 21 of the database device 2 is also acquired by the matching degree calculation process and then converted into a multidimensional vector.
  • a multidimensional vector may be held in advance in the seed data 211 .
  • the multidimensional vector data in the needs/seeds data 211 may be updated as needed.
  • the digitization is explained as multi-dimensional vectorization, but the range of information that can be expressed with scalar values may be expressed only with scalar values.
  • the matching degree is obtained based on image recognition technology. For example, by approximating the waveforms of the one-dimensional arrays of the first vector and the second vector, the matching degree may be obtained based on the similarity of the waveforms.
  • image data including graphs and the like is generated and output as output data, but the present invention is not limited to this.
  • Text data or the like may simply be output, a predetermined dedicated output format may be defined in a structured language, and data may be output according to the standard data format of spreadsheet software. .
  • the output may be sent to a printer or the like instead of being sent back to the terminal device 3 to form an image.
  • the input data may be concisely arranged according to other criteria, or may be simply extracted from the data expressed in natural language in units of necessary sentences and clauses without rearrangement.
  • the information processing device 1, the database device 2, and the terminal device 3 are described as separate configurations, but all processing may be performed by a single computer (matching device).
  • the corresponding search control process may be distributed by the controllers of a plurality of server devices.
  • the terminal device 3 is not limited to one specific device, and a plurality of devices may exist.
  • the storage unit 12 made up of a non-volatile memory such as an HDD or a flash memory is taken as an example of a computer-readable medium for storing the program 121 related to control such as calculation of the degree of conformity of the present invention. Illustrated, but not limited to. As other computer-readable media, it is possible to apply other non-volatile memories such as MRAM, and portable recording media such as CD-ROMs and DVD discs.
  • a carrier wave is also applicable to the present invention as a medium for providing program data according to the present invention via a communication line.
  • the specific configurations, contents and procedures of processing operations, etc. shown in the above embodiments can be changed as appropriate without departing from the scope of the present invention.
  • the scope of the present invention includes the scope of the invention described in the claims and the scope of equivalents thereof.
  • This invention can be used for matching systems, matching methods, programs, and trained models.

Abstract

Provided are a matching system, a matching method, a program, and a trained model that are capable of quantitatively evaluating matching of seeds for needs more easily and objectively. This matching system comprises: an acquisition unit that acquires first data on needs and second data on seeds; an analysis unit that quantifies and analyzes the acquired first data and second data; and an output unit that uses the result of analysis by the analysis unit to output matching information indicating the degree of matching of the second data for the first data.

Description

マッチングシステム、マッチング方法、プログラム及び学習済モデルMatching system, matching method, program and trained model
 本開示は、マッチングシステム、マッチング方法、プログラム及び学習済モデルに関する。 This disclosure relates to matching systems, matching methods, programs, and trained models.
 従来、事業の展開などにおいては、法人などが自身で所有している技術、知識、設備や人的資源などのシーズと、世間のニーズとの関係の整理が重要である。しかしながら、研究者や開発者などが世間のニーズを理解していなかったり、戦略立案者が自身のシーズを認識していなかったりすることも多く、自社の技術などを適切に市場参入や市場拡大に生かせない場合がしばしば生じる。 Conventionally, in business development, etc., it is important to sort out the relationship between the seeds of technology, knowledge, equipment and human resources owned by corporations and others and the needs of the world. However, in many cases, researchers and developers do not understand the needs of the world, and strategic planners do not recognize their own seeds. There are often times when it doesn't work.
 近年、事業の選択、効率化や新規開拓などの判断に大規模データを利用するケースが増えている。特許文献1には、記憶保持されている自社技術などのシーズデータと世間や顧客のニーズデータとを用いて、SWOT解析により保持しているシーズが世間のニーズに適合する事業分野などを割り出す点について開示されている。 In recent years, there have been an increasing number of cases where large-scale data is used to make decisions on business selection, efficiency improvement, and new development. In Patent Document 1, using seed data such as in-house technology stored in memory and needs data of the public and customers, a SWOT analysis is used to identify business fields in which the retained seeds meet public needs. is disclosed.
特開2009-20712号公報Japanese Patent Application Laid-Open No. 2009-20712
 しかしながら、従来の技術では、シーズとニーズの適合に係る評価の基準が定まっておらず、客観的な定量評価が容易ではなかった。 However, with conventional technology, the criteria for evaluating the match between seeds and needs have not been established, making objective quantitative evaluation difficult.
 本開示の目的は、シーズとニーズの適合をより容易かつ客観的に定量評価することのできるマッチングシステム、マッチング方法、プログラム及び学習済モデルを提供することにある。 The purpose of this disclosure is to provide a matching system, matching method, program, and trained model that can more easily and objectively quantitatively evaluate the match between seeds and needs.
 上記目的を達成するため、請求項1記載の発明は、
 ニーズに係る第1データと、シーズに係る第2データを取得する取得部と、
 取得された前記第1データ及び前記第2データをそれぞれ数値化して解析する解析部と、
 前記解析部による解析結果を用いて、前記第1データと前記第2データの適合度合を示すマッチング情報を出力する出力部と、
 を備えるマッチングシステムである。
In order to achieve the above object, the invention according to claim 1,
an acquisition unit that acquires first data related to needs and second data related to seeds;
an analysis unit that quantifies and analyzes the acquired first data and second data;
an output unit that outputs matching information indicating the degree of matching between the first data and the second data using the analysis result by the analysis unit;
is a matching system comprising
 また、請求項2記載の発明は、請求項1記載のマッチングシステムにおいて、
 前記解析部は、前記数値化において前記第1データ及び前記第2データをそれぞれ多次元空間を表すベクトルに変換する。
Further, the invention according to claim 2 is the matching system according to claim 1,
The analysis unit converts the first data and the second data into vectors representing a multidimensional space in the digitization.
 また、請求項3記載の発明は、請求項2記載のマッチングシステムにおいて、
 前記解析部は、前記第1データから得られた第1ベクトルと、前記第2データから得られた第2ベクトルとの類似度に基づいて前記適合度合を算出する。
Further, the invention according to claim 3 is the matching system according to claim 2,
The analysis unit calculates the degree of conformity based on a degree of similarity between a first vector obtained from the first data and a second vector obtained from the second data.
 また、請求項4記載の発明は、請求項2記載のマッチングシステムにおいて、
 前記解析部は、前記第1データから得られた第1ベクトルと、前記第2データから得られた第2ベクトルとを入力に用いて、前記適合度合を出力する学習済モデルを有する。
Further, the invention according to claim 4 is the matching system according to claim 2,
The analysis unit has a trained model that uses as inputs a first vector obtained from the first data and a second vector obtained from the second data and outputs the degree of fitness.
 また、請求項5記載の発明は、請求項4記載のマッチングシステムにおいて、
 前記学習済モデルは、パターン認識アルゴリズムによるものである。
Further, the invention according to claim 5 is the matching system according to claim 4,
The trained model is based on a pattern recognition algorithm.
 また、請求項6記載の発明は、請求項1~5のいずれか一項に記載のマッチングシステムにおいて、
 入力データを取得する入力部を備え、
 前記取得部は、前記入力データから前記第1データ又は前記第2データを取得する。
Further, the invention according to claim 6 is the matching system according to any one of claims 1 to 5,
An input unit for acquiring input data,
The acquisition unit acquires the first data or the second data from the input data.
 また、請求項7記載の発明は、請求項1~6のいずれか一項に記載のマッチングシステムにおいて、
 前記第1データは、ニーズに係る予め定められた要素を少なくとも一つ含み、
 前記第2データは、シーズに係る予め定められた要素を少なくとも一つ含む。
Further, the invention according to claim 7 is the matching system according to any one of claims 1 to 6,
The first data includes at least one predetermined element related to needs,
The second data includes at least one predetermined element related to seeds.
 また、請求項8記載の発明は、請求項7記載のマッチングシステムにおいて、
 前記第1データの前記要素には、ニーズに係る目的、内容、分野及び名称のうち少なくともいずれかが含まれ、
 前記第2データの前記要素には、シーズに係る特徴、機能、競合技術及び名称のうち少なくともいずれかが含まれる。
Further, the invention according to claim 8 is the matching system according to claim 7,
The element of the first data includes at least one of the purpose, content, field and name of the needs,
The elements of the second data include at least one of features, functions, competing technologies and names of seeds.
 また、請求項9記載の発明は、請求項7又は8記載のマッチングシステムにおいて、
 入力データを取得する入力部を備え、
 前記取得部は、前記入力データから前記第1データ又は前記第2データを取得する際に、前記要素を抽出する。
Further, the invention according to claim 9 is the matching system according to claim 7 or 8,
An input unit for acquiring input data,
The acquisition unit extracts the element when acquiring the first data or the second data from the input data.
 また、請求項10記載の発明は、請求項1~9のいずれか一項に記載のマッチングシステムにおいて、
 前記第1データ及び前記第2データは、それぞれ機能を示す名詞、又は動詞を少なくとも含む。
Further, the invention according to claim 10 is the matching system according to any one of claims 1 to 9,
Each of the first data and the second data includes at least a noun or a verb indicating a function.
 また、請求項11記載の発明は、請求項10記載のマッチングシステムにおいて、
 前記第1データ及び前記第2データは、それぞれ前記機能を示す名詞又は動詞に対する目的語を含む。
Further, the invention according to claim 11 is the matching system according to claim 10,
The first data and the second data each include an object for a noun or verb indicating the function.
 また、請求項12記載の発明は、
 コンピューターの制御部により行われるマッチング方法であって、
 ニーズに係る第1データと、シーズに係る第2データを取得する取得ステップ、
 取得された前記第1データ及び前記第2データをそれぞれ数値化して解析する解析ステップ、
 前記解析ステップにおける解析結果を用いて、前記第1データと前記第2データの適合度合を示すマッチング情報を出力する出力ステップ、
 を含む。
Further, the invention according to claim 12,
A matching method performed by a computer control unit,
an acquisition step of acquiring first data related to needs and second data related to seeds;
an analysis step of quantifying and analyzing the acquired first data and second data, respectively;
An output step of outputting matching information indicating the degree of matching between the first data and the second data using the analysis result in the analysis step;
including.
 また、請求項13記載の発明は、
 コンピューターを、
 ニーズに係る第1データと、シーズに係る第2データを取得する取得手段、
 取得された前記第1データ及び前記第2データをそれぞれ数値化して解析する解析手段、
 前記解析手段による解析結果を用いて、前記第1データと前記第2データの適合度合を示すマッチング情報を出力する出力手段、
 として機能させるプログラムである。
Further, the invention according to claim 13,
the computer,
Acquisition means for acquiring first data related to needs and second data related to seeds;
analysis means for digitizing and analyzing the acquired first data and second data, respectively;
output means for outputting matching information indicating the degree of matching between the first data and the second data using the analysis result by the analysis means;
It is a program that functions as
 また、請求項14記載の発明は、
 ニーズに係る第1データから得られた第1ベクトルと、シーズに係る第2データから得られた第2ベクトルとを入力に用いて、前記第1データと前記第2データとの適合度合を出力する学習済モデルである。
Further, the invention according to claim 14,
A first vector obtained from the first data related to the needs and a second vector obtained from the second data related to the seeds are used as inputs, and the degree of conformity between the first data and the second data is output. It is a trained model that
 本開示に従うと、シーズとニーズの適合をより容易かつ客観的に定量評価することができるという効果がある。 According to this disclosure, there is an effect that the match between seeds and needs can be quantitatively evaluated more easily and objectively.
本実施形態のマッチングシステムの全体構成図である。1 is an overall configuration diagram of a matching system according to an embodiment; FIG. シーズ/ニーズデータの内容について説明する図である。It is a figure explaining the content of seeds/needs data. データベース生成処理の制御手順を示すフローチャートである。4 is a flowchart showing a control procedure of database generation processing; 対応検索制御処理の制御手順を示すフローチャートである。FIG. 10 is a flow chart showing a control procedure of correspondence search control processing; FIG. 対応検索制御処理で呼び出される適合度合算出処理の処理手順を示すフローチャートである。FIG. 11 is a flow chart showing a processing procedure of matching degree calculation processing called in correspondence search control processing; FIG. 第2の例の適合度合の設定について説明する図である。It is a figure explaining the setting of a conformity degree of a 2nd example. 学習済モデル生成処理の制御手順を示すフローチャートである。7 is a flowchart showing a control procedure of learned model generation processing; 学習済モデルを用いた第2の例の適合度合算出処理の制御手順を示すフローチャートである。FIG. 11 is a flow chart showing a control procedure of a second example of a degree-of-fit calculation process using a trained model; FIG. 協調フィルタリングに係るニーズとシーズの対応関係について説明する図である。It is a figure explaining the correspondence of needs and seeds which concern on collaborative filtering. 協調フィルタリングに係るマトリクス生成処理の制御手順を示すフローチャートである。4 is a flowchart showing a control procedure of matrix generation processing related to collaborative filtering; 協調フィルタリングの技術を用いた第3の例の適合度合算出処理の制御手順を示すフローチャートである。FIG. 11 is a flow chart showing a control procedure of a matching degree calculation process of a third example using a technique of collaborative filtering; FIG. 出力結果の例を示す図である。FIG. 10 is a diagram showing an example of output results;
 以下、実施の形態を図面に基づいて説明する。
 図1は、本実施形態のマッチングシステム100の全体構成図である。
 本実施形態のマッチングシステム100は、情報処理装置1と、データベース装置2と、端末装置3とを含む。
Embodiments will be described below with reference to the drawings.
FIG. 1 is an overall configuration diagram of a matching system 100 of this embodiment.
A matching system 100 of this embodiment includes an information processing device 1 , a database device 2 , and a terminal device 3 .
 情報処理装置1は、本実施形態のマッチングに係る処理を行うコンピューターであり、例えば、普通のPC(Personal Computer)などであってよい。情報処理装置1は、データベース装置2と接続されて、当該データベース装置2に記憶されている保持データを参照し、また、新たなデータを書き込ませて追加する。なお、データベース装置2は、情報処理装置1に直接接続されていなくてもよく、ネットワークNを通じてアクセス可能なものであってもよい。 The information processing device 1 is a computer that performs processing related to matching in this embodiment, and may be, for example, an ordinary PC (Personal Computer). The information processing device 1 is connected to the database device 2, refers to the held data stored in the database device 2, and writes and adds new data. The database device 2 may not be directly connected to the information processing device 1, and may be accessible through the network N.
 情報処理装置1は、制御部11(取得部、解析部、出力部)と、記憶部12と、通信部13などを備える。
 制御部11は、演算処理を行い情報処理装置1の動作を統括制御するCPUなどのハードウェアプロセッサーとRAMなどのメモリーとを有する。
The information processing device 1 includes a control unit 11 (acquisition unit, analysis unit, output unit), a storage unit 12, a communication unit 13, and the like.
The control unit 11 has a hardware processor such as a CPU that performs arithmetic processing and controls the operation of the information processing apparatus 1 and a memory such as a RAM.
 記憶部12は、不揮発性のメモリーを有し、プログラム121及び設定データなどを記憶する。プログラム121には、本実施形態の対応検索制御処理に係る制御プログラムが含まれる。また、プログラム121には、後述のデータベース生成処理、学習済モデル生成処理及びマトリクス生成処理の一部又は全部が含まれていてもよく、学習済モデル生成処理が含まれる場合には、プログラム121には、更に生成された学習済モデル1210が含まれる。
 不揮発性のメモリーは、例えば、フラッシュメモリーやHDD(Hard Disk Drive)などであってもよい。
The storage unit 12 has a nonvolatile memory and stores a program 121, setting data, and the like. The program 121 includes a control program related to the correspondence search control process of this embodiment. In addition, the program 121 may include a part or all of the database generation processing, learned model generation processing, and matrix generation processing described later. When the learned model generation processing is included, the program 121 also includes the generated trained model 1210 .
The non-volatile memory may be, for example, flash memory or HDD (Hard Disk Drive).
 通信部13は、ネットワークNを介して外部機器との間で行うデータの送受信を制御する。外部機器には、端末装置3が含まれる。データの送受信に係る通信規格は、例えば、LAN(Local Area Network)に係るもの(TCP/IPなど)であってよい。 The communication unit 13 controls transmission and reception of data with external devices via the network N. The terminal device 3 is included in the external device. A communication standard for data transmission/reception may be, for example, a LAN (Local Area Network) standard (TCP/IP, etc.).
 情報処理装置1は、その他、表示部や操作受付部などを備えていてもよい。 The information processing device 1 may also include a display unit, an operation reception unit, and the like.
 データベース装置2は、記憶部21を備える。記憶部21は、予め取得されているニーズ/シーズデータ211、及び各用語、語句や文などを多次元ベクトル(意味ベクトル)に変換するための学習データや学習済モデルなどを記憶保持している。 The database device 2 has a storage unit 21 . The storage unit 21 stores and holds needs/seeds data 211 acquired in advance, learning data and learned models for converting each term, phrase, sentence, etc. into a multidimensional vector (semantic vector). .
 端末装置3は、普通のPCや携帯端末(スマートフォンなど)などであってよく、マッチングさせたいデータの入力操作及びマッチング結果の表示動作などを行う。 The terminal device 3 may be an ordinary PC or a mobile terminal (smartphone, etc.), and performs operations such as inputting data to be matched and displaying matching results.
 端末装置3は、制御部31と、通信部32と、表示部33と、操作受付部34(入力部)などを備える。制御部31は、演算処理を行い端末装置3の動作を統括制御するハードウェアプロセッサー及びメモリーなどを有する。通信部32は、ネットワークNを介して行われる外部機器とのデータの送受信を制御する。外部機器には、上記の情報処理装置1が含まれる。データの送受信に係る通信規格は、情報処理装置1と同一のもの、すなわち、LANに係る規格(TCP/IPなど)が含まれる。 The terminal device 3 includes a control unit 31, a communication unit 32, a display unit 33, an operation reception unit 34 (input unit), and the like. The control unit 31 includes a hardware processor and a memory that perform arithmetic processing and centrally control the operation of the terminal device 3 . The communication unit 32 controls transmission and reception of data with an external device via the network N. FIG. The external device includes the information processing apparatus 1 described above. The communication standard related to data transmission/reception is the same as that of the information processing apparatus 1, that is, the standard related to LAN (TCP/IP, etc.) is included.
 表示部33は、文字の表示が可能な表示画面を有し、制御部31の制御に基づいて当該表示画面への表示動作を行う。表示画面は、特には限られないが、例えば、液晶表示画面(LCD)である。 The display unit 33 has a display screen on which characters can be displayed, and performs display operations on the display screen under the control of the control unit 31 . The display screen is not particularly limited, but is, for example, a liquid crystal display screen (LCD).
 操作受付部34は、端末装置3のユーザーからの入力操作を受け付けて、当該入力操作の内容を操作信号として制御部31に出力する。操作受付部34は、例えば、キーボード及びポインティングデバイスを有する。ポインティングデバイスとしては、マウスなどが挙げられる。これに加えて又は代えて、操作受付部34は、表示画面に重なって位置するタッチパネルなどを有していてもよい。また、これらは外付けされる周辺機器であってもよい。 The operation reception unit 34 receives an input operation from the user of the terminal device 3 and outputs the content of the input operation to the control unit 31 as an operation signal. The operation reception unit 34 has, for example, a keyboard and a pointing device. The pointing device includes a mouse and the like. In addition to or instead of this, the operation reception unit 34 may have a touch panel or the like that overlaps the display screen. Alternatively, these may be externally attached peripheral devices.
 次に、本実施形態のマッチングシステム100におけるマッチングについて説明する。
 マッチングシステム100では、自身(例えば、自社などの法人など)が有する技術や知的財産全般に係る知識など(すなわち、無形を含むモノである。ここではモノには人を含まない)シーズと、顧客や社会などの要求や希望などのニーズとのマッチングを行う。
Next, matching in the matching system 100 of this embodiment will be described.
In the matching system 100, seeds such as knowledge related to technology and intellectual property in general owned by itself (for example, a corporation such as its own company) (that is, things including intangibles. Here, things do not include people) seeds, Matching needs such as demands and wishes of customers and society.
 このマッチングシステム100では、シーズに係るデータ(第2データ)又はニーズに係るデータ(第1データ)の入力(すなわち、第1データ及び第2データのうちいずれか一方のデータ)に対し、当該一方のデータと、マッチングシステム100が保持している当該一方とは反対の他方のデータとを用いた解析を行い、一方のデータに適合する度合(適合度合)の高い他方のデータを出力する。すなわち、適合度合とは、シーズがニーズを満たし得るか/ニーズがシーズにより満たされ得るかを示す指標であって、例えば、ニーズとシーズの趣旨が似ているかを示す指標であったり、ニーズが課題の場合に、シーズがその解決手段になり得るかを示す指標であったりする。データの入力は、端末装置3における操作受付部34などにより受け付けられる。操作受付部34は、直接データの入力操作を受け付けてもよいし、ファイル名及び必要に応じてファイルが位置するパスなどを指定する入力操作を受け付けるのであってもよい。制御部31は、通信部32により操作受付部34が受け付けた入力データをネットワークN経由で情報処理装置1へ送信する。 In this matching system 100, with respect to input of data related to seeds (second data) or data related to needs (first data) (that is, either one of the first data and the second data), the one and the other data held by the matching system 100, and the other data having a high degree of matching (matching degree) with the one data is output. In other words, the degree of suitability is an index that indicates whether the seeds can satisfy the needs/whether the needs can be satisfied by the seeds. In the case of a problem, it can be an index that shows whether the seeds can be the solution. Input of data is received by the operation receiving unit 34 of the terminal device 3 or the like. The operation accepting unit 34 may directly accept a data input operation, or may accept an input operation specifying a file name and, if necessary, a path where the file is located. The control unit 31 transmits the input data received by the operation receiving unit 34 through the communication unit 32 to the information processing device 1 via the network N. FIG.
 マッチングに係る解析では、シーズについての要素及びニーズについての要素をそれぞれ、複数の意味成分の大きさを表す数値の配列である多次元ベクトル(多次元空間を表すベクトル)で表す(変換する)ことにより数値化する。そして、シーズについての多次元ベクトル(第2ベクトル)とニーズについての多次元ベクトル(第1ベクトル)との一致度合(距離)に応じた適合度合が定量的に評価される。 In analysis related to matching, each of the elements of seeds and elements of needs is represented (converted) by a multidimensional vector (vector representing a multidimensional space) that is an array of numerical values representing the sizes of multiple semantic components. digitized by Then, the degree of conformity is quantitatively evaluated according to the degree of matching (distance) between the multidimensional vector of seeds (second vector) and the multidimensional vector of needs (first vector).
 図2Aは、シーズ/ニーズデータの内容について説明する図である。
 シーズ及びニーズの要素は、必要な情報を簡潔に示すものであることが好ましい。例えばここでは、ニーズについての4要素である要求の目的(Why)、内容(What)、分野及び名称と、シーズについての4要素である技術などの特徴、機能、競合技術及び名称とが定められる。これらの8要素は、要点を押さえた簡潔なビジネススピーチの構文であるエレベーターピッチ構文として知られているものである。
FIG. 2A is a diagram for explaining the content of seeds/needs data.
Preferably, the seeds and needs element concisely presents the required information. For example, here, the four elements of the need (Why), the content (What), the field and name of the request, and the four elements of the seeds, such as technology features, functions, competing technologies and names are defined. . These eight elements are known as the Elevator Pitch syntax, a short, to-the-point business speech syntax.
 入力データに対する比較対象データとなるニーズ/シーズデータ211としてデータベース装置2に記憶されるデータでは、これらの8要素のうち、ニーズに係る4要素と、シーズに係る4要素とは、それぞれ別のデータセットから抽出されてよい。例えば、ニーズに係る4要素は、営業担当者の収集した情報、マスメディア情報、ネット情報などから取得され得る。また、例えば、シーズに係る4要素は、自社内における技術文書などから取得され得る。技術文書は、内部文書のみであってもよいし、契約文書、公開されているプレスリリースや特許文献などが含まれていてもよい。 In the data stored in the database device 2 as the needs/seeds data 211 to be compared with the input data, among these eight elements, the four elements related to needs and the four elements related to seeds are different data. may be extracted from the set. For example, the four elements of needs can be acquired from information collected by sales representatives, mass media information, Internet information, and the like. Also, for example, the four elements related to the seeds can be obtained from technical documents within the company. Technical documentation may be internal documentation only, or may include contractual documents, publicly available press releases and patent documents, and the like.
 なお、必ずしも8要素全てが利用される必要はなく、ニーズに係る要素とシーズに係る要素とが対応している範囲において、要素数を絞って取得、利用されてもよい。例えば、ニーズについて要求の目的及び内容の2要素と、シーズについて技術の特徴及び機能の2要素の合計4要素が取得されてもよい。
 また、必ずしもエレベーターピッチ構文に従って上記8要素が抽出されるのではなくてもよく、後述のようにニーズとシーズとの対応関係が適切に定量評価できるような他の項目が設定されていてもよい。例えば、ニーズに係る項目(要素)は、経営状態や市場規模などのビジネス的要素などから予め定められていてもよいし、シーズに係る項目(要素)は、共同研究状況、公開状況などを含んで予め定められていてもよい。
It should be noted that it is not always necessary to use all eight elements, and the number of elements may be narrowed down and acquired and used within a range in which the elements related to needs and the elements related to seeds correspond. For example, a total of four elements may be obtained: two elements of the purpose and content of the request for the needs and two elements of the features and functions of the technology for the seeds.
In addition, the above eight elements may not necessarily be extracted according to the elevator pitch syntax, and other items may be set so that the correspondence between needs and seeds can be appropriately quantitatively evaluated as described later. . For example, items (elements) related to needs may be determined in advance from business factors such as management status and market size, and items (elements) related to seeds include the status of joint research and the status of disclosure. may be predetermined.
 元となる文書からこれらの要素を抽出する処理は、担当者などにより手作業で行われてもよい。あるいは、処理の一部又は全体が端末装置3からの入力、指示などに基づいて情報処理装置1又は他の端末装置などにより行われてもよい。部分的に要素を抽出する場合には、例えば、形態素解析を用いて入力データを単語などの形態素に分解した後に、構文解析を用いて形態素間の係り受けなどを判断したり、共起関係などを求めたりすることで、情報内に頻出する単語や語句などを一度抽出するといったテキストマイニングの周知技術を利用して傾向を判断しやすくすることができる。そして、この結果を踏まえて、担当者などが、当該傾向に沿った語句を選択しながら4要素を抽出することとしてもよい。 The process of extracting these elements from the original document may be done manually by the person in charge. Alternatively, part or all of the processing may be performed by the information processing device 1 or another terminal device based on an input, instruction, or the like from the terminal device 3 . When extracting elements partially, for example, input data is decomposed into morphemes such as words using morphological analysis, and then syntactic analysis is used to determine dependencies between morphemes, co-occurrence relationships, etc. By obtaining , it is possible to easily determine the tendency by using a well-known technique of text mining, such as extracting words and phrases that frequently appear in the information once. Then, based on this result, the person in charge or the like may extract the four elements while selecting words in line with the tendency.
 これらの抽出される要素は、長くなるほど情報量が増えるが、一方で重要性の低い単語も混入することになりやすい。本実施形態のマッチングシステム100では、各要素の分量があまり大きくならないように、例えば、述語、すなわち動詞や機能表現に用いられる名詞(「する」と組み合わせて動詞とされ得る名詞など;機能を示す名詞)と、目的語とを最低限の単位として、必要性や条件に応じてこれらに修飾語句を付す形で要素が抽出される。例えば、断熱構造体(保温容器など)に関し、「お湯の」(修飾語、又は目的語に含む)「温度を」(目的語)「維持する」(述語)などのように整理が行われる。 The longer these extracted elements, the more information they contain, but on the other hand, they tend to include less important words. In the matching system 100 of the present embodiment, for example, predicates, that is, nouns used for verbs and functional expressions (such as nouns that can be combined with "do" to be verbs; Nouns) and objects are used as the minimum units, and elements are extracted by adding modifiers to them according to necessity and conditions. For example, with regard to a heat-insulating structure (heat-retaining container, etc.), ``hot water'' (included in modifiers or objects), ``temperature'' (object), ``maintain'' (predicate), etc. are organized.
 また、全ての処理が情報処理装置1などで行われる場合には、例えば、文書ファイルの名称やテキスト文書のタイトルなどに基づいて内容の要旨が判断されてもよい。上記形態素解析及び構文解析などにより、名称と対応する主たる内容、目的、機能や特徴などが自動的に抽出され得る。 Also, if all the processing is performed by the information processing apparatus 1 or the like, the gist of the content may be determined based on, for example, the name of the document file or the title of the text document. Through the morphological analysis, syntactic analysis, etc., the main content, purpose, function, feature, etc. corresponding to the name can be automatically extracted.
 図2Bは、情報処理装置1などで実行されるデータベース生成処理の制御部11による制御手順を示すフローチャートである。比較対象データの生成に係るこの処理は、例えば、1又は多数のテキストデータが予め読み込み可能に用意されて、所定の入力操作により、又は定期的な処理などの実行タイミングで実行される。なお、このデータベース生成処理は、読み込まれるテキストデータがニーズに対応するものか、あるいはシーズに対応するものか、についての設定を取得可能に実行されてもよい。 FIG. 2B is a flowchart showing a control procedure by the control unit 11 of the database generation process executed by the information processing device 1 or the like. For example, one or a large number of text data are prepared in advance so as to be readable, and this processing related to the generation of comparison target data is executed by a predetermined input operation or at execution timing such as periodic processing. Note that this database generation process may be executed so that the setting as to whether the text data to be read corresponds to needs or seeds can be acquired.
 データベース生成処理が開始されると、制御部11は、用意されているテキストデータから1つを選択して取得する(ステップS301;取得部、取得ステップ、取得手段)。制御部11は、入力データがニーズに係るデータかシーズに係るデータかを判別し、判別結果に対応する4要素を抽出する(ステップS302)。なお、1つのテキストデータ内に複数のニーズ又はシーズが特定された場合については、それぞれについて4要素が抽出されてよい。 When the database generation process is started, the control unit 11 selects and acquires one text data from the prepared text data (step S301; acquisition unit, acquisition step, acquisition means). The control unit 11 determines whether the input data is data related to needs or data related to seeds, and extracts four elements corresponding to the determination result (step S302). In addition, when a plurality of needs or seeds are specified in one text data, four elements may be extracted for each.
 制御部11は、抽出された4要素の内容について、それぞれ動詞(述語)、目的語、修飾語句に整理する(ステップS303)。制御部11は、整理した内容が、既にデータベース装置2の記憶部21に記憶されている内容と重複しているかの判別を行う(ステップS304)。制御部11は、重複の有無に応じて、整理したデータをデータベース装置2の記憶部21に新規追加又は一部更新して記憶させる(ステップS305)。なお、新たな整理データが完全に既存の内容と重複している場合には、更新する必要がないので、制御部11は、このステップS305の処理を省略してよい。 The control unit 11 organizes the contents of the four extracted elements into verbs (predicates), objects, and modifiers (step S303). The control unit 11 determines whether or not the sorted content overlaps with the content already stored in the storage unit 21 of the database device 2 (step S304). The control unit 11 stores the sorted data in the storage unit 21 of the database device 2 after newly adding or partially updating the data according to the presence or absence of duplication (step S305). It should be noted that if the new organization data completely overlaps with the existing content, there is no need to update it, so the control section 11 may omit the process of step S305.
 制御部11は、処理対象の入力データの取得が全て終了したか否かを判別する(ステップS306)。入力データの取得が全て終了したと判別された場合には(ステップS306で“YES”)、制御部11は、データベース生成処理を終了する。入力データの取得が終了していないと判別された場合には(ステップS306で“NO”)、制御部11の処理は、ステップS301に戻る。 The control unit 11 determines whether or not all the input data to be processed have been acquired (step S306). When it is determined that all the input data have been obtained (“YES” in step S306), the control unit 11 ends the database generation process. If it is determined that the acquisition of input data has not ended ("NO" in step S306), the process of the control unit 11 returns to step S301.
 なお、上記実施形態では、各入力データを順番に読み込んで4要素の抽出及び整理を行ったが、ニーズ/シーズごとに一度に全ての入力データを読み込んでまとめて複数種類のニーズ/シーズに係る4要素の抽出及び整理を行ってもよい。 In the above embodiment, each input data is read in order to extract and organize four elements. Extraction and arrangement of four elements may be performed.
 これらのようにして手作業で又はデータベース生成処理により自動で、事前に得られたデータベースであるニーズ/シーズデータ211を参照して、別途入力されたあるニーズと適合するシーズ、又は入力されたあるシーズと適合するニーズを検索する処理が行われる。 By referring to the needs/seeds data 211, which is a database obtained in advance, manually or automatically by the database generation process, seeds that match a certain need that has been separately input, or seeds that have been input Processing is performed to search for needs that match the seeds.
 図3は、情報処理装置1で実行される対応検索制御処理の制御部11による制御手順を示すフローチャートである。この対応検索制御処理は、例えば、端末装置3で入力されたニーズ又はシーズの入力データとともに検索の実行命令が制御部11により取得されることで開始される。入力データは、上記のように予め必要な要素が定められたものである。 FIG. 3 is a flowchart showing the control procedure by the control unit 11 of the correspondence search control process executed by the information processing device 1. FIG. This corresponding search control process is started, for example, when the control unit 11 acquires a search execution command together with the input data of the needs or seeds input by the terminal device 3 . The input data has necessary elements determined in advance as described above.
 対応検索制御処理が開始されると、制御部11は、端末装置3からの入力データを取得する(ステップS101;取得部、取得ステップ、取得手段)。制御部11は、入力データがニーズ又はシーズのいずれに係るデータであるかについての設定を取得する(ステップS102)。 When the correspondence search control process is started, the control unit 11 acquires input data from the terminal device 3 (step S101; acquisition unit, acquisition step, acquisition means). The control unit 11 acquires a setting as to whether the input data is data related to needs or seeds (step S102).
 制御部11は、入力がニーズに係るデータ(第1データ)であるか否かを判別する(ステップS103)。入力がニーズに係るデータ(第1データ)であると判別された場合には(ステップS103“YES”)、制御部11は、シーズに係る保持データ(第2データ)を検索対象に設定する(ステップS104)。それから、制御部11の処理は、ステップS106へ移行する。入力がニーズに係るデータではない(シーズに係るデータ(第2データ)である)と判別された場合には(ステップS103で“NO”)、制御部11は、ニーズに係る保持データ(第1データ)を検索対象に設定する(ステップS105)。それから、制御部11の処理は、ステップS106へ移行する。 The control unit 11 determines whether the input is data (first data) related to needs (step S103). If the input is determined to be the data (first data) related to needs ("YES" in step S103), the control unit 11 sets the retained data (second data) related to seeds as a search target ( step S104). Then, the processing of the control unit 11 proceeds to step S106. When it is determined that the input is not the data related to the needs (the data is the data related to the seeds (second data)) ("NO" in step S103), the control unit 11 outputs the held data related to the needs (the first data). data) is set as a search target (step S105). Then, the processing of the control unit 11 proceeds to step S106.
 ステップS106へ移行すると、制御部11は、後述の適合度合算出処理を実行して、各検索対象の適合度合をそれぞれ算出する(ステップS106)。制御部11は、適合度合が基準を満たす検索対象のシーズ又はニーズのデータを抽出する(ステップS107)。制御部11は、抽出されたデータを必要に応じて適宜加工して、見やすい形でマッチング情報として端末装置3へ出力する(ステップS108;出力部、出力ステップ、出力手段)。そして、制御部11は、対応検索制御処理を終了する。 After proceeding to step S106, the control unit 11 executes a matching degree calculation process, which will be described later, to calculate the matching degree of each search target (step S106). The control unit 11 extracts the data of seeds or needs to be searched whose matching degree satisfies the criteria (step S107). The control unit 11 appropriately processes the extracted data as necessary and outputs it to the terminal device 3 as matching information in an easy-to-read form (step S108; output unit, output step, output means). Then, the control unit 11 terminates the correspondence search control process.
 次に、適合度合の算出について説明する。
 本実施形態のマッチングシステム100では、抽出され、動詞(述語)及び目的語の形で整理された内容を多次元ベクトルに変換して表すことで数値評価を行う。多次元ベクトルの次元数は、特には限られないが、例えば、全体で50~200次元である。あるいは、要素ごとに多次元ベクトルに変換されて、これらが単純に結合されてもよい。例えば、ニーズの要求とニーズの内容とがそれぞれ50次元のベクトルで表され、これらが結合された100次元のベクトルにより、このニーズが表されてもよい。同様に、シーズの特徴とシーズの機能とがそれぞれ50次元のベクトルで表され、これらが結合された100次元のベクトルにより、このシーズが表されてもよい。自然言語表現から多次元ベクトル表現への変換は、特には限られないが、Word2vec及びこれから派生したdоc2vec、BERT(Bidirectional Encoder Representations from Transformers)などが知られている。これらの変換に係る機械学習は、本実施形態のマッチングシステム100内で実行されてもよいし、学習が既になされたものを外部から取得して利用したり、外部のサーバーにアクセスしてこれらのプログラムを利用したりしてもよい。
Next, calculation of the degree of conformity will be described.
In the matching system 100 of the present embodiment, numerical evaluation is performed by converting the contents extracted and organized in the form of verbs (predicates) and objects into multidimensional vectors. The number of dimensions of the multidimensional vector is not particularly limited, but is, for example, 50 to 200 dimensions in total. Alternatively, they may be converted element by element into multidimensional vectors and these may simply be combined. For example, the request of the need and the content of the need may each be represented by a 50-dimensional vector, and the need may be represented by a combined 100-dimensional vector. Similarly, a seed feature and a seed function may each be represented by a 50-dimensional vector, and the seed may be represented by a combined 100-dimensional vector. Word2vec, doc2vec derived therefrom, BERT (Bidirectional Encoder Representations from Transformers), and the like are known for conversion from natural language expression to multidimensional vector expression, although not particularly limited. Machine learning related to these conversions may be executed within the matching system 100 of the present embodiment, may be acquired from the outside and used what has already been learned, or access an external server to perform these You can use the program.
 適合度合は、第1の例として、ニーズを表す第1ベクトルとシーズを表す第2ベクトルとの距離(類似度の一例)により表される。距離としては、例えば、コサイン類似度が用いられる。コサイン類似度は、2つのベクトルの内積を当該2つのベクトルのそれぞれの大きさの積で除したものである。2つのベクトルが単位ベクトルであれば、単純に2つのベクトルの内積がコサイン類似度となる。あるいは、距離は、その他の指標、例えば、ユークリッド距離などで表されてもよい。 As a first example, the degree of adaptation is represented by the distance (an example of similarity) between a first vector representing needs and a second vector representing seeds. For example, cosine similarity is used as the distance. Cosine similarity is the inner product of two vectors divided by the product of their respective magnitudes. If the two vectors are unit vectors, the cosine similarity is simply the inner product of the two vectors. Alternatively, the distance may be represented by other indices such as the Euclidean distance.
 図4は、上記の対応検索制御処理で呼び出される適合度合算出処理の処理手順を示すフローチャートである。
 この適合度合算出処理は、本実施形態のマッチング方法における解析ステップを構成し、また、プログラム121における解析手段を構成する。
 適合度合算出処理が呼び出されると、制御部11は、入力データの内容を上記のように多次元ベクトルに変換する(ステップS151)。
FIG. 4 is a flow chart showing the processing procedure of the degree-of-match calculation process called in the above correspondence search control process.
This matching degree calculation process constitutes an analysis step in the matching method of this embodiment, and also constitutes an analysis means in the program 121 .
When the matching degree calculation process is called, the control unit 11 converts the content of the input data into a multidimensional vector as described above (step S151).
 制御部11は、ニーズ/シーズデータ211から検索対象データを1つ取得する(ステップS152)。制御部11は、取得された検索対象データを多次元ベクトルに変換する(ステップS153)。制御部11は、入力データに係る多次元ベクトルと検索対象データに係る多次元ベクトルとの間の距離(例えば、コサイン類似度)を算出する(ステップS154)。 The control unit 11 acquires one search target data from the needs/seeds data 211 (step S152). The control unit 11 converts the acquired search target data into a multidimensional vector (step S153). The control unit 11 calculates the distance (for example, cosine similarity) between the multidimensional vector related to the input data and the multidimensional vector related to the search target data (step S154).
 制御部11は、全ての検索対象データが取得されたか否かを判別する(ステップS155)。全ての検索対象データが取得されたと判別された場合には(ステップS155で“YES”)、制御部11は、適合度合算出処理を終了して、処理を対応検索制御処理に戻す。 The control unit 11 determines whether or not all search target data has been acquired (step S155). If it is determined that all search target data has been acquired ("YES" in step S155), the control unit 11 terminates the matching degree calculation process and returns the process to the corresponding search control process.
 全ての検索対象データが取得されたわけではない(取得されていない検索対象データがある)と判別された場合には(ステップS155で“NO”)、制御部11の処理は、ステップS152に戻る。 If it is determined that not all search target data has been acquired (there is search target data that has not been acquired) ("NO" in step S155), the processing of the control unit 11 returns to step S152.
 あるいは、適合度合の他の例(第2の例)として、画像認識に係る機械学習モデルが利用されてもよい。例えば、ニーズ(要求)に係る目的と内容の多次元ベクトルの各成分の値をそれぞれ1列に並べた50×2画素のマトリクスと、シーズ(技術)に係る特徴と機能の多次元ベクトルの各成分の値をそれぞれ1列に並べた50×2画素のマトリクスとを更に結合して、50×4画素(各成分値が階調値に対応する200画素)のマトリクスを生成する。そして、このマトリクスパターンが、シーズとニーズが適合している場合のマトリクスパターンの傾向に類似している度合により適合度合を求める。 Alternatively, a machine learning model related to image recognition may be used as another example (second example) of the degree of conformity. For example, a 50 x 2 pixel matrix in which the values of each component of the multidimensional vector of purpose and content related to needs (requirements) are arranged in one column, and a multidimensional vector of features and functions related to seeds (technologies) A matrix of 50×2 pixels in which each component value is arranged in one column is further combined to generate a matrix of 50×4 pixels (200 pixels, each component value corresponding to a tone value). Then, the degree of compatibility is obtained from the degree of similarity of this matrix pattern to the tendency of the matrix pattern when the seeds and needs are matched.
 図5Aは、この第2の例の適合度合の設定について説明する図である。
 この図5Aに示すように、上記200画素のマトリクスパターンが定められる。なお、ここでは4行50列のマトリクスパターンとしているが、これに限られない。200成分が一列に並んだ単純なベクトルであってもよいし、2行100列など、他の行列数のマトリクスパターンであってもよい。
FIG. 5A is a diagram explaining the setting of the degree of conformity in this second example.
As shown in FIG. 5A, the 200-pixel matrix pattern is determined. Although a matrix pattern of 4 rows and 50 columns is used here, it is not limited to this. It may be a simple vector in which 200 elements are arranged in a line, or may be a matrix pattern with other numbers of rows and columns, such as 2 rows and 100 columns.
 マトリクスパターンの適合度合は、予め機械学習モデルを学習させた学習済モデル1210にこの200画素のマトリクス(すなわち第1ベクトル及び第2ベクトル)を入力することにより、当該学習済モデル1210からの出力として得られる。学習済モデル1210の生成は、上記のようにデータベース装置2に記憶保持されているシーズに係る各データとニーズに係る各データとの間でなされればよい。 The degree of conformity of the matrix pattern is obtained by inputting this 200-pixel matrix (that is, the first vector and the second vector) to the trained model 1210 that has previously learned a machine learning model. can get. The learned model 1210 may be generated between each data related to seeds and each data related to needs stored in the database device 2 as described above.
 具体的には、あるシーズのデータに係る第2ベクトルと、あるニーズのデータに係る第1ベクトルとの間で、上記のように距離(コサイン類似度)などを求める。そして、当該距離に応じた適合度合が大きく(コサイン類似度であれば値が小さく)、適合している条件を満たすものについて、上記のマトリクスデータを生成して学習データとする。この学習データと教師データとして高適合(“Good”)を表す数値などとを対応付けて、機械学習モデルを学習させる。 Specifically, the distance (cosine similarity) between the second vector related to the data of a certain seed and the first vector related to the data of a certain need is obtained as described above. Then, the above matrix data is generated and used as learning data for those having a high matching degree according to the distance (a small value in the case of cosine similarity) and satisfying the matching condition. A machine learning model is learned by associating this learning data with a numerical value representing high matching (“Good”) as teacher data.
 一方で、上記の求められた距離に応じた適合度合が小さく(コサイン類似度であれば値が大きく)、適合していない条件を満たすものを複数抽出する。抽出された複数組のデータにおけるシーズの各要素やニーズの各要素に係るベクトル成分を更にランダムに組みなおして学習データを生成する。この学習データと、教師データとして低適合(“Bad”)を表す数値などとを対応付けて、機械学習モデルを学習させる。 On the other hand, a plurality of items are extracted that have a small matching degree according to the distance obtained above (the value is large for cosine similarity) and satisfy the non-matching condition. Learning data is generated by further randomly rearranging vector components related to each element of seeds and each element of needs in the plurality of sets of extracted data. A machine learning model is learned by associating this learning data with numerical values representing low adaptation (“Bad”) as teacher data.
 これらの学習により、入力されるマトリクスパターン(第1データ及び第2データ)に対する高適合である割合などが適合度合として数値などにより出力される学習済モデル1210が得られる。機械学習モデルのアルゴリズムとしては、例えば、教師有モデルであってパターン認識に係るアルゴリズムが利用されればよく、サポートベクターマシンやニューラルネットワークといったパターン認識アルゴリズム、特に、ディープラーニングなどが挙げられる。
 なお、あるシーズに係る多次元ベクトルの入力に対して適合するニーズを出力する学習済モデルと、あるニーズに係る多次元ベクトルの入力に対して適合するシーズを出力する学習済モデルとは、共通であってもよいし、別個に生成されてもよい。
Through these learnings, a trained model 1210 is obtained in which the rate of high matching with respect to the input matrix pattern (first data and second data) is output as a numerical value or the like as the degree of matching. As the algorithm of the machine learning model, for example, a supervised model and an algorithm related to pattern recognition may be used, including pattern recognition algorithms such as support vector machines and neural networks, and particularly deep learning.
Note that a trained model that outputs a need that matches the input of a multidimensional vector related to a certain seed and a trained model that outputs a seed that matches the input of a multidimensional vector related to a certain need are common. , or may be generated separately.
 図5Bは、上記学習済モデル生成処理の制御部11による制御手順を示すフローチャートである。この処理は、予め機械学習モデルのアルゴリズムが定められて未学習の状態で用意されて、端末装置3からの所定の入力操作又はデータベース装置2のニーズ及びシーズに係る記憶データの更新などに応じて自動で開始され得る。 FIG. 5B is a flowchart showing a control procedure by the control unit 11 for the learned model generation process. This process is prepared in an unlearned state with the algorithm of the machine learning model determined in advance, and according to a predetermined input operation from the terminal device 3 or the update of the stored data related to the needs and seeds of the database device 2 It can be started automatically.
 学習済モデル生成処理が開始されると、制御部11は、あるシーズ又はニーズの内容を多次元ベクトルに変換する(ステップS201)。制御部11は、比較対象データ(入力がシーズの場合にはニーズデータ、入力がニーズの場合にはシーズデータ)を1つ取得する(ステップS202)。制御部11は、取得した比較対象データを多次元ベクトルに変換する(ステップS203)。 When the learned model generation process is started, the control unit 11 converts the content of a certain seed or need into a multidimensional vector (step S201). The control unit 11 acquires one comparison target data (needs data if the input is seeds, and seeds data if the input is needs) (step S202). The control unit 11 converts the obtained comparison target data into a multidimensional vector (step S203).
 制御部11は、得られた2つの多次元ベクトルの間の距離を算出する(ステップS204)。制御部11は、算出された距離が下側基準内(下側基準値未満)であるか否かを判別する(ステップS205)。下側基準内であると判別された場合には(ステップS205で“YES”)、制御部11は、2つの多次元ベクトルを組み合わせたマトリクスデータを生成する(ステップS206)。生成されるマトリクスデータは、上記のように、特には限られないが例えば4行50列であってもよい。制御部11は、生成されたマトリクスデータを機械学習モデルに入力する。また、制御部11は、このマトリクスデータに対して高適合“Good”を教師データを設定して、出力結果とのずれ(誤差)を逆伝播させることなどにより、機械学習モデルのパラメーターを最適化していく(ステップS207)。それから、制御部11の処理は、ステップS210に移行する。 The control unit 11 calculates the distance between the two obtained multidimensional vectors (step S204). The control unit 11 determines whether the calculated distance is within the lower reference (less than the lower reference value) (step S205). If it is determined to be within the lower standard ("YES" in step S205), the control unit 11 generates matrix data combining two multidimensional vectors (step S206). The generated matrix data may be, for example, 4 rows and 50 columns, although it is not particularly limited, as described above. The control unit 11 inputs the generated matrix data to the machine learning model. In addition, the control unit 11 optimizes the parameters of the machine learning model by setting highly compatible "Good" as teacher data for this matrix data and back propagating deviations (errors) from the output results. (step S207). Then, the processing of the control unit 11 proceeds to step S210.
 ステップS205の判別処理で、算出された距離が下側基準内ではない(下側基準値以上である)と判別された場合には(ステップS205で“NO”)、制御部11は、算出された距離が上側基準内である(上側基準値より大きい)か否かを判別する(ステップS208)。上側基準内は下側基準内よりも距離が大きい。上側基準内と下側基準内の間には、どちらにも含まれない距離範囲があってよい。距離が上側基準内であると判別された場合には(ステップS208で“YES”)、制御部11は、当該距離が得られた2つの多次元ベクトルの元のニーズデータ及びシーズデータの組を記憶保持する(ステップS209)。それから、制御部11の処理は、ステップS210へ移行する。 If it is determined in the determination process in step S205 that the calculated distance is not within the lower reference value (is greater than or equal to the lower reference value) ("NO" in step S205), the control unit 11 It is determined whether or not the distance obtained is within the upper reference (greater than the upper reference value) (step S208). The distance within the upper reference is greater than that within the lower reference. Between the upper criterion and the lower criterion, there may be a distance range that is not included in either. If it is determined that the distance is within the upper criterion ("YES" in step S208), the control unit 11 converts the original needs data and seeds data set of the two multidimensional vectors from which the distance was obtained to This is stored (step S209). Then, the processing of the control unit 11 proceeds to step S210.
 ステップS210の処理へ移行すると、制御部11は、比較対象のデータを全て取得したか否かを判別する(ステップS210)。全ての比較対象のデータが取得されたわけではない(取得されていない比較対象のデータがある)と判別された場合には(ステップS210で“NO”)、制御部11の処理は、ステップS202に戻る。全ての比較対象のデータが取得されたと判別された場合には(ステップS210で“YES”)、制御部11は、入力対象のニーズ又はシーズのデータが全て入力されたか否かを判別する(ステップS211)。入力対象のニーズ又はシーズのデータが全て入力されたわけではない(入力されたいないデータがある)と判別された場合には(ステップS211で“NO”)、制御部11の処理は、ステップS201に戻る。このとき、比較対象データの取得情報は全て初期化される。 After proceeding to the process of step S210, the control unit 11 determines whether or not all data to be compared has been acquired (step S210). If it is determined that not all the data to be compared has been acquired (there is data to be compared that has not been acquired) ("NO" in step S210), the process of the control unit 11 proceeds to step S202. return. When it is determined that all the data to be compared have been acquired ("YES" in step S210), the control unit 11 determines whether or not all data of needs or seeds to be input has been input (step S211). If it is determined that all the data of the needs or seeds to be input has not been input (there is data that has not been input) ("NO" in step S211), the processing of the control unit 11 proceeds to step S201. return. At this time, all acquisition information of the comparison target data is initialized.
 全ての入力対象のニーズ又はシーズのデータが入力されたと判別された場合には(ステップS211で“YES”)、制御部11は、ステップS209の処理で記憶された組のニーズデータ及びシーズデータのいずれかのうち、更に一部の要素を、記憶されている他の組における同一の一部の要素と適宜入れ替えてから改めて多次元ベクトルを各々生成して、これらを組み合わせたマトリクスデータを生成する(ステップS212)。なお、入れ替え時に結果としてニーズとシーズの距離が近くならないように入れ替えデータが定められてもよい。制御部11は、マトリクスデータを機械学習モデルに入力させる。また、制御部11は、低適合“Bad”を教師データとして設定し、出力結果と教師データとの差分(誤差)を逆伝播させるなどにより、機械学習モデルのパラメーターを最適化させる(ステップS213)。生成された全てのマトリクスデータの入力及び誤差の逆伝播などによるパラメーターの最適化が終了すると、学習がなされた学習済モデル1210が記憶部12に記憶され、制御部11は、学習済モデル生成処理を終了する。 When it is determined that all input target needs or seeds data have been input ("YES" in step S211), the control unit 11 stores the set of needs data and seeds data stored in the process of step S209. A part of the elements of any one of them is appropriately replaced with a part of the same elements in the other stored set, and then each multidimensional vector is generated again, and matrix data combining these is generated. (Step S212). It should be noted that replacement data may be determined so that the distance between the needs and the seeds does not become close as a result of the replacement. The control unit 11 inputs the matrix data to the machine learning model. In addition, the control unit 11 sets low-matching “Bad” as teacher data, and optimizes the parameters of the machine learning model by, for example, backpropagating the difference (error) between the output result and the teacher data (step S213). . After inputting all the generated matrix data and optimizing parameters by backpropagation of errors, etc., the trained model 1210 is stored in the storage unit 12, and the control unit 11 performs a trained model generation process. exit.
 図6は、このようにして生成された学習済モデル1210を用いた第2の例の適合度合算出処理の制御手順を示すフローチャートである。この適合度合算出処理は、上記第1の例の適合度合算出処理におけるステップS154の処理の代わりにステップS161、S162の処理が含まれている。その他の処理は同一であり、同一の処理内容には同一の符号を付して詳しい説明を省略する。 FIG. 6 is a flow chart showing the control procedure of the degree-of-fit calculation process of the second example using the trained model 1210 generated in this way. This conformity level calculation process includes steps S161 and S162 in place of the process of step S154 in the conformity level calculation process of the first example. Other processes are the same, and the same processing contents are assigned the same reference numerals, and detailed description thereof will be omitted.
 ステップS153の処理で検索対象データの内容が多次元ベクトルに変換されると、制御部11は、入力内容に係る多次元ベクトルと検索対象データに係る多次元ベクトルとを組み合わせてマトリクスデータを生成する(ステップS161)。制御部11は、このマトリクスデータを学習済モデル1210に入力し、当該学習済モデル1210に係る演算処理を行う。制御部11は、処理の結果得られた(学習済モデル1210から出力された)適合度合の値を取得する(ステップS162)。それから、制御部11の処理は、ステップS155へ移行する。 When the content of the search target data is converted into a multidimensional vector in the process of step S153, the control unit 11 combines the multidimensional vector relating to the input content and the multidimensional vector relating to the search target data to generate matrix data. (Step S161). The control unit 11 inputs this matrix data to the learned model 1210 and performs arithmetic processing related to the learned model 1210 . The control unit 11 acquires the value of the degree of adaptation (output from the learned model 1210) obtained as a result of the processing (step S162). Then, the processing of the control unit 11 proceeds to step S155.
 あるいは、適合度合の第3の例として、協調フィルタリングの技術が用いられてもよい。
 協調フィルタリングは、2つのパラメーター(ここでは、ニーズとシーズ)の対応関係を規定し、ある一方パラメーターの入力があった場合に、当該一方のパラメーターに対する他方のパラメーターの傾向(ここでは、選択、出力されたか否か)が類似するものの当該傾向に基づいて、他方のパラメーターのうち一方のパラメーターに対応して選択、出力がなされていない他方のパラメーターを選択、出力する技術である。
Alternatively, a technique of collaborative filtering may be used as a third example of goodness of fit.
Collaborative filtering defines the correspondence between two parameters (here, needs and seeds), and when there is an input of one parameter, the tendency of the other parameter for the other parameter (here, selection, output based on the tendency of similar items, the other parameter, which is not selected and output in response to the other parameter, is selected and output.
 図7Aは、この協調フィルタリングに係るニーズとシーズの対応関係について説明する図である。
 この図7Aでは、ニーズとシーズがマトリクス状に組まれており、選択されたことのある対応関係に対して「1」が入力されている。ここでは、ニーズ03とニーズNMに対して選択されるシーズの傾向が似ており(シーズ02、05、06など)、対応関係が近いものとされる。ここでニーズ03が新たに入力されたときに、ニーズNMに対して選択されており、ニーズ03に対して選択されていないシーズ01が出力対象とされる。
FIG. 7A is a diagram illustrating the correspondence between needs and seeds related to this collaborative filtering.
In FIG. 7A, needs and seeds are arranged in a matrix, and "1" is input for the corresponding relationship that has been selected. Here, the tendencies of the seeds selected for the need 03 and the need NM are similar (seeds 02, 05, 06, etc.), and the corresponding relationships are close. Here, when the need 03 is newly input, the seeds 01 that have been selected for the needs NM and not selected for the needs 03 are output.
 本実施形態のマッチングシステム100では、この協調フィルタリングの技術を利用し、ニーズデータ又はシーズデータ(一方)の入力があった場合には、当該入力内容に係る多次元ベクトルと距離の近い多次元ベクトルを有する既知のニーズ又はシーズ(一方)を複数選択し、選択された複数のニーズ又はシーズ(一方)に対応するシーズ又はニーズ(他方)を選択する。すなわち、入力されるニーズ又はシーズ(一方)は、必ずしも既に保持されているものとは同一ではないので、近いものが利用される。また、この協調フィルタリングでは、ある程度選択がなされて類似傾向が定まっていないと適切な出力ができないが、このように入力側の類似関係を定めることで、出力数が少なくても精度よく選択が可能となる。 In the matching system 100 of the present embodiment, using this collaborative filtering technology, when there is an input of needs data or seeds data (one), a multidimensional vector close to the multidimensional vector related to the input content , and select seeds or needs (other) that correspond to the selected needs or seeds (one). That is, since the input needs or seeds (one) is not necessarily the same as what is already held, a close one is used. In this collaborative filtering, appropriate output cannot be produced unless a certain degree of selection has been made and the similarity tendency has been determined. becomes.
 この場合、対応関係に係る選択は、対応検索制御処理での選択動作に限るものではない。実際の製品の開発、販売などによる対応も選択に含まれてよい。具体的には、上記のようにシーズとニーズとを各々別個のテキストデータから抽出するデータベース生成処理の代わりに、製品開発情報や販売情報などのデータにおけるシーズデータに対応付けられて記載されたニーズデータを取得して、選択された対応関係にあるものとして定めてもよい。 In this case, the selection related to the correspondence relationship is not limited to the selection operation in the correspondence search control process. The selection may also include correspondence through development and sales of actual products. Specifically, instead of the database generation process that extracts seeds and needs from separate text data as described above, needs described in correspondence with seeds data in data such as product development information and sales information Data may be obtained and defined as being in the selected correspondence.
 図7Bは、協調フィルタリングに係るマトリクス生成処理の制御部11による制御手順を示すフローチャートである。
 制御部11は、まず、データベース装置2から記憶されている各シーズ及びニーズの内容(それぞれ1~4要素)を取得して、多次元ベクトルに変換する(ステップS251)。
FIG. 7B is a flowchart showing a control procedure by the control unit 11 for matrix generation processing related to collaborative filtering.
First, the control unit 11 acquires the contents of each seed and need (one to four elements each) stored from the database device 2, and converts them into multidimensional vectors (step S251).
 制御部11は、各シーズ及びニーズを得られた多次元ベクトルとともに二次元マトリクスの各行/列の成分として割り当てる(ステップS252)。制御部11は、シーズ及びニーズの組み合わせをそれぞれ示すセルごとに、選択済みのものに対して「1」を設定する(ステップS253)。なお、例えば、各セルには初期値として「0」が設定されており(初期化されており)、この「1」の設定動作がなされたセルのみが「0」から「1」に変更されればよい。そして、制御部11は、マトリクス生成処理を終了する。 The control unit 11 allocates each seed and need as a component of each row/column of the two-dimensional matrix together with the obtained multidimensional vector (step S252). The control unit 11 sets "1" to the selected cell for each cell indicating a combination of seeds and needs (step S253). Note that, for example, each cell is set to "0" as an initial value (initialized), and only the cell set to "1" is changed from "0" to "1". All you have to do is Then, the control unit 11 terminates the matrix generation process.
 図8は、この協調フィルタリングの技術を用いた第3の例の適合度合算出処理の制御部11による制御手順を示すフローチャートである。
 この適合度合算出処理は、図4に示した第1の例の適合度合算出処理におけるステップS151のみが残り、ステップS152以降の代わりにステップS171~S175の処理が追加されている。
FIG. 8 is a flow chart showing the control procedure by the control unit 11 of the matching degree calculation process of the third example using this collaborative filtering technique.
In this matching level calculation process, only step S151 in the matching level calculation process of the first example shown in FIG.
 ステップS151の処理に続いて、制御部11は、入力データと、マトリクス設定されている同分類(シーズ又はニーズ)のデータとの距離をそれぞれ算出する(ステップS171)。制御部11は、算出された距離が近い順に基準個数のデータを抽出する(ステップS172)。 Following the processing of step S151, the control unit 11 calculates the distance between the input data and the data of the same classification (seeds or needs) set in the matrix (step S171). The control unit 11 extracts the reference number of data in descending order of the calculated distance (step S172).
 制御部11は、抽出された同分類データに対応して選択されたことのある他分類データを選択する(ステップS173)。制御部11は、選択された他分類データについて、それぞれスコアを算出する(ステップS174)。スコアは、距離が大きくなる(類似の度合が小さくなる)ほど小さくなるように、コサイン類似度の絶対値やユークリッド距離の逆数などに基づいて定められてもよい。 The control unit 11 selects other classified data that has been selected corresponding to the extracted same classified data (step S173). The control unit 11 calculates a score for each of the selected other classification data (step S174). The score may be determined based on the absolute value of the cosine similarity, the reciprocal of the Euclidean distance, or the like so that the larger the distance (the smaller the degree of similarity), the smaller the score.
 制御部11は、各選択データを上記スコアとともに出力する(ステップS175)。そして、制御部11は、適合度合算出処理を終了して処理を対応検索制御処理に戻す。 The control unit 11 outputs each selection data together with the score (step S175). Then, the control unit 11 terminates the matching degree calculation process and returns the process to the correspondence search control process.
 図9は、出力結果の例を示す図である。
 ここでは、技術(シーズ)Dに対して予め保持されている要求(ニーズ)のうち一部の適合度合(適合性)をパーセントにより一覧表示した棒グラフを示している。
FIG. 9 is a diagram showing an example of output results.
Here, there is shown a bar graph in which some of the requirements (needs) held in advance for the technology (seed) D are listed in percent.
 このように図示を行う画像データを出力することで、端末装置3(マッチングシステム100)のユーザーに対してより直感的に分かりやすい適合度合の結果を示すことができる。このとき、ユーザーの操作受付部34への入力操作などに応じて一覧表示させる基準値(適合度合の下限値など)を変更設定することで、ユーザーが出力結果に要求する精度などを調整することができてもよい。基準値を高く定めることで、精度(確度)の高い情報のみに絞り込まれて表示が行われ、ユーザーが精度(確度)よりも参考となる情報量を望む場合には、基準値を低く定めることで、雑多な情報が多数表示される。 By outputting image data for illustration in this way, it is possible to show the user of the terminal device 3 (matching system 100) a result of the degree of matching that is more intuitive and easy to understand. At this time, by changing and setting the reference value (such as the lower limit of the degree of conformity) to be displayed in a list according to the user's input operation to the operation reception unit 34, the accuracy required for the output result by the user can be adjusted. can be done. By setting the reference value high, the display is narrowed down to only information with high accuracy (accuracy), and if the user desires a reference amount of information rather than accuracy (accuracy), set the reference value low. A lot of miscellaneous information is displayed.
 以上のように、本実施形態のマッチングシステム100は、情報処理装置1の制御部11を備える。制御部11は、取得部として、ニーズに係る第1データと、シーズに係る第2データを取得し、解析部として、取得された第1データ及び第2データをそれぞれ数値化して解析し、出力部として、この解析結果を用いて、第1データと第2データの適合度合を示すマッチング情報を出力する。
 このように、シーズとニーズとを数値化して適合度合を定量的に判断することができるので、このマッチングシステム100では、容易かつより客観的に対象となるシーズ及びニーズのうち一方に適合すると考えられる他方についての情報を得ることが可能になる。よって、このマッチングシステム100によれば、シーズとニーズの適合をより容易かつ客観的に定量評価することができる。
As described above, the matching system 100 of this embodiment includes the control unit 11 of the information processing device 1 . The control unit 11, as an acquisition unit, acquires first data related to needs and second data related to seeds, and as an analysis unit, quantifies and analyzes the acquired first data and second data, and outputs As a part, this analysis result is used to output matching information indicating the degree of matching between the first data and the second data.
In this way, it is possible to quantify the seeds and needs and quantitatively judge the degree of suitability. It becomes possible to obtain information about the other Therefore, according to this matching system 100, it is possible to quantitatively evaluate the match between seeds and needs more easily and objectively.
 また、制御部11は、解析部として、シーズ及びニーズの数値化において第1データ及び第2データをそれぞれ多次元空間を表すベクトルに変換する。シーズやニーズを表す自然言語表現は、簡潔にまとめたとしても依然多くの情報量を含むので、多次元空間でその意味を表すことで、マッチングシステム100は、表現可能な幅を容易に広げてより正確に意味に対応した数値を得ることができる。 In addition, the control unit 11, as an analysis unit, converts the first data and the second data into vectors representing a multidimensional space in the quantification of seeds and needs. Natural language expressions that express seeds and needs still contain a large amount of information even if summarized concisely. Numerical values corresponding to the meaning can be obtained more accurately.
 また、制御部11は、解析部として、第1データから得られた第1ベクトルと、第2データから得られた第2ベクトルとの類似度(例えば、コサイン類似度)に基づいて適合度合を算出する。このような処理によれば、得られた第1ベクトル及び第2ベクトルから容易な演算で数値的な類似度を得ることができるので、マッチングシステム100は、簡便な処理で客観的かつ正確な評価を得ることができる。 Further, the control unit 11, as the analysis unit, determines the degree of conformity based on the degree of similarity (for example, cosine similarity) between the first vector obtained from the first data and the second vector obtained from the second data. calculate. According to such processing, the numerical similarity can be obtained from the obtained first vector and the second vector by simple calculation. can be obtained.
 また、解析部としての制御部11は、第1ベクトルと第2ベクトルとを入力に用いて、適合度合を出力する学習済モデル1210を利用する。このように、機械学習モデルを適切に学習させて、第1ベクトル及び第2ベクトルの適合度合を出力可能とすることで、マッチングシステム100は、客観的かつ定量的な評価を得ることができる。 In addition, the control unit 11 as the analysis unit uses the learned model 1210 that uses the first vector and the second vector as inputs and outputs the degree of adaptation. In this way, the matching system 100 can obtain an objective and quantitative evaluation by properly learning the machine learning model and outputting the degree of conformity of the first vector and the second vector.
 また、この学習済モデル1210は、パターン認識アルゴリズムによるものである。多次元ベクトルに変換した各方向成分の値を配列したマトリクスパターンに基づいて、この学習済モデル1210を利用してパターン認識処理を行うことで、マッチングシステム100は、多次元ベクトルの全体的な類似の度合について適切に定量評価することができる。 Also, this trained model 1210 is based on a pattern recognition algorithm. By performing pattern recognition processing using this trained model 1210 based on the matrix pattern in which the values of the respective direction components converted into multidimensional vectors are arranged, the matching system 100 can detect the overall similarity of the multidimensional vectors. It is possible to appropriately quantitatively evaluate the degree of
 また、端末装置3は、入力データを取得する入力部としての操作受付部34を備える。
 制御部11は、取得部はとして入力データから第1データ又は前第2データを取得する。このように、ニーズ又はシーズのうち一方を指定して入力することにより、他方との適合度合が、データベース装置2に保持されているデータの多く(ここでは総当たり)に対してそれぞれ算出されるので、マッチングシステム100では、ユーザーが1つのニーズ又はシーズに係る情報に対して適合する他方の情報を取得したい場合に容易かつ適切に所望の情報を得ることができる。
The terminal device 3 also includes an operation reception unit 34 as an input unit that acquires input data.
The control unit 11, as an acquisition unit, acquires the first data or the previous second data from the input data. By specifying and inputting one of the needs and seeds in this way, the degree of compatibility with the other is calculated for most of the data held in the database device 2 (here, round-robin). Therefore, in the matching system 100, when a user desires to acquire other information that matches one need or seed, desired information can be obtained easily and appropriately.
 また、第1データは、ニーズに係る予め定められた要素を少なくとも一つ含み、第2データは、シーズに係る予め定められた要素を少なくとも一つ含む。このように、抽出項目を共通化し、かつ第1データと第2データとで対応しやすくすることで、マッチングシステム100では、より適切に適合度合を算出してマッチング情報の精度を向上させることができる。 Also, the first data includes at least one predetermined element related to needs, and the second data includes at least one predetermined element related to seeds. In this way, by standardizing the extracted items and facilitating correspondence between the first data and the second data, the matching system 100 can calculate the matching degree more appropriately and improve the accuracy of the matching information. can.
 また、第1データの要素は、ニーズに係る目的、内容、分野及び名称のうち少なくともいずれかを含み、第2データの要素は、シーズに係る特徴、機能、競合技術及び名称のうち少なくともいずれかを含む。このように、ニーズやシーズを簡潔かつ端的に表す情報を含むデータを入力データとすることで、ノイズを少なくしつつ、より正確にこれらのデータを数値化することができる。したがって、このマッチングシステム100では、出力される適合度合の結果を含むマッチング情報の精度も向上させることができる。 Elements of the first data include at least one of the purpose, content, field, and name of the needs, and elements of the second data include at least one of the features, functions, competing technologies, and names of the seeds. including. In this way, by using data containing information that simply and concisely expresses needs and seeds as input data, it is possible to more accurately quantify these data while reducing noise. Therefore, in this matching system 100, it is possible to improve the accuracy of the matching information including the result of the degree of matching that is output.
 また、制御部11は、取得部として、操作受付部34により受け付けられた入力データを取得すると、この入力データから上記エレベーターピッチ構文に係る8要素のうち少なくともいずれかを抽出する。
 このように、入力される文書データからの要素の抽出が自動で行われてもよい。これにより、事前に手作業で入力情報を整理して生成する必要がなくなり、手間が軽減される。
Further, when the control unit 11 acquires the input data accepted by the operation accepting unit 34 as an acquisition unit, the control unit 11 extracts at least one of the eight elements related to the elevator pitch syntax from the input data.
In this way, elements may be automatically extracted from input document data. This eliminates the need to organize and generate input information manually in advance, thereby reducing labor.
 また、第1データ及び第2データは、それぞれ機能を示す名詞、又は動詞を少なくとも含む。特に技術内容に係るニーズやシーズでは、どのような動作がなされる又は要求されているのかが適切に表現された自然言語表現を含む入力データを得ることで、精度の高い適合度合の算出を行うことができる。 Also, the first data and the second data each include at least a noun or a verb indicating a function. In particular, for needs and seeds related to technical content, by obtaining input data including natural language expressions that appropriately express what kind of operation is performed or required, we can calculate the degree of compatibility with high accuracy. be able to.
 また、第1データ及び第2データは、それぞれ機能を示す名詞又は動詞に対する目的語を含む。上記に加えて動作の対象も第1データ及び第2データに含まれることで、簡潔な表現でより正確に動作内容又は要求内容を表現する入力データが得られ、これにより数値化されたデータや当該数値化データに基づいて得られた適合度合の結果の精度も向上させることができる。 Also, the first data and the second data each include an object for a noun or a verb indicating a function. In addition to the above, since the target of the operation is also included in the first data and the second data, it is possible to obtain input data that more accurately expresses the operation content or the request content in a concise manner. It is also possible to improve the accuracy of the fitness results obtained based on the quantified data.
 また、本実施形態のマッチング方法は、ニーズに係る第1データと、シーズに係る第2データを取得する取得ステップ、取得された第1データ及び第2データをそれぞれ数値化して解析する解析ステップ、解析ステップにおける解析結果を用いて、第1データと第2データの適合度合を示すマッチング情報を出力する出力ステップ、を含む。このようなマッチング方法では、シーズとニーズとを数値化して適合度合を定量的に判断することができるので、容易かつより客観的に対象となるシーズ及びニーズのうち一方に適合すると考えられる他方についての情報を得ることが可能になる。よって、このマッチング方法によれば、シーズとニーズの適合をより容易かつ客観的に定量評価することができる。 In addition, the matching method of the present embodiment includes an acquisition step of acquiring first data related to needs and second data related to seeds, an analysis step of numerically analyzing the acquired first data and second data, An output step of outputting matching information indicating the degree of matching between the first data and the second data using the analysis result in the analysis step. With such a matching method, it is possible to quantify seeds and needs and quantitatively judge the degree of suitability. information can be obtained. Therefore, according to this matching method, the match between seeds and needs can be quantitatively evaluated more easily and objectively.
 また、上記マッチング方法に係るプログラム121をコンピューター(情報処理装置1など)にインストールして実行することで、特別な構成を必要とせずに容易に汎用の装置でシーズとニーズの適合をより容易かつ客観的に定量評価することができる。 In addition, by installing and executing the program 121 related to the matching method on a computer (such as the information processing device 1), it is possible to easily match seeds and needs with a general-purpose device without requiring a special configuration. A quantitative evaluation can be made objectively.
 また、本実施形態の学習済モデル1210は、ニーズに係る第1データから得られた第1ベクトルと、シーズに係る第2データから得られた第2ベクトルとを入力に用いて、これら第1データと第2データとの適合度合を出力する。このような学習済モデル1210により、シーズデータとニーズデータとの全体的な適合度合をより適切に客観的な値で得ることができる。 In addition, the trained model 1210 of the present embodiment uses the first vector obtained from the first data related to needs and the second vector obtained from the second data related to seeds as inputs, and uses these first vectors as inputs. A degree of matching between the data and the second data is output. With such a trained model 1210, it is possible to obtain the overall degree of matching between the seeds data and the needs data in a more appropriate and objective value.
 なお、本発明は、上記実施の形態に限られるものではなく、様々な変更が可能である。
 例えば、上記実施の形態では、ニーズ又はシーズのデータ(一方のデータ)を一つ入力し、これに対して保持されている複数の他方のデータに対する適合度合を算出して、適合度合の大きいものを抽出するものとして説明したが、ニーズに係るデータ及びシーズに係るデータをそれぞれ多数ずつ有している場合に、総当たりで組み合わせて、実施漏れなどの検出を行うなどされてもよい。
It should be noted that the present invention is not limited to the above embodiments, and various modifications are possible.
For example, in the above embodiment, one data of needs or seeds (one data) is input, the degree of conformity with respect to a plurality of other data held for this is calculated, and the one with a large degree of conformity However, when a large number of data related to needs and data related to seeds are provided, they may be combined in a round-robin manner to detect omissions in implementation.
 また、上記実施の形態では、データベース装置2の記憶部21に保持されているニーズ/シーズデータ211も、適合度合算出処理で取得されてから多次元ベクトルに変換されるものとして説明したが、ニーズ/シーズデータ211内で予め多次元ベクトルが保持されていてもよい。この場合、自然言語データを多次元ベクトルに変換するための学習済モデルが更新された場合には、随時ニーズ/シーズデータ211内の多次元ベクトルデータも更新されればよい。 Further, in the above-described embodiment, the needs/seeds data 211 held in the storage unit 21 of the database device 2 is also acquired by the matching degree calculation process and then converted into a multidimensional vector. A multidimensional vector may be held in advance in the seed data 211 . In this case, when a trained model for converting natural language data into a multidimensional vector is updated, the multidimensional vector data in the needs/seeds data 211 may be updated as needed.
 また、上記実施の形態の第3の例において、協調フィルタリングを用いる場合に、出力の有無によって2値設定のみが行われることとして説明したが、これに限られない。出力頻度や、出力後の実際の実施状況、出力結果の有無にかかわらず組合せに応じてアイデアの開発、頒布などを行ったユーザーの所感(「いいね」など)などに応じて加算又は重み付けなどを行った多値設定を行ってもよい。設定される値は整数に限らず、正負を問わない任意の実数であってもよい。これらを考慮したマトリクス生成処理が随時行われてマトリクスデータが更新されてもよい。また、この協調フィルタリングでは、入力データと同一種別の保持データとの多次元ベクトルの類似の度合を考慮するものとして説明したが、必ずしもこの類似の度合を考慮しなければならないものではなく、また、出力設定の増加などに応じて不要な場合には考慮しないように切り替えるなどとしてもよい。 Also, in the third example of the above embodiment, when using collaborative filtering, only binary setting is performed depending on the presence or absence of output, but this is not the only option. Addition or weighting according to the output frequency, the actual implementation status after output, and the user's impression (such as "like") who developed and distributed ideas according to the combination regardless of the output result You may perform multi-value setting with The value to be set is not limited to an integer, and may be any real number regardless of whether it is positive or negative. The matrix data may be updated by performing the matrix generation process considering these as needed. Further, in this collaborative filtering, although the degree of similarity of the multidimensional vectors of the input data and the retained data of the same type has been explained, it is not always necessary to consider the degree of similarity. It may be switched so as not to be considered when it is unnecessary according to an increase in the output setting or the like.
 また、上記実施の形態では、数値化を多次元ベクトル化であるものとして説明したが、スカラー値で表現可能な範囲の情報についてはスカラー値のみで表現してもよい。 Also, in the above embodiment, the digitization is explained as multi-dimensional vectorization, but the range of information that can be expressed with scalar values may be expressed only with scalar values.
 また、上記実施の形態では、第1ベクトルの成分配列と第2ベクトルの成分配列の全体を比較する第2の例として、画像認識の技術に基づいて適合度合を求めることとしたが、他の手法、例えば、第1ベクトルと第2ベクトルのそれぞれについての一次元配列をそれぞれ波形近似することなどによって、当該波形の類似性などに基づいて適合度合を求めることとしてもよい。 Further, in the above-described embodiment, as a second example of comparing the entire component array of the first vector and the entire component array of the second vector, the matching degree is obtained based on image recognition technology. For example, by approximating the waveforms of the one-dimensional arrays of the first vector and the second vector, the matching degree may be obtained based on the similarity of the waveforms.
 また、機械学習モデルを利用する場合に、上記のようにパターン認識に主に用いられる手法を用いる代わりに、上記成分配列に対してランダムフォレストや勾配ブースティングなどの決定木を用いた手法が機械学習モデルに利用されてもよい。 In addition, when using a machine learning model, instead of using the methods mainly used for pattern recognition as described above, methods using decision trees such as random forests and gradient boosting for the above component sequences are used by machines. May be used for learning models.
 また、上記実施の形態では、出力データとしてグラフなどを含む画像データを生成して出力するものとして説明したが、これに限られない。単純にテキストデータなどが出力されてもよいし、所定の専用出力フォーマットが構造化言語で定められて出力されたり、表計算ソフトウェアの標準データ形式などに従ったデータが出力されたりしてもよい。また、出力は端末装置3に返送されるのではなく、プリンターなどに送信されて画像形成出力されてもよい。 Also, in the above embodiment, it is assumed that image data including graphs and the like is generated and output as output data, but the present invention is not limited to this. Text data or the like may simply be output, a predetermined dedicated output format may be defined in a structured language, and data may be output according to the standard data format of spreadsheet software. . Also, the output may be sent to a printer or the like instead of being sent back to the terminal device 3 to form an image.
 また、上記実施の形態では、エレベーターピッチ構文の構成に従ってニーズ及びシーズについてそれぞれ4要素ずつを抽出して数値化に利用するものとして説明したが、これに限るものではない。入力データは、他の基準で簡潔に整理されてもよいし、特に整理しなおさずに必要な文や文節などを単位として自然言語表現でのデータから抽出されるのみであってもよい。 Also, in the above embodiment, it was explained that four elements each of needs and seeds are extracted according to the structure of the elevator pitch syntax and used for quantification, but it is not limited to this. The input data may be concisely arranged according to other criteria, or may be simply extracted from the data expressed in natural language in units of necessary sentences and clauses without rearrangement.
 また、上記実施の形態では、機能を示す名詞又は動詞と、目的語との組み合わせ、及び必要に応じて修飾語を伴う形に整理してから数値化されるものとして説明したが、これに限られない。形容詞などが含まれてもよい。 Further, in the above embodiment, the description is made assuming that the combination of the noun or verb indicating the function, the object, and, if necessary, the form with modifiers is arranged and then numerically quantified. can't Adjectives and the like may be included.
 また、上記実施の形態では、情報処理装置1、データベース装置2及び端末装置3を別個の構成として説明したが、単一のコンピューター(マッチング装置)により処理が全てなされるのであってもよい。一方で、対応検索制御処理が複数のサーバー装置の制御部により分散処理されてもよい。また、端末装置3は、特定の一台に限られず、複数台存在してもよい。 Also, in the above embodiment, the information processing device 1, the database device 2, and the terminal device 3 are described as separate configurations, but all processing may be performed by a single computer (matching device). On the other hand, the corresponding search control process may be distributed by the controllers of a plurality of server devices. Moreover, the terminal device 3 is not limited to one specific device, and a plurality of devices may exist.
 また、以上の説明では、本発明の適合度合の算出などの制御に係るプログラム121を記憶するコンピューター読み取り可能な媒体としてHDD、フラッシュメモリーなどの不揮発性メモリーなどからなる記憶部12を例に挙げて説明したが、これらに限定されない。その他のコンピューター読み取り可能な媒体として、MRAMなどの他の不揮発性メモリーや、CD-ROM、DVDディスクなどの可搬型記録媒体を適用することが可能である。また、本発明に係るプログラムのデータを通信回線を介して提供する媒体として、キャリアウェーブ(搬送波)も本発明に適用される。
 その他、上記実施の形態で示した具体的な構成、処理動作の内容及び手順などは、本発明の趣旨を逸脱しない範囲において適宜変更可能である。本発明の範囲は、特許請求の範囲に記載した発明の範囲とその均等の範囲を含む。
In the above description, the storage unit 12 made up of a non-volatile memory such as an HDD or a flash memory is taken as an example of a computer-readable medium for storing the program 121 related to control such as calculation of the degree of conformity of the present invention. Illustrated, but not limited to. As other computer-readable media, it is possible to apply other non-volatile memories such as MRAM, and portable recording media such as CD-ROMs and DVD discs. A carrier wave is also applicable to the present invention as a medium for providing program data according to the present invention via a communication line.
In addition, the specific configurations, contents and procedures of processing operations, etc. shown in the above embodiments can be changed as appropriate without departing from the scope of the present invention. The scope of the present invention includes the scope of the invention described in the claims and the scope of equivalents thereof.
 この発明は、マッチングシステム、マッチング方法、プログラム及び学習済モデルに利用することができる。 This invention can be used for matching systems, matching methods, programs, and trained models.
1 情報処理装置
11 制御部
12 記憶部
121 プログラム
1210 学習済モデル
13 通信部
2 データベース装置
21 記憶部
211 ニーズ/シーズデータ
3 端末装置
31 制御部
32 通信部
33 表示部
34 操作受付部
100 マッチングシステム
N ネットワーク
1 information processing device 11 control unit 12 storage unit 121 program 1210 learned model 13 communication unit 2 database device 21 storage unit 211 needs/seeds data 3 terminal device 31 control unit 32 communication unit 33 display unit 34 operation reception unit 100 matching system N network

Claims (14)

  1.  ニーズに係る第1データと、シーズに係る第2データを取得する取得部と、
     取得された前記第1データ及び前記第2データをそれぞれ数値化して解析する解析部と、
     前記解析部による解析結果を用いて、前記第1データと前記第2データの適合度合を示すマッチング情報を出力する出力部と、
     を備えるマッチングシステム。
    an acquisition unit that acquires first data related to needs and second data related to seeds;
    an analysis unit that quantifies and analyzes the acquired first data and second data;
    an output unit that outputs matching information indicating the degree of matching between the first data and the second data using the analysis result by the analysis unit;
    A matching system with
  2.  前記解析部は、前記数値化において前記第1データ及び前記第2データをそれぞれ多次元空間を表すベクトルに変換する請求項1記載のマッチングシステム。 The matching system according to claim 1, wherein the analysis unit converts the first data and the second data into vectors representing a multidimensional space in the quantification.
  3.  前記解析部は、前記第1データから得られた第1ベクトルと、前記第2データから得られた第2ベクトルとの類似度に基づいて前記適合度合を算出する請求項2記載のマッチングシステム。 3. The matching system according to claim 2, wherein the analysis unit calculates the degree of matching based on the degree of similarity between the first vector obtained from the first data and the second vector obtained from the second data.
  4.  前記解析部は、前記第1データから得られた第1ベクトルと、前記第2データから得られた第2ベクトルとを入力に用いて、前記適合度合を出力する学習済モデルを有する請求項2記載のマッチングシステム。 3. The analysis unit has a trained model that uses as inputs a first vector obtained from the first data and a second vector obtained from the second data and outputs the degree of fitness. Matching system as described.
  5.  前記学習済モデルは、パターン認識アルゴリズムによるものである請求項4記載のマッチングシステム。  The matching system according to claim 4, wherein the trained model is based on a pattern recognition algorithm.
  6.  入力データを取得する入力部を備え、
     前記取得部は、前記入力データから前記第1データ又は前記第2データを取得する
     請求項1~5のいずれか一項に記載のマッチングシステム。
    An input unit for acquiring input data,
    The matching system according to any one of claims 1 to 5, wherein the acquisition unit acquires the first data or the second data from the input data.
  7.  前記第1データは、ニーズに係る予め定められた要素を少なくとも一つ含み、
     前記第2データは、シーズに係る予め定められた要素を少なくとも一つ含む
     請求項1~6のいずれか一項に記載のマッチングシステム。
    The first data includes at least one predetermined element related to needs,
    The matching system according to any one of claims 1 to 6, wherein the second data includes at least one predetermined element related to seeds.
  8.  前記第1データの前記要素には、ニーズに係る目的、内容、分野及び名称のうち少なくともいずれかが含まれ、
     前記第2データの前記要素には、シーズに係る特徴、機能、競合技術及び名称のうち少なくともいずれかが含まれる
     請求項7記載のマッチングシステム。
    The element of the first data includes at least one of the purpose, content, field and name of the needs,
    8. The matching system according to claim 7, wherein said elements of said second data include at least one of features, functions, competing technologies and names of seeds.
  9.  入力データを取得する入力部を備え、
     前記取得部は、前記入力データから前記第1データ又は前記第2データを取得する際に、前記要素を抽出する請求項7又は8記載のマッチングシステム。
    An input unit for acquiring input data,
    9. The matching system according to claim 7, wherein the acquisition unit extracts the element when acquiring the first data or the second data from the input data.
  10.  前記第1データ及び前記第2データは、それぞれ機能を示す名詞、又は動詞を少なくとも含む請求項1~9のいずれか一項に記載のマッチングシステム。  The matching system according to any one of claims 1 to 9, wherein the first data and the second data each include at least a noun or a verb indicating a function.
  11.  前記第1データ及び前記第2データは、それぞれ前記機能を示す名詞又は動詞に対する目的語を含む請求項10記載のマッチングシステム。 11. The matching system according to claim 10, wherein said first data and said second data each include an object for a noun or verb indicating said function.
  12.  コンピューターの制御部により行われるマッチング方法であって、
     ニーズに係る第1データと、シーズに係る第2データを取得する取得ステップ、
     取得された前記第1データ及び前記第2データをそれぞれ数値化して解析する解析ステップ、
     前記解析ステップにおける解析結果を用いて、前記第1データと前記第2データの適合度合を示すマッチング情報を出力する出力ステップ、
     を含むマッチング方法。
    A matching method performed by a computer control unit,
    an acquisition step of acquiring first data related to needs and second data related to seeds;
    an analysis step of quantifying and analyzing the acquired first data and second data, respectively;
    An output step of outputting matching information indicating the degree of matching between the first data and the second data using the analysis result in the analysis step;
    Matching methods, including
  13.  コンピューターを、
     ニーズに係る第1データと、シーズに係る第2データを取得する取得手段、
     取得された前記第1データ及び前記第2データをそれぞれ数値化して解析する解析手段、
     前記解析手段による解析結果を用いて、前記第1データと前記第2データの適合度合を示すマッチング情報を出力する出力手段、
     として機能させるプログラム。
    the computer,
    Acquisition means for acquiring first data related to needs and second data related to seeds;
    analysis means for digitizing and analyzing the acquired first data and second data, respectively;
    output means for outputting matching information indicating the degree of matching between the first data and the second data using the analysis result by the analysis means;
    A program that acts as
  14.  ニーズに係る第1データから得られた第1ベクトルと、シーズに係る第2データから得られた第2ベクトルとを入力に用いて、前記第1データと前記第2データとの適合度合を出力する学習済モデル。 A first vector obtained from the first data related to the needs and a second vector obtained from the second data related to the seeds are used as inputs, and the degree of conformity between the first data and the second data is output. trained model.
PCT/JP2022/038679 2021-10-26 2022-10-18 Matching system, matching method, program, and trained model WO2023074457A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-174687 2021-10-26
JP2021174687 2021-10-26

Publications (1)

Publication Number Publication Date
WO2023074457A1 true WO2023074457A1 (en) 2023-05-04

Family

ID=86157701

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/038679 WO2023074457A1 (en) 2021-10-26 2022-10-18 Matching system, matching method, program, and trained model

Country Status (1)

Country Link
WO (1) WO2023074457A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019013344A1 (en) * 2017-07-14 2019-01-17 株式会社マスターリンク Information processing device
JP2019211846A (en) * 2018-05-31 2019-12-12 リンカーズ株式会社 Technical information providing system
JP2021026413A (en) * 2019-08-01 2021-02-22 株式会社大和総研 Matching system and program
JP2021157363A (en) * 2020-03-26 2021-10-07 株式会社野村総合研究所 Needs matching device and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019013344A1 (en) * 2017-07-14 2019-01-17 株式会社マスターリンク Information processing device
JP2019211846A (en) * 2018-05-31 2019-12-12 リンカーズ株式会社 Technical information providing system
JP2021026413A (en) * 2019-08-01 2021-02-22 株式会社大和総研 Matching system and program
JP2021157363A (en) * 2020-03-26 2021-10-07 株式会社野村総合研究所 Needs matching device and program

Similar Documents

Publication Publication Date Title
von Gerich et al. Artificial Intelligence-based technologies in nursing: a scoping literature review of the evidence
JP5171962B2 (en) Text classification with knowledge transfer from heterogeneous datasets
Karl et al. A practical guide to text mining with topic extraction
JP5477297B2 (en) Active metric learning device, active metric learning method, and active metric learning program
JP2005182280A (en) Information retrieval system, retrieval result processing system, information retrieval method, and program
US11188819B2 (en) Entity model establishment
Gupta et al. Prediction of research trends using LDA based topic modeling
Velasco-Elizondo et al. Knowledge representation and information extraction for analysing architectural patterns
US20220358379A1 (en) System, apparatus and method of managing knowledge generated from technical data
da Silva Júnior et al. A roadmap toward the automatic composition of systematic literature reviews
Quirchmayr et al. Semi-automatic Software Feature-Relevant Information Extraction from Natural Language User Manuals: An Approach and Practical Experience at Roche Diagnostics GmbH
de la Torre-López et al. Artificial intelligence to automate the systematic review of scientific literature
Hauder et al. Making data analysis expertise broadly accessible through workflows
Subbalakshmi et al. A Gravitational Search Algorithm Study on Text Summarization Using NLP
Atkinson-Abutridy Text Analytics: An Introduction to the Science and Applications of Unstructured Information Analysis
Gillies et al. Theme and topic: How qualitative research and topic modeling can be brought together
KR20210129465A (en) Apparatus for managing laboratory note and method for searching laboratory note using thereof
WO2023074457A1 (en) Matching system, matching method, program, and trained model
Ho et al. Using word embeddings in abstracts to accelerate metallocene catalysis polymerization research
Lamba et al. Tools and techniques for text mining and visualization
Schapke et al. Text integration based on a construction information resource sharing ontology
Nyongesa et al. User modelling using evolutionary interactive reinforcement learning
Kjellin et al. A survey on interactivity in topic models
Braun et al. The case for retaining natural language descriptions of phenotypes in plant databases and a web application as proof of concept
Valdez et al. Classification of provenance triples for scientific reproducibility: A comparative evaluation of deep learning models in the ProvCaRe project

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22886785

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023556340

Country of ref document: JP