HK1219552B - Product code analysis system and product code analysis method - Google Patents
Product code analysis system and product code analysis method Download PDFInfo
- Publication number
- HK1219552B HK1219552B HK16107603.6A HK16107603A HK1219552B HK 1219552 B HK1219552 B HK 1219552B HK 16107603 A HK16107603 A HK 16107603A HK 1219552 B HK1219552 B HK 1219552B
- Authority
- HK
- Hong Kong
- Prior art keywords
- dictionary
- product
- keyword
- product name
- classification
- Prior art date
Links
Description
Technical Field
The present invention relates to a product code analysis system and a product code analysis program for analyzing a database to be analyzed in which product names classified in layers are stored as records, and performing a total according to the hierarchical structure.
Background
Since it is important for retail companies such as supermarkets to grasp the needs of diversified customers and to develop business, marketing data obtained by investigating which commodities sold in all markets are, for example, is grasped, and the sales trends of all commodities in the markets are analyzed.
As a technique for analyzing such a trend of sales, for example, patent document 1 is known. Patent document 1 discloses the following system: the market trend is analyzed quickly and easily from the stock status of the entire market of the commodity based on the sales quantity data and the stock quantity data of the commodity obtained from the POS (Point of sales information management) terminal of the retail dealer.
Documents of the prior art
Patent document
Patent document 1: japanese patent laid-open publication No. 2005-8341
Disclosure of Invention
Problems to be solved by the invention
However, since each store (business) manages each product individually, the product information of each store is classified into an individual product type (category) or is managed as product master information by assigning an individual product code to a product. Therefore, if the commodity master information of each store is simply collected and accumulated in the database, even the same commodity is classified into different categories, and the accurate sales trend cannot be analyzed.
In addition, in each store, information related to the product, such as the place of production or the number of the product, may be included in the product master information, and therefore, even the same product may be registered as a different product for a product name including information related to the product and a product name not including information related to the product. On the other hand, there is a problem that the work of re-classifying the types of the product master information of each store and changing the product name is very complicated.
The present invention has been made to solve the above-described problems, and an object of the present invention is to provide a product code analysis system and a product code analysis program that can easily classify product information registered in different classifications or product names in stores into a uniform type, and can unify the product information by changing the product name to an appropriate product name.
Means for solving the problems
In order to solve the above problem, a product code analysis system according to the present invention analyzes an analysis target database in which product names classified in layers are stored as records, and performs a total according to the hierarchical structure, the product code analysis system including: an input interface that inputs the analysis target database in a state where the hierarchical structure is maintained; a classification dictionary that stores keywords of classification names in each hierarchy constituting the hierarchical structure in association with a unit column serving as a storage destination of each product name; a product name dictionary that stores a keyword of a product name belonging to each unit column classified by the hierarchical structure in each unit column; a provisional classification execution unit that provisionally classifies and registers, for each record in the analysis target database input from the input interface, a product name of each record in accordance with an occurrence rate of a keyword of a classification name in the classification dictionary; a product name registration unit that registers, for each record in the analysis target database, a product name of each record in the unit column according to an appearance rate of a keyword for a product name in the product name dictionary, based on the provisional classification registration in the provisional classification execution unit; and a dictionary search execution unit that specifies an application order of each dictionary and each keyword, and a combination of the application order of each keyword and the keyword when calculating the occurrence rate of the keyword in the provisional classification execution unit and the product name registration unit.
In the present invention, the input records are temporarily classified and registered in the unit column to be the storage destination according to the occurrence rate of the keyword of the classification name in the classification dictionary, and the temporarily registered product names are changed to the uniform keyword according to the occurrence rate of the product name keyword in the product name dictionary, so that the records registered in different classifications or product names in the stores can be easily classified into the uniform unit column and changed to the appropriate product name to unify the product information.
In particular, in the present invention, the dictionary search execution unit defines the application order of each dictionary and each keyword, the application order of each keyword, and the combination of keywords when the provisional classification execution unit and the product name registration unit calculate the occurrence rate of keywords. Here, the order of application of the keywords indicates, for example, the order of application of the keywords such as setting priorities for the product keywords in the category, performing a search from a keyword having a higher priority, or performing a search from a sequence having a longer character string length. The combination of keywords refers to a combination of 2 or more keywords necessary for identifying a product name, such as a product name or a product form thereof, a manufacturer, and term limit information, and a search method based on the combination includes a method of searching for a single keyword by connecting a plurality of keywords together, in addition to searching for all the keywords specified and searching for a keyword including a specified keyword or searching for a keyword.
As described above, according to the present invention, since the application order of each dictionary and each keyword, the application order of each keyword, and the combination of keywords are specified, the number of characters or the combination of characters constituting the category or the product name enables processing according to the application order of the appropriate keywords or the combination of keywords even for products belonging to different unit ranks, and the records of each store can be stored in the appropriate unit rank.
In the above invention, the product code analysis system further includes: an annotation dictionary that stores information associated with the product name registered in the product name dictionary in each unit column classified by the hierarchical structure; and a comment registration unit that registers, for each record in the analysis target database, information associated with a product name of each record in a unit column to which the product belongs, in accordance with an appearance rate of a keyword in the comment dictionary, wherein the dictionary search execution unit specifies an application order of each dictionary and each keyword, an application order of each keyword, and a combination of the keyword when calculating the appearance rate of the keyword in the comment registration unit.
Here, the information related to the product name includes, for example, information such as the origin, quantity, manufacturer, and number of articles loaded. In this case, the annotation dictionary is also referred to for information other than the product name, and the information is registered in the unit column in accordance with the appearance rate of the keyword of the information related to the product name, so that the information can be registered in association with the classification of the product or the additional information other than the product name.
In this case, since the dictionary search execution unit specifies each dictionary and the application order of each keyword, and the application order of each keyword and the combination of keywords when the comment registration unit calculates the occurrence rate of keywords, even if there is information that belongs to a different item from each other in each information related to a product, the information can be stored in the appropriate item by specifying the application order of keywords or the combination of keywords by the number of character strings or character strings of the information related to the product.
In the present invention, the product name registration unit has a verification function as follows: a provisional classification mode for performing dictionary search of the product name based on provisional classification registration performed by the provisional classification execution unit and a collation mode for performing dictionary search of all classifications regardless of the result of the provisional classification registration are executed and the result is notified when the results of the two modes are different.
In the above invention, the product code analysis system further includes: and a learning function unit which reflects the dictionary search results of the two patterns in the corresponding dictionary based on the result of the matching function.
In the above invention, the dictionary search execution unit decomposes the product name and the associated information character string in each record into word units, and executes the application of each dictionary in the decomposed word units. In this case, for example, even when the product name and the information related to the product are mixedly input in the record input in the store, the dictionary search execution unit decomposes the input data in units of words and executes the application of each dictionary, so that the record can be registered in an appropriate unit column.
In the above invention, the dictionary search execution unit further includes: and a keyword control unit that sets an application order of the keywords based on a string length of each keyword and a string length of a keyword obtained by combining the keywords. In this case, when the product name "AAABB" is registered, and when the product name dictionary includes "AAA" having a long character string length and "BB" having a short character string length, the dictionary search execution unit can search from the "AAA" having a long character string length based on the character string length, and thus can prevent the product name "AAABB" from being registered under the classification of "BB".
The dictionary search execution unit can perform a search and/or retrieval by cyclically combining related keywords such as AA1, AA2, and AA3 with each other in a manner of AA1 × AA2, AA1 × AA3, AA2 × AA1, AA2 × AA3, AA3 × AA1, and AA3 × AA 2. In this case, the search is performed in the order of the total string length of the keywords from long to short, thereby enabling more appropriate classification. The dictionary search execution unit can also be provided with a function of generating a new search keyword by appropriately connecting related keywords, such as AA1AA2 and AA1AA 3. By combining the search keyword and the original keyword to arbitrarily adjust the string length and performing a search, or the like, the order of application of the decomposed limited keywords can be adjusted, and the analysis accuracy can be improved.
As described above, according to the present invention, since the order of application of the keywords is set according to the string length of the keywords or the combined keywords, it is possible to register the record in an appropriate unit column.
The system of the present invention can be realized by executing the invention of a program described in a predetermined language on a computer. Specifically, the present invention is a product code analysis program for analyzing an analysis target database in which product names classified in layers are stored as records, and performing a total of the product names based on the hierarchical structure, the program causing a computer to perform:
(1) an input step of inputting the analysis object database through an input interface in a state where the hierarchical structure is maintained;
a provisional classification execution step of reading a classification dictionary in which keywords of classification names in each hierarchy constituting the hierarchical structure and unit columns serving as storage destinations of the respective product names are stored in association with each other, and provisionally classifying and registering the product names of the respective records in accordance with an appearance rate of the keywords of the classification names in the classification dictionary for the respective records in the analysis target database input from the input interface;
(2) a product name registration step of reading out a product name dictionary storing a keyword of a product name belonging to each unit column in each unit column classified by the hierarchical structure, and registering, for each record in the analysis target database, a product name of each record in accordance with an appearance rate of the keyword of the product name in the product name dictionary in accordance with provisional classification registration in the provisional classification execution step; and
(3) a dictionary search execution step of defining an application order of each dictionary and each keyword, and a combination of the application order of each keyword and the keyword when calculating the occurrence rate of the keyword in the provisional classification execution step and the product name registration step.
Further, by installing the program on a computer such as a user terminal or a Web server or an IC chip and executing the program on a CPU, a system having the above-described functions and actions/effects can be easily constructed. The program may be distributed via a communication line, for example, or may be transferred as a package (package application) running on a stand-alone computer.
Further, such a program can be recorded in a recording medium readable by a general-purpose computer, and the system or the method can be realized by the recording medium in which the program is recorded using a general-purpose computer or a dedicated computer, and the program can be easily stored, transported, and installed.
Effects of the invention
As described above, according to the present invention, it is possible to easily classify the product master information registered in different classifications or product names in each store into a uniform category and unify the product information by changing the product name to an appropriate product name.
Drawings
Fig. 1 is a conceptual diagram illustrating a product code analysis system according to an embodiment.
Fig. 2 is table data showing records for displaying product information on the store side according to the embodiment.
Fig. 3 is table data showing each information accumulated in a unit column in the product master information database according to the embodiment.
Fig. 4 is table data showing each information accumulated in the annotation dictionary database according to the embodiment.
Fig. 5 is an explanatory diagram showing an outline of the product code analysis method according to the embodiment.
Fig. 6 is a flowchart showing a method of generating various dictionary data according to the embodiment.
Fig. 7 is a flowchart showing a product information classification method according to the embodiment.
Fig. 8 is a flowchart showing a product information classification method according to the embodiment.
Detailed Description
Hereinafter, an embodiment of the product code analysis system according to the present invention will be described in detail with reference to the drawings. Fig. 1 is a block diagram showing an internal structure of a management server according to the present embodiment, and fig. 2 is table data showing product master information accumulated in a product master information database according to the present embodiment. Fig. 3 is table data of information accumulated in the annotation dictionary database according to the present embodiment, and fig. 4 is table data showing the store-side commodity master information according to the present embodiment. The term "module" used in the description means a functional unit configured by hardware such as a device or an apparatus, software having a function thereof, a combination of these, or the like to achieve a predetermined operation.
The system of the present embodiment is a system that acquires, as records, hierarchically classified product names generated in the information processing terminals 3 and the like of a plurality of stores S, and aggregates the records according to a hierarchical structure, and is configured by a management server 1 and a database group 2.
The information processing terminal 3 is, for example, an information processing terminal that is held by a retail dealer such as a supermarket that sells foods, daily necessities, or the like and has an arithmetic processing function of a CPU and a communication processing function of a communication interface, and can be realized by a general-purpose computer such as a Personal computer or a dedicated device (for example, a POS device or the like) that specializes functions, and includes a mobile computer or a PDA (Personal Digital assistant), a mobile phone, or the like similar to the mobile terminal.
The database group 2 is a database server that accumulates information related to the system itself, and also accumulates dictionary data used when registering commodity information that is stored in a unified manner for records of stores or information of records of different stores.
Specifically, the database group 2 includes a product master information database 21, a classification dictionary database 22, a product name dictionary database 23, an annotation dictionary database 24, a JAN code database 25, and an analysis target database 26.
The analysis target database 26 is table data in which product information including product names of stores to be analyzed is accumulated, and stores the product names classified hierarchically in units of records. Specifically, as shown in fig. 2, the items of "categories 1 to 4", "JAN code", "product code", and "product name" are stored in the analysis target database 26. Here, "categories 1 to 4" are attribute information on products of each department, and in the example shown in fig. 2, category 1 indicates an agricultural department, category 2 indicates a group of products such as vegetables, category 3 indicates a more detailed group of products such as mushrooms, and category 4 indicates a variety such as Hypsizigus marmoreus.
The "JAN code" records a public product code in japan, and the "product code" records a code uniquely assigned in the store. Further, information on the product-related information indicating the name of the product, the origin or quantity of the product, and the like is recorded in the "product name".
The product master information database 21 is a storage device that accumulates the product names of the input records in a unit column (column) that serves as a storage destination of each product name. Here, as shown in fig. 3, the unit column is information divided by items "category 1" to "category 4", and in the example shown in fig. 3, the unit column relating to the product "yuzu" is shown. In the unit column, "product names" of the respective products and "comment information" which is information related to the products are further stored in the database.
"categories 1 to 4" are attribute information on products of each department, and in the example shown in fig. 3, category 1 indicates an agricultural department, category 2 indicates a group of products such as vegetables, category 3 indicates a more detailed group of products such as mushrooms, and category 4 indicates a variety such as Hypsizigus marmoreus.
In addition, information indicating the name of the product to which predetermined comment information related to the contents such as the origin and number of the product is added is recorded in the "product name". In the example shown in the figure, information such as "manufacturer" which is information of a manufacturing source, "brand" which is information distinguishable from others, "size" which is information indicating a place of production, a size or a weight of a product, and "number of packages" which is sales form information such as the number of packages put in a box, is accumulated. In the present embodiment, the "product name" is stored with the comment information added thereto, but only the product name may be stored.
In addition, although not shown, management-side product identification information for identifying each product is added to the product master information database 21. In the other database, identification information for identifying the store, use information including the sales status of the product, and the like are recorded in association with the management-side product identification information. Here, the use information includes sales status information such as "average price", "sales amount", "number of sales", "sales shop rate", and "national sales end result", and "update status information" such as "update date", which are set in the shop. Further, the use information of the product or the product information of each store is searched for based on the management-side product identification information, whereby each product can be analyzed. In this case, when comment information is added to the "product name" item, a search can be performed by a combination of the product name and the added comment information.
The classification dictionary database 22 is a storage device that stores keywords of classification names in each hierarchy constituting a hierarchical structure in association with unit columns serving as storage destinations of the respective product names. In the present embodiment, among keywords appearing in each classification, keywords having a high occurrence rate are recorded as classification keywords, and keywords having a low occurrence rate are accumulated while being associated with keywords having a high occurrence rate.
The product name dictionary database 23 is a storage device that stores keywords of product names belonging to each unit column classified according to the hierarchical structure. In the present embodiment, keywords with a high occurrence rate among keywords of product names appearing in each category are recorded as keywords for product name assignment, and keywords with a low occurrence rate are accumulated in association with keywords with a high occurrence rate.
The annotation dictionary database 24 is a storage device that stores information (information other than the product names) associated with the product names registered in the product name dictionary database 23 in each unit column classified according to the hierarchical structure. As shown in fig. 4, the words accumulated in the annotation dictionary database 24 are roughly classified into "product relationship information", "attribute-related information", and "conditioning-related information", and are classified according to each content. Specifically, the information on the product accumulated in the "product relationship information" is classified into "manufacturer", "brand", "place of origin/country name", "volume/weight (kg/ml)", "size/length", "number of pieces put in/number of pieces of a pallet", "flavor" indicating a kind of taste "," character "indicating a name of a character", "container, package", "material, variety, seasoning" indicating a kind of container such as a can or a bag package "," allergen "indicating a material to be an allergen", "age restriction" indicating a purchase restriction age, and a selling time of the product (weekday, morning, olympic period, etc.), or "sales time/season" indicating information on season (spring, mother festival, etc.), "sales area/special product" indicating information on sales area, etc., and "sales characteristics" indicating discount information, etc.
The "attribute-related information" is information related to the target of purchasing a product, and is divided into items such as "rank/decile", "gender", "age group", "aspiration indicating aspiration information of a customer", and "time" indicating a sales time, which are classified in the order of purchase amount. The "conditioning-related information" is stored with information relating to the conditioning of the product, and is classified into items such as "storage period", "storage method", "processing degree", and "dining environment" indicating the usage status. Even when 1 store has any of the above items, the respective data are accumulated in the annotation dictionary database 24.
The JAN code database 25 stores JAN codes, which are common product codes, in association with the categories 1 to 4, the product names, and the words of the comment information, which are the items of the product master information database 21. The JAN code database 25 includes official JAN table data in which JAN codes are associated with classifications and product names common to all stores, and temporary JAN table data in which temporary classifications and temporary product names are temporarily assigned to JAN codes by a management side. This is because it is difficult to accumulate all data in official JAN table data for a commodity having a JAN code updated every day when a new commodity is registered, and therefore, as a management side, first, table data in which JAN codes are associated with classifications and commodity names determined by the management side is temporarily accumulated. After that, since the information accumulated in the temporary JAN table data is integrated with the formal JAN table data at regular intervals, the temporarily registered category and the product name can be changed to the formal category and the product name. The registration to the temporary JAN table data may be performed by a user operation of a manager, or may be configured to automatically register product information that is not registered in the official JAN table data.
On the other hand, the management server 1 is a server device that classifies product information from stores into unit columns and registers the product information in a database, and is realized by a server computer that executes various kinds of information processing or software having the functions thereof. As shown in fig. 1, the management server 1 includes a communication interface 11, an input interface 12, an output interface 13, and a control unit 14.
The input interface 12 is a device for inputting a user operation, such as a mouse or a keyboard, and in the present embodiment, records are input to the analysis target database 26 while maintaining the hierarchical structure. The output interface 13 is a device for outputting video or audio, such as a display or a speaker. In particular, the output interface 13 includes a display unit 13a such as a liquid crystal display. The communication interface 11 is a communication interface capable of performing a call or data communication, and performs transmission and reception of packet data via a communication network to acquire records of each store S. The memory 18 is a storage device that accumulates an OS (Operating System) or a product code analysis program according to the present embodiment.
The control unit 14 is an arithmetic module configured by hardware such as a Processor such as a CPU or a DSP (Digital Signal Processor), a memory, other electronic circuits, or the like, or software such as a program having the functions thereof, or a combination thereof, and virtually constructs various functional modules by appropriately reading and executing the program, and performs various processes for operation control of each part and user operation by the constructed functional modules. In the present embodiment, the control unit 14 includes a product information registration unit 15, a product information search unit 16, and a dictionary data generation unit 17.
The dictionary data generating unit 17 is a module for constructing various dictionary databases. First, when receiving input of information such as a product name of a sample, the dictionary data generating unit 17 extracts each word from each item of product information by a language analysis program such as morphological element analysis processing.
Then, the dictionary data generating unit 17 calculates the occurrence rate of the keyword for each item, sets the keyword having a high occurrence rate as a unified word, and accumulates the word in each dictionary database. The setting of the dictionary data will be described in detail below. In the present embodiment, as shown in fig. 2, it is assumed that records of company a, company B, and company C are input as dictionary registration data.
First, a case will be described in which keywords for categories 1 to 4 are constructed in the dictionary database based on the product information input from the store. In the present embodiment, classification 1 is "agricultural products" in company a, "vegetables and fruits" in company B, and "agricultural products" in company C. At this time, the dictionary data generating unit 17 sets "agricultural product" with a high occurrence rate as a keyword with a high occurrence rate in the category 1.
In category 2, company a, B, and C all use the word "vegetables", and therefore "vegetables" with a high occurrence rate is set as a keyword with a high occurrence rate. In category 3, company a uses the word "mushroom", company B uses the word "mushroom", and company C uses the word "mushroom". In this case, "mushroom" of B corporation, which has a high occurrence rate, is set as the keyword, which has a high occurrence rate, in category 3.
In category 4, company a uses the word "harziana", company B uses the word "hypsizygus" and company C uses the words "hypsizygus" and "hypsizygus". In this case, "hypsizigus" of companies B and C, which have high occurrence rates, are set as keywords having high occurrence rates in category 4. Further, each keyword having a low occurrence rate, which is not set as a keyword having a high occurrence rate, is associated with each keyword having a high occurrence rate and stored in each dictionary database.
Next, a case will be described in which a keyword for a product name is constructed in the dictionary database. First, the dictionary data generating unit 17 receives a process of replacing only the product name from the product name in the product master information. For example, as shown in fig. 4, in the case where the product name is "macy mushroom (beidou)", the processing is accepted in which the character of "beidou" is extracted and replaced with a word of only "macy mushroom". Then, the dictionary data generating unit 17 counts homophones in the product names, and registers the product names having a high frequency of occurrence as keywords having a high frequency of occurrence. Here, although there are words called "chamaemelum maculatum-jade-mushroom" and chamaemelum maculatum-mushroom homophonic, the occurrence rate of the word of "chamaemelum maculatum" is high, and the trade name is set to "chamaemelum maculatum". In this case, the keywords registered in the department may be given a priority indicating the order of use at the time of operation.
At this time, the dictionary data generating unit 17 receives a combination of 2 or more keywords necessary for specifying the product name, such as the product name and the form of the product, and registers the combination as a keyword. In addition, for products that are the same product but differ in name depending on the region (for example, "chamomile" in kanto and "chrysanthemum" in kanxi), a selection operation is performed to select which product name is a keyword, and the product names are unified.
Next, the setting of the annotation information in the annotation dictionary database will be described. The dictionary data generating unit 17 records information on the product in each item in the annotation dictionary database 24. For example, as shown in fig. 3, a word "beidou" extracted from a trade name "yuzu (beidou)" of company a is registered in the item "manufacturer" after being operated by a user. Then, for the comment information, the occurrence rate of keywords is calculated for each item, and keywords having a high occurrence rate are set and accumulated in each dictionary database.
Through the above-described processing by the dictionary data generating unit 17, keywords of the category, the product name, and the comment information are constructed in various databases. Then, the product information registration unit 15 refers to the various dictionary databases 22 to 25 constructed, and thereafter analyzes the product information (product name, category name for each store, comment information, and the like) inputted from each store, and totals the product information as uniform information in the product master information database 21.
The product information registration unit 15 includes a provisional classification execution unit 15a, a product name registration unit 15b, a dictionary search execution unit 15c, a collation function unit 15d, a learning function unit 15e, and a comment registration unit 15 f.
The provisional classification execution unit 15a is a module that provisionally classifies and registers the product names of the records in the analysis target database 26 input from the input interface 12, in accordance with the occurrence rate of the keyword of the classification name in the classification dictionary database 22. Specifically, when a record is input, the provisional classification executor 15a compares the classification name of the record with the keywords of the classification name in the classification dictionary database 22 in the order of classification 1 to 4, replaces the classification name of the record with the keyword having a high record appearance rate, and performs provisional classification registration.
For example, as shown in fig. 2, it is assumed that a record of company a is input. In this way, the word "agricultural product" of class 1 and the word "vegetable" of class 2 in the input record are the same as the keywords having a high occurrence rate stored in the classification dictionary database 22, and therefore the "agricultural product" is temporarily classified and registered in class 1, and the "vegetable" is temporarily classified and registered in class 2. On the other hand, when the word "mushroom" of category 3 refers to the category dictionary database 22, there is "mushroom" whose occurrence rate is higher than that of "mushroom", and the record is temporarily classified and registered in category 3 of "mushroom" in association with the "mushroom" keyword. Similarly to "Hypsizygus marmoreus" of category 4, the provisional category of "Hypsizygus marmoreus" which is a keyword having a high occurrence rate is registered in category 4.
Similarly, assuming that the record of company B is input, when the classification dictionary database 22 is referred to, there is a keyword "agricultural product" having a higher occurrence rate than "vegetables" of class 1, and therefore the "agricultural product" is temporarily classified and registered in class 1. Then, the keywords of "vegetables" of category 2, "mushrooms" of category 3, and "mushrooms" of category 4 that are input are keywords with high occurrence rates, and therefore, the provisional category is registered in the category of the keyword.
Further, when the record of company C is input, when the classification dictionary database 22 is referred to, the word "agricultural product" of classification 1 and the word "vegetable" of classification 2 are the same as the keyword having a high occurrence rate of the classification dictionary database 22, and therefore the "agricultural product" is temporarily classified and registered in classification 1 and the "vegetable" is temporarily classified and registered in classification 2. On the other hand, when the word "mushroom" in category 3 refers to the category dictionary database 22, there is a keyword having a higher frequency of occurrence than "mushroom", that is, "mushroom", and therefore "mushroom" is temporarily classified and registered in category 3. Similarly to "Hypsizygus marmoreus" of category 4, the provisional category of "Hypsizygus marmoreus" which is a keyword having a high occurrence rate is registered in category 4. Further, words not accumulated in the dictionary database are input to the dictionary data generating unit 17 and then registered in the dictionary.
The product name registration unit 15b is a module that registers, for each record in the analysis target database 26, the product name of each record in the unit column according to the occurrence rate of the keyword of the product name in the product name dictionary database 23 based on the provisional classification registration in the provisional classification execution unit 15 a.
To describe the processing of the product name registration unit 15b in detail, first, the product name registration unit 15b compares the product name of the input record with the keywords for each division stored in the product name dictionary database 23 in order, detects a keyword having a high occurrence rate associated with the input product name, and registers the product name of the keyword having the high occurrence rate in the item "product name" column in the unit column.
Specifically, as shown in fig. 2, when the record of company a is input, "hypsizigus" in the first row is the same as "hypsizigus" which is a keyword having a high occurrence rate, and therefore, "hypsizigus" characters are registered in the unit column.
On the other hand, when the trade name "sanguisorba gilsonii" of company B is referred to in the trade name dictionary database 23, a keyword having a high appearance rate is set as "sanguisorba gilsonii". Therefore, the trade name of a product "naematoloma mushroom" of company B is converted to "naematoloma mushroom" and registered in the unit column. Moreover, "Hypsizygus marmoreus" of B company is converted into "Hypsizygus marmoreus" and registered. Similarly, the other records are converted into keywords having a high occurrence rate and registered.
The comment registration unit 15f is a module for registering comment information of the product with reference to the comment dictionary database 24. Specifically, the annotation registering unit 15f registers, for each record in the analysis target database 26, information associated with the product name of each record in the unit column to which the product belongs, in accordance with the occurrence rate of the keyword in the annotation dictionary database 24.
For example, as shown in fig. 2, in the case where the selected keyword is "beidou", it is determined whether or not the word is contained in the annotation dictionary database 24. Here, the "beidou" word is a word registered in the "manufacturer" item, and therefore, as shown in fig. 3, the comment registering part 15f assigns the "beidou" word to the "manufacturer" item of the comment information. Similarly, keywords having a high occurrence rate for each item are assigned to each item of comment information. For example, the "Chinese" keyword is assigned to the "producing area" item, and the "value + g (gram)" keyword is assigned to the "size" item.
The dictionary search execution unit 15c is a module that defines the application order of each dictionary and each keyword, the application order of each keyword, and the combination of keywords when calculating the occurrence rate of keywords in the provisional classification execution unit 15a and the product name registration unit 15 b.
Here, the application order of each dictionary and each keyword includes, for example, a method of setting a priority to a product keyword and then searching from a keyword having a higher priority, or a method of searching from a keyword having a longer string length. Further, the keyword control section 15g may execute a search based on the character string length. The keyword control unit 15g is a module for setting the application order of keywords based on the string length of each keyword and the string length of a keyword obtained by combining the keywords.
In the present embodiment, 10-stage priorities are set for the product keywords of all the doors, and the search is performed from the keyword having the highest priority, and the search is performed from the keyword having the same priority in the order of the longer character string length.
For example, when the product name "AAABB" is registered and when the keyword "AAA" having a long string length and the keyword "BB" having a short string length in the product name dictionary have the same priority, the dictionary search execution unit can search from the "AAA" having a long string length based on the string length, and therefore, can prevent the product name "AAABB" from being registered under the classification of "BB". On the other hand, if the priority of the keyword "BB" with a short string length is set to be higher than the keyword "AAA" with a long string length, the product is registered in the product column of "BB" even with the same product name "AAABB". The order of application of the keywords can be appropriately selected according to the product department or product name, and the search can be performed only by either the priority or the character string length. Further, the application order may be changed so that the search is performed according to the string length, and the priority may be referred to when the same string length exists. The priority level may be arbitrarily changed.
The dictionary search execution unit 15c has a function of specifying a combination of keywords. Specifically, the dictionary search execution unit 15c searches for a combination of 2 or more keywords necessary for identifying a product name. The information combined with the product is information included in the annotation dictionary database 24, such as "product form", "manufacturer", "time/season of sale", "flavor", and the like, and can be arbitrarily extracted from the database. As this extraction method, for example, a condition according to which the search should be performed and the search condition to be accepted may be displayed on the screen of the administrator, or the search may be performed in a predetermined application order in which a combination of keywords is set.
Then, the dictionary search execution unit 15c cyclically combines the related keywords such as AA1, AA2, AND AA3 with each other in accordance with the AA1 × AA2, AA1 × AA3, AA2 × AA1, AA2 × AA3, AA3 × AA1, AND AA3 × AA2 modes, AND can perform an AND (AND) search including all the specified keywords, an OR (OR) search including any keyword, OR the like. In this case, the keywords can be classified more appropriately by searching in the order of the total string length of the keywords from long to short or in the order of priority. The dictionary search execution unit 15c can also be provided with a function of generating a new search keyword by appropriately connecting related keywords, such as AA1AA2 and AA1AA 3. By combining the search keyword and the original keyword to arbitrarily adjust the string length and performing a search, or the like, the order of application of the decomposed limited keywords can be adjusted, and the analysis accuracy can be improved. Further, even if another word is inserted between the combinations, the word is not recognized in the determination, and the determination can be made even if there is another word between the combinations.
The dictionary search execution unit 15c decomposes the product names and associated information strings in each record in units of words by a language analysis program such as form element analysis processing on the premise that the product names and associated information of the records are input to the provisional classification execution unit 15a and the product name registration unit 15b, and executes the application of each dictionary in units of words thus decomposed. For example, as shown in fig. 2, for the recorded trade name "hypsizigus (beidou)" inputted from company a, it is decomposed into characters "hypsizigus" and "beidou".
The dictionary search execution unit 15c further includes: when calculating the appearance ratio of the keywords in the comment registration unit 15f, functions of each dictionary, the application order of each keyword, and the combination of the application order of each keyword and the keyword are also specified.
The dictionary search execution unit 15c has the following functions: as shown in fig. 2, when the record obtained from the shop side contains the JAN code, the JAN code database 25 is referred to, and the words of the categories 1 to 4, the product names, and the comment information associated with the JAN code are extracted and registered in the commodity master information database 21 (in the figure, P1 to P5) as shown in fig. 3. In this case, for example, a name obtained by combining comment information such as a manufacturer name or a brand name is recorded in the product name.
The collation function section 15d is a block that executes a provisional classification mode for performing dictionary search of product names based on provisional classification registration by the provisional classification execution section 15a and a collation mode for performing dictionary search of all classifications regardless of the result of the provisional classification registration, and notifies the result when the result is different between the two modes. The notification of the collation result includes, for example, a case of notification by e-mail or the like and a case of causing the display unit 13a to pop up the result of the two modes. Further, the function of receiving a selection of whether to register an arbitrary classification (part) after the notification is provided.
When the JAN code included in the input product information is not registered in the JAN code database 25, the verification function unit 15d refers to the temporary JAN table data and determines whether or not the JAN code is included in the temporary JAN table data. When the temporary JAN table data does not include the JAN code, the display unit 13a displays the information and accepts the user operation registered in which category (department).
On the other hand, when the temporary JAN table data includes the JAN code, the classification is a temporary classification registered in the temporary JAN table data. In this case, the result of the classification may be displayed on the display unit 13a, and the operation of changing the classification destination may be accepted. The verification function section 15d includes a function of moving a specific product name to another classification destination according to an arbitrary operation by the user. As a method of accepting the user operation, for example, an intuitive operation of moving an arbitrary unit column can be performed by displaying a list of unit columns on a screen and dragging the list by a manager on the display screen.
The learning function unit 15e is a module for reflecting the dictionary search result in the two patterns to the corresponding dictionary based on the result of the matching function. Specifically, the learning function unit 15e adds a change to the dictionary data and changes the order of application of the keywords by the keyword control unit 15g in accordance with the user operation received by the collation function unit 15d, and automatically accumulates the same product in the unit column corresponding to the product without performing notification processing when the same product is input this time or later. When a change operation of moving a specific product name classified into a unit column to an arbitrary classification destination is performed, the learning function unit 15e automatically changes the order of application of keywords when the same product is input, and the like so that the change operation is reflected in the dictionary search result at this time and thereafter.
The processing of the learning function unit 15e will be described in detail. For example, when a specific product name is moved to another classification destination in accordance with an arbitrary operation by the user or as a result of the collation function, for example, a list of unit columns (classification list) is displayed on the screen, and the list is specified by dragging or the like on the display screen, thereby specifying the product name to be changed and the unit column of the movement destination. In accordance with this change operation, the learning function section 15e automatically changes the priority given to the keyword, the number of character strings, and the combination with another keyword so that the product name to be changed does not affect the search result of another keyword after the change operation, thereby changing the order of application of the keyword.
When the change operation is performed, the following operation is specifically performed.
(1) First, the classification source and the classification source after the change are compared, it is determined whether or not any one of the classifications is preferentially targeted for search execution, and whether or not the order of application of the product name (keyword) targeted for change is ascending or descending (move type determination processing).
(2) Next, a range in which interference is likely to occur due to the change processing is determined based on the determination result of the movement type determination processing (range determination processing). Specifically, when the order of application of the product name to be changed is ascending or descending, it is determined whether or not the examination is performed within a range of a keyword having a higher priority than the product name to be changed or a larger number of character strings or within a range of a keyword having a smaller number of character strings.
(3) Then, according to the above-described range determination processing, a check is performed for the presence or absence of occurrence of interference with respect to the keywords included in the determined range. Specifically, a keyword is extracted by referring to a classification source to which a product name to be changed belongs and a dictionary having a changed classification destination as a search result (back extraction processing).
(4) Next, the keyword extracted by the reverse extraction process is compared with the product name (keyword) to be changed, and the priority is adjusted or the search keyword is generated based on the priority or the number of character strings. In the present embodiment, since the level of the priority is limited, the above-described interference can be eliminated as much as possible by the generation of the search key, and the priority is adjusted when the interference cannot be eliminated only by the generation of the search key. As the generation of the search keyword, for example, a search keyword is generated by appropriately connecting related keywords such as AA1AA2 and AA1AA3, and the character string length is arbitrarily adjusted by combining the search keyword and the original keyword. The dictionary search execution unit 15c can adjust the order of application by generating a search keyword of a desired string length, because it performs the search with a plurality of keywords and applies the search keywords in the order of the total number of strings of the plurality of keywords from long to short.
The product information search unit 16 is a module that searches for product information for each main unit corresponding to the search condition by referring to the product master information database 21. In addition, the search conditions may be classified into categories 1 to 4, product names, and comment information, and may be searched for each store based on store identification information. In addition, for the searched product, the sales status and the like may be searched based on the product identification information.
(Commodity code analysis method)
By operating the product code analysis system having the above configuration, a product code analysis method in which records are aggregated into a unified database can be implemented. Fig. 5 is an explanatory diagram showing an outline of the product code analysis method of the present embodiment, fig. 6 is a flowchart showing a method of generating various dictionary data of the present embodiment, and fig. 7 and 8 are flowcharts showing a method of classifying product master information of the present embodiment.
As shown in fig. 5, first, in step S100, a process of constructing (creating) various dictionary data for analysis is executed, and then, in step S200 and step S300, if a record is input from each store, the records are classified and registered in a unified commodity master information database.
(1) Method for generating various dictionary data
A method of generating dictionary data will be described. As shown in fig. 6, first, the number of categories of product types (categories) is determined (S101). In the present embodiment, the classification is classified into classification 1 (business department), classification 2 (product group), classification 3 (more detailed product group), and classification 4 (item).
Next, the dictionary data generating unit 17 receives an input of a record as a sample (S102). The record may be information input from a product selection field or the like displayed on the browser, or may be information read from data recorded on a recording medium.
When the reception of the record input is completed, the dictionary data generating unit 17 extracts the words of the items of the categories 1 to 4, the product names, and the comment information of the record (S103). Then, the occurrence rate of keywords in each item is calculated, and a keyword having a high occurrence rate is set and accumulated in each dictionary database (S105). The keywords with low occurrence rates are associated with the keywords with high occurrence rates and stored in the respective dictionary databases (S106).
(2) Commodity classification method
Next, a description will be given of a method of classifying the recorded product names. In the present embodiment, it is assumed that the dictionary and the application order of the keywords, and the application order of the keywords and the combination of the keywords are predetermined. The application specification further includes setting an application order of the keywords based on the string length of each keyword and the string length in which each keyword is combined. In the present embodiment, the search is performed from a keyword having a high priority in the dictionary, and when the keywords have the same priority, the search is set to be performed from a keyword having a long character string length. Note that the comment information, the order of application to each keyword, the order of application of each keyword, and the order of application of the keyword combination may be set.
First, as shown in fig. 7, when each record in the analysis target database 26 is input through the input interface 12 while maintaining the hierarchical structure (S201), the dictionary search execution unit 15c determines whether or not the JAN code is included in the record (S202). If the record includes the JAN code (yes in S202), it is determined whether the JAN code is registered in the official JAN table data in the JAN code database 25 (S203). When the JAN code is included in the JAN table data (yes in S203), the classification (classifications 1 to 4) of the product, the product name, and the comment information are determined and registered based on the JAN code (S210).
On the other hand, if the JAN code is not included in the formal JAN table data (no in S203), the temporary JAN table data is referred to, and it is determined whether or not the JAN code is included in the temporary JAN table data (S204).
When the JAN code exists in the temporary JAN table data (yes in S204), the assigned temporary category and temporary product name are selected and registered as the temporary category (S210). At this time, the result of the provisional classification is displayed on the display unit 13a, and the operation of changing the classification destination is accepted.
On the other hand, when the JAN code is not registered in the temporary JAN table data (no in S204), the dictionary search execution unit 15c extracts the words of the respective information registered in the record for each item, and decomposes the product name and the associated information character string in each record into word units by morphological element analysis. Then, the display unit 13a displays the notification information by the collation function unit 15d, and receives the user operation (S211). Then, the matching function unit 15d registers the selected keyword of the classification in each dictionary and temporarily classifies and registers the product information in the classification in accordance with the user operation (S210).
When the JAN code is not present in the record (no in S202), the temporary classification execution unit 15a temporarily classifies and registers the product name of each record in the analysis target database 26 input from the input interface 12, in accordance with the occurrence rate of the keyword of the classification name in the classification dictionary database 22. Specifically, the keyword of the classification name of each department is read (S205), the classification dictionary database 22 is read (S206), and it is determined whether or not the classification name of the record is registered in the classification dictionary database 22 (S207).
When the word in the record is registered in the classification dictionary database 22 (yes in S207), the word is temporarily classified and registered in the unit column having a high occurrence rate in accordance with the occurrence rate of the keyword (S209) (S210). On the other hand, when the word in the record is not registered in the classification dictionary database 22 (no in S207), the keyword of the classification is newly registered in the dictionary (S208). Specifically, the dictionary search execution unit 15c extracts words of the respective information registered in the records for each item, and decomposes the product names and associated information character strings in the records into word units by morphological element analysis. Then, the display unit 13a displays the notification information by the collation function unit 15d, and receives the user operation. Thereafter, in accordance with the user operation, the keywords of the classification are registered in each dictionary, and the product information is temporarily classified and registered in the classification (S210).
Next, as shown in fig. 8, the product name registration unit 15b performs a product name registration step of: for each record in the analysis target database 26, the product name of each record is registered in the unit column according to the occurrence rate of the keyword of the product name in the product name dictionary database 23.
Specifically, the record registered in the provisional classification performed in the provisional classification execution step is selected (S301), the product name dictionary database 23 is read for each unit column classified by the hierarchical structure (S302), and it is determined whether or not the product name is registered in the product name dictionary database 23 (S303).
When the selected product name is not registered in the product name dictionary database 23 (no in S303), a word of the product name is registered in the dictionary (S304), and then the product name is registered in the unit column (S306). Further, the word registration process in the dictionary is performed in the same manner as in steps S103 to S106. On the other hand, when the product name is registered in the product name dictionary database 23 (yes in S303), the registered product name is registered in the corresponding unit column in accordance with the appearance rate of the keyword of the product name (S305) (S306).
In the product name registration step, a provisional classification mode for performing dictionary search of product names based on provisional classification registration in the provisional classification registration step and a collation mode for performing dictionary search of all classifications regardless of the result of the provisional classification registration are executed, and when the results in both modes are different, the result is notified. In this case, the dictionary search results in both modes are assigned to the corresponding dictionary based on the result of the matching step.
Next, the comment registration unit 15f performs the comment registration step of: for each record in the analysis target database 26, information associated with the product name of each record is registered in the unit column to which the product belongs, in accordance with the occurrence rate of the keyword in the annotation dictionary database 24.
Specifically, first, the information associated with the product name registered in the product name dictionary database 23 and the annotation dictionary database 24 stored in each unit column are read out (S307 and S308), and it is determined whether or not the word is registered in the dictionary (S309).
When the selected word is registered in the annotation dictionary database 24 (yes at S309), the word is assigned to the item (for example, "manufacturer", "brand", "place of production", "size", and "number of packages") of the registered annotation information, and the annotation information is registered (S311).
On the other hand, when the selected word is not registered in the annotation dictionary database 24 (no in S309), the annotation information is registered in the dictionary (S310), and the annotation information is registered in each item (S311). Further, the word registration process in the dictionary is performed in the same manner as in steps S103 to S106. Note that the comment registration unit 15f repeats the processing of steps S307 to S311 until the word in the record is completely absent. Thereafter, the processing in steps S201 to S311 is repeated with reference to the next record until none of the processing is performed.
(product code analysis program)
The product code analysis system and the product code analysis method according to the present embodiment described above can be realized by executing a product code analysis program written in a predetermined language on a computer. That is, the program is installed in a portable terminal device having a portable telephone/communication function integrated with a portable information terminal (PDA), a server device disposed on a network to provide data or functions to a client side, a dedicated device such as a game device, or an IC chip, and executed on the CPU, whereby a system having the above-described functions can be easily constructed. The program may be distributed via a communication line, for example, or may be transferred as an application package running on a stand-alone computer.
Such a program may be recorded in a recording medium readable by a personal computer. Specifically, the recording medium can be recorded on a magnetic recording medium such as a floppy (hard) disk or a cassette tape, an optical disk such as a CD-ROM or a DVD-ROM, or various recording media such as a USB memory or a memory card.
(action/Effect)
According to the present embodiment as described above, for each input record, first, the provisional classification execution unit 15a registers each record in the unit column to be the storage destination in the provisional classification according to the occurrence rate of the keyword of the classification name in the classification dictionary database 22, and then the product name registration unit 15b changes the keyword of the product name registered in the provisional classification according to the occurrence rate of the keyword of the product name in the product name dictionary database 23 to the keyword in which the temporarily registered product name is unified and registers the keyword.
In particular, according to the present embodiment, the dictionary search execution unit 15c defines the application order of each dictionary and each keyword, and the combination of the application order of each keyword and the keyword when the provisional classification execution unit 15a and the product name registration unit 15b calculate the occurrence rate of the keyword. Specifically, for example, in the case where the keywords in the dictionary include "AAABB" and "BB" keywords, even when the product name "AAABB" is registered and when the product name dictionary includes "AAA" having a long character string length and "BB" having a short character string length, the dictionary search execution unit can first perform the search from "AAA" having a long character string length according to the character string length, and therefore, can prevent the product name "AAABB" from being registered under the classification of "BB". For example, the priority may be set for each product column to a keyword, and the search may be performed from a keyword having a higher priority.
In the present embodiment, the dictionary search execution unit 15c performs the determination using a combination of 2 or more keywords necessary for specifying the product name, the product form, and the like. Specifically, for example, keywords related to AA1, AA2, AA3, and the like can be cyclically combined in a manner of AA1 × AA2, AA1 × AA3, AA2 × AA1, AA2 × AA3, AA3 × AA1, AA3 × AA2, and search, or the like can be performed. In this case, the search is performed in the order of the total string length of the keywords from long to short, thereby enabling more appropriate classification. The dictionary search execution unit 15c can also be provided with a function of generating a new search keyword by appropriately connecting related keywords, such as AA1AA2 and AA1AA 3. By combining the search keyword and the original keyword to arbitrarily adjust the string length and performing a search, or the like, the order of application of the decomposed limited keywords can be adjusted, and the analysis accuracy can be improved.
Further, according to the present embodiment, information other than the product name is registered in the unit column to which the product belongs by referring to the annotation dictionary, and therefore, the classification of the product and additional information other than the product name can be registered in association with each other.
Further, according to the present embodiment, since the present embodiment has a collation function of performing the provisional classification mode and the collation mode and notifying the result when the results of the two modes are different, for example, in the case of a product name mutually used in different classifications, the result is notified, and therefore, it is possible to determine to which classification the product name belongs with certainty. Further, since the learning function is provided to reflect the processing for the result notification to each dictionary, the product can be automatically assigned when the next registration is performed.
In the present embodiment, the dictionary search execution unit 15c decomposes the product names and the associated information character strings in each record in units of words, and executes the application of each dictionary in units of words thus decomposed, for example, even when information relating to product names and products is mixedly input to records input from a store, the records can be registered in appropriate unit columns because the temporary classification registration and the product name registration are performed in units of words of the smallest size.
[ modification example ]
The above description of the embodiments is an example of the present invention. Therefore, the present invention is not limited to the above-described embodiments, and various modifications can be made in accordance with design and the like without departing from the technical spirit of the present invention.
For example, in the above-described embodiment, the input product information is temporarily classified and registered with reference to the classification dictionary database 22, and then registered in the unit column based on the product name dictionary database 23, but the input product name may be directly registered in the unit column with reference to the product name dictionary database 23 without performing the process of temporarily classifying and registering, for example.
In this case, the same processing as the collation pattern in which dictionary search is performed for all the classifications described above is performed, and the inputted product name and the keywords of all the classifications are compared. In this case, the priority, the character string length, the combination of keywords, and the like can be arbitrarily selected, for example, according to the order of application of the keywords.
In such a modification, the product names can be associated with categories 1 to 4 in advance, and therefore categories 1 to 4 can be automatically assigned to the total commodity master information. In this case, the provisional registration process is omitted, and therefore the total processing speed can be increased.
Description of the symbols
1 management server
2 database group
3 information processing terminal
11 communication interface
12 input interface
13 output interface
13a display unit
14 control part
15 commodity information registration unit
15a temporary classification execution unit
15b trade name registration part
15c dictionary search execution unit
15d checking function part
15e learning function part
15f comment registration unit
15g keyword control section
16 commodity information search unit
17 dictionary data generating part
18 memory
21 commodity main information database
22 classification dictionary database
23 Commodity dictionary database
24 annotation dictionary database
25JAN code database
26 analysis object database
Claims (10)
1. A product code analysis system that analyzes an analysis target database storing product names classified in layers as records and totals the product names according to a hierarchical structure, the product code analysis system comprising:
an input interface that inputs the analysis target database in a state where the hierarchical structure is maintained;
a classification dictionary that stores keywords of classification names in each hierarchy constituting the hierarchical structure in association with a unit column serving as a storage destination of each product name;
a product name dictionary that stores a keyword of a product name belonging to each unit column classified by the hierarchical structure in each unit column;
a provisional classification execution unit that provisionally classifies and registers, for each record in the analysis target database input from the input interface, a product name of each record in accordance with an occurrence rate of a keyword of a classification name in the classification dictionary;
a product name registration unit that registers, for each record in the analysis target database, a product name of each record in the unit column according to an appearance rate of a keyword for a product name in the product name dictionary, based on the provisional classification registration in the provisional classification execution unit; and
a dictionary search execution unit that specifies an application order of each dictionary and each keyword and a combination of keywords when calculating the occurrence rate of the keywords in the provisional classification execution unit and the product name registration unit;
wherein the trade name registration unit has a collation function of: a provisional classification mode for performing dictionary search of the product name based on provisional classification registration performed by the provisional classification execution unit and a collation mode for performing dictionary search of all classifications regardless of the result of the provisional classification registration are executed and the result is notified when the results of the two modes are different.
2. The merchandise code analysis system of claim 1,
the product code analysis system further includes:
an annotation dictionary that stores information associated with the product name registered in the product name dictionary in each unit column classified by the hierarchical structure; and
a comment registration unit that registers, for each record in the analysis target database, information associated with the product name of each record in a unit column to which the product belongs, in accordance with the occurrence rate of the keyword in the comment dictionary,
the dictionary search execution unit specifies each dictionary, an application order of each keyword, and a combination of keywords when calculating the occurrence rate of the keyword in the annotation registration unit.
3. The merchandise code analysis system of claim 1,
the product code analysis system further includes: and a learning function unit which reflects the dictionary search results of the two patterns in the corresponding dictionary based on the result of the matching function.
4. The merchandise code analysis system according to claim 1 or 2,
the dictionary search execution unit decomposes the product name and the associated information character string in each record into word units, and executes the application of each dictionary in the decomposed word units.
5. The merchandise code analysis system according to claim 1 or 2,
the dictionary search execution unit further includes: and a keyword control unit that sets an application order of the keywords based on a string length of each keyword and a string length of a keyword obtained by combining the keywords.
6. A commodity code analysis method for analyzing an analysis target database in which hierarchically classified commodity names are stored as records and aggregating the products according to a hierarchical structure, the commodity code analysis method comprising the steps of:
an input step of inputting the analysis object database through an input interface in a state where the hierarchical structure is maintained;
a provisional classification execution step of reading a classification dictionary in which keywords of classification names in each hierarchy constituting the hierarchical structure and unit columns serving as storage destinations of the respective product names are stored in association with each other, and provisionally classifying and registering the product names of the respective records in accordance with an appearance rate of the keywords of the classification names in the classification dictionary for the respective records in the analysis target database input from the input interface;
a product name registration step of reading out a product name dictionary storing a keyword of a product name belonging to each unit column in each unit column classified by the hierarchical structure, and registering, for each record in the analysis target database, a product name of each record in accordance with an appearance rate of the keyword of the product name in the product name dictionary in accordance with provisional classification registration in the provisional classification execution step; and
a dictionary search execution step of specifying an application order of each dictionary and each keyword and a combination of keywords when calculating the occurrence rate of the keyword in the provisional classification execution step and the product name registration step;
wherein the trade name registration step includes a collation step in which: a provisional classification mode for performing dictionary search of the product name based on provisional classification registration performed in the provisional classification registration step and a collation mode for performing dictionary search of all classifications regardless of the result of the provisional classification registration are executed and the result is notified when the results of both modes are different.
7. The merchandise code analysis method according to claim 6,
the commodity code analysis method further includes: an annotation registration step of reading an annotation dictionary that stores information associated with the product name registered in the product name dictionary in each unit column classified by the hierarchical structure, and registering, for each record in the analysis target database, information associated with the product name of each record in a unit column to which the product belongs in accordance with the occurrence rate of the keyword in the annotation dictionary,
in the dictionary search execution step, when calculating the occurrence rate of the keyword related to the annotation dictionary, the application order of each dictionary and each keyword and the combination of keywords are specified.
8. The merchandise code analysis method according to claim 6,
the commodity code analysis method further includes: a learning step of reflecting the dictionary search results of the two patterns in the corresponding dictionary based on the result of the collation step.
9. The commodity code analysis method according to claim 6 or 7,
in the dictionary search execution step, the product name and the associated information character string in each record are decomposed into word units, and the application of each dictionary is executed in the decomposed word units.
10. The commodity code analysis method according to claim 6 or 7,
the dictionary search execution step further includes: and a keyword control step of setting an application order of the keywords based on a character string length of each keyword and a character string length of a keyword obtained by combining the keywords.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2013-104749 | 2013-05-17 | ||
| JP2013104749A JP5753217B2 (en) | 2013-05-17 | 2013-05-17 | Product code analysis system and product code analysis program |
| PCT/JP2014/063036 WO2014185507A1 (en) | 2013-05-17 | 2014-05-16 | Product code analysis system and product code analysis program |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1219552A1 HK1219552A1 (en) | 2017-04-07 |
| HK1219552B true HK1219552B (en) | 2018-03-02 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105229640B (en) | Commercial product code analysis system and commercial product code analysis method | |
| Chen et al. | A flexible evaluative framework for order picking systems | |
| CN102609869B (en) | Commodity purchasing system and method | |
| US20100114665A1 (en) | Customer reference generator | |
| JP4800394B2 (en) | Intelligent product search method and system based on customer purchase behavior analysis | |
| US20120109788A1 (en) | Merchandising items of topical interest | |
| US9235633B2 (en) | Processing data in a data warehouse | |
| US20120239600A1 (en) | Method for training and using a classification model with association rule models | |
| Wang et al. | Database submission—market dynamics and user-generated content about tablet computers | |
| Raorane et al. | Data mining techniques: A source for consumer behavior analysis | |
| JP2015043167A (en) | Sales prediction system and method | |
| Alawadh et al. | A survey on methods and applications of intelligent market basket analysis based on association rule. | |
| US10235711B1 (en) | Determining a package quantity | |
| Lonlac et al. | Mining frequent seasonal gradual patterns | |
| US7949576B2 (en) | Method of providing product database | |
| CN107341165A (en) | The method and apparatus for prompting display are carried out at search box | |
| Mirajkar et al. | Data mining based store layout architecture for supermarket | |
| WO2021024966A1 (en) | Company similarity calculation server and company similarity calculation method | |
| JP7463480B2 (en) | Information processing device, information processing method, and computer program | |
| HK1219552B (en) | Product code analysis system and product code analysis method | |
| JP2020038535A (en) | Feature extraction device and feature extraction method | |
| Sudharma et al. | Neoj4 and SARIMAX Model for Optimizing Product Placement and Predicting the Shortest Shopping Path | |
| Pınar et al. | The Impact of the COVID-19 Outbreak on Customer Purchasing Behavior in the E-commerce Sector Project | |
| JP2005092721A (en) | Market information analysis apparatus, market information analysis system, market information analysis method and program | |
| CN114282627B (en) | Classification model generation method, object classification method and device |