CN111159491A - Material duplicate checking data method - Google Patents

Material duplicate checking data method Download PDF

Info

Publication number
CN111159491A
CN111159491A CN201911299014.7A CN201911299014A CN111159491A CN 111159491 A CN111159491 A CN 111159491A CN 201911299014 A CN201911299014 A CN 201911299014A CN 111159491 A CN111159491 A CN 111159491A
Authority
CN
China
Prior art keywords
processing
data
standard
checking
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911299014.7A
Other languages
Chinese (zh)
Inventor
康利生
王磊
周涛
范海竹
李朝鹏
王海霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bohai Shipyard Group Co Ltd
Original Assignee
Bohai Shipyard Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bohai Shipyard Group Co Ltd filed Critical Bohai Shipyard Group Co Ltd
Priority to CN201911299014.7A priority Critical patent/CN111159491A/en
Publication of CN111159491A publication Critical patent/CN111159491A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Abstract

The invention provides a method for checking duplication data of materials. When the duplication checking data are generated by splicing according to the standard numbers, names, specifications, materials, material standards and additional attributes of each field of the material data, the data are processed as follows: (1) standard number processing: and converting the standard number input by the user into a standard verification format, namely converting the year into four digits, and removing the mark of the recommendation standard. (2) Name processing: if the materials recorded in the system are recorded in the standard number, the recorded materials name is used. (3) Data of the multiplier is incorrectly used in the processing specification. (4) And processing the parameter sequence in the specification. (5) And splicing the data to generate an initial duplicate checking character string. (6) The confusing characters are replaced. (7) Clear space, upper and lower case conversion. After the duplicate checking character string is processed by the method, the storage of the repeated data in a warehouse caused by the problem that the format in the standard number, the standard name and the specification entry is not standard can be effectively avoided. The method is suitable for being adopted as a material data query method.

Description

Material duplicate checking data method
Technical Field
The invention relates to a method for checking duplication data of materials, which can generate the same checking data of the same materials with different input data, avoids the generation of duplicate materials and belongs to the technical field of inventory or stock management in a computer data processing system.
Background
For manufacturing enterprises, material management is a key field of enterprise informatization management, and systems such as ERP, PDM, MES and the like are directly related to material data. In order to facilitate material management, an enterprise informatization system generally needs to establish a material library and endow materials with material codes. No matter what kind of coding scheme is used, a problem can not be avoided, that is the problem that when the specific data of the goods and materials are input, because the difference of the input characters leads to the fact that the same kind of goods and materials are input for many times, a plurality of codes are given, and a plurality of codes are generated.
The main method for the information system to judge the repeatability of the materials is to splice the fields of the materials for weight checking, such as standard numbers, names, specifications, materials and the like to generate a character string, and then judge whether the materials are the same according to whether the character strings are the same. Because the duplicate checking character strings are different due to the difference of the input characters, certain processing is needed when the duplicate checking character strings are generated, and some repeated situations are reduced. The method for generating the duplicate checking character string by the existing information system mainly comprises the steps of removing the blank and uniformly converting letters into uppercase or lowercase, and can avoid repeated data entry caused by the large difference between the blank and the letters. However, the simple processing mode can only avoid a small part of material entry errors, and a large amount of repeated materials are still entered into the system.
In practical work, the following reasons are found out statistically and commonly among the reasons for generating the repeated materials:
1. the standard numbers are not uniform
In actual work, the problem is serious, for example, the standard number of the throttle valve CB/T315-93 is recorded as CB315-93, CB/T315 + 1993, CB315-93 and the like by a logging personnel. The method mainly comprises that standard numbers have 2 years and 4 years, some recommendation standards are provided with "/T", some recommendation standards are provided with "+", and the like, and users can often make mistakes when recording.
2. The names of materials are not uniform
For example, the bolt material of GB/T5782-.
3. Errors caused by confusing characters
Such as the letters "I", roman numerals "I", full-angle letters "I" are similar, and the letters "O" are similar to the numerals "0", which are confusable characters that are often mistaken when entered by an entering person.
4. Substitution of multiplier (x) with other characters
For example, the bolt specification includes "M12 × 80", and also includes "M12X 80" and "M12 × 80".
5. The sequence of a plurality of parameters in the specification is not uniform
For example, pressure and drift diameter are marked in specifications of part of pipeline accessories and valves, and the 'PN 10 DN 32' and the 'DN 32 PN 10' are the same in nature, but the generated weight checking character strings are different due to different sequences.
In order to improve the accuracy of checking the duplicate goods and materials and reduce the entering of the duplicate goods and materials into the system, corresponding treatment must be carried out according to the situations.
Disclosure of Invention
Aiming at the problem that the traditional method for generating the duplication checking character string can only shield simple spaces and case, the invention provides the method for checking the duplication data of the materials. The duplicate checking method and the duplicate checking device can generate the identical duplicate checking data for the same material with different input data, further improve the duplicate checking accuracy, reduce the generation of repeated material in an information system, and solve the technical problem of duplicate checking of the material.
The technical scheme adopted by the invention for solving the technical problems is as follows:
when the duplication checking data are generated by splicing according to the standard number, the name, the specification, the material standard and the additional attribute of each field of the material data, the data are processed as follows:
(1) standard number processing: and converting the standard number input by the user into a standard verification format, namely converting the year into four digits, and removing the mark of the recommendation standard.
(2) Name processing: if the materials recorded in the system are recorded in the standard number, the recorded materials name is used.
(3) Data of the multiplier is incorrectly used in the processing specification.
(4) And processing the parameter sequence in the specification.
(5) And splicing the data to generate an initial duplicate checking character string.
(6) The confusing characters are replaced.
(7) Clear space, upper and lower case conversion.
The positive effects are as follows: after the duplication checking character string is processed by the invention, the problems of standard number,
Repeated data which is caused by the problem that the format in the standard name and specification input is not standard is stored in a warehouse, so that great help is brought to the data purity of the enterprise material management related information system. The method is suitable for being adopted as a material data query method.
Drawings
FIG. 1 is a flow chart of generating a duplicate checking character string according to the present invention.
Detailed Description
The material warehousing information processing process comprises the following steps:
(1) standard number processing: and converting the standard number input by the user into a standard verification format, namely converting the year into four digits, and removing the mark of the recommendation standard.
(2) Name processing: if the materials recorded in the system are recorded in the standard number, the recorded materials name is used.
(3) Data of the multiplier is incorrectly used in the processing specification.
(4) And processing the parameter sequence in the specification.
(5) And splicing the data to generate an initial duplicate checking character string.
(6) The confusing characters are replaced.
(7) Clear space, upper and lower case conversion.
After the character string is input, the character string is replaced by the character string which is easy to be confused, and the character string is finished after distinguishing blank spaces and upper and lower cases.
The method comprises the steps of standard number processing, name processing, multiplication processing, parameter sequence processing, character string splicing, character replacement easy to confuse, space processing and case processing.
The standard number processing mainly processes the year into 4 bits in a unified way, and cancels marks of various recommendation standards. If the return value is null character string, the non-standard number which is filled in the standard field and does not need to be processed is indicated, and the original content is used.
The name processing searches a standard name in a data table constructed in advance according to the processed standard number data, and if the standard name exists, the standard name is used. After the treatment of the step, part of non-uniform material names are subjected to standard treatment. For example, the material names of serial numbers 1 and 2 are changed into bolts.
The multiplication processing checks material data, which mainly includes specification columns, whether the multiplication is used in an irregular way or not can be realized by using functions.
Wherein, the parameter sequencing assumes that the sequencing of the pressure of the valve and the drift diameter marks PN and DN is defined as PN before DN after, and the data in the specifications of the serial numbers 3 and 4 are processed as follows: PN25 DN 32.
And splicing the data needing to be subjected to duplicate checking by using the connection character to generate a duplicate checking character string.
And replacing the confusable characters in the confusion character library to extract a confusable character list, and replacing the confusable characters in the duplication checking character string with the specified characters.
Wherein, the processing procedures of distinguishing blank space and case are carried out.
Example 1: the specific implementation steps are as follows:
serial number Standard number Name (R) Specification of Material of ……
1 GB5782-2000 Hexagon head bolt M12X100 Stage 8.8
2 GB/T5782-2000 Bolt M12*100 Stage 8.8
3 CB 598-90 Stop valve IV DN32 PN25 Assembly part
4 cb* 598-1990 Stop valve IV PN25 DN32 Assembly part
Table 1 example data
1. Standard number processing:
the standard number processing is mainly to process the year into 4 bits in a unified way, and cancel the marks of various recommendation standards. The following C # code implementation may be used;
code 1:
Figure DEST_PATH_IMAGE001
Figure DEST_PATH_IMAGE003
if the return value is a null character string, the non-standard number which is filled in the standard field and does not need to be processed is indicated, and the original content is used. After processing, the standard numbers of the four data become: GB5782-2000, CB 598-1990.
2. Name processing:
and searching a standard name in a data table which is constructed in advance according to the processed standard number data, and if the standard name exists, using the standard name. After the treatment of the step, part of non-uniform material names are subjected to standard treatment. For example, the material names of serial numbers 1 and 2 are changed into bolts.
3. Multiplication processing:
in the inspection of material data, the specification column is mainly used, and whether the multiplication number is used in an irregular way can be realized by using the following functions:
code 2:
Figure 301312DEST_PATH_IMAGE004
4. processing the parameters in sequence:
assuming that the sequence of the pressure of the valve and the sequence of the drift diameter marks PN and DN is that PN is before DN, the data in the specifications of the serial numbers 3 and 4 are processed as follows: PN25 DN 32.
5. Splicing character strings:
the data needing to be checked for duplication are spliced by using the connection characters to generate the duplication checking character strings, the connection characters generally use unusual special symbols, in the example, "■", and the duplication checking character strings of four materials are obtained as follows:
Figure DEST_PATH_IMAGE005
table 2 concatenated duplicate checking strings
6. Replace confusable characters:
and extracting a confusable character list from the confusable character library, and replacing the confusable characters in the duplication checking character string with the specified characters. Table 3 is an example of a confusing character table.
Figure 31501DEST_PATH_IMAGE006
TABLE 3 confusing character table
After the replacement, the Roman numeral "IV" in the asset name serial number 4 is replaced with the two letter "IV".
7. Space and case processing:
after the blank space is removed, a final duplication checking character string is generated and is shown in a table 4:
Figure DEST_PATH_IMAGE007
TABLE 4 Final generated duplication checking string
When the repeated inspection of the material warehousing is carried out, the materials 1 and 2, and 3 and 4 with different input data are generated into the same duplicate checking character string, the duplicate checking inspection cannot be passed, and the duplicate checking character string can only be input once, so that the aim of duplicate checking is fulfilled.

Claims (8)

1. A method for checking duplicate data of materials is characterized in that: the method comprises the steps of standard number processing, name processing, multiplication processing, parameter sequence processing, character string splicing, easy-to-confuse character replacement, space processing and case processing;
(1) standard number processing: converting the standard number input by the user into a standard verification format, namely converting the year into four digits, and removing the mark of the recommendation standard;
(2) name processing: if the materials recorded in the system are recorded in the standard number, the names of the recorded materials are used;
(3) multiplication processing: processing data of incorrect use multiplier in the specification;
(4) processing the parameters in sequence: processing the parameter sequence in the specification;
(5) splicing character strings: splicing the data to generate an initial duplicate checking character string;
(6) replacing the confusable character;
(7) blank space and case processing process: clearing blank space and converting case and case;
and (3) returning the data to the multiplier processing through user processing after standard number processing, name processing, multiplier processing and error-free entry, forming parameter sequential arrangement after error-free entry to form a spliced character string, replacing characters which are easy to be confused with the spliced character string, and finishing after distinguishing blank spaces and case cases.
2. The method for checking the duplication data of the material, as claimed in claim 1, wherein: the standard number processing mainly processes the year into 4 bits in a unified way, cancels marks of various recommendation standards, is realized by using C # codes, and if the return value is a null character string, the non-standard number which does not need to be processed is filled in the standard field, and the original content is used.
3. The method for checking the duplication data of the material, as claimed in claim 1, wherein: the name processing searches for a standard in a data table which is constructed in advance according to processed standard number data, the name is used if the standard name exists, and after the step of processing, part of non-uniform material names are subjected to standard processing.
4. The method for checking the duplication data of the material, as claimed in claim 1, wherein: the multiplication number processing checks that the material data is mainly a specification column, and whether the multiplication number is used in an irregular way can be realized by using a function.
5. The method for checking the duplication data of the material, as claimed in claim 1, wherein: wherein, the parameter precedence order processing assumes that the precedence order of the pressure of the valve and the drift diameter marks PN and DN is that PN is before DN and then DN, and the data in the specifications of the serial numbers 3 and 4 are processed as follows: PN25 DN 32.
6. The method for checking the duplication data of the material, as claimed in claim 1, wherein: and the splicing character string splices the data needing to be subjected to duplicate checking by using the connecting characters to generate a duplicate checking character string.
7. The method for checking the duplication data of the material, as claimed in claim 1, wherein: and replacing the confusable characters in the duplication checking character string by the specified characters.
8. The method for checking the duplication data of the material, as claimed in claim 1, wherein: and the process of space and case processing is finished after space distinguishing, space clearing and case conversion are carried out.
CN201911299014.7A 2019-12-17 2019-12-17 Material duplicate checking data method Pending CN111159491A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911299014.7A CN111159491A (en) 2019-12-17 2019-12-17 Material duplicate checking data method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911299014.7A CN111159491A (en) 2019-12-17 2019-12-17 Material duplicate checking data method

Publications (1)

Publication Number Publication Date
CN111159491A true CN111159491A (en) 2020-05-15

Family

ID=70557313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911299014.7A Pending CN111159491A (en) 2019-12-17 2019-12-17 Material duplicate checking data method

Country Status (1)

Country Link
CN (1) CN111159491A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6098892A (en) * 1998-05-27 2000-08-08 Peoples, Jr.; Max J. Device for conversion from a pharmaceutical identification number to a standardized number and method for doing the same
CN101183373A (en) * 2007-12-17 2008-05-21 渤海船舶重工有限责任公司 Encode coding method of marine vehicle material computer management
CN202177927U (en) * 2011-07-29 2012-03-28 罗勇 Material uniformly-coding checking management system
CN107122947A (en) * 2017-05-04 2017-09-01 四川省红地科技有限责任公司 A kind of method of materiel code generation and management based on material catalogue
CN110309132A (en) * 2019-05-08 2019-10-08 广东中建普联科技股份有限公司 A kind of ration standard method of priced bill of quantities

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6098892A (en) * 1998-05-27 2000-08-08 Peoples, Jr.; Max J. Device for conversion from a pharmaceutical identification number to a standardized number and method for doing the same
CN101183373A (en) * 2007-12-17 2008-05-21 渤海船舶重工有限责任公司 Encode coding method of marine vehicle material computer management
CN202177927U (en) * 2011-07-29 2012-03-28 罗勇 Material uniformly-coding checking management system
CN107122947A (en) * 2017-05-04 2017-09-01 四川省红地科技有限责任公司 A kind of method of materiel code generation and management based on material catalogue
CN110309132A (en) * 2019-05-08 2019-10-08 广东中建普联科技股份有限公司 A kind of ration standard method of priced bill of quantities

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韩新伟 等: "多层构架的物资编码管理系统的设计与实现", 计算机工程与设计 *

Similar Documents

Publication Publication Date Title
US8019795B2 (en) Data warehouse test automation framework
US9280569B2 (en) Schema matching for data migration
CN101978348B (en) Manage the archives about approximate string matching
US9721009B2 (en) Primary and foreign key relationship identification with metadata analysis
US5457792A (en) System for using task tables and technical data from a relational database to produce a parsed file of format instruction and a standardized document
CN109033410B (en) SQL (structured query language) analysis method based on regular and character string cutting
CN110704880B (en) Correlation method of engineering drawings
US20210056268A1 (en) Data transformation system and method
CN102024046A (en) Data repeatability checking method and device as well as system
CN110599289A (en) Method for formatting official document
US20050149482A1 (en) Method of updating a database created with a spreadsheet program
US20230087421A1 (en) Systems and methods for generalized structured data discovery utilizing contextual metadata disambiguation via machine learning techniques
US20060074971A1 (en) Method and system for formatting and indexing data
CN111143370B (en) Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables
CN105260300A (en) Service test method based on CAS (General Classification Standards of China Accounting Standards) application platform
US20140372408A1 (en) Sparql query optimization method
CN111159491A (en) Material duplicate checking data method
CN103440272A (en) Database maintenance method and device
CN114462736A (en) Experience feedback intelligent recommendation method for nuclear power plant radiation work license application
CN104636471A (en) Procedure code finding method and device
JP2006268661A (en) Data import method and data import device
CN114595225A (en) Method and device for comparing personnel information by using full data
CN113537349A (en) Method, device, equipment and storage medium for identifying hardware fault of large host
CN108255887B (en) Method and device for verifying industry text
CN109408510A (en) A kind of method for normalizing and device of data model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination