CN111460777B

CN111460777B - A DUS testing method for plant varieties

Info

Publication number: CN111460777B
Application number: CN202010172640.6A
Authority: CN
Inventors: 付深造; 徐东辉; 杨坤
Original assignee: Institute of Vegetables and Flowers Chinese Academy of Agricultural Sciences
Current assignee: Institute of Vegetables and Flowers Chinese Academy of Agricultural Sciences
Priority date: 2020-03-12
Filing date: 2020-03-12
Publication date: 2023-11-14
Anticipated expiration: 2040-03-12
Also published as: CN111460777A

Abstract

A DUS testing method for plant varieties, including conducting plant cultivation tests, collecting and processing test data in a unified format, and performing DUS analysis. The method optimizes the DUS test design and increases the objectivity of the DUS test results. At the same time, data and statistical hypothesis validity testing and DUS data analysis can be efficiently implemented with the help of the EXCEL program.

Description

A DUS testing method for plant varieties

技术领域Technical Field

本发明属于植物品种DUS测试技术领域，主要针对DUS测试中获取的数据进行处理分析、实现DUS判定、和优化DUS测试指南以及试验设计的方法。The present invention belongs to the technical field of DUS testing of plant varieties, and mainly aims at processing and analyzing data obtained in DUS testing, realizing DUS determination, and optimizing DUS testing guidelines and experimental design methods.

背景技术Background Art

植物品种特异性(Distinctness)、一致性(Uniformity)和稳定性(Stability) 测试(简称DUS测试)是指采用相应的测试技术与标准，通过种植试验或室内分析对植物品种的特异性、一致性和稳定性进行评价的过程。DUS测试是各国品种管理的基本技术依据，是植物新品种保护、品种审定或登记的必要条件。对于植物选种、育种，推动种子工程建设，促进农林业生产的发展有着重要的意义。Plant variety specificity (Distinctness), consistency (Uniformity) and stability (Stability) test (DUS test for short) refers to the process of evaluating the specificity, consistency and stability of plant varieties through planting tests or indoor analysis using corresponding testing technologies and standards. DUS test is the basic technical basis for variety management in various countries and a necessary condition for the protection, variety approval or registration of new plant varieties. It is of great significance for plant selection and breeding, promoting seed engineering construction, and promoting the development of agricultural and forestry production.

特异性是指一个植物品种有一个以上性状明显区别于已知品种。一致性是指一个植物品种的特性除可预期的自然变异外，群体内个体间相关的特征或者特性表现一致。稳定性是指一个植物品种经过反复繁殖后或者在特定繁殖周期结束时，其主要性状保持不变。Specificity refers to one or more traits of a plant variety that are clearly distinguishable from known varieties. Uniformity refers to the consistent expression of related characteristics or traits between individuals in a plant variety, in addition to expected natural variation. Stability refers to the fact that the main traits of a plant variety remain unchanged after repeated reproduction or at the end of a specific reproduction cycle.

国际上，DUS测试的历史是与植物新品种保护制度的历史是同步的。1957 年2月，法国政府邀请12个西欧国家参加同年5月在巴黎举行的外交大会，探讨建立一个专门的植物新品种保护制度。1961年底在巴黎召开的第二届外交大会上通过了“植物新品种保护国际公约”，并据此成立“国际植物新品种保护联盟 (UPOV)”。该公约分别于1972年、1978年和1991年进行了三次修订，形成目前最完善的DUS术语名词和定义。UPOV汇总其成员DUS测试经验，组织起草并逐步采纳了15个TGP技术文件和323个DUS测试指南，全面规范了DUS 测试的基本概念和原理、已知品种确认及品种库构建与维护、测试经验与合作、 DUS测试指南研制、试验设计与统计分析、DUS审查程序、新植物类型测试指导、分子技术应用原则等内容。Internationally, the history of DUS testing is synchronized with the history of the plant variety protection system. In February 1957, the French government invited 12 Western European countries to participate in the diplomatic conference held in Paris in May of the same year to discuss the establishment of a special plant variety protection system. At the second diplomatic conference held in Paris at the end of 1961, the "International Convention for the Protection of New Varieties of Plants" was adopted, and the "International Union for the Protection of New Varieties of Plants (UPOV)" was established on this basis. The convention was revised three times in 1972, 1978 and 1991, forming the most complete DUS terminology and definition. UPOV summarizes the DUS testing experience of its members, organizes the drafting and gradual adoption of 15 TGP technical documents and 323 DUS testing guidelines, and comprehensively standardizes the basic concepts and principles of DUS testing, the confirmation of known varieties and the construction and maintenance of variety libraries, testing experience and cooperation, the development of DUS testing guidelines, experimental design and statistical analysis, DUS review procedures, new plant type testing guidance, and molecular technology application principles.

中国于1997年3月20日颁布《植物新品种保护条例》，将DUS列为品种授权的必要实质条件，整个条例内容是参考UPOV1978年文本制定，DUS的定义是参考UPOV1991年文本制定。1999年4月23日中国加入UPOV公约1978 年文本，成为UPOV第39个成员，同年开始接收植物新品种权申请。2000年9 月29日，原农业部在全国建立1个DUS测试总中心和14个分中心，具体承担植物新品种DUS测试工作。2015年11月4日，新修订的《种子法》首次将植物新品种保护纳入品种管理范围，并将DUS测试规定为品种审定和登记的前置条件。随着DUS测试工作量的不断增加，原农业部后来又陆续新建了13个分中心和3个测试站。截止目前，农业农村部一共发布159个植物保护目录，将玉米等5个作物列入品种审定目录，将马铃薯等29个作物列入品种登记目录。On March 20, 1997, China promulgated the Regulations on the Protection of New Plant Varieties, which listed DUS as a necessary and substantial condition for variety authorization. The entire regulations were formulated with reference to the 1978 text of UPOV, and the definition of DUS was formulated with reference to the 1991 text of UPOV. On April 23, 1999, China joined the 1978 text of the UPOV Convention and became the 39th member of UPOV. In the same year, it began to accept applications for new plant variety rights. On September 29, 2000, the former Ministry of Agriculture established a DUS testing center and 14 sub-centers across the country to specifically undertake the DUS testing of new plant varieties. On November 4, 2015, the newly revised Seed Law included the protection of new plant varieties in the scope of variety management for the first time, and stipulated that DUS testing was a prerequisite for variety approval and registration. With the continuous increase in the workload of DUS testing, the former Ministry of Agriculture later established 13 sub-centers and 3 testing stations. Up to now, the Ministry of Agriculture and Rural Affairs has released a total of 159 plant protection catalogs, included five crops such as corn in the variety approval catalog, and included 29 crops such as potatoes in the variety registration catalog.

虽然UPOV发布了一整套DUS测试指导文件，但并没有提供具体的操作方法。英国的DUST统计分析软件和法国的GAIA品种管理软件是目前仅有的两款在UPOV范围内免费共享的软件，但前者仅涉及数理统计分析功能且操作繁琐，后者仅涉及品种描述分析功能且权重设置主观性强，两者使用起来都有局限性。Although UPOV has released a complete set of DUS test guidance documents, it does not provide specific operation methods. The UK's DUST statistical analysis software and France's GAIA variety management software are currently the only two softwares that are shared free of charge within UPOV. However, the former only involves mathematical and statistical analysis functions and the operation is cumbersome, while the latter only involves variety description analysis functions and the weight setting is highly subjective. Both have limitations in use.

近年来，各国代表在UPOV技术工作组年度会议上都交流了自己的数据分析方法，但主要集中在统计分析方面，功能分散，操作步骤多，尤其缺乏原始数据的质量控制方法，导致分析结果不稳定、分析效率低。In recent years, representatives of various countries have shared their own data analysis methods at the annual meeting of the UPOV Technical Working Group, but they mainly focus on statistical analysis, with scattered functions and many operating steps. In particular, there is a lack of quality control methods for raw data, resulting in unstable analysis results and low analysis efficiency.

一直以来，本发明人不断深入研究，发明了一套全面的DUS分析方法，只需一套固定性状参数，就可以对不同地点和年份的数据进行汇总分析，得出科学准确的测试结果，同时分析速度大大加快。The inventors have been continuously conducting in-depth research and have invented a comprehensive DUS analysis method. With only a set of fixed trait parameters, data from different locations and years can be summarized and analyzed to obtain scientific and accurate test results, while the analysis speed is greatly accelerated.

发明内容Summary of the invention

为高效分析植物品种DUS测试数据，提高DUS测试的试验水平和评价质量，同时为DUS判定提供全面、准确的依据，本发明人开发了一套植物品种DUS测试试验及数据评价和分析方法。并且，本发明人开发的数据分析方法可以借助 EXCEL程序方便地实现。In order to efficiently analyze the DUS test data of plant varieties, improve the experimental level and evaluation quality of DUS tests, and provide a comprehensive and accurate basis for DUS determination, the inventors have developed a set of plant variety DUS test experiments and data evaluation and analysis methods. In addition, the data analysis method developed by the inventors can be conveniently implemented with the help of EXCEL program.

鉴于此，提供一种植物品种DUS测试方法，包括进行植物种植试验、以统一的格式采集和处理试验数据、以及进行DUS分析，所述植物种植试验中的品种类型包括待测品种、标准品种和任选的近似品种；其特征在于，所述以统一的格式采集和处理试验数据包括：制作统一的表格、设置统一的数据格式(如数据类型)、数值范围和任选的数值单位，以及对试验数据和统计假设有效性进行检验；In view of this, a plant variety DUS test method is provided, comprising conducting a plant planting test, collecting and processing test data in a unified format, and performing DUS analysis, wherein the variety types in the plant planting test include a variety to be tested, a standard variety, and an optional similar variety; characterized in that collecting and processing the test data in a unified format comprises: making a unified table, setting a unified data format (such as a data type), a numerical range, and an optional numerical unit, and testing the validity of the test data and statistical hypotheses;

所述对试验数据有效性进行检验包括：进行数据格式、数值范围和/或任选的数值单位匹配检验，所述检验借助程序在输入时或输入后对数据自动进行，若出现输入的数据与所设置的数据格式、数值范围和/或任选的数值单位不相符，则表明出现异常值，若出现异常值，则对异常值进行自动标识，并人工检查原始记录或田间样本；如果属于输入错误，则直接改正；如果属于客观事实，则保留该异常数据，继续进行之后的步骤。The verification of the validity of the test data includes: performing a data format, numerical range and/or optional numerical unit matching test, and the test is automatically performed on the data during or after input with the help of a program. If the input data does not match the set data format, numerical range and/or optional numerical unit, it indicates that an abnormal value has occurred. If an abnormal value has occurred, the abnormal value is automatically marked, and the original record or field sample is manually checked; if it is an input error, it is directly corrected; if it is an objective fact, the abnormal data is retained and the subsequent steps are continued.

所述植物种植试验可以仅进行一期试验。优选地，所述植物种植试验进行两期或两期以上试验。一年生或两年生植物一个完整的繁殖周期是从播种到收获的时间，多年生植物是正常开花结果年份从发芽到收获的时间。植物生长受温度、雨量和光照影响较大，年度间会表现较大差异，一个年份往往不能准确判定品种描述和差异。所以DUS测试一般需要两个生长周期，对于那些一致性较差、品种间差异较小的作物或品种(如牧草)，往往需要3个生长周期，对于那些无性繁殖、可控环境种植(温室)、品种差异明显的作物或品种(如蝴蝶兰)，1个生长周期也可以结束DUS测试。The plant planting test may only be conducted in one phase. Preferably, the plant planting test is conducted in two or more phases. A complete reproductive cycle for annual or biennial plants is the time from sowing to harvesting, and for perennial plants it is the time from germination to harvesting in the normal flowering and fruiting year. Plant growth is greatly affected by temperature, rainfall and light, and there will be large differences between years. One year often cannot accurately determine the variety description and differences. Therefore, the DUS test generally requires two growth cycles. For crops or varieties with poor consistency and small differences between varieties (such as forage), 3 growth cycles are often required. For those crops or varieties that are asexually propagated, planted in a controlled environment (greenhouse), and have obvious variety differences (such as Phalaenopsis), 1 growth cycle can also end the DUS test.

稳定性测试一般取同一品种不同世代的种子进行种植试验分析。如果下一代种子与上一代种子性状表达状态一致，并且都具备一致性，则表示该品种具备稳定性。如果一个品种在某次试验中具备了一致性，也就意味着具备了稳定性。Stability testing generally involves planting test analysis of seeds of the same variety from different generations. If the expression of traits of the next generation of seeds is consistent with that of the previous generation, and they are both consistent, it means that the variety is stable. If a variety is consistent in a certain test, it means that it is stable.

杂交品种的一致性和稳定性也可以通过测试亲本的一致性或稳定性进行判断。The consistency and stability of hybrid varieties can also be judged by testing the consistency or stability of the parents.

已知品种是指已受理申请或者已通过品种审定、品种登记、新品种保护，或者已经销售、推广的植物品种。待测品种是指申请品种权保护、审定或登记的品种，或者是从市场上抽检待评价的品种。近似品种：是指为了特异性测试而从品种库中筛选出来的与待测品种在表型或者分子特征上相近、需要在田间种植试验中进一步验证的品种。标准品种：是指在种植试验中用于评价环境影响、指明性状表达状态的已知品种。Known varieties refer to plant varieties that have been accepted for application or have passed variety evaluation, variety registration, new variety protection, or have been sold and promoted. Varieties to be tested refer to varieties that have applied for variety protection, evaluation or registration, or varieties sampled from the market for evaluation. Similar varieties: refers to varieties that are screened from the variety library for specific testing and are similar to the varieties to be tested in phenotype or molecular characteristics and need to be further verified in field planting trials. Standard varieties: refers to known varieties used in planting trials to evaluate environmental impacts and indicate the expression status of traits.

性状按表达类型可以分为质量性状(QL)、假质量性状(PQ)和数量性状 (QN)。按观测类型分为目测(V)和测量(M)。按记录类型分为群体(G) 和个体(S)。按观测类型和记录类型组合可分为群体目测(VG)、群体测量(MG)、个体目测(VS)、个体测量(MS)。According to the expression type, traits can be divided into quality traits (QL), pseudo-quality traits (PQ) and quantitative traits (QN). According to the observation type, they can be divided into visual observation (V) and measurement (M). According to the record type, they can be divided into group (G) and individual (S). According to the combination of observation type and record type, they can be divided into group visual observation (VG), group measurement (MG), individual visual observation (VS) and individual measurement (MS).

表达状态：植物品种DUS测试指南或标准中，将每个测试性状的表达范围划分为一系列表达状态。为便于定义性状和规范描述，每个表达状态赋予一个相应的数字代码，以便于测试数据记录、处理和品种描述。Expression state: In the DUS test guide or standard for plant varieties, the expression range of each test trait is divided into a series of expression states. To facilitate the definition of traits and specification descriptions, each expression state is assigned a corresponding numeric code to facilitate test data recording, processing and variety description.

优选地，所述制作统一的表格包括制作统一的参数表，所述参数表中至少包括以下参数的字段：代码、标准值、表达状态、标准品种、性状编号、性状名称、数值类型；优选地，所述参数表中还包括选自以下的一种或多种参数字段：表达类型、观测类型、观测时间、数量单位、分级值、极大值、极小值、代码索引、分级值索引、分组、权重、阈值和照片。Preferably, the making of a unified table includes making a unified parameter table, and the parameter table includes at least fields for the following parameters: code, standard value, expression status, standard variety, trait number, trait name, and numerical type; preferably, the parameter table also includes one or more parameter fields selected from the following: expression type, observation type, observation time, quantity unit, grading value, maximum value, minimum value, code index, grading value index, grouping, weight, threshold and photo.

其中，性状编号、代码、表达状态、标准品种、性状名称、表达类型、观测类型、观测时间、数量单位、数值类型、极大值、极小值参数可根据DUS测试指南进行预设。Among them, the trait number, code, expression status, standard variety, trait name, expression type, observation type, observation time, quantity unit, numerical type, maximum value and minimum value parameters can be preset according to the DUS test guide.

标准值根据DUS测试指南确定或其他方法确定。The standard value is determined according to the DUS test guide or other methods.

标准品种的实测值为本次试验中测得的数据。The actual measured values of the standard varieties are the data measured in this test.

代码索引是为每个代码设置一个识别码，可设置为由“性状编号*10000+代码”组合而成，便于后面通过性状编号和代码提取对应信息。The code index is to set an identification code for each code, which can be set to be composed of "characteristic number*10000+code", so as to facilitate the subsequent extraction of corresponding information through the characteristic number and code.

分级值索引是为每个代码设置的另一个识别码，可设置为由“性状编号 *10000+分级值”组合而成，便于区间法将原始值转换成代码。The grading value index is another identification code set for each code, which can be set to be composed of "trait number *10000+grading value" to facilitate the interval method to convert the original value into a code.

分级值用于设置原始值对应代码的分级区间，是每个代码对应分级区间的最小值。The grading value is used to set the grading interval of the code corresponding to the original value, and is the minimum value of the grading interval corresponding to each code.

代码索引、分级值索引、分级值由标准值和实测值按预设的公式自动计算得到。The code index, classification value index and classification value are automatically calculated from the standard value and the measured value according to the preset formula.

分组：在分组程序中用到。分组依据是分组性状，在DUS测试指南中有记载。但考虑实际使用效果，可重新选用表达状态离散、易于区分、能够准确观测的质量性状和/或假质量性状作为分组性状。Grouping: used in the grouping procedure. The basis for grouping is the grouping trait, which is documented in the DUS test guide. However, considering the actual use effect, quality traits and/or pseudo-quality traits with discrete expression states, easy to distinguish, and accurately observable can be re-selected as grouping traits.

权重在计算品种价值中用到，根据经验设定。例如生育期重要，权重为3，性状不太重要，权重为1，在品种价值评价中给出分值中适用，从技术角度评价品种种植的意义。Weights are used in calculating the value of varieties and are set based on experience. For example, if the growth period is important, the weight is 3, and if the trait is not important, the weight is 1. This is applicable in giving scores in variety value evaluation and evaluating the significance of variety planting from a technical perspective.

阈值在阈值法筛选近似品种用到。根据多年数据变异情况凭经验设定。例如，预先将质量性状设置为0，假质量性状设置为1，数量性状设置为2。如果在样品增加或技术改进后，发现预设的阈值不合适，可以进行适当的调整。The threshold is used in the threshold method to screen similar varieties. It is set empirically based on the variation of data over the years. For example, the quality trait is set to 0, the pseudo-quality trait is set to 1, and the quantitative trait is set to 2. If the preset threshold is found to be inappropriate after the increase of samples or technical improvements, appropriate adjustments can be made.

照片：对于每个拍摄的对象类型，人工预先赋予特定字符(例如数字编号)，并以该字符对照片进行命名，如玉米照片分为幼苗、植株、雄穗、花丝、果穗五个类型照片，分别以1、2、3、4、5进行编号，照片以这些编号进行命名，例如1.jpg，表示幼苗照片。Photos: For each type of object photographed, specific characters (such as digital numbers) are manually assigned in advance and the photos are named with the characters. For example, corn photos are divided into five types: seedlings, plants, tassels, filaments, and ears. They are numbered 1, 2, 3, 4, and 5 respectively. The photos are named with these numbers, for example 1.jpg, which represents a seedling photo.

为方便分析处理，可优化照片的存储管理，按统一的方法建立各级文件夹，例如照片\玉米\2019\品种名，将照片存入对应品种名的文件夹中。To facilitate analysis and processing, the storage management of photos can be optimized and folders at all levels can be established in a unified way, such as photos\corn\2019\variety name, and the photos can be stored in folders with the corresponding variety names.

根据对应关系，将用于命名照片的字符(如编号数字)，输入到相应性状的照片字段内，并与相应的照片链接。According to the corresponding relationship, the characters used to name the photo (such as serial numbers) are input into the photo field of the corresponding trait and linked with the corresponding photo.

为了与其他系统衔接，存在需要统一修改照片名称或文件夹名称的情况。此时，按照旧名称、文件类型、文件地址、新名称的字段格式制作表格，输入照片新名称，通过程序链接文件夹和/或照片，可以完成文件或文件夹批量更名。In order to connect with other systems, there may be situations where the names of photos or folders need to be uniformly modified. In this case, a table is created according to the field format of the old name, file type, file address, and new name, and the new name of the photo is entered. The file or folder batch renaming can be completed by linking the folder and/or photo through the program.

作为示例，如表1所示。As an example, see Table 1.

表1Table 1

关于数据格式(例如数据类型)、数值范围和任选的数值单位等的设置，可利用表格或数据库自带功能设置数据类型，以EXCEL单元格为例，数据类型有任意数、整数、小数、序列、日期、时间、文本长度等。对于代码型目测数据，可以选择序列，并在数据来源里填入允许的代码值；对于连续型测量数据，可以选择小数，并在数据来源中填入允许的最小值和最大值；对于离散型测量数据，可以选择整数，并在数据来源中填入允许的最小值和最大值；对于日期型数据，可以选择日期，并在数据来源中填入开始日期和结束日期；对于比色卡型数据，可以选择文本长度，并在数据来源中填入最小长度4和最大长度5，等等。数值范围以玉米株高为例，可设置为30-500，根据需要设置数值单位，如cm。Regarding the settings of data format (such as data type), value range and optional value unit, you can use the table or database built-in function to set the data type. Taking EXCEL cells as an example, the data types include arbitrary number, integer, decimal, sequence, date, time, text length, etc. For code-type visual data, you can select sequence and fill in the allowed code value in the data source; for continuous measurement data, you can select decimal and fill in the allowed minimum and maximum values in the data source; for discrete measurement data, you can select integer and fill in the allowed minimum and maximum values in the data source; for date-type data, you can select date and fill in the start date and end date in the data source; for colorimetric card data, you can select text length and fill in the minimum length 4 and maximum length 5 in the data source, etc. Taking corn plant height as an example, the value range can be set to 30-500, and the value unit can be set as needed, such as cm.

优选地，采用统一格式的表格采集数据；例如，采用横排数据表或竖排数据表采集数据，优选地，采用横排数据表采集数据；所述横排数据表的格式为：按照待测、品种、试验、性状编号的字段进行横排，当针对同一性状测定了多个单株样本值时，同一性状编号连续重复横排；同一试验下的同一品种只列出一次，不能出现重复。Preferably, data is collected using a table in a unified format; for example, data is collected using a horizontal data table or a vertical data table, preferably, data is collected using a horizontal data table; the format of the horizontal data table is: arranged horizontally according to the fields of the test, variety, test, and trait number; when multiple single-plant sample values are measured for the same trait, the same trait number is repeated horizontally; the same variety under the same test is only listed once, and no duplication occurs.

所述竖排数据表的格式为：按待测、品种、试验、性状、同一性状各单株样本编号的字段进行横排，而将性状编号作为数据竖排。The format of the vertical data table is as follows: the fields of the test, variety, test, trait, and sample number of each individual plant of the same trait are arranged horizontally, and the trait number is used as data in a vertical arrangement.

各表格中，待测字段下用“是”标识的品种表示需要待测试评价并需要出具分析报告的品种，其他品种用“否”表示，例如标准品种、近似品种，并非被测试评价品种，不需要针对其出分析报告。In each table, the varieties marked with "Yes" under the field to be tested indicate the varieties that need to be tested and evaluated and for which an analysis report needs to be issued. Other varieties are indicated by "No", such as standard varieties and similar varieties, which are not tested and evaluated varieties and do not require an analysis report.

优选地，在上述植物品种DUS测试方法中，所述对试验数据有效性进行检验还包括采用BoxPlot法(箱线图法)和/或3σ法(三倍标准差法)进行检验，对试验中采集的多个样本的MS数据进行检验；若出现异常值，则进行自动标识。若经两种方法检验，仍为异常值，需要人工检查原始记录或田间样本，如属于输入错误，直接改正。属于客观事实的情况下，如果只有极少数(例如两个以内) 无法说明原因的异常值，可以通过程序提供前后值的平均数进行修正(其他计算法例如整体平均值、极大似然法估算缺失值并不适用于DUS测试数据的修正，采用该株附近株的试验结果估算缺失值，更能反映试验真实情况)。如果异常值较多，则不处理，可以考虑用相对方差法或者COYU法检验一致性。异常值也可能是环境或者取样方式造成的，如地力不均匀，环境不一致，或者取样没排除边际植株。这就需要优化试验设计和取样方式，必要时，扩大取样数量。Preferably, in the above-mentioned plant variety DUS test method, the test of the validity of the test data also includes using the BoxPlot method (box plot method) and/or the 3σ method (three times standard deviation method) to test the MS data of multiple samples collected in the test; if an abnormal value appears, it is automatically marked. If it is still an abnormal value after the two methods are tested, it is necessary to manually check the original records or field samples. If it is an input error, it is directly corrected. In the case of objective facts, if there are only a few (for example, less than two) abnormal values that cannot explain the reason, the average of the previous and next values provided by the program can be corrected (other calculation methods such as the overall mean value and the maximum likelihood method are not suitable for the correction of DUS test data. The test results of the plants near the plant are used to estimate the missing values, which can better reflect the actual situation of the test). If there are many abnormal values, they will not be processed, and the relative variance method or COYU method can be considered to test the consistency. Abnormal values may also be caused by the environment or sampling method, such as uneven soil fertility, inconsistent environment, or sampling without excluding marginal plants. This requires optimizing the experimental design and sampling methods, and expanding the sampling quantity when necessary.

对于重复取样测量的性状，采用BoxPlot或3σ法可更精确地检验异常值，前者极端值不参与计算，后者极端值参与计算，两种方法互补。For traits measured by repeated sampling, the use of BoxPlot or the 3σ method can more accurately test outliers. In the former, extreme values are not included in the calculation, while in the latter, extreme values are included in the calculation. The two methods are complementary.

优选地，采用竖排数据表进行BoxPlot法和/或3σ法检验；当数据采用横排数据表采集时，可以通过设计的程序将其转化为竖排数据表。上述两种检验方法可在同一数据格式下进行，并用不同颜色标识出各种异常值。Preferably, a vertical data table is used for BoxPlot method and/or 3σ method test; when data is collected using a horizontal data table, it can be converted into a vertical data table through a designed program. The above two test methods can be performed in the same data format, and various abnormal values can be marked with different colors.

优选地，所述以统一的格式采集和处理试验数据还包括将性状原始数据转化成代码的步骤；该步骤包括：Preferably, the step of collecting and processing the test data in a unified format further includes converting the raw data of the trait into codes; the step includes:

对于每个目测性状VG和VS原始数据，直接赋予对应的代码，进行植物品种DUS测试；For each visual trait VG and VS raw data, directly assign the corresponding code and conduct the plant variety DUS test;

对于每个测量性状MG和MS原始数据，进行频率分布分析，包括：选择任意一次试验的性状原始数据或者任意多次试验的性状原始数据进行LSD分析 (任何一次或多次试验数据均可，尽可能品种数量多的试验，计算一次即可，无需每次试验都计算)，获得测量性状的LSD_0.05值；求出所有品种的所有测量性状原始数据的平均值；以所有品种的测量性状原始数据平均值为中心标准值，以 2倍LSD_0.05为级差，设定各级标准值；以各级标准值为中心，向两侧各延伸1/2 级差所构成的区间，设为每个代码的分级区间；每个区间的最小值，作为分级值；统计各分级区间的品种数和百分比；For each measured trait MG and MS raw data, frequency distribution analysis is performed, including: selecting the trait raw data of any one test or the trait raw data of any multiple tests for LSD analysis (any one or multiple test data are acceptable, and the test with as many varieties as possible can be calculated once, without calculating for each test), and obtaining the LSD _0.05 value of the measured trait; finding the average value of all measured trait raw data of all varieties; taking the average value of all measured trait raw data of all varieties as the central standard value, and 2 times LSD _0.05 as the level difference, setting the standard values of each level; taking the standard value of each level as the center, extending the interval formed by 1/2 level difference to both sides, and setting it as the classification interval of each code; the minimum value of each interval is taken as the classification value; and counting the number and percentage of varieties in each classification interval;

(2)确定标准值、分级值和区间代码，包括：根据统计结果，判定总区间覆盖范围是否小于3级或者大于9级；小于3级的性状剔除，不适宜用于植物品种DUS分析；大于9级的性状，调增LSD_0.05的倍数，并参考各分级区间百分比是否均匀来调整分级，使得分级范围处于9级以内；处于3和9级之间的性状，可以在两端各空出1-2级以便将来出现新品种时使用；如果最小分级区间最小值小于零，则将最小分级区间的最小值设置为0，再按照前面确定的倍数的LSD_0.05从小到大进行分级，重新确定标准值；优选地，将得出的标准值精确到合适的小数点位数(例如0位小数，即取整)；(2) Determine the standard value, grading value and interval code, including: according to the statistical results, determine whether the total interval coverage is less than 3 levels or greater than 9 levels; eliminate the traits less than 3 levels and are not suitable for DUS analysis of plant varieties; for traits greater than 9 levels, increase the multiple of LSD _0.05 , and adjust the grading by referring to whether the percentage of each grading interval is uniform, so that the grading range is within 9 levels; for traits between 3 and 9 levels, leave 1-2 levels at both ends to be used when new varieties appear in the future; if the minimum value of the minimum grading interval is less than zero, set the minimum value of the minimum grading interval to 0, and then grade from small to large according to the multiple of LSD _0.05 determined previously, and re-determine the standard value; preferably, the obtained standard value is accurate to an appropriate number of decimal places (for example, 0 decimal places, i.e., rounding);

选择测量性状实测值处在各级标准值或其附近的一个植物品种作为相应分级区间的标准品种，误差较大时，适当平移标准值，使标准品种实测值与其对应的标准值接近，由此最终形成一套相对固定的标准值和标准品种。标准品种的选取应兼顾品种的适应能力、繁殖材料的可获得性和表达状态的代表性进行选择。Select a plant variety whose measured value of the measured trait is at or near the standard value of each level as the standard variety of the corresponding classification interval. When the error is large, appropriately translate the standard value to make the measured value of the standard variety close to its corresponding standard value, thus eventually forming a set of relatively fixed standard values and standard varieties. The selection of standard varieties should take into account the adaptability of the variety, the availability of propagation materials and the representativeness of the expression state.

在不同的测试试验中，标准值通常保持不变，可在后续步骤中进行验证，当出现不合适的情形，例如有新的表达状态出现，再重新调整、确定。标准品种通常也保持不变，在每次试验中重复种植。标准品种的实测值及由其决定的分级值会随着试验而变化。In different test trials, the standard value usually remains unchanged and can be verified in subsequent steps. When an unsuitable situation occurs, such as a new expression state, it can be readjusted and confirmed. The standard variety is usually kept unchanged and planted repeatedly in each trial. The measured value of the standard variety and the grade value determined by it will change from trial to trial.

基于所给出的分级区间和代码，将试验中的测量性状MG和MS原始数据代码化，获得区间代码，区间代码可以直接用于植物品种DUS分析，或者将区间代码进一步优化处理后，用于植物品种DUS分析。Based on the given grading intervals and codes, the raw data of the measured traits MG and MS in the experiment are coded to obtain interval codes, which can be directly used for DUS analysis of plant varieties, or the interval codes can be further optimized and used for DUS analysis of plant varieties.

优选地，上述频率分布分析在横排跨试验数据表中进行，在对一次试验MS 性状数据进行频率分布分析时，从横排数据表中直接提取待测、品种、性状编号及其原始值到横排跨试验数据表中；在针对两次或两次以上试验MS性状数据进行频率分布分析时，从横排数据表中提取待测、品种、性状编号，并计算各性状的试验平均值，一并转入至横排跨试验数据表中，所述横排跨试验数据表格式为：待测、品种、同一性状不同试验的平均值或不同植株的原始值连续横排。Preferably, the above-mentioned frequency distribution analysis is performed in a horizontal cross-experiment data table. When performing frequency distribution analysis on the MS trait data of one experiment, the test items, varieties, trait numbers and their original values are directly extracted from the horizontal row data table to the horizontal cross-experiment data table; when performing frequency distribution analysis on the MS trait data of two or more experiments, the test items, varieties, and trait numbers are extracted from the horizontal row data table, and the experimental average values of each trait are calculated and transferred to the horizontal cross-experiment data table. The format of the horizontal cross-experiment data table is: the test items, varieties, and average values of different experiments of the same trait or the original values of different plants are arranged in consecutive horizontal rows.

优选地，在上述植物品种DUS测试方法中，对于MG、MS性状，在所采用的一套标准值不是由本次试验数据得到的情况下，还包括检验标准品种在本次试验中与获得所述一套标准值的试验中表现是否一致的步骤，该步骤包括：将本次试验中标准品种实测值同与其相对应的标准值进行比较，两者差值的绝对值除以标准值，若该值大于10％，则确认为异常值，对其进行标识(例如标识为特定的颜色)；对于出现异常值的情况，人工判定是否属于可接受的异常情况，如果某个性状多个标准品种均因某种因素出现类似的变化，则认定为属于可接受的异常；如果该性状出现异常情况的标准品种与其他标准品种变化不一致，则需要剔除该标准品种实测值。Preferably, in the above-mentioned plant variety DUS test method, for MG and MS traits, when the set of standard values adopted is not obtained from the current test data, it also includes a step of verifying whether the performance of the standard variety in this test is consistent with that in the test in which the set of standard values is obtained, and the step includes: comparing the actual measured value of the standard variety in this test with the corresponding standard value, dividing the absolute value of the difference between the two by the standard value, and if the value is greater than 10%, it is confirmed as an abnormal value and marked (for example, marked with a specific color); in the case of abnormal values, it is manually determined whether it is an acceptable abnormal situation. If multiple standard varieties of a certain trait show similar changes due to some factors, it is determined to be an acceptable abnormality; if the standard variety with the abnormal situation of the trait is inconsistent with the changes of other standard varieties, it is necessary to eliminate the actual measured value of the standard variety.

优选地，所述检验标准品种在本次试验与获得所述一套标准值的试验中表现是否一致的步骤中，在竖排处理数据表中计算本次试验中性状的实测值、标准差、样本数，把标准品种实测值提取到参数表中进行检验；所述竖排处理数据表可由竖排数据表转化而来，所述竖排处理数据表的格式为：待测、品种、试验、性状横排，并将竖排数据表中同一MS性状多个样本值处理成平均值、标准差、样本数，并预留区间代码、已知代码、回归代码、优化代码、表达状态的字段，将这些字段横排。Preferably, in the step of checking whether the performance of the standard variety in this test is consistent with that in the test for obtaining the set of standard values, the measured value, standard deviation and sample number of the trait in this test are calculated in the vertically processed data table, and the measured value of the standard variety is extracted into the parameter table for testing; the vertically processed data table can be converted from the vertical data table, and the format of the vertically processed data table is: test, variety, test, and trait are arranged horizontally, and multiple sample values of the same MS trait in the vertical data table are processed into average value, standard deviation and sample number, and fields for interval code, known code, regression code, optimization code and expression status are reserved, and these fields are arranged horizontally.

优选地，对于MG、MS性状，在所采用的一套标准值不是由本次试验数据得到的情况下，还包括利用标准品种矫正分级范围的步骤，该步骤包括：分级值第一个为零，第二个以下分别为：各标准品种的(实测值-标准值)/标准值所得值之和除以标准品种个数，再加上本代码对应标准值与前一代码对应标准值之和的1/2。Preferably, for MG and MS traits, when the set of standard values adopted is not obtained from the test data of this time, the step of correcting the grading range using standard varieties is also included, and the step includes: the first grading value is zero, and the second and subsequent grading values are: the sum of the values obtained by (measured value-standard value)/standard value of each standard variety divided by the number of standard varieties, plus 1/2 of the sum of the standard value corresponding to the current code and the standard value corresponding to the previous code.

优选地，在上述植物品种DUS测试方法中，对于数量性状，还包括利用已知品种(包括标准品种和/或近似品种)的已知代码与本次实验中该已知品种的相应平均值建立线性回归函数，并将待测品种的在本次试验中的原始数据平均值代入该回归函数，求出其回归代码；并对区间代码、已知代码、和回归代码进行分析，选取其中的众数代码、中间数代码或三者的平均数代码作为进一步优化的代码数据。Preferably, in the above-mentioned plant variety DUS testing method, for quantitative traits, it also includes establishing a linear regression function using the known codes of known varieties (including standard varieties and/or similar varieties) and the corresponding average values of the known varieties in this experiment, and substituting the average value of the original data of the variety to be tested in this experiment into the regression function to calculate its regression code; and analyzing the interval code, known code, and regression code, and selecting the mode code, median code or the average code of the three as the code data for further optimization.

优选地，在上述植物品种DUS测试方法中，当至少存在区间代码、已知代码、回归代码中的两种代码时，还包括采用代码极差进行检验步骤，该检验步骤包括：由这些代码中的最大值减最小值计算极差，对不同大小的极差进行不同的标识；例如，差1个代码显示黄色，差2个代码显示橙色，差3个代码显示红色，差4个及以上的代码显示紫色，对于标识的代码数据，人工检查原始数据，或调取照片确认，并根据需要人工修改优化的代码数据。Preferably, in the above-mentioned plant variety DUS test method, when there are at least two codes among the interval code, the known code and the regression code, it also includes a verification step using the code range, and the verification step includes: calculating the range by subtracting the minimum value from the maximum value among these codes, and marking different sizes of ranges differently; for example, a code difference of 1 is displayed in yellow, a code difference of 2 is displayed in orange, a code difference of 3 is displayed in red, and a code difference of 4 or more is displayed in purple. For the marked code data, the original data is manually checked, or photos are retrieved for confirmation, and the optimized code data is manually modified as needed.

优选地，在上述植物品种DUS测试方法中，还包括将多个试验的优化代码放在一起进行比较，运用最大值-最小值所得的代码极差对代码进行检验，根据极差大小，进行不同的标识，例如以不同的特定颜色显示，并对标识的代码数据，人工检查原始数据，或者调取照片确认，根据需要人工修改代码，获得跨试验的综合代码；Preferably, in the above-mentioned plant variety DUS test method, the optimized codes of multiple tests are put together for comparison, the code range obtained by using the maximum value-minimum value is used to test the code, different markings are made according to the range size, such as displaying in different specific colors, and the marked code data is manually checked for original data, or photos are retrieved for confirmation, and the code is manually modified as needed to obtain a comprehensive code across the tests;

优选地，所述运用代码极差对代码进行的检验在竖排跨试验数据表中进行，所述竖排跨试验数据表格式为：待测、品种、性状、所述各次试验中的平均值、标准差、样本数、代码、表达状态横排并排显示，计算所有试验的平均值、标准差、样本数、优化代码的平均值，一并横排并排显示，其中，优化代码的平均值取整，即直接去掉小数点后的数值或四舍五入；计算不同试验的代码极差，根据极差大小，进行不同的标识，例如以不同的特定颜色显示，例如黄色(差1)、橙色(差2)、红色(差3)、紫色(差4以上)；对于显色的代码数据，即各次试验间有差异的代码，人工检查原始数据，或调取照片确认，根据需要人工修改代码，确认的代码作为跨试验综合代码。Preferably, the inspection of the code using the code range is carried out in a vertical cross-test data table, and the format of the vertical cross-test data table is: the test, variety, trait, the average value, standard deviation, number of samples, code, and expression status in each test are displayed horizontally side by side, and the average value, standard deviation, number of samples, and average value of the optimized code of all tests are calculated and displayed horizontally side by side, wherein the average value of the optimized code is rounded, that is, the value after the decimal point is directly removed or rounded off; the code range of different tests is calculated, and different identifications are performed according to the size of the range, for example, displaying in different specific colors, such as yellow (difference 1), orange (difference 2), red (difference 3), and purple (difference 4 or more); for the colored code data, that is, the code that is different between each test, the original data is manually checked, or photos are retrieved for confirmation, and the code is manually modified as needed, and the confirmed code is used as the cross-test comprehensive code.

优选地，在上述植物品种DUS测试方法中，还包括将转化的代码转入品种库的步骤；优选地，如果品种库中已经存在该品种或相应性状，则将结果进行覆盖；如果不存在，则在品种库的最后一行添加品种，或最后一列添加性状，更新品种库；Preferably, in the above-mentioned plant variety DUS test method, the step of transferring the converted code into a variety library is also included; preferably, if the variety or corresponding trait already exists in the variety library, the result is overwritten; if not, the variety is added to the last row of the variety library, or the trait is added to the last column, and the variety library is updated;

优选地，完成两次或两次以上试验后，对两次或两次以上试验的照片逐个进行对比，判定两次或两次以上试验是否有差异，如果有差异，检查原因，如果没差异，挑选一套标准照片，存储到预设的文件夹，例如存放到DUS\玉米\标准照片\品种名称文件夹中。Preferably, after completing two or more tests, the photos of the two or more tests are compared one by one to determine whether there are differences between the two or more tests. If there are differences, check the reasons. If there are no differences, select a set of standard photos and store them in a preset folder, for example, in the DUS\Corn\Standard Photos\Variety Name folder.

优选地，还包括利用照片对性状代码进行确认或矫正的步骤；优选地，将某个性状代码按从小到大的顺序排列后，提取该性状对应的照片进行依次目测比较；优选地，该操作在横排数据表或品种库中进行，将鼠标点在某个代码型性状列上，程序自动按代码按大小排序，并通过照片命名的字符(优选为数字编号)，批量提取该性状对应的每个品种的照片，放在下一列对应位置，按顺序查看照片，人工确认是否有代码给错的情况，最终保证拟出具报告的代码和照片一致。Preferably, the process further includes the step of confirming or correcting the trait code using photos; preferably, after arranging a certain trait code in ascending order, extracting photos corresponding to the trait and visually comparing them in sequence; preferably, the process is performed in a horizontal data table or variety library, and the mouse is placed on a certain code-type trait column. The program automatically sorts the codes by size, and extracts photos of each variety corresponding to the trait in batches using the characters (preferably digital numbers) named in the photos, places them in the corresponding positions in the next column, and checks the photos in sequence to manually confirm whether any of the codes are incorrect, ultimately ensuring that the codes and photos to be reported are consistent.

本发明方法优化了DUS测试试验设计，实现多个试验数据的联合矫正，增加了DUS测试分析结果的客观性。同时，本发明的方法可以借助EXCEL程序高效地实现试验数据和统计假设有效性检验和DUS数据分析，使得原先需要2 个月才能完成的DUS数据分析工作，缩减到1天完成。The method of the present invention optimizes the DUS test design, realizes the joint correction of multiple test data, and increases the objectivity of the DUS test analysis results. At the same time, the method of the present invention can efficiently realize the test data and statistical hypothesis validity test and DUS data analysis with the help of EXCEL program, so that the DUS data analysis work that originally took 2 months to complete can be reduced to 1 day.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1显示的是展示了链接照片的竖排处理数据表；Figure 1 shows a vertically arranged processed data table showing linked photos;

图2显示的是展示了链接照片的竖排跨试验数据表；Figure 2 shows a vertical cross-experiment data table showing linked photos;

图3显示的是两次实验对比界面示例图；Figure 3 shows an example of the comparison interface of the two experiments;

图4显示的是在品种库中进行照片确认的界面示例图；Figure 4 shows an example of the interface for photo confirmation in the variety library;

图5显示的是品种库中待测品种与近似品种照片对比界面示例图；FIG5 shows an example of an interface for comparing photos of a variety to be tested with similar varieties in the variety library;

具体实施方式DETAILED DESCRIPTION

以下对本发明的具体实施方式进行详细的说明。应当理解的是，此处所描述的具体实施方式仅用于示例性地对本发明进行说明，并不用于限制本发明。The specific embodiments of the present invention are described in detail below. It should be understood that the specific embodiments described herein are only used to illustrate the present invention by way of example and are not used to limit the present invention.

本次试验作物为玉米，品种149个，其中品种147个，标准品种2个。第一年和第二年各种植一次。每个小区长5米、宽2.4米，四行种植，株行距为 30cm×60cm，双粒播种，两叶一心期间苗，每个小区留苗至少80株，设置两个重复。田间管理措施同大田生产。数据采集时，VG、MG性状只采集一个值，VS、MS性状采集20个单株值，照片拍幼苗、植株、雄穗、花丝、果穗五种，取样均来自同一个小区。全生育期拍摄幼苗、植株、雄穗、花丝、果穗五种照片。The experimental crop is corn, with 149 varieties, including 147 varieties and 2 standard varieties. It is planted once in the first year and once in the second year. Each plot is 5 meters long and 2.4 meters wide, with four rows planted, and the row spacing is 30cm×60cm. Double seeds are sown, and two leaves and one heart are planted. At least 80 seedlings are left in each plot, and two replicates are set. The field management measures are the same as those in field production. When collecting data, only one value is collected for VG and MG traits, and 20 single plant values are collected for VS and MS traits. Photos of seedlings, plants, tassels, filaments, and ears are taken, and the samples are all from the same plot. Five photos of seedlings, plants, tassels, filaments, and ears are taken during the whole growth period.

1、按照DUS测试指南，设置玉米的参数表1. Set the parameter table for corn according to the DUS test guide

因性状数量较多，以下仅以三个性状为例，参数表设置如表2所示：Due to the large number of traits, only three traits are taken as an example below. The parameter table settings are shown in Table 2:

表2Table 2

表2中，性状编号、代码、表达状态、标准品种、性状名称、表达类型、观测类型、观测时间、数量单位、数值类型、极大值、极小值等参数均可以根据 DUS测试指南进行预设，标准值按照上述方法确定；阈值、权重根据经验设定；代码索引、分级值索引、分级值由标准值和实测值按预设的公式自动计算得到。In Table 2, parameters such as trait number, code, expression status, standard variety, trait name, expression type, observation type, observation time, quantity unit, numerical type, maximum value, minimum value, etc. can be preset according to the DUS test guide, and the standard value is determined according to the above method; the threshold value and weight are set according to experience; the code index, classification value index, and classification value are automatically calculated from the standard value and the measured value according to the preset formula.

2、横排数据表记录原始数据2. Horizontal data table records original data

记录格式如下表3所示。因品种和数量较多，仅列举前16个性状和前14个品种。The record format is shown in Table 3. Due to the large number of varieties and quantities, only the first 16 traits and the first 14 varieties are listed.

表3Table 3

按照统一的横排数据格式进行横排数据采集，制作横排数据表：按照待测、品种、试验、性状编号的字段进行横排。当针对同一性状测定了多个单株样本值时，同一性状编号连续重复横排。例如，表3中，针对性状编号为16的性状测定了20个单株样本，性状编号16连续横排。According to the unified horizontal data format, horizontal data collection is performed to create a horizontal data table: the fields of test, variety, test, and trait number are arranged horizontally. When multiple single plant sample values are measured for the same trait, the same trait number is repeated horizontally. For example, in Table 3, 20 single plant samples were measured for the trait numbered 16, and the trait number 16 is arranged horizontally continuously.

表3中，待测字段中用“是”标识的品种表示需要待测试评价并需要出具分析报告的品种，其他品种用“否”表示，例如标准品种、近似品种，并非被测试评价品种，不需要针对其出分析报告。In Table 3, the varieties marked with "yes" in the field to be tested indicate the varieties that need to be tested and evaluated and for which an analysis report needs to be issued. Other varieties are indicated by "no", such as standard varieties and similar varieties, which are not tested and evaluated varieties and do not require an analysis report.

同一试验下的同一品种只列出一次，不能出现重复。The same variety under the same test can only be listed once and no duplication is allowed.

3、数据检验3. Data verification

(1)数据格式(例如数据类型)、数值范围和/或任选的数值单位匹配检验(1) Data format (e.g., data type), value range, and/or optional value unit matching check

基于第1部分参数表中数据类型等数据格式、最小值、最大值等数值范围、天、cm等数值单位，检验横排数据。Verify the horizontal data based on the data format such as data type, value range such as minimum value, maximum value, and value units such as day and cm in the parameter table in Part 1.

检验结果中，用特定颜色例如红色标识出异常值。例如，基于第1部分参数表的设置，性状2的数值范围应落在1-5的范围，数据采集时录入的6即为异常值；性状16的数值范围应为10-70，数据采集时录入的4即为异常值。通过设计程序自动地将这些异常值显示为红色。In the test results, outliers are marked with a specific color, such as red. For example, based on the settings in the parameter table in Part 1, the value range of trait 2 should fall within the range of 1-5, and the value 6 entered during data collection is an outlier; the value range of trait 16 should be 10-70, and the value 4 entered during data collection is an outlier. These outliers are automatically displayed in red through the design program.

(2)利用BoxPlot法和3σ法进行试验数据有效性检验(2) Use BoxPlot method and 3σ method to test the validity of test data

将横排数据格式转化为竖排数据格式，制作竖排数据表。利用BoxPlot法和 3σ法进行试验数据有效性检验。Convert the horizontal data format into the vertical data format and make a vertical data table. Use the BoxPlot method and the 3σ method to test the validity of the test data.

转化的竖排数据表如表4所示。按待测、品种、试验、性状、同一性状各单株样本编号的字段进行横排，而将性状编号作为数据竖排。The converted vertical data table is shown in Table 4. The fields of the test, variety, test, trait, and sample number of each individual plant of the same trait are arranged horizontally, while the trait number is used as data in vertical arrangement.

表4Table 4

可采用同一竖排数据表进行两种计算，并用不同颜色标识出各种异常值。The same vertical data table can be used to perform both calculations, with various outliers identified by different colors.

BoxPlot法计算结果如表5所示(以黄色表示1.5倍中距，红色表示3倍中距)。The calculation results of the BoxPlot method are shown in Table 5 (yellow represents 1.5 times the center distance, and red represents 3 times the center distance).

表5Table 5

3σ法计算结果如表6所示(黄色是2倍标准差，红色是3倍标准差)。The calculation results of the 3σ method are shown in Table 6 (yellow is 2 times the standard deviation, red is 3 times the standard deviation).

表6Table 6

经两种方法检验，仍为异常值，需要人工检查原始记录或田间样品，如属于输入错误，直接改正。属于客观事实的情况下，如果只有极少数(例如仅仅两个以内)无法说明原因的异常值，通过程序提供前后值平均。如果异常值较多，则不处理，用相对方差法或者COYU法检验一致性。If the value is still an outlier after the two methods are tested, the original record or field sample needs to be manually checked. If it is an input error, it should be corrected directly. In the case of objective facts, if there are only a few (for example, only two or less) outliers that cannot be explained, the program will provide the average of the before and after values. If there are many outliers, they will not be processed and the relative variance method or COYU method will be used to test the consistency.

4、频率分布分析4. Frequency distribution analysis

(1)在对一次试验MS性状数据进行频率分布分析时，可以从横排数据表中直接提取待测、品种、性状编号及其原始值到横排跨试验数据表中；(2)在针对两次试验MS性状数据进行频率分布分析时，可以从横排数据表中提取待测、品种、性状编号，并计算各性状的试验平均值，一并转入至横排跨试验数据表中，所述横排跨试验数据表格式为：待测、品种、同一性状不同试验的平均值或不同植株的原始值连续横排。(1) When performing frequency distribution analysis on MS trait data from one test, the test items, varieties, trait numbers and their original values can be directly extracted from the horizontal data table into the horizontal cross-test data table; (2) When performing frequency distribution analysis on MS trait data from two tests, the test items, varieties, trait numbers can be extracted from the horizontal data table, and the experimental average values of each trait can be calculated and transferred into the horizontal cross-test data table. The format of the horizontal cross-test data table is: the test items, varieties, and the average values of different tests of the same trait or the original values of different plants are arranged in a row.

横排跨试验数据表如表7所示。The horizontal cross test data table is shown in Table 7.

表7Table 7

频率分布分析以及标准值、分级值的确定Frequency distribution analysis and determination of standard values and grading values

数量性状频率分布计算结果示例如表8所示。An example of the calculation results of the frequency distribution of quantitative traits is shown in Table 8.

表8Table 8

性状Characteristics 1616 1717 1818 1919 22twenty two 25.225.2 26.226.2 27.227.2 29.229.2 30.230.2 31.231.2 总均值Overall mean 40.6840.68 29.7229.72 7.9567.956 25.1925.19 10.5810.58 103103 276.6276.6 0.3720.372 19.9619.96 4.5994.599 16.2616.26 总和sum 5776957769 4219842198 1129711297 3577135771 1501915019 1E+051E+05 4E+054E+05 528528 2833828338 65306530 2308823088 总方差Total variance 18.2418.24 15.5415.54 9.8299.829 13.0313.03 0.920.92 276.7276.7 711711 0.0020.002 3.6123.612 0.0680.068 3.1693.169 总平方和Total sum of squares 2E+062E+06 1E+061E+06 1E+051E+05 9E+059E+05 2E+052E+05 2E+072E+07 1E+081E+08 199199 6E+056E+05 3012530125 4E+054E+05 LSD0.05LSD0.05 1.631.63 1.4881.488 1.0791.079 1.6491.649 0.4240.424 5.115.11 6.8556.855 0.0180.018 0.7230.723 0.0940.094 0.8460.846 1616 中间值Median 最小值Minimum 数量quantity 频率frequency 极小值Minimum 平均值average value 极大值Maximum 11 34.134.1 40.6840.68 51.251.2 22 33 34.1634.16 32.5332.53 88 0.1130.113 44 37.4237.42 35.7935.79 1111 0.1550.155 55 40.6840.68 39.0539.05 3232 0.4510.451 66 43.9443.94 42.3142.31 1414 0.1970.197 77 47.247.2 45.5745.57 44 0.0560.056 88 50.4650.46 48.8348.83 22 0.0280.028 99 1717 中间值Median 最小值Minimum 数量quantity 频率frequency 极小值Minimum 平均值average value 极大值Maximum 11 23.7523.75 29.7229.72 37.2537.25 22 33 23.7723.77 22.2822.28 55 0.070.07 44 26.7426.74 25.2525.25 1616 0.2250.225 55 29.7229.72 28.2328.23 2525 0.3520.352 66 32.6932.69 31.231.2 1919 0.2680.268 77 35.6735.67 34.1834.18 55 0.070.07 88 38.6438.64 37.1637.16 11 0.0140.014 99

针对每个测量性状，分别以所有品种的所有原始数据的平均值为中心标准值，以2倍LSD_0.05为级差，设定各级的标准值，以各级标准值为中心，向两侧各延伸1/2级差所构成的区间，设为每个代码的分级区间；每个区间的最小值，作为分级值；确定每级分级值和分级区间，统计各个分级区间的品种数和百分比。根据统计结果，判定总区间覆盖范围是否小于3级或者大于9级。小于3级的性状不适宜用于DUS测试，予以剔除；大于9级的性状，调增LSD_0.05的倍数，并参考各级区间百分比是否均匀来调整分级，使得分级范围处于9级以内；处于3 和9级之间的性状，可以在两端各空出1-2级以便将来出现新品种时适用；由此获得一个更合理的分级。如果最小分级区间最小值小于零，则将最小分级区间的最小值设置为0，再按照前面确定的倍数的LSD_0.05从小到大进行分级，重新确定标准值，将各级标准值取整。再根据标准品种在试验中的表现，判定标准品种的代码位置是否合适，如果不合适，将标准品种移到合适的代码位置，并根据标准品种的实测值，将标准值进行整体平移，使标准品种实测值与其对应的标准值接近。由此确定标准品种并形成一套相对固定的标准值。如表9所示。For each measured trait, the average value of all original data of all varieties is used as the central standard value, and 2 times LSD _0.05 is used as the grade difference to set the standard value of each level. The interval formed by extending 1/2 grade difference to both sides with the standard value of each level as the center is set as the grading interval of each code; the minimum value of each interval is used as the grading value; the grading value and grading interval of each level are determined, and the number and percentage of varieties in each grading interval are counted. According to the statistical results, it is determined whether the total interval coverage is less than 3 levels or greater than 9 levels. Traits less than 3 levels are not suitable for DUS testing and are eliminated; for traits greater than 9 levels, the multiple of LSD _0.05 is increased, and the grading is adjusted by referring to whether the percentage of each level interval is uniform, so that the grading range is within 9 levels; for traits between 3 and 9 levels, 1-2 levels can be left blank at both ends to be applicable when new varieties appear in the future; thus, a more reasonable grading is obtained. If the minimum value of the minimum grading interval is less than zero, the minimum value of the minimum grading interval is set to 0, and then the grading is carried out from small to large according to the LSD _0.05 determined previously, and the standard value is re-determined, and the standard values of each level are rounded. Then, according to the performance of the standard variety in the test, it is determined whether the code position of the standard variety is appropriate. If it is not appropriate, the standard variety is moved to a suitable code position, and according to the measured value of the standard variety, the standard value is shifted as a whole, so that the measured value of the standard variety is close to its corresponding standard value. In this way, the standard variety is determined and a set of relatively fixed standard values is formed. As shown in Table 9.

表9Table 9

5、竖排数据格式转竖排处理数据格式，制作竖排处理数据表，进行代码数据检验和优化5. Convert vertical data format to vertical processing data format, create vertical processing data table, and perform code data inspection and optimization

将竖排数据格式转成竖排处理数据格式，即待测、品种、试验、性状横排，并将竖排数据表中同一MS性状多个样本值处理成平均值、标准差、样本数，并预留区间代码、已知代码、回归代码、优化代码、表达状态字段，将这些字段横排。处理后的表格格式如表10所示。The vertical data format is converted into a vertical processing data format, that is, the test, variety, test, and trait are arranged horizontally, and multiple sample values of the same MS trait in the vertical data table are processed into average value, standard deviation, and number of samples, and the interval code, known code, regression code, optimization code, and expression status fields are reserved and arranged horizontally. The processed table format is shown in Table 10.

表10Table 10

对于MG、MS性状，在所采用的一套标准值不是由本次试验数据得到的情况下 (例如第二年试验)，检验标准品种在本次试验中与获得所述一套标准值的试验中表现是否一致。For MG and MS traits, if the set of standard values used is not obtained from the current test data (for example, the second year test), check whether the performance of the standard variety in this test is consistent with that in the test where the set of standard values was obtained.

在竖排处理数据表中，计算本次试验中性状的平均值(即实测值)、标准差、样本数。把标准品种实测值提取到参数表中。标准品种实测值与标准值的差值的绝对值除以标准值，该值大于10％则确认为差异过大，用特定的颜色(例如红色) 显示提示为异常值。对于出现异常值的情况，人工判定是否属于可接受的异常情况(质量性状原则上不允许出现异常，出现的话，需要人工检查原因，予以修正)。如果某个性状多个标准品种因某种因素出现类似的变化，则属于可接受的异常，数据保留。如果该性状出现异常情况的标准品种与其他标准品种变化不一致，则需要剔除该标准品种实测值。In the vertical processing data table, calculate the average value (i.e. measured value), standard deviation, and number of samples of the trait in this test. Extract the measured value of the standard variety into the parameter table. The absolute value of the difference between the measured value of the standard variety and the standard value is divided by the standard value. If the value is greater than 10%, it is confirmed that the difference is too large, and a specific color (such as red) is used to display the prompt as an abnormal value. In the case of abnormal values, manual judgment is made as to whether it is an acceptable abnormal situation (in principle, quality traits are not allowed to have abnormalities. If they do occur, manual inspection of the cause is required and corrections are made). If multiple standard varieties of a certain trait show similar changes due to some factors, it is an acceptable abnormality and the data is retained. If the standard variety with abnormal conditions for the trait is inconsistent with the changes of other standard varieties, the measured value of the standard variety needs to be eliminated.

对于MG、MS性状，在所采用的一套标准值不是由本次试验数据得到的情况下，利用标准品种矫正分级范围。分级值第一个为零，第二个以下分别为：各标准品种的(实测值-标准值)/标准值所得值之和除以标准品个数，再加上本代码对应标准值与前一代码对应标准值之和的1/2。For MG and MS traits, when the standard values used are not obtained from the test data, the standard varieties are used to correct the grading range. The first grading value is zero, and the second and following values are: the sum of the values obtained by (measured value-standard value)/standard value of each standard variety divided by the number of standard products, plus 1/2 of the sum of the standard value corresponding to this code and the standard value corresponding to the previous code.

根据标准值和实测值计算每个代码对应的分级值、分级值索引、分级区间后，并计算每个品种每个性状的代码，获得区间代码。The grading value, grading value index and grading interval corresponding to each code are calculated according to the standard value and the measured value, and the code of each trait of each variety is calculated to obtain the interval code.

针对数量性状，如果本次试验中存在在品种库中有数据的已知品种，将品种库中已知品种的相应代码提到竖排处理数据表中的已知代码处。将这些已知品种在本次试验中的平均值与相应的已知代码建立线性回归关系，将其他品种在本次试验中的原始数据平均值代入该线性回归关系，计算出所有品种的回归代码。对区间代码、已知代码、和回归代码进行分析，选取其中的众数代码、中间数代码或三者的平均数代码作为进一步优化的代码数据。For quantitative traits, if there are known varieties with data in the variety library in this test, the corresponding codes of the known varieties in the variety library are mentioned in the known codes in the vertical processing data table. A linear regression relationship is established between the average values of these known varieties in this test and the corresponding known codes, and the average values of the original data of other varieties in this test are substituted into the linear regression relationship to calculate the regression codes of all varieties. The interval codes, known codes, and regression codes are analyzed, and the mode code, the median code, or the average code of the three is selected as the code data for further optimization.

当至少存在区间代码、已知代码、回归代码中的两种代码时，采用代码极差进行检验。由这些代码中的最大值减最小值计算极差，按极差大小显示特定颜色，例如，差1个代码显示黄色，差2个代码显示橙色，差3个代码显示红色，差4 个代码显示紫色等。对于显色的代码数据，人工检查原始数据，或调取照片确认，可以根据需要人工修改代码。通过设计程序，可快速调取照片进行确认。将鼠标放在需要调照片的代码上，通过程序链接照片，并展示在该代码同行空白位置处。如图1所示。When there are at least two codes among the interval code, known code, and regression code, the code range is used for inspection. The range is calculated by subtracting the minimum value from the maximum value of these codes, and a specific color is displayed according to the range size. For example, yellow is displayed when the difference is 1 code, orange is displayed when the difference is 2 codes, red is displayed when the difference is 3 codes, and purple is displayed when the difference is 4 codes. For the color code data, the original data can be manually checked, or photos can be retrieved for confirmation. The code can be manually modified as needed. By designing a program, photos can be quickly retrieved for confirmation. Put the mouse on the code that needs to be retrieved, link the photo through the program, and display it in the blank position of the same line of the code. As shown in Figure 1.

6、竖排处理数据格式转竖排跨试验数据格式，确定跨试验综合代码6. Convert the vertical processing data format to the vertical cross-test data format and determine the cross-test comprehensive code

将两年度试验的竖排处理数据格式转成竖排跨试验数据格式，制作竖排跨试验数据表，如下表所示。竖排跨年数据表格式为：待测、品种、性状、各年度试验中的平均值、标准差、样本数、代码、表达状态横排并排显示，并计算两年度试验的平均值、标准差、样本数、优化代码的平均值，一并横排并排显示。其中，优化代码的平均值取整(直接去掉小数点后数值或四舍五入)。另外，计算各年度代码极差，根据极差大小，以不同的特定颜色显示，例如黄色(差1)、橙色 (差2)、红色(差3)、紫色(差4以上)。对于显色的代码数据(年度间有差异的代码)，人工检查原始数据，或调取照片确认，可以根据需要人工修改代码。确认的代码作为跨试验综合代码。通过设计程序，可以实现快速调取照片进行确认。将鼠标放在需要调照片的代码上，通过程序链接照片，并展示在该代码同行空白位置处。如图2所示。The vertical processing data format of the two-year test is converted into a vertical cross-test data format, and a vertical cross-test data table is prepared, as shown in the following table. The format of the vertical cross-year data table is: the test, variety, trait, average value, standard deviation, number of samples, code, and expression status in each year's test are displayed horizontally and side by side, and the average value, standard deviation, number of samples, and average value of the optimized code of the two-year test are calculated and displayed horizontally and side by side. Among them, the average value of the optimized code is rounded (the value after the decimal point is directly removed or rounded). In addition, the range of each year's code is calculated, and different specific colors are displayed according to the range size, such as yellow (difference 1), orange (difference 2), red (difference 3), and purple (difference 4 or more). For the color code data (codes with differences between years), the original data is manually checked, or photos are retrieved for confirmation, and the code can be manually modified as needed. The confirmed code is used as a cross-test comprehensive code. By designing a program, it is possible to quickly retrieve photos for confirmation. Put the mouse on the code that needs to be retrieved, link the photo through the program, and display it in the blank position of the same line of the code. As shown in Figure 2.

7、将跨试验综合代码转入品种库7. Transfer cross-trial integrated codes to the variety library

将跨试验数据表中的综合代码数据转入品种库中，如表11所示。对于品种库中已有的品种或性状，代码数据直接覆盖；品种库中没有的品种或性状，在数据的最后一行(品种)或最后一列(性状)添加字段，并导入综合代码数据。这样，品种库的数据逐年更新和累加。The comprehensive code data in the cross-experimental data table is transferred to the variety library, as shown in Table 11. For varieties or traits already in the variety library, the code data is directly overwritten; for varieties or traits not in the variety library, a field is added to the last row (variety) or last column (trait) of the data, and the comprehensive code data is imported. In this way, the data in the variety library is updated and accumulated year by year.

表11Table 11

待测To be tested 品种variety 11 22 33 44 55 66 77 88 99 1010 1111 1212 1313 1414 1515 1616 1717 1818 1919 2020 21twenty one 22twenty two 23twenty three 24twenty four 2525 2626 是yes DK145DK145 22 22 55 55 44 22 44 22 22 22 44 44 44 22 22 66 66 22 66 22 22 44 22 22 88 77 是yes DK817DK817 66 44 44 55 44 22 44 33 22 22 22 55 44 44 11 66 88 22 66 22 22 55 22 22 88 88 是yes FH218FH218 44 44 55 55 44 22 44 22 22 22 22 44 22 22 22 66 66 22 55 22 44 55 22 22 88 77 是yes MC598MC598 22 44 44 55 44 22 44 33 22 22 44 44 44 11 11 55 55 22 44 22 22 55 22 22 66 66 是yes MC858MC858 44 44 44 55 44 33 44 22 22 22 55 44 44 11 33 66 66 22 66 22 33 55 22 22 77 66 是yes ND688ND688 22 44 44 55 44 11 44 33 22 44 44 44 44 22 22 55 55 22 66 22 33 44 22 22 88 66 是yes 北农486Beinong 486 55 44 55 55 44 22 44 44 22 44 44 66 44 11 22 66 66 22 66 22 22 55 22 22 88 66 是yes 北农851Beinong 851 44 44 55 66 66 22 44 44 22 22 22 44 44 11 44 55 55 44 66 22 22 55 22 22 88 77 是yes 北农861Beinong 861 22 44 55 66 66 22 44 33 22 22 33 44 44 22 22 66 55 22 66 22 22 55 22 22 1010 88 是yes 北农青贮36Beinong Silage 36 66 44 55 66 66 22 44 22 22 44 22 44 44 11 22 55 33 22 55 22 33 55 22 22 1111 1010 是yes 畅玉88Changyu88 22 22 55 55 44 33 44 22 66 22 22 44 44 22 22 55 44 22 66 22 33 55 22 22 88 66 是yes 大唐121Datang 121 22 44 44 44 44 22 44 22 22 22 22 55 44 11 11 55 55 22 44 22 22 44 22 22 88 66 否no 迪卡517Decca 517 22 33 44 44 44 22 44 22 22 22 44 44 44 11 11 55 44 44 55 22 33 44 22 22 88 66 是yes 和育501Heyu 501 33 44 44 44 44 22 44 33 22 22 22 44 44 11 11 66 66 22 66 22 33 44 22 22 66 66

8、标准照片整理8. Standard photo arrangement

完成两年试验后，对两年的照片进行对比。判定两年照片是否有差异。如果有差异，检查原因，如果没差异，挑选一套标准照片，存放到DUS\玉米\标准照片\品种名称文件夹中。对比界面示例如图3所示。After completing the two-year experiment, compare the photos of the two years. Determine whether there are any differences between the two years. If there are differences, check the reasons. If there are no differences, select a set of standard photos and store them in the DUS\Corn\Standard Photos\Variety Name folder. An example of the comparison interface is shown in Figure 3.

9、代码排序后批量调取照片确认代码9. After the codes are sorted, batch retrieve photos to confirm the codes

在横排数据表或品种库中，将鼠标点在某个代码型性状列上，程序自动按代码按大小排序，并通过第1部分的参数表中的性状照片字段下预设该性状对应的照片类型编号，批量提取该性状对应的每个品种的照片，放在下一列对应位置，按顺序查看照片，人工确认是否有代码给错的情况。最终保证拟出具报告的代码和照片一致。在品种库中进行照片确认的界面示例图如图4所示。In the horizontal data table or variety library, click the mouse on a certain code-type trait column, the program will automatically sort by code and size, and preset the photo type number corresponding to the trait under the trait photo field in the parameter table of Part 1, batch extract the photos of each variety corresponding to the trait, put them in the corresponding position of the next column, check the photos in order, and manually confirm whether there is a wrong code. Finally, ensure that the code and photo to be reported are consistent. The interface example of photo confirmation in the variety library is shown in Figure 4.

10、特异性分析10. Specificity analysis

(1)品种库中进行近似品种筛选(1) Screening similar varieties in the variety library

在最终形成的品种库中，先根据参数表中设置的分组信息，对所有品种进行分组排序。可以快速地初步将品种分为若干组。如果某组内只有一个品种，则该品种不需要与其他品种进行对比。In the final variety library, all varieties are first grouped and sorted according to the grouping information set in the parameter table. Varieties can be quickly and preliminarily divided into several groups. If there is only one variety in a group, it does not need to be compared with other varieties.

然后，利用有差异性状数累加法、差异大于阈值性状数累加法、相关系数法或最小距离法对品种进行近似程度分析，分别设置一般提醒和特别警示数值区间，如表12所示。Then, the varieties were analyzed for their approximation degree using the accumulation method of the number of traits with differences, the accumulation method of the number of traits with differences greater than the threshold, the correlation coefficient method, or the minimum distance method, and the general warning and special warning value ranges were set respectively, as shown in Table 12.

表12Table 12

以相关系数为例，结果区中，横排是待测品种，竖排是所有品种，相关系数大于90％用黄色显示，大于95％的用红色显示。Taking the correlation coefficient as an example, in the result area, the horizontal row is the varieties to be tested, and the vertical row is all varieties. Correlation coefficients greater than 90% are displayed in yellow, and those greater than 95% are displayed in red.

(2)利用品种照片对比进一步确认(2) Further confirmation using variety photo comparison

删除品种库表格中的待测字段下的所有数据，在需要进一步对比的待测品种前填上“是”，在第(1)步中分析出的近似品种前填上“否”。依据该输入的信息，通过程序依次调取待测品种和近似品种的照片，进行并排展现，人工快速查看、确认是否有差异。待测品种与近似品种照片对比界面示例如图5所示。Delete all data under the field to be tested in the variety library table, fill in "yes" before the variety to be tested that needs further comparison, and fill in "no" before the similar variety analyzed in step (1). Based on the input information, the program sequentially retrieves photos of the variety to be tested and similar varieties, displays them side by side, and manually quickly checks and confirms whether there are any differences. An example of the interface for comparing the photos of the variety to be tested and similar varieties is shown in Figure 5.

(3)调取原始数据(3) Retrieving original data

如果近似品种与待测品种照片没差异，从竖排处理数据表或竖排跨试验数据表中进一步调取两个品种所有测量性状的平均值、代码(MS性状还包括标准差) 放在数据对比表格中，进行并排对比，如表13所示，查看是否存在差异较大的数据。If there is no difference between the photos of the similar variety and the variety to be tested, further retrieve the average values and codes (MS traits also include standard deviations) of all measured traits of the two varieties from the vertical processing data table or the vertical cross-test data table and place them in the data comparison table for side-by-side comparison, as shown in Table 13, to see if there are data with large differences.

表13Table 13

(4)针对MG、MS性状，继续调取两个品种的单次试验原始数据或两次试验汇总数据进行T检验，或COYD分析。如表14和表15所示。(4) For the MG and MS traits, the original data of a single test or the summary data of two tests of the two varieties were retrieved for T test or COYD analysis, as shown in Tables 14 and 15.

表14Table 14

表15Table 15

(5)对于VS性状，在竖排行列数据表中，利用皮尔逊卡方检验方法对待测品种和近似品种的特异性进行分析。(5) For VS traits, in the vertical row and column data table, the Pearson chi-square test method was used to analyze the specificity of the tested variety and similar varieties.

可以将原始数据从竖排数据格式经统计转成竖排行列数据格式，竖排行列数据表的格式为：待测、品种、试验、性状、代码字段横排，每个代码字段下的数据为该代码在群体中出现的次数，如表16所示。也可以直接按竖排行列数据格式，采集田间数据。The original data can be converted from the vertical data format to the vertical row and column data format through statistics. The format of the vertical row and column data table is: the test, variety, test, trait, code field is arranged horizontally, and the data under each code field is the number of times the code appears in the population, as shown in Table 16. You can also directly collect field data in the vertical row and column data format.

表16Table 16

待测To be tested 品种variety 试验test 性状Characteristics 11 22 33 是yes C01C01 20182018 11 3434 66 66 否no R01R01 20182018 11 1212 23twenty three 99 否no R02R02 20182018 11 66 2020 1919 否no R03R03 20182018 11 11 1818 99 否no R04R04 20182018 11 77 22twenty two 1515 是yes C02C02 20182018 11 99 33 3434 否no R05R05 20182018 11 44 88 3434 否no R06R06 20182018 11 11 1111 3434

皮尔逊卡方检验结果如表17所示。The results of the Pearson chi-square test are shown in Table 17.

表17Table 17

待测To be tested 品种variety 试验test 性状Characteristics 11 22 33 C01C01 结果result C02C02 结果result 是yes C01C01 20182018 11 3434 66 66 11 2E-082E-08 DD 否no R01R01 20182018 11 1212 23twenty three 99 3E-053E-05 DD 3E-073E-07 DD 否no R02R02 20182018 11 66 2020 1919 4E-084E-08 DD 0.00020.0002 DD 否no R03R03 20182018 11 11 1818 99 2E-082E-08 DD 5E-075E-07 DD 否no R04R04 20182018 11 77 22twenty two 1515 2E-072E-07 DD 2E-052E-05 DD 是yes C02C02 20182018 11 99 33 3434 22 2E-082E-08 DD 否no R05R05 20182018 11 44 88 3434 3E-103E-10 DD 0.12270.1227 NDND 否no R06R06 20182018 11 11 1111 3434 5E-125E-12 DD 0.00410.0041 DD

(6)当(5)中的表达状态仅有两种时，可以采用精度更高的费氏精确检验方法分析特异性。(6) When there are only two expression states in (5), the more accurate Fisher's exact test method can be used to analyze the specificity.

费氏精确检验计算结果如表18所示。The calculation results of Fisher's exact test are shown in Table 18.

表18Table 18

待测To be tested 品种variety 试验test 性状Characteristics 11 22 C01C01 结果result C02C02 结果result 是yes C01C01 20182018 11 3434 66 0.22950.2295 NDND 否no R01R01 20182018 11 1212 23twenty three 6E-066E-06 DD 0.01460.0146 NDND 否no R02R02 20182018 11 66 2020 5E-075E-07 DD 0.00330.0033 DD 否no R03R03 20182018 11 11 1818 3E-093E-09 DD 9E-059E-05 DD 否no R04R04 20182018 11 77 22twenty two 4E-074E-07 DD 0.00330.0033 DD 是yes C02C02 20182018 11 99 33 0.22950.2295 NDND 否no R05R05 20182018 11 44 88 0.00110.0011 DD 0.04360.0436 NDND 否no R06R06 20182018 11 11 1111 2E-062E-06 DD 0.00130.0013 DD

11、离散型数据一致性分析11. Discrete data consistency analysis

离散型数据的一致性分析采用UPOV异型株法，本方法对显示界面进行了改造，便于数据记录和批量计算，界面表19所示。The consistency analysis of discrete data adopts the UPOV heterotypic strain method. This method transforms the display interface to facilitate data recording and batch calculation. The interface is shown in Table 19.

表19Table 19

待测To be tested 品种variety 试验test 性状Characteristics 总株数Total number of plants 异型株数Number of heterotypic strains 异型株法Heterotypic strain method 一致性consistency 否no R01R01 20172017 11 2020 33 0.9789916440.978991644 NUNU 否no R02R02 20172017 11 3030 33 0.9966822910.996682291 NUNU 否no R03R03 20172017 11 4040 33 0.9925026370.992502637 NUNU 否no R04R04 20172017 11 5050 33 0.8107980750.810798075 UU 否no R05R05 20172017 11 6060 33 0.731466110.73146611 UU 否no R06R06 20172017 11 7070 33 0.6492369120.649236912 UU 否no R07R07 20172017 11 8080 66 0.9666936180.966693618 NUNU 否no R08R08 20172017 11 9090 66 0.9460796120.946079612 UU 否no R09R09 20172017 11 100100 66 0.9191628710.919162871 UU 否no R10R10 20172017 11 110110 66 0.8860119350.886011935 UU 否no R11R11 20172017 11 120120 66 0.8470818580.847081858 UU 是yes C01C01 20172017 11 130130 66 0.8031378390.803137839 UU 是yes C02C02 20172017 11 140140 99 0.9741226950.974122695 NUNU 是yes C03C03 20172017 11 150150 99 0.9621904320.962190432 NUNU 是yes C04C04 20172017 11 160160 99 0.9469735520.946973552 UU 是yes C05C05 20172017 11 170170 99 0.9282360310.928236031 UU 是yes C06C06 20172017 11 180180 99 0.9058638160.905863816 UU 是yes C07C07 20172017 11 190190 99 0.8798722170.879872217 UU

P值和结果是依据群体大小、异型株数、群体标准三个值计算而来。The P value and results are calculated based on three values: population size, number of heterotypic strains, and population standard.

12、连续型数据一致性分析-相对方差法；12. Continuous data consistency analysis-relative variance method;

一年数据可以采用相对方差法分析一致性，数据采集和分析界面如表20所示。The one-year data can be analyzed for consistency using the relative variance method. The data collection and analysis interface is shown in Table 20.

表20Table 20

待测To be tested 品种variety 试验test 性状Characteristics 总株数Total number of plants 标准差Standard Deviation 相对方差Relative variance 允许相对方差Allow relative variance 一致性consistency 否no R01R01 20172017 11 2020 1.51.5 1.0122699391.012269939 1.8783117351.878311735 UU 否no R02R02 20172017 11 2020 2.22.2 1.4846625771.484662577 1.8783117351.878311735 UU 否no R03R03 20172017 11 2020 2.32.3 1.5521472391.552147239 1.8783117351.878311735 UU 否no R04R04 20172017 11 2020 1.41.4 0.9447852760.944785276 1.8783117351.878311735 UU 否no R05R05 20172017 11 2020 0.50.5 0.3374233130.337423313 1.8783117351.878311735 UU 否no R06R06 20172017 11 2020 1.41.4 0.9447852760.944785276 1.8783117351.878311735 UU 否no R07R07 20172017 11 2020 1.41.4 0.9447852760.944785276 1.8783117351.878311735 UU 否no R08R08 20172017 11 2020 1.41.4 0.9447852760.944785276 1.8783117351.878311735 UU 否no R09R09 20172017 11 2020 1.41.4 0.9447852760.944785276 1.8783117351.878311735 UU 否no R10R10 20172017 11 2020 1.41.4 0.9447852760.944785276 1.8783117351.878311735 UU 否no R11R11 20172017 11 2020 1.41.4 0.9447852760.944785276 1.8783117351.878311735 UU 是yes C01C01 20172017 11 2020 1.71.7 1.1472392641.147239264 1.8783117351.878311735 UU 是yes C02C02 20172017 11 2020 4.14.1 2.7668711662.766871166 1.8783117351.878311735 NUNU 是yes C03C03 20172017 11 2020 1.531.53 1.0325153371.032515337 1.8783117351.878311735 UU 是yes C04C04 20172017 11 2020 1.471.47 0.992024540.99202454 1.8783117351.878311735 UU 是yes C05C05 20172017 11 2020 1.411.41 0.9515337420.951533742 1.8783117351.878311735 UU 是yes C06C06 20172017 11 2020 33 2.0245398772.024539877 1.8783117351.878311735 NUNU 是yes C07C07 20172017 11 2020 1.291.29 0.8705521470.870552147 1.8783117351.878311735 UU

13、连续型数据一致性分析-COYU方法13. Continuous data consistency analysis-COYU method

两年数据采用COYU法，数据可从横排跨试验数据表中获取，分析界面如表21 所示。The two-year data adopts the COYU method. The data can be obtained from the horizontal cross-test data table. The analysis interface is shown in Table 21.

表21Table 21

Claims

1. A plant variety DUS testing method comprising performing a plant planting test, collecting and processing test data in a unified format, and performing DUS analysis, the variety types in the plant planting test comprising a variety to be tested, a standard variety, and optionally an approximate variety; it is characterized in that the method comprises the steps of,

the step of collecting and processing test data in a unified format includes the step of converting character raw data into codes; the method comprises the following steps:

directly giving corresponding codes to the original data of each visual trait VG and VS, and carrying out plant variety DUS test;

for each measurement trait MG and MS raw data, a frequency distribution analysis was performed comprising: selecting character original data of any one test or character original data of any plurality of tests to carry out LSD analysis to obtain LSD for measuring characters _0.05 A value; calculating the average value of all the original data of the measured characters of all varieties; taking the average value of the raw data of the measured properties of all varieties as a central standard value and taking 2 times of LSD _0.05 Setting standard values of all levels for the level difference; taking standard values of all levels as centers, and setting a section formed by extending 1/2 level difference to two sides as a grading section of each code; the minimum value of each interval is used as a grading value; counting the number and percentage of varieties in each grading interval;

determining standard values, hierarchical values, and interval codes, comprising: judging whether the total interval coverage is smaller than 3 levels or larger than 9 levels according to the statistical result; less than 3 grades of characters are removed, and the method is not suitable for DUS analysis of plant varieties; traits greater than grade 9, modulating LSD _0.05 And (2) is a multiple ofAdjusting the classification according to whether the percentages of the classification intervals are uniform or not so that the classification range is within 9 stages; traits between 3 and 9 levels, which can be used when 1-2 levels are left at each end for future appearance of new varieties; if the minimum value of the minimum classification interval is smaller than zero, the minimum value of the minimum classification interval is set to 0, and the LSD is further processed according to the above determined multiple _0.05 Grading from small to large, and re-determining a standard value;

selecting a plant variety with the measured property actual measurement value at or near each level of standard value as a standard variety of a corresponding grading interval, and properly translating the standard value when the error is large to enable the standard variety actual measurement value to be close to the corresponding standard value, thereby finally forming a set of relatively fixed standard value and standard variety;

Coding the original data of the measurement characters MG and MS in the test based on the grading interval and the code to obtain an interval code, wherein the interval code can be directly used for plant variety DUS analysis or used for plant variety DUS analysis after the interval code is further optimized;

the collecting and processing test data in a unified format further comprises: preparing a unified form, setting a unified data format, a numerical range and optional numerical units, and checking validity of test data and statistical assumptions;

the checking of the validity of the test data comprises: performing data format, numerical range and/or optional numerical unit matching check, wherein the check is performed automatically on data during or after input by a program, if the input data does not accord with the set data format, numerical range and/or optional numerical unit, an abnormal value is indicated, if the abnormal value is found, the abnormal value is automatically identified, and an original record or a field sample is manually checked; if the input error is included, directly correcting; if the abnormal data belongs to objective facts, the abnormal data is reserved, and the following steps are continued;

The step of making the unified table comprises making a unified parameter table, wherein the parameter table at least comprises the following fields of parameters: code, standard value, expression state, standard variety, character number, character name and numerical type; the parameter table also includes one or more parameter fields selected from the following: expression type, observation time, number unit, rating value, maximum value, minimum value, code index, rating value index, group, weight, threshold value, and photograph;

the method comprises the steps of presetting character numbers, codes, expression states, standard varieties, character names, expression types, observation time, quantity units, numerical types, maxima and minima parameters according to DUS test guidelines; the standard value is determined according to the method; the threshold value and the weight are set according to experience; the code index, the grading value index and the grading value are automatically calculated by standard values and actual measurement values according to a preset formula;

wherein, a horizontal data table or a vertical data table is adopted to collect data;

the format of the horizontal data table is as follows: performing horizontal arrangement according to fields of to-be-detected, variety, test and character numbers, and continuously repeating the horizontal arrangement of the same character number when a plurality of single plant sample values are measured for the same character; the same variety under the same test is listed only once, and no repetition can occur;

The format of the vertical data table is as follows: the horizontal arrangement is carried out according to the fields of the sample numbers of each single plant of the to-be-detected, variety, test, character and the same character, and the character numbers are used as the data vertical arrangement;

the varieties marked by 'yes' under the fields to be tested in each table represent varieties which need to be tested and evaluated and need to give an analysis report, and other varieties are marked by 'no', and are not tested and evaluated varieties, and the analysis report is not needed to be given.

2. The seed plant variety DUS testing method of claim 1, wherein the testing for test data validity further comprises testing for MS data for a plurality of samples collected during the test using a BoxPlot method and/or a 3σ method; if abnormal value occurs, then automatic identification is carried out, and original record or field sample is checked manually, if the original record or field sample belongs to input error, direct correction is carried out; in the case of objective facts, if few abnormal values which cannot explain the cause occur, correction can be performed by providing an average of the values before and after the program; if the abnormal value is more, the data is not processed, and the consistency is checked by adopting a relative variance method or a COYU method.

3. The plant species DUS test method according to claim 1 or 2, further comprising comparing the codes of the plurality of tests together, checking the codes using the code range obtained by the maximum value-minimum value, performing different identifications according to the range, and manually checking the original data or retrieving photo confirmation for the identified codes, i.e., codes having differences between the tests, and manually modifying the codes as needed to obtain the integrated codes across the tests.

4. The seed plant variety DUS testing method of claim 3, wherein the code-using test is performed in a vertical cross test data table format that is: the average value, standard deviation, sample number, code and expression state in each test are displayed in parallel, the average value, standard deviation, sample number and average value of optimized code of all tests are calculated and displayed in parallel, wherein the average value of the optimized code is rounded, namely the numerical value after decimal point is directly removed or rounded; and calculating the range of different test codes, and carrying out different identifications according to the range.

5. The seed plant variety DUS testing method of claim 4, further comprising the step of transferring the obtained integrated code into a variety library; if the variety and/or the corresponding character already exist in the variety library, covering the result; if not, adding varieties in the last row of the variety library and/or adding characters in the last column, and updating the varieties.

6. The seed plant variety DUS testing method of claim 5, further comprising the step of confirming or correcting the trait code with a photograph; the operation is carried out in a horizontal data table or a variety library, the mouse is marked on a certain code character, the program automatically sorts the codes according to the size, the photos of each variety corresponding to the character are extracted in batches through the named characters of the photos, the photos are placed at the corresponding position of the next row, the photos are checked in sequence, whether the error situation of the codes is confirmed manually, and finally, the fact that the codes with reports are drawn out is ensured to be consistent with the photos.