RU2559771C2

RU2559771C2 - Device for primary division of molecular numbers

Info

Publication number: RU2559771C2
Application number: RU2013148505/08A
Authority: RU
Inventors: Николай Иванович Червяков; Михаил Григорьевич Бабенко; Павел Алексеевич Ляхов; Ирина Николаевна Лавриненко
Priority date: 2013-10-30
Filing date: 2013-10-30
Publication date: 2015-08-10
Also published as: RU2013148505A

Abstract

FIELD: physics, computation hardware.

SUBSTANCE: invention relates to computer engineering and can be used in computer systems operated in the system of remainder classes. Proposed device comprises multiplier, multiplexer, comparator circuit, registers, counter, subtractor, memory, control circuit, inhibit elements and switches.

EFFECT: higher speed of division, simplified design, enhanced operating performances.

1 dwg, 1 tbl

Description

Изобретение относится к вычислительным модулярным системам и предназначено для выполнения основного деления чисел, представленных в системе остаточных классов (СОК).The invention relates to computing modular systems and is intended to perform the basic division of numbers represented in the system of residual classes (RNS).

В СОК обычное целое число представляется в виде остатков от деления на набор модулей. Арифметические операции над числами заменяются операциями над остатками. Выполнение операций происходит параллельно и без межразрядных переносов, что позволяет очень быстро реализовать сложение, вычитание и умножение. Однако операция деления представляет определенные трудности, которые исследователи стараются упростить, предлагая новые архитектуры вычислений и аппаратные реализации.In JUICE, the usual integer is represented as the remainder of the division by a set of modules. Arithmetic operations on numbers are replaced by operations on residuals. The operations are performed in parallel and without inter-bit transfers, which allows very quickly to add, subtract and multiply. However, the division operation presents certain difficulties, which researchers are trying to simplify by proposing new computing architectures and hardware implementations.

Известна «Нейронная сеть для деления чисел, представленных в системе остаточных классов» (патент RU 2318239, G06F, опубликовано 27.02.2008), содержащая нейронную сеть для расширения кортежа числовой системы вычетов, нейронные сети конечного кольца для суммирования и умножения.The well-known "Neural network for dividing the numbers represented in the system of residual classes" (patent RU 2318239, G06F, published 02.27.2008), containing a neural network for expanding a tuple of a numerical system of residues, neural networks of a finite ring for summation and multiplication.

Недостатком устройства является низкая скорость деления чисел и ограниченная функциональная возможность, так как в качестве делителя выбирается один их модулей системы остаточных классов (СОК).The disadvantage of this device is the low speed of division of numbers and limited functionality, since one of the modules of the system of residual classes (RNS) is selected as a divider.

Наиболее близкой к изобретению является «Нейронная сеть основного деления модулярных чисел» (патент RU 2400813, G06F 3/02, G06F 7/72, опубликовано 27.04.2010). Недостатком устройства является большой объем оборудования. Известная нейронная сеть предназначена для деления модулярных чисел в случае, когда в качестве делителя используется целое положительное число, попарно простое с р₁, р₂,…, p_n, либо целое положительное число, представляющее собой произведение чисел, попарно простых с p_i. Для выполнения этого условия возникает необходимость нахождения приблизительного делителя путем использования обобщенной позиционной системы счисления (ОПСС). Для нахождения приблизительного делителя необходимо дополнительное оборудование и время.Closest to the invention is the "Neural network of the main division of modular numbers" (patent RU 2400813, G06F 3/02, G06F 7/72, published 04/27/2010). The disadvantage of this device is the large amount of equipment. The well-known neural network is designed to divide modular numbers in the case when a positive integer pairwise prime with p ₁ , p ₂ , ..., p _n , or a positive integer representing a product of numbers pairwise prime with p _{i is} used as a divider. To fulfill this condition, it becomes necessary to find an approximate divisor by using a generalized positional number system (OPSS). To find an approximate divider, additional equipment and time are needed.

Техническим результатом изобретения является расширение функциональных возможностей, что выражается в отсутствии накладываемых условий на делимое, делитель и выбор набора модулей СОК, сокращение оборудования и повышение скорости при выполнении операции деления.The technical result of the invention is the expansion of functionality, which is expressed in the absence of imposed conditions on the dividend, divider and the choice of a set of RNS modules, equipment reduction and speed increase during the division operation.

Указанный технический результат достигается тем, что устройство для основного деления модулярных чисел содержит (фиг. 1): входные шины для определения начала процесса «деления», шину 31, входные шины для подачи делимого 1 и делителя 2 и выходную шину частного 30; схему умножения 7, предназначенную для умножения делителя на высшую степень аппроксимационного ряда частного; схему мультиплексора 8, на вход которого поступают данные либо непосредственно делителя, шина 2, либо делителя, умноженного на высшую степень аппроксимационного ряда частного, шина 32; схему сравнения 11 (патент на изобретение №2503992, опубл. 10.01.2014) для сравнения относительных значений делимого и делителя, поступивших по шинам 1 и 33; сумматоры делимого 37 и делителя 10, предназначенные для суммирования произведений констант выбранной СОК на разряды соответственно делимого и делителя, поступивших по шинам 26 и 25; кроме того, на сумматор делимого 37 поступает остаток от вычитания из делимого степеней членов ряда, поступающий по шине 39; вычитатель 14 для вычитания из сумматора делимого 37, шина 38, соответствующих значений степеней аппроксимационного ряда частного, поступивших по шине 34; регистр сдвига 9, предназначенный для сдвига делителя «влево» до появления переноса в знаковый разряд в случае аппроксимации ряда частного или сдвига «вправо» в случае уточнения аппроксимационного ряда частного, поступившего по шине 23; регистр 36 хранения остатка при вычитании из делимого членов ряда частного, предназначенный для временного хранения результатов вычитания (вычитатель 14) из делимого соответствующих степеней ряда частного, поступивших по шине 41 в случае отсутствия запрещающего сигнала по шине 27; схему «запрета» 35, предназначенную для запрета прохождения результата вычитателя 14 шина 40, если значение результата отрицательное, т.е. в случае действия запрещающего сигнала по шине 27; схему «запрета» 12, предназначенную для прохождения значений соответствующих степеней, если они входят в уточненный ряд частного, поступивших по шине 28; шина 27 служит для подачи сигналов на запрещающие входы схем «запрета» 12 и 35, если в знаковом разряде вычитателя 14 записана «1», то есть число отрицательное; счетчик 4, предназначенный для счета тактовых импульсов, поступивших по шине 17, соответствующих количеству сдвигов регистра сдвига 9; память 5 (количество ячеек памяти определяется выражением i=logP, где Р=p₁p_2…p_n) предназначена для хранения степеней двойки q_imod p_j, j=1,2,…,n, представленных в СОК, которые подаются по шине 24 через ключ 6 на информационный вход схемы «запрета» 12 по шине 28 и далее на вход сумматора 13 по шине 29, считывание которых осуществляется по адресной шине 21 после поступления сигнала «Разрешение считывания» по шине 22 (остановка счетчика 4 и «Разрешение считывания» определяется в момент переноса единицы в знаковый разряд регистра сдвига 9); ключ 6 для прохождения степеней ряда на вход схемы умножения 7 и схемы «запрета» 12 при поступлении тактовых импульсов с шины 16; схема управления 3 предназначена для формирования тактовых, синхронизирующих импульсов и сигналов установки в «0» элементов устройства деления (шины 15, 16, 17, 18); сумматор 13 частного предназначен для суммирования степеней ряда, представленных в СОК, поступивших по шине 29, либо по шине 42 в случае а=b; сигналы с выхода схемы сравнения 11 поступают на вход схемы управления: в случае, если а>b, шина 20, в случае если а<b, шина 19.The specified technical result is achieved by the fact that the device for the main division of modular numbers contains (Fig. 1): input buses for determining the beginning of the "division" process, bus 31, input buses for supplying dividend 1 and divider 2 and the output bus private 30; multiplication scheme 7, designed to multiply the divider by the highest degree of the approximation series of the quotient; multiplexer circuit 8, the input of which receives data either directly from the divider, bus 2, or the divider, multiplied by the highest degree of the approximation number of the quotient, bus 32; comparison scheme 11 (patent for the invention No. 2503992, publ. 01/10/2014) for comparing the relative values of the dividend and divider received on buses 1 and 33; dividend 37 adders and divider 10, designed to sum the products of the constants of the selected RNS into the bits of the dividend and divider, respectively, received via buses 26 and 25; in addition, the remainder of subtracting the terms of the series from the dividend of powers of the dividend 37 is received via bus 39; a subtractor 14 for subtracting from the adder the dividend 37, the bus 38, the corresponding values of the degrees of the approximation series of the quotient received on the bus 34; shift register 9, designed to shift the divider “to the left” until the transfer to the sign digit occurs in the case of approximation of a number of quotients or shift “to the right” in case of clarification of the approximation series of quotients received via bus 23; the register 36 of the storage of the residue when subtracting from the dividend members of the private number, intended for temporary storage of the results of subtraction (subtractor 14) from the dividend of the corresponding degrees of the number of private received on the bus 41 in the absence of a prohibition signal on the bus 27; the "ban" 35 scheme, designed to prohibit the passage of the result of the subtractor 14 bus 40, if the value of the result is negative, i.e. in the case of the inhibit signal on the bus 27; the “ban” scheme 12, designed to pass the values of the corresponding degrees, if they are included in the specified number of quotes received via bus 28; the bus 27 serves to supply signals to the inhibitory inputs of the “ban” circuits 12 and 35 if “1” is written in the sign bit of the subtractor 14, that is, the number is negative; a counter 4 for counting clock pulses received on the bus 17, corresponding to the number of shifts of the shift register 9; memory 5 (the number of memory cells is determined by the expression i = logP, where P = p ₁ p _{2 ...} p _n ) is designed to store the powers of two q _i mod p _j , j = 1,2, ..., n, presented in the RNC, which are served on bus 24 through key 6 to the information input of the “ban” circuit 12 on bus 28 and then to the input of the adder 13 on bus 29, which are read on address bus 21 after the signal “Read permission” is received on bus 22 (counter stops 4 and “ Read permission ”is determined at the moment of transferring the unit to the sign bit of the shift register 9); a key 6 for passing the degrees of the series to the input of the multiplication circuit 7 and the “ban” circuit 12 when clock pulses are received from the bus 16; control circuit 3 is intended for the formation of clock, synchronizing pulses and signals of setting to “0” elements of the division device (buses 15, 16, 17, 18); The adder 13 quotient is intended to summarize the degrees of the series represented in the RNC received via bus 29, or via bus 42 in the case a = b; the signals from the output of the comparison circuit 11 are fed to the input of the control circuit: in the case a> b, bus 20, in the case a <b, bus 19.

Известные алгоритмы деления чисел, представленных в СОК, базируются на абсолютных значениях делимого и делителя. В изобретении предлагается использовать не абсолютные значения, а их относительные величины.Known algorithms for dividing the numbers represented in the RNS are based on the absolute values of the dividend and divisor. The invention proposes to use not absolute values, but their relative values.

Рассмотрим модификацию алгоритма основного деления чисел, представленных в системе остаточных классов в случае, когда и делимое и делитель представляют собой произвольные целые числа и делитель не приводится к случаю попарно простого с модулями СОК. Модификация основана на использовании делимого и делителя, представленных в относительных величинах.Let us consider a modification of the algorithm for the basic division of numbers represented in the system of residual classes in the case when both the dividend and the divisor are arbitrary integers and the divisor is not reduced to the case of pairwise simple with RNS modules. The modification is based on the use of the dividend and divisor, presented in relative terms.

В последнее время проявляется значительный интерес к СОК, обладающей высоким уровнем естественного параллелизма при выполнении арифметических операций, высокой точностью, надежностью и стойкостью.Recently, considerable interest has been shown in RNS, which has a high level of natural parallelism during arithmetic operations, high accuracy, reliability, and stability.

Специализированные процессоры на основе арифметики СОК могут сыграть важную роль в высокоскоростных системах обработки данных в режиме реального времени. Операции сложения, вычитания и умножения, называемые модульными операциями, могут быть реализованы очень быстро, без распространения межразрядных переносов. Немодульные операции деления, сравнения чисел, определения знака и переполнения диапазона остаются сравнительно медленными. Любое улучшение скорости этих медленных алгоритмов значительно улучшает производительность многомодульных арифметико-логических устройств (АЛУ). Обычно при рассмотрении деления в СОК выделяют три категории: деление с нулевым остатком, масштабирование и деление в общем случае. Проблема деления в общем виде в СОК привлекает внимание многих исследователей для разработки высокопроизводительных многомодульных АЛУ. Известные алгоритмы деления в СОК, основанные на использовании преобразования в ОПСС, масштабировании, округлении, расширении и других операциях, являются медленными и требуют выполнения большого количества арифметических действий. Большинство известных алгоритмов основаны на сравнении делимого с делителем или с его удвоенным значением, которые представляют определенную сложность. В связи с этим возникает необходимость упростить структуру вычислений при сравнении модулярных чисел. Одним из направлений упрощения операции сравнения модулярных чисел является подход с использованием приближенного метода вычисления позиционной характеристики модулярного числа, который позволяет абсолютно правильно реализовать основные классы процедур принятия решений: проверка равенства (неравенства) двух значений; сравнение двух значений (больше, меньше) и другие, которые обеспечивают решение основного круга задач, возникающих при аппаратной или программной реализации вычислений в системе остаточных классов.Specialized processors based on RNS arithmetic can play an important role in high-speed real-time data processing systems. Addition, subtraction and multiplication operations, called modular operations, can be implemented very quickly, without the spread of inter-bit transfers. The non-modular operations of dividing, comparing numbers, determining the sign, and overflowing the range remain relatively slow. Any improvement in the speed of these slow algorithms significantly improves the performance of multi-module arithmetic logic devices (ALUs). Usually, when considering division in the RNS, three categories are distinguished: division with zero remainder, scaling, and division in the general case. The problem of general division in the RNS has attracted the attention of many researchers for the development of high-performance multi-module ALUs. Well-known division algorithms in RNS based on the use of conversion in OPSS, scaling, rounding, expansion, and other operations are slow and require a large number of arithmetic operations. Most well-known algorithms are based on comparing the dividend with the divisor or with its double value, which represent a certain complexity. In this regard, it becomes necessary to simplify the structure of calculations when comparing modular numbers. One of the ways to simplify the operation of comparing modular numbers is the approach using an approximate method for calculating the positional characteristic of a modular number, which allows you to absolutely correctly implement the main classes of decision-making procedures: checking the equality (inequality) of two values; a comparison of two values (more, less) and others that provide a solution to the main range of problems arising from hardware or software implementation of calculations in the system of residual classes.

Суть приближенного метода вычисления позиционной характеристики модулярного числа и его использование для деления модулярных чисел основаны на использовании относительных величин анализируемых чисел к полному диапазону, определяемому Китайской теоремой об остатках, которая связывает позиционное число а с его представлением в остатках (α₁, α₂,…,α_n„), где α_i - наименьшие неотрицательные вычеты числа, относительно модулей системы остаточных классов р₁, р₂,…,p_n следующим выражениемThe essence of the approximate method for calculating the positional characteristic of a modular number and its use for dividing modular numbers is based on using the relative values of the analyzed numbers to the full range defined by the Chinese remainder theorem, which relates the positional number a to its representation in the remainders (α ₁ , α ₂ , ... , α _n „), where α _i are the smallest non-negative residues of the number, with respect to the moduli of the system of residual classes p ₁ , p ₂ , ..., p _{n with the} following expression

где

- модули СОК,

- мультипликативная инверсия P_i относительно p_i, и

Where

- SOK modules,

- a multiplicative inversion of P _i relative to p _i , and

Если разделить левую и правую части выражения (1) на константу Р, соответствующую диапазону чисел, то получим приближенное значениеIf we divide the left and right parts of the expression (1) by the constant P, corresponding to the range of numbers, then we obtain an approximate value

где

- константы выбранной системы, а α_i - разряды числа, представленного в СОК по модулям p_i, где i=1, 2,…,n, при этом значение суммы (2) будет в интервале [0,1). Конечный результат суммы определяется после суммирования и отбрасывания целой части числа с сохранением дробной части суммы. Дробная величина

содержит как информацию о величине числа, так и о его знаке. Если

то число a - положительное и F(a) равна величине числа а, разделенной на Р. В противном случае a - отрицательное число, и 1 - F(a) показывает относительную величину числа а. Округление величины F{a) до 2^-1 бита будем обозначать как

Точное значение величины F(a) определяется неравенствами

Целая часть числа, полученная в результате суммирования констант k_i, представляет собой ранг числа, то есть такую непозиционную характеристику, которая показывает, сколько раз диапазон системы р был превзойден при переходе от представления чисел в системе остаточных классов к его позиционному представлению. При необходимости определение ранга может производиться непосредственно в процессе выполнения операции суммирования констант k_i. Дробная часть может быть записана также как A mod 1, потому что

Количество разрядов дробной части числа определяется максимально возможной разностью между соседними числами. При точном сравнении, которое широко используется при делении чисел, необходимо вычислить значение, которое является эквивалентом преобразования из СОК в позиционную систему счисления. Для решения задачи сравнения чисел а и b достаточно приблизительно знать относительные значения чисел

и

по отношению к динамическому диапазону [0,l), что выполняется достаточно просто, но при этом верно определяются соотношения А=В, А>В или А<В.Where

are the constants of the selected system, and α _i are the digits of the number represented in the RNC by the modules p _i, where i = 1, 2, ..., n, while the value of the sum (2) will be in the range [0,1). The final result of the sum is determined after summing and discarding the integer part of the number while maintaining the fractional part of the sum. Fractional value

contains both information about the value of the number and its sign. If

then the number a is positive and F (a) is equal to the value of the number a divided by P. Otherwise, a is the negative number, and 1 - F (a) shows the relative value of the number a. Rounding the value of F (a) to 2 ^-1 bits will be denoted as

The exact value of F (a) is determined by the inequalities

The integer part of the number obtained by summing the constants k _i represents the rank of the number, that is, such a non-positional characteristic that shows how many times the range of the system p was surpassed during the transition from the representation of numbers in the system of residual classes to its positional representation. If necessary, the determination of the rank can be made directly in the process of performing the operation of summing the constants k _i . The fractional part can also be written as A mod 1, because

The number of digits of the fractional part of the number is determined by the maximum possible difference between adjacent numbers. With the exact comparison, which is widely used in the division of numbers, it is necessary to calculate the value, which is the equivalent of converting from RNS to a positional number system. To solve the problem of comparing the numbers a and b, it is enough to know approximately the relative values of the numbers

and

in relation to the dynamic range [0, l), which is quite simple, but the relations A = B, A> B or A <B are correctly determined.

Пример 1. Пусть дана система оснований р₁=2, р₂=3, р₃=5, р₄=7, объем диапазона Р=2·3·5·7=210. Допустим, что в заданной СОК будут представлены только положительные числа. Определим величины

и сравним два числа а=25 и 6=30, представленные в СОК по основаниям р₁, р₂, р₃, р₄. Определим числа а и b в СОК: а=(1,1,0,4), 6=(0,0,0,2). Для реализации предлагаемого алгоритма найдем константы

Example 1. Let us give a base system p ₁ = 2, p ₂ = 3, p ₃ = 5, p ₄ = 7, the volume of the range P = 2 · 3 · 5 · 7 = 210. Suppose that in a given RNS only positive numbers will be presented. We define the quantities

and compare the two numbers a = 25 and 6 = 30, presented in the RNS on the basis of p ₁ , p ₂ , p ₃ , p ₄ . Define the numbers a and b in the RNS: a = (1,1,0,4), 6 = (0,0,0,2). To implement the proposed algorithm, we find the constants

Далее, используя выражение (2) найдем функции F(a) и F(b)Next, using expression (2) we find the functions F (a) and F (b)

Так как

то есть 0,1428>0,1189, то b>а, и действительно 30>25.As

i.e. 0.1428> 0.1189, then b> a, and indeed 30> 25.

Алгоритм деления целых чисел $\frac{a}{b}$

можно описать итеративной схемой, которая выполняется в два этапа. На первом этапе осуществляется поиск старшей степени 2ⁱ при аппроксимации частного двоичным рядом. На втором этапе осуществляется уточнение аппроксимирующего ряда. Чтобы получить диапазон, больший чем Р, можно выбрать значение Pⁱ=Р·p_n+1, то есть потребуется расширить базу СОК, добавив дополнительный модуль. Чтобы избежать этого расширения базы, которое является вычислительно сложной операцией, необходимо сравнивать не делимое с промежуточными делителями, а текущие результаты итерации (i) с предыдущими значениями итераций (i-1). Это позволит выполнить условие 0<b<Р-1.Integer division algorithm

\frac{a}{b}

can be described by an iterative scheme, which is performed in two stages. At the first stage, a search is performed for the highest degree 2 ⁱ when approximating the quotient with a binary series. At the second stage, the approximation series is refined. To get a range greater than P, you can choose the value of P ⁱ = P · p _{n + 1} , that is, you need to expand the base of the RNS by adding an additional module. To avoid this expansion of the base, which is a computationally complex operation, it is necessary to compare not divisible with intermediate divisors, but the current results of iteration (i) with previous values of iterations (i-1). This will satisfy the condition 0 <b <P-1.

Алгоритм деления можно описать следующими правилами.The division algorithm can be described by the following rules.

Конструируется некоторое правило φ, которое каждой паре целых положительных чисел а и b ставит в соответствие некоторое положительное число q_i, где i - номер итерации, такое что a-bq₁=r₁≥0, то есть a > bq_l. Тогда деление а на b осуществляется по следующему правилу: согласно операции φ паре чисел а и b ставится в соответствие число q₁=q_i, такое что а-bq₁=r₁≥0, то есть a ≥ bq₁. В качестве q_i примем значения 2ⁱ, которые поместим в память в виде констант q_i=(2ⁱ mod р₁,2ⁱ mod р₂,…,2ⁱ mod p_n). При этом i+1 операция не зависит от i-й операции, что позволяет итерации выполнять параллельно. Кроме того, в каждой итерации выполняются только две операции: умножение делителя на константу 2ⁱ и сравнение полученного значения с делимым.A certain rule φ is constructed that associates with each pair of positive integers a and b a positive number q _i , where i is the iteration number, such that a-bq ₁ = r ₁ ≥0, that is, a> bq _l . Then dividing a by b is carried out according to the following rule: according to the operation φ, a pair of numbers a and b is assigned a number q ₁ = q _i such that a-bq ₁ = r ₁ ≥0, that is, a ≥ bq ₁ . As q _i assume the values 2 ^i, are put into memory as constants q _i = (2 ⁱ mod p _1, p 2 ⁱ mod _2, ..., 2 ⁱ mod p _n). Moreover, the i + 1 operation does not depend on the i-th operation, which allows iteration to be performed in parallel. In addition, in each iteration, only two operations are performed: multiplying the divisor by the constant 2 ⁱ and comparing the obtained value with the dividend.

Если r₁,<b₁, то деление закончено, если же r₁≥b₁, то согласно правилу φ паре чисел (r₁, b) ставится в соответствие q₂ такое, что a-bq₂=r₂≥0, то есть a≥bq₂. Если r₂<b, то деление завершается, если же r₂≥b, то согласно правилу φ паре чисел {r₂, b) ставится в соответствие q₃ такое, что а-bq₃=r₃≥0 и т.д. Так как последовательное применение операции φ приводит к убывающей последовательности целых чисел а>r₁>r₂>…≥0, то алгоритм реализуется за конечное число шагов. Пусть на n -м шаге зафиксирован случай а<bq_n, что означает окончание операции деления. Тогда в итоге получим а≅(q₁+q₂+…+q_n)b+r_n, где ряд q₁+q₂+…+q_n есть аппроксимация частного, которое может содержать лишние q_i. Далее необходимо провести уточнение полученного аппроксимирующего ряда.If r ₁ , <b ₁ , then division is complete, but if r ₁ ≥b ₁ , then according to the rule φ, a pair of numbers (r ₁ , b) is assigned q ₂ such that a-bq ₂ = r ₂ ≥0, i.e. a≥bq ₂ . If r ₂ <b, then the division is completed, but if r ₂ ≥b, then according to the rule φ a pair of numbers {r ₂ , b) is associated with q ₃ such that a-bq ₃ = r ₃ ≥0, etc. . Since the sequential application of the operation φ leads to a decreasing sequence of integers a> r ₁ > r ₂ > ... ≥0, the algorithm is implemented in a finite number of steps. Let the case a <bq _{n be} fixed at the nth step, which means the end of the division operation. Then in the end we get a≅ (q ₁ + q ₂ + ... + q _n ) b + r _n , where the series q ₁ + q ₂ + ... + q _n is an approximation of the quotient, which may contain extra q _i . Next, it is necessary to refine the obtained approximating series.

Уточнение начнем со старшего q_n. Если a>bq_n, то q_n является членом аппроксимирующего ряда образованного частного. Далее берем (q_n+_n+1), если a>b(q_n+q_n-1), то q_n-1 вносится в ряд, иначе, если a<b(q_n+q_n-1), то q_n-1 исключается из ряда и т.д. После проверки всех частное определяется оставшимися членами ряда. Тогда искомое частное определяется выражениемWe begin the refinement with the highest q _n . If a> bq _n , then q _n is a member of the approximating series of the formed quotient. Next, we take (q _n + _{n + 1} ), if a> b (q _n + q _n-1 ), then q _n-1 is placed in a row, otherwise, if a <b (q _n + q _n-1 ), then q _{n-1 is} excluded from the series, etc. After checking all the quotient is determined by the remaining members of the series. Then the sought quotient is determined by the expression

Данный алгоритм легко модифицируется в модулярную форму, причем абсолютные значения величин заменяются их относительными значениями. При этом структура алгоритма основана на сравнении относительных значений, которое выполняется с использованием операции вычитания. Однако некоторые операции в системе остаточных классов, такие как сравнение чисел и деление, весьма сложные. С целью сокращения временной сложности при использовании модулярной формы абсолютные величины делимого и делителя заменим на их относительные значения к общему диапазону системы остаточных классов.This algorithm is easily modified into a modular form, and the absolute values of the quantities are replaced by their relative values. The structure of the algorithm is based on a comparison of relative values, which is performed using the subtraction operation. However, some operations in the system of residual classes, such as comparing numbers and division, are very complex. In order to reduce the time complexity when using the modular form, we replace the absolute values of the dividend and divisor by their relative values to the general range of the system of residual classes.

Известные алгоритмы деления определяют частное на основе итерации A′=A-QD, где А и А′, соответственно, текущее и следующее делимое, D - делитель, Q₁ - частное, которое генерируется на каждой итерации из полного диапазона СОК, а не выбирается из небольшого множества констант. В предлагаемом алгоритме частное определяется на основе итерации r_i=А-b2ⁱ, где А - некоторое делимое, 6 - делитель, а 2' является членом аппроксимирующего ряда частного.Known dividing algorithms determine the quotient based on the iteration A ′ = A-QD, where A and A ′, respectively, the current and next dividend, D is the divisor, Q ₁ is the quotient that is generated at each iteration from the full range of RNS, but is not selected from a small set of constants. In the proposed algorithm, the quotient is determined on the basis of iteration r _i = A-b2 ⁱ , where A is a divisible, 6 is a divisor, and 2 'is a member of the approximating series of quotients.

Сравнение алгоритмов показывает, что деление во всех итерациях не меняется, а делитель умножается на константу, что существенно уменьшает вычислительную сложность. Приведенный выше алгоритм легко модифицируется в систему остаточных классов с применением приближенного метода сравнения модулярных чисел. При итерационном процессе деления в позиционной системе счисления для поиска старшей степени ряда аппроксимации частного и для уточнения аппроксимирующего ряда сравниваются делимое с удвоенными делителями или с суммой членов ряда. Применение этого принципа для СОК может привести к ошибке процесса деления, так как при переполнении динамического диапазона восстановленное число выходит за пределы рабочего диапазона, числа которого будут меньше делимого, что не соответствует действительности, так как на самом деле числа будут превышать диапазон Р. Например, если модули СОК равны р₁=2, р₂=3, р₃=5, р₄=7, тогда диапазон Р=2·3·5·7=210. Допустим при восстановлении получили число А=220. В СОК Л=220=(0,1,0,3). Диапазон Р превышен на число 10, которое в СОК равно (0,1,0,3). При использовании относительных значений число А=220 выражается как А′=10, что не соответствует действительности.A comparison of the algorithms shows that division in all iterations does not change, and the divisor is multiplied by a constant, which significantly reduces computational complexity. The above algorithm is easily modified into a system of residual classes using the approximate method of comparing modular numbers. During the iterative division process in the positional number system, to search for the highest degree of a series of approximation of a quotient and to refine the approximating series, the dividend is compared with doubled divisors or with the sum of the members of the series. Application of this principle for RNS can lead to an error in the division process, since when the dynamic range is overflowed, the restored number goes beyond the working range, the numbers of which will be less than the dividend, which is not true, since in reality the numbers will exceed the range P. For example, if the SOK modules are equal to p ₁ = 2, p ₂ = 3, p ₃ = 5, p ₄ = 7, then the range is P = 2 · 3 · 5 · 7 = 210. Suppose, during recovery, we got the number A = 220. In RNS L = 220 = (0,1,0,3). Range P is exceeded by the number 10, which in the RNS is (0,1,0,3). When using relative values, the number A = 220 is expressed as A ′ = 10, which is not true.

Для преодоления этой трудности необходимо в СОК сравнивать результаты текущих значений итераций с предыдущими, что позволяет правильно определить большее или меньшее число. Итак, факт переполнения динамического диапазона в СОК можно использовать для принятия решения «больше-меньше». На первой итерации происходит сравнение делимого с делителем, а на остальных итерациях происходит сравнение удвоенных значений делителей q_ib<q_j+1b. На каждой новой итерации происходит сравнение текущего значения с предыдущим. Количество требуемых итераций зависит от величин делимого и делителя. Последовательное применение этой операции приводит к формированию последовательности целых чисел bq₁<bq₂<…<bq_n>bq_n+1. Таким образом, алгоритм реализуется за конечное число итераций. Пусть на n+1 итерации зафиксирован случай bq_n>bq_n+1, что соответствует переполнению диапазона СОК, то есть bq_n+l>Р и а<bq_n+l. На этом процесс формирования интерполяции частного двоичным рядом или набором констант в СОК завершается. Итак, процесс аппроксимации частного может осуществляться путем сравнения только удвоенных соседних приближенных делителей.To overcome this difficulty, it is necessary to compare the results of the current iteration values with the previous ones in the RNS, which allows one to correctly determine a larger or smaller number. So, the fact of overflow of the dynamic range in the RNS can be used to make a decision “more-less”. At the first iteration, the dividend is compared with the divisor, and at the other iterations, the doubled values of the divisors q _i b <q _{j + 1} b are compared. At each new iteration, the current value is compared with the previous one. The number of iterations required depends on the divisor and divisor values. The sequential application of this operation leads to the formation of a sequence of integers bq ₁ <bq ₂ <... <bq _n > bq _{n + 1} . Thus, the algorithm is implemented in a finite number of iterations. Let the case bq _n > bq _{n + 1 be} fixed at n + 1 iterations, which corresponds to an overflow of the RNC range, i.e., bq _{n + l} > P and a <bq _{n + l} . This completes the process of generating interpolation of the quotient by a binary series or by a set of constants in the RNS. So, the process of approximating the quotient can be carried out by comparing only doubled adjacent neighboring approximate divisors.

Рассмотрим алгоритм деления на примере. Действия будут производиться как над десятичными числами, так и над числами, представленными в системе остаточных классов.Consider the division algorithm by example. Actions will be performed both on decimal numbers and on numbers represented in the system of residual classes.

Пример 2. Найти частное

от деления числа а=201 на число b=8. Выберем СОК с основаниями 2, 3, 5, 7, тогда Р=р₁р₂р₃р₄=210. Константы k_i соответственно равны: k₁=0,5; k₂=0,3333; k₃=0,6; k₄=0,5714.Example 2. Find the quotient

from dividing the number a = 201 by the number b = 8. We choose a JUICE with

bases

2, 3, 5, 7, then P = p ₁ p ₂ p ₃ p ₄ = 210. The constants k _{i are} respectively equal: k ₁ = 0.5; k ₂ = 0.3333; k ₃ = 0.6; k ₄ = 0.5714.

Представим в СОК числа а и b.We represent the numbers a and b in the RNS.

а₁₀=201→(l,0,l,5)сок,and ₁₀ = 201 → (l, 0, l, 5) juice,

b₁₀=8→(0,2,3,l)сок.b ₁₀ = 8 → (0,2,3, l) juice.

Относительные значения этих чисел, соответственно, равны:The relative values of these numbers, respectively, are equal to:

Решение. Деление а на b осуществляется по следующему алгоритму. Для интерполяции частного определим степени 2ⁱ, представленные в СОК.Decision. The division of a by b is carried out according to the following algorithm. To interpolate the quotient, we define the degrees 2 ⁱ represented in the RNS.

I. Поиск старшей степени при аппроксимации частного двоичным рядом.I. Search for a higher degree in approximating a partial binary series.

1. На первой итерации сравниваем

Если

то в память ничего не записывается, так как в этом случае делитель больше делимого и на этом процесс деления заканчивается и частное равно 0. Если

то в память записываем константу q₁=2°, а если

то реализуется итерационный процесс деления.1. At the first iteration, we compare

If

then nothing is written to the memory, since in this case the divisor is larger than the dividend and the division process ends and the quotient is 0. If

then in the memory we write the constant q ₁ = 2 °, and if

then an iterative division process is implemented.

Допустим, что делимое а и делитель b имеют следующие значения:Assume that the dividend a and the divisor b have the following meanings:

а₁₀=201→(1,0,l,5)сок, b₁₀=8→(0,2,3,1)сок.and ₁₀ = 201 → (1,0, l, 5) juice, b ₁₀ = 8 → (0,2,3,1) juice.

Найдем частное от деления

Find the quotient of the division

Тогда,Then,

отсюда 0,957>0,038, то естьfrom here 0.957> 0.038, i.e.

В память записываем константы, представленные в двоичном коде q₁=2° и в СОК q₁=(1,1,1,1).In memory, we write the constants represented in the binary code q ₁ = 2 ° and in the RNS q ₁ = (1,1,1,1).

2. Далее во всех остальных итерациях будем сравнивать текущие значения с предыдущими. Так, на второй итерации умножаем знаменатель на 2¹, то есть q₂=2, если

то в память запишем 2¹.2. Next, in all other iterations, we will compare the current values with the previous ones. So, at the second iteration, we multiply the denominator by 2 ¹ , that is, q ₂ = 2, if

then in memory we write 2 ¹ .

ТогдаThen

b_1}=8·2=16, b₁=16→(0,2,3,l)·(0,2,2,2)=(0,l,l,2).b _1} = 8 · 2 = 16, b ₁ = 16 → (0,2,3, l) · (0,2,2,2) = (0, l, l, 2).

Сравнение дает следующий результат:The comparison gives the following result:

Тогда

так как 0,038<0,0761, то

Then

since 0.038 <0.0761, then

В память записываем число в двоичном коде q₂=2¹, а в СОК q₂=(0,2,2,2).In memory, write the number in the binary code q ₂ = 2 ¹ , and in the RNS q ₂ = (0,2,2,2).

3. Аналогично получим результаты сравнения на следующих итерациях.3. Similarly, we obtain the results of comparison at the following iterations.

На третьей итерации умножаем знаменатель b на q₃=2², если

то в память запишем 2².At the third iteration, we multiply the denominator of b by q ₃ = 2 ² if

then in memory we write 2 ² .

Так как b₂=8·4=32, b₂=32→(0,2,3,l)·(0,l,4,4)=(0,2,2,4).Since b ₂ = 8 · 4 = 32, b ₂ = 32 → (0,2,3, l) · (0, l, 4,4) = (0,2,2,4).

Тогда

так как 0,0761<0,1522.Then

since 0.0761 <0.1522.

В память записываем число в двоичном коде q₃=2², а в СОК q₃=(0,1,4,4).In memory, write the number in the binary code q ₃ = 2 ² , and in the RNS q ₃ = (0,1,4,4).

4. На четвертой итерации получим результат4. At the fourth iteration, we get the result

b₃=8·8=64; b₃=64→(0,2,3,1)·(0,2,3,1)=(0,1,4,1).b ₃ = 8 · 8 = 64; b ₃ = 64 → (0,2,3,1) · (0,2,3,1) = (0,1,4,1).

Тогда

так как 0,1522<0,3047.Then

since 0.1522 <0.3047.

В память записываем число в двоичном коде q₄=2³, а в СОК q₄=(0,2,3,1).In memory, write the number in the binary code q ₄ = 2 ³ , and in the RNS q ₄ = (0,2,3,1).

5. На пятой итерации получим следующий результат5. At the fifth iteration, we obtain the following result

b₄=8·16=128, b₄=128→(0,2,3,l)·(0,l,l,2)=(0,2,3,2).b ₄ = 8 · 16 = 128, b ₄ = 128 → (0,2,3, l) · (0, l, l, 2) = (0,2,3,2).

Тогда

так как 0,3047<0,6094.Then

since 0.3047 <0.6094.

В память записываем число в двоичном коде q₅=2⁴, а в СОК q_s=(0,1,1,2).In memory, write the number in the binary code q ₅ = 2 ⁴ , and in the RNS q _s = (0,1,1,2).

6. На шестой итерации получим следующий результат6. At the sixth iteration, we obtain the following result

b₅=8·32=256, b₅=256→(0,2,3,1)·(0,2,2,4)=(0,1,1,4).b ₅ = 8 · 32 = 256, b ₅ = 256 → (0,2,3,1) · (0,2,2,4) = (0,1,1,4).

Тогда

так как 0,6094>0,2189.Then

since 0.6094> 0.2189.

Таким образом, произошло переполнение, и в память ничего не записывается. Процесс формирования аппроксимационного ряда частного завершается.Thus, an overflow occurred, and nothing is written to the memory. The process of forming the approximation series of the quotient is completed.

Итак, первое неравенство 0,957>0,038 определяет начало итерационного процесса деления, а последовательность последующих неравенств 0,0761>0,038; 0,1522>0,0761; 0,3047>0,1522; 0,6094>0,3047, или 0,038<0,0761<0,1522<0,3047<0,6094>0,2189 определяет количество итераций. Отсюда видно, что на шестом шаге заканчивается возрастающая последовательность. Этот факт сигнализирует об окончании итераций, так как полученный путем последовательного умножения на 2 делитель превысил делимое.So, the first inequality 0.957> 0.038 determines the beginning of the iterative process of division, and the sequence of subsequent inequalities 0.0761> 0.038; 0.1522> 0.0761; 0.3047> 0.1522; 0.6094> 0.3047, or 0.038 <0.0761 <0.1522 <0.3047 <0.6094> 0.2189 determines the number of iterations. This shows that in the sixth step the increasing sequence ends. This fact signals the end of the iterations, since the divisor obtained by sequential multiplication by 2 exceeded the dividend.

II. Уточнение аппроксимирующего ряда частного от деления а на b начнем со старшего q_n.II. We begin the refinement of the approximating series of the quotient of a and b by starting with the highest q _n .

1. Из памяти выбираем старшую степень, то есть q₅=(0,1,1,2), умножаем на знаменатель b=(0,2,3,1) и сравниваем с а. Тогда1. From the memory, select the highest degree, that is, q ₅ = (0,1,1,2), multiply by the denominator b = (0,2,3,1) and compare with a. Then

Так как 0,957>0,6094, то в качестве старшей степени берем 2⁴, а в СОК (0,1,1,2).Since 0.957> 0.6094, we take 2 ⁴ as the highest degree, and in the RNS (0,1,1,2).

2. Из памяти выбираем степень 2³ и вычисляем (2⁴+2³)·8=192, а в СОК ((0,1,1,2)+(0,2,3,1))·(0,2,3,1)=(0,0,2,3). Тогда2. From the memory, select the degree of 2 ³ and calculate (2 ⁴ +2 ³ ) · 8 = 192, and in the RNS ((0,1,1,2) + (0,2,3,1)) · (0, 2,3,1) = (0,0,2,3). Then

Так как 0,9142>0,6094 и 0,957>0,9142, то в качестве следующего члена ряда берем 2³, а в СОК (0,2,3,1).Since 0.9142> 0.6094 and 0.957> 0.9142, then as the next member of the series we take 2 ³ , and in RNS (0,2,3,1).

3. Из памяти берем 2² и вычисляем (2⁴+2³+2²)·8=224. Тогда ((0,1,1,2)+(0,2,3,1)+(0,1,4,4))·(0,2,3.1)=(0,2,4,0),3. From the memory we take 2 ² and calculate (2 ⁴ +2 ³ +2 ² ) · 8 = 224. Then ((0,1,1,2,2) + (0,2,3,1) + (0,1,4,4)) · (0,2,3.1) = (0,2,4,0) ,

Так как 0,0666<0,9142, то произошло переполнение диапазона Р и степень 2², или в СОК (0,1,4,4), из аппроксимационного ряда исключается.Since 0.0666 <0.9142, an overflow of the P range and degree 2 ² occurred, or in the RNS (0,1,4,4), it is excluded from the approximation series.

4. Из памяти берем 2¹ и вычисляем (2⁴+2³+2¹)·8=208. Тогда ((0,1,1,2)+(0,2,3 J)+(0,2,2,2))·(0,2,3, l)=(0,1,3,5),4. From the memory we take 2 ¹ and calculate (2 ⁴ +2 ³ +2 ¹ ) · 8 = 208. Then ((0,1,1,2,2) + (0,2,3 J) + (0,2,2,2)) (0,2,3, l) = (0,1,3,5 ),

Так как 0,957<0,9903, то степень 2¹ или (0,2,2,2) из аппроксимационного ряда исключается.Since 0.957 <0.9903, the power of 2 ¹ or (0,2,2,2,2) is excluded from the approximation series.

5. Из памяти берем 2⁰ и вычисляем (2⁴+2³+2⁰)·8=200. Тогда ((0,l,l,2)+(0,2,3,l)+(l,l,l,l))·(0,2,3,l)=(0,2,0,4),a5. From the memory we take 2 ⁰ and calculate (2 ⁴ +2 ³ +2 ⁰ ) · 8 = 200. Then ((0, l, l, 2) + (0,2,3, l) + (l, l, l, l)) · (0,2,3, l) = (0,2,0, 4), a

Так как 0,957>0,9522, поэтому в качестве младшей степени берем 2⁰ или в СОК (1,1,1,1) Следовательно, частное равно (0,l,l,2)+(0,2,3,l)+(l,l,l,l)=(l,l,0,4).Since 0.957> 0.9522, therefore, as the youngest degree we take 2 ⁰ or in the RNS (1,1,1,1) Therefore, the quotient is (0, l, l, 2) + (0,2,3, l ) + (l, l, l, l) = (l, l, 0.4).

Для определения частного необходимо сложить оставшиеся члены аппроксимационного ряда. Из приведенного примера видно, что остались следующие члены ряда: (0,1,1,2), (0,2,3,1) и (l,1,1,1). Тогда частное определяется путем суммирования членов ряда:To determine the quotient, it is necessary to add the remaining members of the approximation series. It can be seen from the above example that the following members of the series remain: (0,1,1,2), (0,2,3,1) and (l, 1,1,1). Then the quotient is determined by summing the members of the series:

По остаткам частного восстановим позиционное число с помощью выражения (1), тогдаUsing the remainders of the quotient, we restore the positional number using expression (1), then

Действительно

Really

Результат в СОК и в позиционной системе счисления совпадают, что говорит о правильности проведенного деления.The result in the RNS and in the positional number system are the same, which indicates the correctness of the division.

Новый алгоритм основного деления модулярных чисел.A new algorithm for the basic division of modular numbers.

Улучшенный алгоритм деления модулярных чисел на основе приближенного метода сравнения чисел состоит из следующих шагов.An improved modular number division algorithm based on the approximate number comparison method consists of the following steps.

1. Вычисляем приближенные значения делимого F(a) и делителя F(b), и сравниваем их. Если F(a)<F(b), то процесс деления заканчивается и частное

Если F(a)=F(b), то процесс деления заканчивается и частное

Если F(a)>F(b), то осуществляется поиск старшей степени 2^k при аппроксимации частного двоичным кодом.1. Calculate the approximate values of the divisible F (a) and the divisor F (b), and compare them. If F (a) <F (b), then the division process ends and the quotient

If F (a) = F (b), then the division process ends and the quotient

If F (a)> F (b), then a higher degree of 2 ^k is searched for by approximating a partial binary code.

2. Сдвигаем функцию F(b) влево до появления переноса старшего значащего разряда в знаковый разряд. Количество сдвигов определяет старшую степень, которая регистрируется счетчиком импульсов.2. We shift the function F (b) to the left until the transfer of the most significant digit to the sign digit appears. The number of shifts determines the highest degree, which is recorded by the pulse counter.

3. Из памяти 5 выбираем константу 2^k (старшая степень ряда), умножаем ее на делитель F₁(b)=b2^k и подаем на вход схемы сравнения.3. From memory 5 we select the constant 2 ^k (the highest power of the series), multiply it by the divisor F ₁ (b) = b2 ^k and feed the comparison circuit.

4. Находим Δ_i=F(a)-F₁(b). Если в знаковом разряде Δ_i стоит «1», то соответствующая степень ряда отбрасывается, если стоит «0», то в сумматор частного 13 добавляем значение члена ряда с этой степенью, то есть 2^k.4. We find Δ _i = F (a) -F ₁ (b). If the sign bit Δ _i is "1", the corresponding power series is discarded if there is a "0", then the adder 13 add the private value of a number of members with this degree, ie, 2 ^k.

5. Сдвигаем F₁(b) «вправо» и проверяем член ряда со степенью 2^k-1.5. Move F ₁ (b) “to the right” and check the term in the series with degree 2 ^k-1 .

6. Находим Δ₂=Δ₁-F₁{b) и выполняем действия в соответствии с пунктом 4.6. Find Δ ₂ = Δ ₁ -F ₁ {b) and perform the actions in accordance with paragraph 4.

7. Аналогично проверяем все оставшиеся члены ряда до нулевой степени. Полученный остаток Δ_i=Δ_i-1-F_i-1(b)≈0. В случае, когда делитель принимает значение, равное нескольким единицам, то порог Δ_i можно взять больше нуля, что позволит сократить количество итераций при делении большого делимого и маленького делителя. В процессе этих преобразований суммируем все разрешенные члены ряда.7. Similarly, we check all the remaining members of the series to the zero degree. The resulting residue Δ _i = Δ _i-1 -F _i-1 (b) ≈0. In the case when the divisor takes a value equal to several units, then the threshold Δ _i can be taken more than zero, which will reduce the number of iterations when dividing a large divisible and a small divisor. In the process of these transformations, we summarize all the allowed members of the series.

Процесс сдвига и вычитания завершается проверкой нулевой степени ряда. В накопительном сумматоре суммируются только разрешенные члены аппроксимирующего ряда частного. После последнего шага устройство устанавливается в исходное состояние.The shift and subtraction process is completed by checking the zero degree of the series. In the accumulative adder, only allowed members of the approximating series of quotients are summed. After the last step, the device is reset.

Новый упрощенный алгоритм для деления модулярных чисел основан на использовании приближенного метода сравнения, который значительно снижает вычислительную сложность алгоритма. Он требует лишь одну операцию приближенного сравнения, одну операцию умножения и по числу шагов операции сдвига и вычитания. При реализации известных алгоритмов в каждой итерации используются операции сравнения, умножения и сложения. Исходя из этого можно сказать, что на сегодняшний день предложенный алгоритм является лучшим решением выполнения операции деления, что позволит в целом упростить реализацию аппаратно-логических модулярных схем.The new simplified algorithm for dividing modular numbers is based on the use of an approximate comparison method, which significantly reduces the computational complexity of the algorithm. It requires only one operation of approximate comparison, one operation of multiplication and the number of steps of the operation of shift and subtraction. When implementing well-known algorithms in each iteration, operations of comparison, multiplication and addition are used. Based on this, we can say that today the proposed algorithm is the best solution for performing the division operation, which will generally simplify the implementation of hardware-logic modular circuits.

Пример 3. Рассмотрим работу устройства деления на конкретном примере. Используем делимое и делитель из примера 2, где а=201=(1,0,1,5) и b=(0,2,3,1). Найдем частное от деления числа а на число b

Example 3. Consider the operation of the division device on a specific example. We use the dividend and divisor from Example 2, where a = 201 = (1,0,1,5) and b = (0,2,3,1). Find the quotient of a and b

1. На вход 1 подается делимое a=(l,0,l,5), а на вход 2 - делитель b=(0,2,3,l). Выберем из LUT-таблиц схемы сравнения 11 значения k_iα_i и k_iβ_iкоторые используются для сравнения модулярных чисел схемой сравнения и выдаются схемой сравнения для суммирования по модулю 1 в сумматорах делимого 37 и делителя 10. Эти операции осуществляются параллельно в схеме сравнения и сумматорах делимого 37 и делителя 10. Так как F(a)=0,957 и F(b)=0,038, то схема сравнения 11 формирует сигнал а>b, а в сумматорах формируются значения в двоичной форме соответственно F(a)=0,11110 и F(b)=0,00001.1. Divisible a = (l, 0, l, 5) is fed to input 1, and divisor b = (0,2,3, l) is input 2. From the LUT tables of the comparison scheme 11, we choose the values k _i α _i and k _i β _i that are used to compare modular numbers with the comparison scheme and are issued by the comparison scheme for summing modulo 1 in the adders of divisible 37 and divisor 10. These operations are carried out in parallel in the comparison scheme and the adders of the dividend 37 and the divisor 10. Since F (a) = 0.957 and F (b) = 0.038, the comparison circuit 11 generates a signal a> b, and in the adders the values are generated in binary form, respectively, F (a) = 0, 11110 and F (b) = 0.00001.

2. В регистре сдвига 9 сдвигаем F(b) на 5 разрядов влево F(b)_c=1,00000, счетчик 4 формирует состояние, соответствующее высшей степени ряда 2⁴.2. In shift register 9, we shift F (b) by 5 bits to the left F (b) _c = 1.00000, counter 4 forms a state corresponding to the highest degree of row 2 ⁴ .

3. Умножаем делитель b=8=(0,2,3,1) на 2⁴=16=(0,1,1,2), тогда b·2⁴=(0,2,3,1.)(0,1,1,2)=(0,2,3,2).3. Multiply the divisor b = 8 = (0,2,3,1) by 2 ⁴ = 16 = (0,1,1,2), then b · 2 ⁴ = (0,2,3,1.) ( 0,1,1,2) = (0,2,3,2).

4. Вычисляем F(b·2⁴)≈0,6094.4. Calculate F (b · 2 ⁴ ) ≈0.6094.

5. Вычисляем F(a)-F(b·2⁴)≈0,957-0,6094=0,3476. Разность положительная, в знаковом разряде «0», который разрешает прохождение члена ряда (0,1,1,2) через схему «запрета» 12 на вход сумматора 13 частного и разности F(a)-f(b·2⁴)≈0,3476 на вход регистра 36 хранения остатка при вычитании из делимого членов ряда частного через схему «запрета» 35, которая далее поступает на второй вход сумматора делимого 37. В регистре сдвига 9 запишем значение функции F(b·2⁴), причем регистр сдвига 9 переведен в режим сдвига «вправо».5. Calculate F (a) -F (b · 2 ⁴ ) ≈0.957-0.6094 = 0.3476. The difference is positive, in the sign category “0”, which allows the passage of a member of the series (0,1,1,2) through the “ban” 12 to the input of the adder 13 of the quotient and the difference F (a) -f (b · 2 ⁴ ) ≈ 0.3476 to the input of the balance storage register 36 when subtracting the quotient from the dividend terms through the “ban” 35 scheme, which then goes to the second input of the dividend adder 37. In shift register 9, write the value of the function F (b · 2 ⁴ ), and the register shift 9 is transferred to the shift mode "to the right."

6. В регистре сдвига 9 F(b·2⁴) сдвигаем на 1 разряд «вправо», то есть уменьшаем число в 2 раза, при этом F_c(b·2⁴)≈0,3047, и результат подаем на первые входы вычитателя 14, а на вторые входы вычитателя 14 поступает разность F(a)-F(b·2⁴)≈0,3476. На выходе вычитателя 14 получаем F(a)-F(b·2⁴)-F(b·2³)≈0,3476-0,3047=0,0429. Так как число положительное, то в схеме происходят процессы, аналогичные пункту 5 и в сумматоре 13 частного содержимое (0,1,1,2) суммируется с 2³=8=(0,2,3,1), то есть (0,l,l,2)+(0,2,3,l)=(0,0,4,3), а в сумматор делимого 37 записываются новые данные 0,0429.6. In the shift register 9 F (b · 2 ⁴ ) we shift by 1 bit “to the right”, that is, we reduce the number by 2 times, while F _c (b · 2 ⁴ ) ≈0.3047, and the result is fed to the first inputs subtractor 14, and the second inputs of the subtractor 14 receives the difference F (a) -F (b · 2 ⁴ ) ≈0.3476. At the output of the subtractor 14, we obtain F (a) -F (b · 2 ⁴ ) -F (b · 2 ³ ) ≈0.3476-0.3047 = 0.0429. Since the number is positive, processes similar to paragraph 5 occur in the scheme and in the adder 13 the private content (0,1,1,2) is summed up with 2 ³ = 8 = (0,2,3,1), i.e. (0 , l, l, 2) + (0,2,3, l) = (0,0,4,3), and new data 0,0429 is written to the adder of dividend 37.

7. Содержимое регистра сдвига 9 F(b·2³) сдвигаем вправо и получаем F(b·2²). Тогда на первом входе вычитателя будет число 0,1524, а на втором 0,0429. Результат вычитания будет отрицательным и схемы «запрета» 12 и 35 не пропустит данные на вход регистра 36 хранения. Содержимое сумматора частного 13 и регистра 36 хранения сохраняется, а степень 2² удаляется.7. The contents of the shift register 9 F (b · 2 ³ ) are shifted to the right and we get F (b · 2 ² ). Then at the first input of the subtractor will be the number 0.1524, and at the second 0.0429. The result of the subtraction will be negative and the “ban” schemes 12 and 35 will not pass data to the input of the storage register 36. The contents of the adder private 13 and the register 36 storage is stored, and the degree 2 ² is deleted.

Далее происходит сдвиг содержимого регистра сдвига 9, то есть получаем F(b·2¹), тогда на первом входе вычитателя будет число 0,0762, а на втором 0,0429. При вычитании процесс повторяется, как и для степени 2², то есть степень 2¹ тоже удаляется. Новый сдвиг дает число F(b·2⁰). Тогда на первом входе вычитателя будет число 0,0381, а на втором 0,0429. При вычитании результат будет положительным и к содержимому сумматора частного 13 (0,0,4,3) добавляется (1,1,1,1), то есть (0,0,4,3)+(l,1,1,l)=(1,1,0,4).Next, the contents of shift register 9 are shifted, that is, we get F (b · 2 ¹ ), then at the first input of the subtractor there will be the number 0.0762, and at the second 0.0429. When subtracting, the process is repeated, as for degree 2 ² , that is, degree 2 ^{1 is} also deleted. The new shift gives the number F (b · 2 ⁰ ). Then at the first input of the subtractor will be the number 0.0381, and at the second 0.0429. When subtracting, the result will be positive and (1,1,1,1) is added to the contents of the adder of private 13 (0,0,4,3), i.e. (0,0,4,3) + (l, 1,1, l) = (1,1,0,4).

Получили частное Q=(1,1,0,4) от деления числа а=(1,0,1,5) на число b=(0,2,3,1).We obtained the quotient Q = (1,1,0,4) from dividing the number a = (1,0,1,5) by the number b = (0,2,3,1).

В таблице 1 приведены исходные и промежуточные величины работы алгоритма. В алгоритме использованы двоичные значения F(a) и F(b). Округление F(a) и F(b) проведено до t -го бита, где t - первый значащий разряд F(b). Вычислительная сложность нового алгоритма деления представлена в таблице 1. Сравнительный анализ работы нового алгоритма с модифицированными показал преимущества первого. Если все операции свести к элементарным действиям типа побитового сложения и сдвига, то выигрыш нового алгоритма по вычислительной сложности примерно равен 5.Table 1 shows the initial and intermediate values of the algorithm. The binary values F (a) and F (b) are used in the algorithm. Rounding of F (a) and F (b) is carried out to the t-th bit, where t is the first significant bit of F (b). The computational complexity of the new division algorithm is presented in Table 1. A comparative analysis of the operation of the new algorithm with the modified ones showed the advantages of the first. If all operations are reduced to elementary actions such as bit addition and shift, then the gain of the new algorithm in computational complexity is approximately equal to 5.

На чертеже представлена схема для деления модулярных чисел. Принцип работы изобретения излагается ниже. Устройство для деления модулярных чисел позволяет выполнять операцию деления при произвольных значениях делимого и делителя без каких-либо дополнительных предварительных операций, кроме операций, устанавливающих устройство в исходное состояние.The drawing shows a diagram for dividing modular numbers. The principle of operation of the invention is described below. A device for dividing modular numbers allows you to perform the division operation for arbitrary values of the dividend and divisor without any additional preliminary operations, except for operations that set the device to its original state.

Режим аппроксимации частного двоичным рядомPrivate binary approximation mode

В исходном состоянии схема управления 3 по шине 18 выдает сигнал «установка в 0», по которому регистр сдвига 9, счетчик 4 и сумматоры делителя 10 и частного 13, вычитатель 14 устанавливаются в начальное состояние, а мультиплексор 8 (сигнал поступает на адресный вход) коммутирует шину 2 на вход схемы сравнения 11, шина 33.In the initial state, the control circuit 3 on the bus 18 generates a “set to 0” signal, according to which the shift register 9, counter 4 and the adders of the divider 10 and private 13, the subtractor 14 are set to the initial state, and the multiplexer 8 (the signal is fed to the address input) switches bus 2 to the input of the comparison circuit 11, bus 33.

Делимое и делитель, представленные в системе остаточных классов по модулям p₁,p₂,…,p_n соответственно, по шинам 1 и 33 поступают на вход схемы сравнения 11, в которой происходит сравнение относительных значений делимого и делителя. Если

то схема сравнения формирует сигнал по шине 19, который поступает на вход схемы управления 3, и устройство устанавливается в начальное состояние. Частное (шина 30) равно нулю. Если

то схема сравнения 11 выдает сигнал равенства делимого и делителя по шине 42 на вход сумматора частного 13, в который записывается константа «1». Если

то схема сравнения 11 формирует сигнал по шине 20, под действием которого схема управления 3 переводит устройство в режим аппроксимации ряда частного. Значения k_iβ_i и k_iα_i с выходов LUT-таблиц схемы сравнения 11 по шинам 25 и 26 соответственно поступают на вход сумматоров делителя 10 и делимого 37. Относительное значение делимого, представленное в дополнительном коде, выход сумматора делимого 37, по шине 38 поступает на вход вычитателя 14.The dividend and divisor presented in the system of residual classes for the modules p ₁ , p ₂ , ..., p _n, respectively, are supplied to the input of the comparison circuit 11 via buses 1 and 33, in which the relative values of the dividend and divider are compared. If

then the comparison circuit generates a signal on the bus 19, which is fed to the input of the control circuit 3, and the device is set to its initial state. The quotient (bus 30) is zero. If

then the comparison circuit 11 generates an equality signal for the dividend and the divider on the bus 42 to the input of the adder private 13, in which the constant “1” is written. If

then the comparison circuit 11 generates a signal on the bus 20, under the action of which the control circuit 3 puts the device into the approximation mode of a number of quotients. The values of k _i β _i and k _i α _i from the outputs of the LUT tables of the comparison circuit 11 via buses 25 and 26, respectively, are input to the adders of the divider 10 and the dividend 37. The relative value of the dividend, presented in the additional code, the output of the dividend adder 37, via the bus 38 goes to the input of the subtractor 14.

Относительное значение делителя, выход сумматора делителя 10, по шине 23 поступает на вход регистра сдвига 9. Под действием тактовых импульсов схемы управления 3 (шина 17) происходит сдвиг содержимого регистра сдвига 9 и счет этих импульсов счетчиком 4. Как только старший значащий разряд делителя становится знаковым, т.е. произошло переполнение, то эта единица по шине 22 останавливает счет импульсов счетчиком 4 и активируется сигнал «разрешение считывания» информации из памяти 5. Счетчик 4 регистрирует состояние, формирующее высшую степень 2^k ряда частного. В регистре 36 хранения остатка при вычитании из делимого членов ряда частного в это время во всех разрядах записано значение «0» и он не оказывает никакого влияния на сумматор делимого 37. На этом режим интерполяции частного заканчивается. Итак, для интерполяции частного потребовалась одна итерация сравнения, одна операция суммирования и одна операция сдвига на k разрядов.The relative value of the divider, the output of the adder of the divider 10, is fed to the input of the shift register 9 through the bus 23. Under the influence of clock pulses of the control circuit 3 (bus 17), the contents of the shift register 9 are shifted and these pulses are counted by the counter 4. As soon as the most significant digit of the divider becomes significant, i.e. an overflow has occurred, then this unit on the bus 22 stops the pulse counting by the counter 4 and the signal “permission to read” information from the memory 5 is activated. Counter 4 registers a state that forms the highest degree 2 ^k of the quotient row. In the register 36 for storing the remainder when subtracting the quotient from the dividend at that time, the value “0” is written in all the digits and it does not have any effect on the adder of the dividend 37. The interpolation of the quotient ends here. So, to interpolate the quotient, it took one iteration of comparison, one operation of summation, and one shift operation by k bits.

Операция сдвига эквивалентна одной итерации известного алгоритма. В каждую итерацию входит операция удвоения делителя и операция сравнения результата удвоения со значением делителя. Замена абсолютных значений их относительными значениями позволила при интерполяции частного получить выигрыш по сравнению с рассмотренными выше алгоритмами в k раз, где k - количество сдвигов до появления переполнения.The shift operation is equivalent to one iteration of a known algorithm. Each iteration includes the operation of doubling the divisor and the operation of comparing the result of doubling with the value of the divisor. Replacing the absolute values with their relative values made it possible to obtain a gain by a factor of k when interpolating the quotient, compared to the above algorithms, where k is the number of shifts before the overflow appears.

Режим уточнения аппроксимирующего ряда частного отделения а на b.Refinement mode for the approximating row of the private compartment a to b.

Схема управления 3 формирует сигналы по шинам 15 и 17, которые подаются на адресные входы мультиплексора 8 и коммутирует умноженное значение делителя на высшую степень ряда частного (шина 32) на выход мультиплексора 8 и переводит регистр сдвига 9 в режим сдвига «вправо». Ключ 6 (шина 28)под действием двух сигналов, поступивших на его вход по шинам 16 и 24, подает на вход схемы умножения 7 и схемы «запрета» 12 значение высшей степени ряда. Поступившие данные на вход схемы сравнения 11 сравниваются. Если

то схема сравнения 11 выдает сигнал по шине 19 и устанавливает устройство в исходное состояние. Если

то формируется сигнал а=b и по шине 42 записывает в сумматор «единицу», а устройство переходит в исходное состояние. Если

то сигнал по шине 20 устанавливает устройство в режим уточнения аппроксимационного ряда частного.The control circuit 3 generates signals on the buses 15 and 17, which are fed to the address inputs of the multiplexer 8 and commutes the multiplied value of the divider by the highest power of the private row (bus 32) to the output of the multiplexer 8 and puts the shift register 9 in the shift mode "to the right". Key 6 (bus 28), under the action of two signals received at its input via buses 16 and 24, supplies the input of the multiplication circuit 7 and the “ban” circuit 12 to the value of the highest power of the series. The incoming data to the input of the comparison circuit 11 are compared. If

then the comparison circuit 11 provides a signal on the bus 19 and sets the device to its original state. If

then a = b signal is generated and writes “one” to the adder via bus 42, and the device returns to its initial state. If

then the signal on the bus 20 sets the device to the refinement mode of the approximation series of the quotient.

Значения умноженного на высшую степень делителя и делимого, как в случае интерполяции ряда, поступают на вход сумматоров делителя 10 и делимого 37. Содержимое сумматора делителя 10 по шине 23 подается на вход регистра сдвига 9, стирает старую информацию и записывает новое значение и далее по шине 34 подается на вход вычитателя 14, а содержимое сумматора делимого 37 в дополнительном коде подается на вторые входы вычитателя 14, где происходит вычитание умноженного делителя на высшую степень из содержимого делимого. Делимое и делитель, как и ранее, представлены своими относительными значениями. Если в знаковом разряде результата вычитания в вычитателе 14 стоит «ноль», то есть делимое больше делителя, тогда на запрещающих входах схем «запрета» 12 и 35 формируются «нули» и высшая степень частного по шине 28 через схему «запрета» 12 поступает на вход сумматора частного 13 по шине 29, а результат вычитания вычитателя 14, то есть остаток делимого, приходящий на остальные степени по шине 40 через схему «запрета» 35, подается на вход регистра 36 хранения остатка при вычитании из делимого членов ряда частного и далее по шине 39 поступает на вход сумматора делимого 37, где удаляется старое содержимое и записывается новое значение. Далее происходит сдвиг «вправо» содержимого регистра сдвига 9 и процесс происходит аналогично вышеизложенному. При этом, если в знаковом разряде результата вычитания в вычитателе 14 будет стоять «1», то есть относительное значение делителя больше делимого, то появившаяся единица, поступающая на запрещающие входы схемы «запрета» 12 и 35, запрещает прохождение соответствующей степени на сумматор частного 13 и результата вычитания вычитателя 14 на вход регистра 36 хранения остатка при вычитании из делимого членов ряда частного, то есть регистр сохраняет прежнее значение результата суммирования. Таким образом, на вход сумматора частного 13 поступают только те уточненные степени, которые являются членами ряда частного. Процесс преобразования заканчивается после анализа степени 2⁰. Таким образом, при уточнении итеративно удаляются лишние члены аппроксимационного ряда частного путем несложных преобразований, состоящих из операций сдвига и сложения. В известном алгоритме при уточнении используются такие сложные операции, как умножение и сравнение, которые входят в каждую итерацию.The values of the divisor multiplied by the highest degree and divisible, as in the case of series interpolation, are input to the adders of the divisor 10 and the divisible 37. The contents of the adder of the divider 10 via bus 23 are fed to the input of shift register 9, erases the old information and writes the new value and then on the bus 34 is fed to the input of the subtractor 14, and the contents of the adder of the dividend 37 in the additional code is fed to the second inputs of the subtractor 14, where the multiplied divider by the highest degree is subtracted from the contents of the dividend. The dividend and divisor, as before, are represented by their relative values. If the sign of the result of subtraction in the subtractor 14 is “zero”, that is, the dividend is greater than the divisor, then “zeros” are formed on the inhibitory inputs of the “ban” circuits 12 and 35 and the highest degree of quotient on bus 28 through the “ban” circuit 12 is transmitted to the input of the adder private 13 on the bus 29, and the result of subtracting the subtractor 14, that is, the remainder of the dividend coming to the other degrees on the bus 40 through the "ban" 35, is fed to the input of the remainder register 36 when subtracting the members of the number of the private from the dividend and then bus 39 receives the input amount ora divisible 37, wherein the old content is removed, and writes the new value. Then there is a shift to the "right" of the contents of the shift register 9 and the process occurs similarly to the above. Moreover, if the sign of the result of the subtraction in the subtractor 14 will be “1”, that is, the relative value of the divisor is greater than the dividend, then the unit that appears, which goes to the inhibitory inputs of the “ban” circuit 12 and 35, prohibits the passage of the corresponding degree to the adder private 13 and the result of subtracting the subtractor 14 at the input of the balance storage register 36 when subtracting the number of quotients from the dividend, that is, the register retains the previous value of the summation result. Thus, the input of the adder private 13 receives only those specified degrees that are members of a number of private. The conversion process ends after analysis of degree 2 ⁰ . Thus, during refinement, excess members of the approximation series of the quotient are iteratively deleted by simple transformations consisting of shift and addition operations. In the well-known algorithm, refinement uses such complex operations as multiplication and comparison, which are included in each iteration.

Итак, основное деление модулярных чисел осуществляется примерно за 2 итерации сравнения и k-1 операций сдвига и сложения, а в известном алгоритме необходимо 2k итераций сравнения, k операций умножения и 2k операций суммирования. Выигрыш в скорости деления модулярных чисел достигает примерно k итераций. Это лучший на сегодняшний день алгоритм основного деления модулярных чисел. Это достоинство предложенного алгоритма по сравнению с известным достигается тесной связью архитектурных вычислений с аппаратной реализацией, что позволило значительно сократить вычислительную сложность деления модулярных чисел. Предложенный алгоритм отличается от известных простотой его реализации, который требует меньшего объема вычислений по сравнению с существующими алгоритмами.So, the main division of modular numbers is carried out in about 2 iterations of comparison and k-1 operations of shift and addition, and in the well-known algorithm, 2k iterations of comparison, k multiplication operations and 2k summation operations are necessary. The gain in the modular number division rate reaches approximately k iterations. This is the best algorithm for the main division of modular numbers to date. This advantage of the proposed algorithm compared to the known one is achieved by close connection of architectural calculations with hardware implementation, which significantly reduced the computational complexity of dividing modular numbers. The proposed algorithm differs from the known ones in the simplicity of its implementation, which requires less computation compared to existing algorithms.

Сравнительный анализ по сложности и времени деления модулярных чисел изобретения с известными (патент RU 2400813, опубликованный 27.09.2010, Бюл. №27) показал значительные преимущества по аппаратным и временным ресурсам.A comparative analysis of the complexity and time of dividing the modular numbers of the invention with the known ones (patent RU 2400813, published September 27, 2010, Bull. No. 27) showed significant advantages in terms of hardware and time.

Так, в известном изобретении используются:So, in the known invention are used:

схема для преобразования в ОПСС;scheme for conversion to OPSS;

схема для нахождения приблизительного делителя;scheme for finding an approximate divider;

схема для деления с нулевым остатком; схема памяти;scheme for division with zero remainder; memory circuit;

схема расширения оснований СОК;a scheme for expanding the bases of the RNS;

схема ключей;key scheme;

схема сравнения, умножения и вычитания.scheme of comparison, multiplication and subtraction.

Сложность всех перечисленных схем пропорциональна n, где n - число оснований СОК, то есть сложность всего устройства определяется как N_=O(_zn), где z - количество схем, перечисленных выше.The complexity of all the above schemes is proportional to n, where n is the number of bases of the RNS, that is, the complexity of the entire device is defined as N _{= O} ( _zn ), where z is the number of schemes listed above.

В предложенном устройстве для деления чисел используются:In the proposed device for dividing numbers are used:

схема сравнения модулярных чисел;modular numbers comparison scheme;

схема умножения, сложения;multiplication scheme; addition;

схемы регистров и счетчик;register schemes and counter;

схемы мультиплексоров;multiplexer circuits;

схемы ключей.key schemes.

Сложность устройства определяется как 0(5k), где k - количество двоичных разрядов относительных значений делимого и делителя.The complexity of the device is defined as 0 (5k), where k is the number of binary bits of the relative values of the dividend and divisor.

При нежестком допущении о равенстве количества разрядов двоичного представления СОК

и разрядов относительных величин делимого и делителя выигрыш в аппаратурных затратах равен 1,4 раза.Under the non-rigid assumption that the number of bits of the binary representation of the RNS is equal

and the categories of relative values of the dividend and the divisor, the gain in hardware costs is 1.4 times.

По временным ресурсам на конкретном примере (один и тот же пример рассмотрен в известном и предложенном устройстве) выигрыш примерно равен 5. Все итерации в предложенном устройстве состоят из операций сдвига и сложения. На сегодняшний день это лучшее решение.For temporary resources on a specific example (the same example is considered in the known and proposed device), the gain is approximately equal to 5. All iterations in the proposed device consist of shift and addition operations. Today it is the best solution.

Claims

A device for the main division of modular numbers, containing the input buses of the dividend and divider, which feed the dividend directly, and the divider through the multiplication circuit, or through the multiplexer to the input of the modular number comparison circuit, the outputs of which implement the computational model a <b, a> b or a = b, where a is the dividend, b is the divisor; the control outputs of the comparison circuit a <b, a> b are connected to a control circuit whose outputs are connected to the address inputs of the multiplexer, control inputs of the counter, shift and storage registers, adders private, divider and subtractor, as well as one of the key inputs, the second inputs which are connected to the memory output, the inputs of which are connected to the shift register, and the output a = b of the comparison circuit is connected to the input of the adder private, putting “one” in it, and the information outputs are connected to the circuits of the dividend and divider adders, the outputs are cat They are connected to the left shift register, the output of which is connected to the counter for determining the highest degree of the approximation series of the private, the output of the counter is connected to the address inputs of the memory, the outputs of which through the key circuit and the ban provide the input of the adder of the private degree of a member of the series included in the specified term of the private and the input of the multiplication circuit of the highest degree of the series by the divider, the output of which through the multiplexer is connected to the comparison circuit, the outputs of which are connected to the adders of the dividend and divider; the output of the adder of the divider is connected to the input of the shift register to the right, the output of which is connected to the subtractor circuit, the output of the adder of the dividend is connected to the second input of the output, the output of the subtractor is connected to the storage register of the remainder when subtracting the number of quotients from the dividend, the output of which is connected through the adder of the dividend to the subtracter, the output of which is connected to the prohibition circuit, the outputs of which are connected to the balance storage register when subtracting the quotients of the number of quotients from the dividend and the quotient adder circuit.