RU2802599C1

RU2802599C1 - System and method for providing failure tolerance of network nodes

Info

Publication number: RU2802599C1
Application number: RU2023107516A
Authority: RU
Inventors: Гопикришна ЧАРИПАДИ
Original assignee: Радисус Индия Приват Лимитед
Priority date: 2022-01-19
Filing date: 2023-01-04
Publication date: 2023-08-30

Abstract

FIELD: network.

SUBSTANCE: T310 (radio link failure timer), N310 (radio link failure counter), T311 (radio link recovery timer), N311 (radio link recovery counter) and their associated counters ensure that L1 recovers during the sum of timers T310 and T311, for example, typically 100 ms after L1 software exception event.

EFFECT: efficient and reliable systems and methods for providing network failure by using 3GPP specific radio resource control timers.

20 cl, 2 dwg

Description

ОБЛАСТЬ ТЕХНИКИTECHNICAL FIELD

[0001] Варианты осуществления настоящего изобретения в целом относятся к телекоммуникационным разработкам. В частности, настоящее изобретение относится к системам и способам обеспечения отказоустойчивости в сети.[0001] Embodiments of the present invention generally relate to telecommunications developments. In particular, the present invention relates to systems and methods for providing fault tolerance in a network.

УРОВЕНЬ ТЕХНИКИBACKGROUND OF THE ART

[0002] Приведенное ниже описание уровня техники предназначено для предоставления дополнительной информации, относящейся к области изобретения. Данный раздел может включать определенные аспекты уровня техники, которые могут быть связаны с различными признаками настоящего изобретения. Тем не менее, следует понимать, что данный раздел используется только для углубления понимания читателем настоящего изобретения, а не в качестве признания уровня техники. [0002] The following description of the prior art is intended to provide additional information related to the scope of the invention. This section may include certain aspects of the prior art that may be related to various features of the present invention. However, it should be understood that this section is used only to enhance the reader's understanding of the present invention and not as an admission of prior art.

[0003] Система беспроводной связи содержит цифровой блок gNodeB, соединенный по меньшей мере с одним радиоблоком, через который gNodeB передает беспроводные радиочастотные сигналы на пользовательское устройство и принимает беспроводные радиочастотные сигналы от пользовательского устройства в зоне покрытия радиоблока. Цифровой блок содержит протоколы уровня 1 (L1), уровня 2 (L2) и уровня 3 (L3) стека протоколов стандартов беспроводной связи 5G. В такой системе беспроводной связи, развернутой в действующих сетях, подсистема gNodeB/Базовая сеть должна удовлетворять двум основным критериям. Критерии: (а) высокая доступность, часто называемая критерием FIVE 9s, т.е. 99,999% доступности и (б) низкое время простоя gNodeB, составляющее, например, 21 минуту/год/gNodeB. Кроме того, производители другого оборудования gNodeB должны соблюдать ключевые показатели эффективности (КПЭ) сети, определенные спецификациями проекта 3G Partnership Project (3GPP), например, коэффициент потерянных вызовов < 0,1%; коэффициент поступивших вызовов > 99,5% и т.д., в качестве соглашений об уровне обслуживания оператора сети.[0003] The wireless communication system includes a digital unit gNodeB coupled to at least one radio unit, through which the gNodeB transmits wireless RF signals to a user device and receives wireless RF signals from a user device within the coverage area of the radio unit. The digital block contains layer 1 (L1), layer 2 (L2) and layer 3 (L3) protocols of the 5G wireless communication standards protocol stack. In such a wireless communication system deployed in live networks, the gNodeB/Core Network subsystem must satisfy two main criteria. Criteria: (a) high availability, often called the FIVE 9s criterion, i.e. 99.999% availability and (b) low gNodeB downtime of, for example, 21 minutes/year/gNodeB. In addition, manufacturers of other gNodeB equipment must comply with network key performance indicators (KPIs) defined by the 3G Partnership Project (3GPP) specifications, such as <0.1% call abandonment rate; Call Rate > 99.5%, etc., as network operator service level agreements.

[0004] На уровне техники непредвиденная программная ошибка в программном обеспечении первого уровня (ПО L1) воспринимается программным обеспечением как сбой, следовательно, когда программное обеспечение второго уровня (ПО L2), управляющее ПО L1, не получает периодический контрольный сигнал от ПО L1, оно передает программному обеспечению третьего уровня (ПО L3) команду на прекращение передачи и приема ячеек, что в конечном итоге приводит к перезагрузке или перезапуску цифрового блока в целом, содержащего ПО L1, ПО L2 и ПО L3. Обычно этот процесс занимает порядка нескольких минут, в течение которых действующая сеть остается недоступной и считается неисправной в данном географическом регионе. На уровне техники непредвиденное событие ПО L1 приводит к переходу ПО L1 в состояние контроля и, следовательно, к аварийному завершению работы ПО L1. Если ПО L1 не отвечает в течение нескольких последовательных интервалов передачи, например, 15 интервалов передачи, ПО L2 распознает сбой и сообщает об этом ПО L3. ПО L3 может сообщить базовой сети о неработоспособности ячейки, и базовая сеть может удалить ячейку и запустить ее заново. Время, прошедшее с момента возникновения неожиданного события ПО L1 до повторного запуска ячейки, обычно составляет несколько секунд. Следовательно, настроенные таймеры 3GPP, таймеры T310 и T311 срабатывают (тайм-аут) на пользовательских устройствах, и все пользовательские устройства переходят в состояние IDLE и обрывают вызов, вследствие чего КПЭ сети существенно нарушаются. [0004] In the prior art, an unexpected software error in the first level software (L1 software) is perceived by the software as a failure, therefore, when the second level software (L2 software) controlling the L1 software does not receive a periodic heartbeat from the L1 software, it sends the third level software (L3 software) a command to stop transmitting and receiving cells, which ultimately leads to a reboot or restart of the digital block as a whole, containing L1 software, L2 software and L3 software. This process typically takes on the order of several minutes, during which time the live network remains unavailable and is considered faulty in the geographic region. In the prior art, an unexpected event in the L1 software causes the L1 software to enter a control state and, consequently, to an abnormal termination of the L1 software. If the L1 software does not respond for several consecutive transmission intervals, for example, 15 transmission intervals, the L2 software recognizes the failure and reports this to the L3 software. The L3 software can inform the core network that the cell is down, and the core network can remove the cell and start it again. The time elapsed from the occurrence of an unexpected L1 software event until the cell is restarted is typically several seconds. Consequently, the configured 3GPP timers, T310 and T311 timers are triggered (time out) on the user devices and all user devices go into IDLE state and drop the call, causing the network KPIs to be significantly violated.

[0005] Как правило, часть стека протоколов беспроводной сети 5G L1 реализуют в аппаратных акселераторах, выполняющих, например, операции проверки четности с низкой плотностью и согласование скорости передачи в цепочке обработки нисходящего канала, а также операции проверки четности с низкой плотностью и согласования скорости передачи в цепочке обработки восходящего канала. Такая архитектура позволяет перевести вычислительные операции на аппаратный акселератор в целях экономии времени, а не выполнять их программно. Аппаратные акселераторы управляются конфигурацией, предоставляемой ПО L1 в соответствии с сообщениями DL CONFIG REQUEST и UL CONFIG REQUEST, получаемыми с интервалом передачи 0,5 мс от ПО L2. Иногда, если полученные от ПО L2 атрибуты выходят за пределы диапазона, или ПО L1 передает неожиданные атрибуты аппаратному акселератору в интервале передачи, аппаратному акселератору может потребоваться много циклов для завершения, или завершение в некоторых случаях окажется невозможным. В вариантах осуществления ПО L1, известных на уровне техники, это может привести к сбою ПО L1, после чего ПО L2 не будет получать ответа в течение нескольких интервалов передачи, что в конечном итоге приведет к перезагрузке программного обеспечения цифрового блока. Для восстановления работоспособности gNodeB потребуется несколько секунд, что приведет к перебоям в работе сети и потере вызовов на всех подключенных пользовательских устройствах. [0005] Typically, part of the 5G L1 wireless protocol stack is implemented in hardware accelerators that perform, for example, low-density parity and rate matching operations in the downlink processing chain, as well as low-density parity and rate matching operations in the upstream processing chain. This architecture allows you to transfer computing operations to a hardware accelerator in order to save time, rather than performing them in software. The hardware accelerators are controlled by the configuration provided by the L1 software in accordance with the DL CONFIG REQUEST and UL CONFIG REQUEST messages received at 0.5 ms transmission intervals from the L2 software. Sometimes, if the attributes received from L2 software are out of range, or L1 software sends unexpected attributes to the hardware accelerator in the transmission interval, the hardware accelerator may take many cycles to complete, or completion may not be possible in some cases. In prior art L1 software embodiments, this may cause the L1 software to crash, after which the L2 software will not receive a response for several transmission intervals, eventually causing the digital pad software to reboot. It will take several seconds for the gNodeB to become operational again, resulting in network outages and lost calls on all connected user devices.

[0006] Поэтому в данной области техники существует потребность в системах и способах, позволяющих преодолеть недостатки, присущие уровню техники.[0006] Therefore, there is a need in the art for systems and methods that overcome the disadvantages inherent in the prior art.

ЗАДАЧИ НАСТОЯЩЕГО ИЗОБРЕТЕНИЯOBJECTIVES OF THE PRESENT INVENTION

[0007] Некоторые задачи настоящего изобретения, решенные по меньшей мере одним вариантом осуществления, перечислены ниже.[0007] Some objects of the present invention, solved by at least one embodiment, are listed below.

[0008] Целью настоящего изобретения является построение сети с отказоустойчивыми узлами. [0008] The purpose of the present invention is to build a network with fault-tolerant nodes.

[0009] Целью настоящего изобретения является система, сводящая к минимуму время простоя сети таким образом, чтобы минимизировать ухудшение КПЭ сети.[0009] The purpose of the present invention is to provide a system that minimizes network downtime in a manner that minimizes degradation of network KPIs.

[0010] Целью настоящего изобретения является разработка способа, позволяющего ПО L1 удовлетворять требованиям высокой доступности и защищать gNodeB от любого неожиданного исключения ПО L1 в действующих разработках.[0010] The purpose of the present invention is to develop a method that allows L1 software to meet high availability requirements and protect the gNodeB from any unexpected exception of L1 software in live designs.

СУЩНОСТЬ ИЗОБРЕТЕНИЯSUMMARY OF THE INVENTION

[0011] Данный раздел содержит упрощенное представление некоторых задач и аспектов настоящего изобретения, более подробно раскрытых в приведенном ниже описании. Сущность изобретения не определяет ключевые признаки или защищаемый объем заявленного изобретения. [0011] This section contains a simplified presentation of some of the objects and aspects of the present invention, which are described in more detail in the description below. The essence of the invention does not define the key features or the protected scope of the claimed invention.

[0012] В одном из аспектов настоящего изобретения предложена система, сводящая к минимуму отключение ячеек и ухудшение ключевых показателей эффективности путем обеспечения отказоустойчивости узлов. Система может содержать ячейку и отличается наличием нескольких пользовательских устройств, соединенных с сетью с возможностью осуществления связи, одно или несколько электронных устройств (gnodes) в сети, выполненных с возможностью создания и/или приема и/или передачи информации по каналу связи, соединенному с возможностью осуществления связи с несколькими пользовательскими устройствами, один или несколько аппаратных акселераторов, соединенных с возможностью осуществления связи с одним или несколькими gnodes. Один или несколько аппаратных акселераторов могут быть настроены модулем уровня 1 (L1) с возможностью обработки информации, полученной посредством соединения в соответствии с инструкцией конфигурации восходящего и нисходящего канала, получаемой в каждом заданном интервале передачи, полученном от модуля уровня 2 (L2), связанного с сетью. Модуль L1 связан с физическим (PHY) уровнем сети, а модуль L2 связан с канальным уровнем сети. Модуль L1 может дополнительно содержать процессор, выполняющий набор исполняемых инструкций, хранящихся в памяти, причем при выполнении этих инструкций процессор инициирует на уровне L1 отслеживание одного или нескольких несоответствий в информации, полученной одним или несколькими аппаратными акселераторами; извлечение из одного или нескольких несоответствий набора атрибутов, относящихся к выходящим за пределы диапазона атрибутам, полученным от модуля L2, или неожиданным атрибутам, переданным модулем L1 одному или нескольким аппаратным акселераторам в заданном интервале передачи, вследствие чего одному или нескольким аппаратным акселераторам требуется больше циклов для завершения обработки информации, или же обработка не завершается вовсе. На основании набора извлеченных атрибутов модуль L1 может сбросить любую из очередей или комбинацию одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора. Сброс может занять заданный интервал времени отключения, длительность которого меньше времени истечения, связанного с таймером (T310) отказа радиоканала, подключенным к сети. Сброс любой из очередей или комбинации одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора выполняется параллельно с одновременным осуществлением взаимодействия «запрос-ответ» между модулями L1 и L2 в течение заданного интервала времени. Взаимодействие «запрос-ответ» может предусматривать отправку сообщения об ошибке в модуль L2 таким образом, чтобы модуль L2 мог деактивировать модуль L1 до перезапуска комбинации одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора. Кроме того, несколько пользовательских устройств в ячейке может быть выполнено с возможностью нахождения в состоянии подключения к радиоресурсам, поскольку заданный интервал времени отключения меньше время истечения таймера T310.[0012] In one aspect of the present invention, a system is provided that minimizes cell outages and degradation of key performance indicators by providing node resiliency. The system may contain a cell and is characterized by the presence of several user devices connected to the network with the ability to communicate, one or more electronic devices (gnodes) in the network, configured to create and/or receive and/or transmit information over a communication channel connected with the ability communicating with multiple user devices, one or more hardware accelerators connected to communicate with one or more gnodes. One or more hardware accelerators may be configured by a layer 1 (L1) module to process information received through the connection in accordance with an uplink and downlink configuration instruction received at each given transmission interval received from a layer 2 (L2) module associated with network. The L1 module is associated with the physical (PHY) layer of the network, and the L2 module is associated with the data link layer of the network. The L1 module may further comprise a processor executing a set of executable instructions stored in memory, wherein upon execution of these instructions the processor causes the L1 level to monitor for one or more inconsistencies in information received by one or more hardware accelerators; extracting from one or more attribute set inconsistencies related to out-of-range attributes received from the L2 module or unexpected attributes sent by the L1 module to one or more hardware accelerators in a given transmission interval, causing one or more hardware accelerators to require more cycles to information processing is completed, or processing is not completed at all. Based on the set of extracted attributes, the L1 module can reset any of the queues or a combination of one or more queues in the hardware accelerator and the hardware accelerator core. The reset may take a predetermined downtime interval, the duration of which is less than the expiration time associated with the radio link failure timer (T310) connected to the network. Resetting any one of the queues or a combination of one or more queues in the hardware accelerator and the hardware accelerator core is performed in parallel with the simultaneous request-response interaction between the L1 and L2 modules for a specified time interval. The request-response interaction may include sending an error message to the L2 module such that the L2 module can deactivate the L1 module before restarting the combination of one or more queues in the hardware accelerator and the hardware accelerator core. In addition, multiple user devices in a cell may be configured to be connected to radio resources because the predetermined disconnect time interval is less than the expiration time of timer T310.

[0013] В одном из аспектов настоящего изобретения предложен способ сведения к минимуму отключения ячеек и ухудшения ключевых показателей эффективности (КПЭ) путем обеспечения отказоустойчивости узлов. Способ может содержать этапы отслеживания одного или нескольких несоответствий в информации, полученной одним или несколькими аппаратными акселераторами, соединенными с возможностью осуществления связи с одним или несколькими электронными устройствами (gnodes), выполненными с возможностью создания и/или приема и/или передачи информации по каналу связи, соединенному с возможностью осуществления связи с несколькими пользовательскими устройствами. Один или несколько аппаратных акселераторов настроены модулем уровня 1 (L1) с возможностью обработки информации, полученной посредством соединения в соответствии с инструкцией конфигурации восходящего и нисходящего канала, получаемой в каждом заданном интервале передачи, полученном от модуля уровня 2 (L2), связанного с сетью. Модуль L1 связан с физическим (PHY) уровнем сети, а модуль L2 связан с канальным уровнем сети. Способ может дополнительно содержать этап извлечения из одного или нескольких несоответствий набора атрибутов, относящихся к выходящим за пределы диапазона атрибутам, полученным от модуля L2, или неожиданным атрибутам, переданным модулем L1 одному или нескольким аппаратным акселераторам в заданном интервале передачи, вследствие чего одному или нескольким аппаратным акселераторам требуется больше циклов для завершения обработки информации, или же обработка не завершается вовсе. На основании набора извлеченных атрибутов способ может содержать этап сброса любой из очередей или комбинации одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора. Сброс может занимать заданный интервал времени отключения таким образом, что заданный интервал времени отключения будет меньше времени истечения, связанного с таймером (T310) отказа радиоканала, подключенным к сети, и сброс любой из очередей или комбинации одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора может выполняться параллельно с одновременным осуществлением взаимодействия «запрос-ответ» между модулями L1 и L2 в течение заданного интервала времени. Взаимодействие «запрос-ответ» может предусматривать отправку сообщения об ошибке в модуль L2, вследствие чего модуль L2 деактивирует модуль L1 до перезапуска комбинации одной или нескольких очередей в аппаратном акселераторе и ядра аппаратного акселератора, и несколько пользовательских устройств в ячейке может быть выполнено с возможностью нахождения в состоянии подключения к радиоресурсам, поскольку заданный интервал времени отключения меньше времени истечения таймера T310.[0013] In one aspect of the present invention, a method is provided for minimizing cell outages and degradation of key performance indicators (KPIs) by providing node resiliency. The method may comprise the steps of monitoring one or more inconsistencies in information received by one or more hardware accelerators coupled to communicate with one or more electronic devices (gnodes) configured to create and/or receive and/or transmit information over a communication channel , connected with the ability to communicate with multiple user devices. One or more hardware accelerators are configured by a layer 1 (L1) module to process information received through the connection in accordance with an uplink and downlink configuration instruction received at each given transmission interval received from a layer 2 (L2) module associated with the network. The L1 module is associated with the physical (PHY) layer of the network, and the L2 module is associated with the data link layer of the network. The method may further comprise the step of extracting from one or more inconsistencies a set of attributes related to out-of-range attributes received from the L2 module or unexpected attributes sent by the L1 module to one or more hardware accelerators in a given transmission interval, causing one or more hardware accelerators to accelerators require more cycles to complete processing or do not complete processing at all. Based on the set of extracted attributes, the method may comprise the step of resetting any of the queues or combination of one or more queues in the hardware accelerator and the hardware accelerator core. The reset may take a predetermined downtime interval such that the predetermined downtime interval is less than the expiration time associated with a radio link failure timer (T310) connected to the network, and a reset of any of the queues or a combination of one or more queues in the hardware accelerator and the hardware core accelerator can be executed in parallel with simultaneous request-response interaction between modules L1 and L2 for a given time interval. The request-response interaction may involve sending an error message to the L2 module, causing the L2 module to deactivate the L1 module until the combination of one or more queues in the hardware accelerator and the hardware accelerator core is restarted, and multiple user devices in a cell can be configured to find in the state of connection to radio resources, since the specified disconnection time interval is less than the expiration time of timer T310.

КРАТКОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙBRIEF DESCRIPTION OF THE DRAWINGS

[0014] Сопроводительные чертежи, входящие в состав настоящей заявки и являющиеся частью настоящего изобретения, иллюстрируют примерные варианты осуществления раскрытых способов и систем, причем одинаковые ссылочные обозначения относятся к одинаковым элементам на разных чертежах. Компоненты на чертежах не обязательно выполнены в масштабе, вместо этого акцент сделан на наглядной иллюстрации принципов настоящего изобретения. На некоторых рисунках компоненты могут быть изображены в виде блок-схем, которые не обязательно отражают внутреннюю схему каждого компонента. Специалистам в данной области техники очевидно, что такие чертежи изобретения содержат электрические компоненты, электронные компоненты или схемы, обычно используемые для реализации таких компонентов.[0014] The accompanying drawings, included herein and forming a part of the present invention, illustrate exemplary embodiments of the disclosed methods and systems, with like reference numerals referring to like elements in different drawings. Components in the drawings are not necessarily drawn to scale, but instead emphasis is placed on clearly illustrating the principles of the present invention. Some illustrations may show components in block diagram form, which does not necessarily represent the internal circuitry of each component. It will be apparent to those skilled in the art that such drawings of the invention include electrical components, electronic components or circuitry commonly used to implement such components.

[0015] На ФИГ. 1 изображена примерная архитектура системы, в которой или с которой может быть реализована предложенная настоящим изобретением система, в соответствии с одним из вариантов осуществления настоящего изобретения.[0015] In FIG. 1 depicts an exemplary system architecture in which or with which a system of the present invention may be implemented, in accordance with one embodiment of the present invention.

[0016] На ФИГ. 2 изображена примерная блок-схема, раскрывающая предложенный способ, в соответствии с одним из вариантов осуществления настоящего изобретения.[0016] In FIG. 2 is an exemplary flow diagram illustrating a proposed method in accordance with one embodiment of the present invention.

[0017] Вышеизложенное станет более понятным из нижеследующего подробного описания изобретения.[0017] The foregoing will become clearer from the following detailed description of the invention.

ПОДРОБНОЕ ОПИСАНИЕ ИЗОБРЕТЕНИЯDETAILED DESCRIPTION OF THE INVENTION

[0018] В следующем описании в целях пояснения изложены различные конкретные детали, позволяющие глубже понимать варианты осуществления настоящего изобретения. Тем не менее, очевидно, что варианты осуществления настоящего изобретения могут быть реализованы без этих конкретных деталей. Несколько раскрытых ниже признаков могут использоваться независимо друг от друга или в произвольной комбинации с другими признаками. Отдельный признак может не решать все вышеперечисленные проблемы или решать только некоторые из них. Некоторые из рассмотренных выше проблем не могут быть полностью решены ни одним из раскрытых здесь признаков. [0018] In the following description, for purposes of explanation, various specific details are set forth to provide a better understanding of the embodiments of the present invention. However, it will be appreciated that embodiments of the present invention may be implemented without these specific details. Several of the features disclosed below may be used independently of each other or in arbitrary combination with other features. A single feature may not solve all of the above problems or only solve some of them. Some of the problems discussed above cannot be completely resolved by any of the features disclosed here.

[0019] Последующее описание содержит только примерные варианты осуществления и не может ограничивать объем, применимость или конфигурацию изобретения. Скорее, последующее описание примерных вариантов осуществления дает специалистам в данной области техники информацию, позволяющую реализовать примерный вариант осуществления. Следует понимать, что функции и расположение элементов могут быть изменены различным образом без отклонения от идеи и объема изложенного изобретения. [0019] The following description contains only exemplary embodiments and is not intended to limit the scope, applicability or configuration of the invention. Rather, the following description of exemplary embodiments provides those skilled in the art with information to enable implementation of the exemplary embodiment. It should be understood that the functions and arrangement of elements can be changed in various ways without deviating from the spirit and scope of the invention.

[0020] Настоящим изобретением предложены эффективные и надежные системы и способы обеспечения отказоустойчивости сети путем использования определенных 3GPP таймеров управления радиоресурсами T310 (таймер отказа радиоканала), N310 (счетчик отказа радиоканала), T311 (таймер восстановления радиоканала), N311 (счетчик восстановления радиоканала) и связанных с ними счетчиков, что обеспечит восстановление L1 в течение суммарного значения таймеров T310 и T311, например, обычно составляющего 100 мс после события исключения ПО L1.[0020] The present invention provides efficient and reliable systems and methods for providing network resiliency by using 3GPP defined radio resource management timers T310 (Radio Link Failure Timer), N310 (Radio Link Failure Counter), T311 (Radio Link Recovery Timer), N311 (Radio Link Recovery Counter), and associated counters, which will ensure that L1 is restored within the sum of timers T310 and T311, for example typically 100 ms after the L1 software exception event.

[0021] На ФИГ. 1 изображена примерная архитектура беспроводной сети (100) (также называемая сетевой архитектурой (100)), в которой или с которой может быть реализована система (110), предложенная настоящим изобретением, в соответствии с одним из вариантов осуществления настоящего изобретения. Как показано на фигуре, примерная сетевая архитектура (100) может содержать функцию управления радиоресурсами, таймер отказа радиоканала (T310), счетчик отказа радиоканала (N310), таймер восстановления радиоканала (T311), счетчик восстановления радиоканала (N311). Система (110) может быть соединена с возможностью осуществления связи с несколькими первыми вычислительными устройствами (102-1, 102-2, 102-3...102-N) - также называемыми пользовательскими устройствами (102-1, 102-2, 102-3...102-N) и (по отдельности называемым пользовательским устройством (102) и совместно пользовательскими устройствами (102)) через вторые вычислительные устройства (104-1, 104-2,.... 104-N) (также называемые базовыми станциями (104-1, 104-2,...104-N), по отдельности базовой станцией (104) и совместно базовыми станциями (104)), и система (110) может быть также функционально соединена с базовыми станциями (104) через радиоблок (114) открытой сети радиодоступа. Система (110) может быть дополнительно соединена с возможностью осуществления связи с одним или несколькими третьими вычислительными устройствами (106) (также называемыми распределенными блоками gNB (DU) или gNB DU 106), и одним или несколькими четвертыми вычислительными устройствами (116) (также называемыми блоками управления gNB (CU) или gNB CU 116). Одно или несколько четвертых вычислительных устройств (116) могут быть соединены с возможностью осуществления связи с несколькими пятыми вычислительными устройствами (118) (также называемыми первыми узлами (118)). Одно или несколько третьих вычислительных устройств (106) или gNB DU (106) могут представлять собой спутники или любыми развернутые вне земли устройства, но не ограничиваются этими вариантами.[0021] In FIG. 1 depicts an exemplary wireless network (100) architecture (also referred to as a network architecture (100)) in or with which a system (110) of the present invention may be implemented in accordance with one embodiment of the present invention. As shown in the figure, the exemplary network architecture (100) may include a radio resource management function, a radio link failure timer (T310), a radio link failure counter (N310), a radio link recovery timer (T311), a radio link recovery counter (N311). The system (110) may be communicatively connected to several first computing devices (102-1, 102-2, 102-3...102-N) - also called user devices (102-1, 102-2, 102 -3...102-N) and (individually called user device (102) and collectively user devices (102)) through second computing devices (104-1, 104-2,.... 104-N) (also called base stations (104-1, 104-2,...104-N), individually a base station (104) and collectively base stations (104)), and the system (110) can also be operably connected to base stations ( 104) through the radio unit (114) of the open radio access network. The system (110) may be further coupled to communicate with one or more third computing devices (106) (also called gNB distributed units (DUs) or gNB DU 106), and one or more fourth computing devices (116) (also called control units gNB (CU) or gNB CU 116). One or more fourth computing devices (116) may be communicatively coupled to multiple fifth computing devices (118) (also referred to as first nodes (118)). One or more third computing devices (106) or gNB DU (106) may be, but is not limited to, satellites or any off-ground deployed devices.

[0022] В одном из примерных вариантов осуществления заданное время истечения таймера отказа радиоканала (также называемого T310) может составлять по меньшей мере 100 мс. Такое непродолжительное время может привести к кратковременному сбою пользовательских устройств (102), подключенных к ячейке и работающих с определенной пропускной способностью с различными профилями трафика, если произойдет неожиданное событие программного обеспечения (ПО) L1, вследствие которого в течение 100 мс на пользовательских устройствах (102) будет иметь место отказ радиоканала. Поскольку длительность отказа радиоканала находится в пределах длительности таймера T310, и обычно составляет 100 мс, пользовательские устройства (102) остаются в состоянии RRC CONNECTED и не переходят в состояние RRC IDLE. Таким образом, ни одно из активных пользовательских устройств не потеряет соединение, и будет наблюдаться только кратковременное (100 мс) отключение и потеря пропускной способности.[0022] In one exemplary embodiment, the predetermined expiration time of the radio link failure timer (also referred to as T310) may be at least 100 ms. Such a short time may result in a momentary outage of user devices (102) connected to the cell and operating at a certain bandwidth with different traffic profiles if an unexpected L1 software event occurs, causing the user devices (102) to fail within 100 ms. ) there will be a failure of the radio channel. Since the duration of the radio link failure is within the duration of timer T310, and is typically 100 ms, the user devices (102) remain in the RRC CONNECTED state and do not enter the RRC IDLE state. This way, none of the active user devices will lose connection, and only momentary (100ms) disconnection and loss of bandwidth will be experienced.

[0023] Кроме того, система может содержать один или несколько аппаратных акселераторов, соединенных с возможностью осуществления связи с одним или несколькими gnodes (106), и может быть настроена модулем уровня 1 (L1) с возможностью обработки информации, полученной по линии связи в соответствии с инструкцией конфигурации восходящего и нисходящего канала, получаемой в каждом заданном интервале передачи, полученном от модуля уровня 2 (L2), связанного с сетью. Модуль L1 связан с физическим (PHY) уровнем сети, а модуль L2 связан с канальным уровнем сети. Модуль L1 дополнительно содержит процессор, выполняющий набор исполняемых инструкций, хранящихся в памяти, причем при выполнении этих инструкций процессор инициирует на уровне L1 отслеживание одного или нескольких несоответствий в информации, полученной одним или несколькими аппаратными акселераторами. После этого модуль L1 может извлекать из одного или нескольких несоответствий набор атрибутов, относящихся к выходящим за пределы диапазона атрибутам, полученным от модуля L2, или неожиданным атрибутам, переданным модулем L1 одному или нескольким аппаратным акселераторам в заданном интервале передачи, вследствие чего одному или нескольким аппаратным акселераторам требуется больше циклов для завершения обработки информации, или же обработка не завершается вовсе. На основании набора извлеченных атрибутов модуль L1 может сбросить любую из очередей или комбинацию одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора. Сброс может занимать заданный интервал времени отключения таким образом, что заданный интервал времени отключения будет меньше времени истечения, связанного с таймером T310, подключенным к сети, и сброс любой из очередей или комбинации одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора может выполняться параллельно с одновременным осуществлением взаимодействия «запрос-ответ» между модулями L1 и L2 в течение заданного интервала времени. Взаимодействие «запрос-ответ» может предусматривать отправку сообщения об ошибке в модуль L2, и модуль L2 может деактивировать модуль L1 до перезапуска комбинации одной или нескольких очередей в аппаратном акселераторе и ядре аппаратного акселератора. Несколько пользовательских устройств в ячейке могут быть выполнены с возможностью нахождения в состоянии подключения к радиоресурсам, поскольку заданный интервал времени отключения меньше времени истечения таймера T310.[0023] In addition, the system may include one or more hardware accelerators coupled to communicate with one or more gnodes (106), and may be configured by a layer 1 (L1) module to process information received over the communication link in accordance with with an uplink and downlink configuration instruction received at each given transmission interval received from a layer 2 (L2) module associated with the network. The L1 module is associated with the physical (PHY) layer of the network, and the L2 module is associated with the data link layer of the network. The L1 module further includes a processor executing a set of executable instructions stored in memory, wherein upon execution of these instructions the processor initiates, at the L1 level, monitoring for one or more inconsistencies in information received by one or more hardware accelerators. The L1 module can then extract from one or more mismatches a set of attributes related to out-of-range attributes received from the L2 module or unexpected attributes sent by the L1 module to one or more hardware accelerators in a given transmission interval, causing one or more hardware accelerators to accelerators require more cycles to complete processing or do not complete processing at all. Based on the set of extracted attributes, the L1 module can reset any of the queues or a combination of one or more queues in the hardware accelerator and the hardware accelerator core. The reset may take a specified shutdown time interval such that the specified shutdown time interval is less than the expiration time associated with the T310 timer connected to the network, and the reset of any of the queues or a combination of one or more queues in the hardware accelerator and the hardware accelerator core may be performed in parallel with the simultaneous implementation of request-response interaction between modules L1 and L2 during a specified time interval. The request-response interaction may involve sending an error message to the L2 module, and the L2 module may deactivate the L1 module before restarting the combination of one or more queues in the hardware accelerator and the hardware accelerator core. Multiple user devices in a cell may be configured to be connected to radio resources because the predetermined disconnect time interval is less than the expiration time of timer T310.

[0024] В одном из примерных вариантов осуществления на нескольких пользовательских устройствах, подключенных к сети, имеет место отказ радиоканала в течение второго временного интервала, длительность которого меньше длительности заданного интервала времени отключения. [0024] In one exemplary embodiment, multiple user devices connected to the network experience a radio link failure during a second time interval that is less than the duration of a predetermined outage time interval.

[0025] В одном из примерных вариантов осуществления несколько пользовательских устройств, подключенных к сети, не переходят в состояние RRC IDLE.[0025] In one exemplary embodiment, multiple user devices connected to the network do not enter the RRC IDLE state.

[0026] В одном из примерных вариантов осуществления взаимодействие «запрос-ответ» дополнительно предусматривает остановку цепочки обработки нисходящего канала и восходящего канала в реальном времени и перевод модуля L1 из состояния RUNNING в состояние IDLE. Модуль L1 может возвратить сообщение о физической остановке на модуль L2. Взаимодействие «запрос-ответ» дополнительно предусматривает деактивацию модуля L1 модулем L2, причем модуль L1 продолжает выполнять любую из отправок или комбинацию отправок SLOT INDICATIONS для предотвращения отсчета номеров системных кадров планировщиком L2, не позволяя модулю L2 очистить один или несколько контекстов, связанных с несколькими пользовательскими устройствами и несколькими базами данных, связанными с несколькими пользовательскими устройствами. После этого модуль L2 может отправить новый запрос CONFGURE CELL REQUEST с одним или несколькими заданными атрибутами CELL, в котором один или несколько заданных атрибутов CELL связаны с атрибутами, использованными в исходной конфигурации модуля L1. После этого модуль L1 может конфигурировать цепочку обработки нисходящего и восходящего каналов модуля L1 и переходит в состояние CONFIGURED, в котором модуль L1 отправляет сообщение CONFIGURE CELL RESPONSE на модуль L2 после перехода в состояние CONFIGURED.[0026] In one exemplary embodiment, the request-response interaction further includes stopping the real-time downlink and uplink processing chain and transitioning the L1 module from the RUNNING state to the IDLE state. The L1 module may return a physical stop message to the L2 module. The request-response interaction further involves deactivation of the L1 module by the L2 module, with the L1 module continuing to perform any one or combination of SLOT INDICATIONS sends to prevent the L2 scheduler from counting system frame numbers, preventing the L2 module from clearing one or more contexts associated with multiple user contexts. devices and multiple databases associated with multiple user devices. The L2 module may then issue a new CONFGURE CELL REQUEST with one or more CELL attributes specified, in which the one or more CELL attributes specified are associated with the attributes used in the L1 module's original configuration. The L1 module can then configure the L1 module's downlink and uplink processing chain and enters the CONFIGURED state, in which the L1 module sends a CONFIGURE CELL RESPONSE message to the L2 module after entering the CONFIGURED state.

[0027] Взаимодействие «запрос-ответ» предусматривает отправку модулем L2 сообщения PHY START для перезапуска ячейки на заданном номере системного кадра, полученном из SLOT INDICATIONS. После этого модуль L1 может сбросить несколько баз данных и перезапустить цепочки обработки нисходящего и восходящего канала в реальном времени, и далее отправляет сообщение PHY RESPONSE на модуль L2.[0027] The request-response interaction involves the L2 module sending a PHY START message to restart the cell at a given system frame number obtained from SLOT INDICATIONS. After this, the L1 module can reset several databases and restart the downlink and uplink processing chains in real time, and then sends a PHY RESPONSE message to the L2 module.

[0028] В одном из примерных вариантов осуществления система (110) завершает взаимодействие «запрос-ответ» в течение суммы заданного времени Т310 истечения и второго заданного времени (Т311) таймеров продолжительности тайм-аута. Заданный интервал времени взаимодействия «запрос-ответ» меньше или по меньшей мере равен заданному интервалу времени отключения.[0028] In one exemplary embodiment, the system (110) completes the request-response interaction within the sum of a predetermined expiration time T310 and a second predetermined timeout duration timer time (T311). The specified request-response interaction time interval is less than or at least equal to the specified disconnection time interval.

[0029] В примерном варианте осуществления система (110) может предотвратить повторную загрузку заданных каналов, связанных с одной или несколькими таблицами поиска для нескольких физических сигналов, определенных, но не ограничивающихся беспроводным протоколом 5G L1, из операций ввода/вывода файлов (FIFO), которые будут длиться заданные сотни микросекунд для каждой загрузки, поскольку речь идет о медленной операции ввода/вывода файлов, и вместо этого сохранять их исходно загруженную таблицу поиска в защищенной памяти, когда ячейка активируется в первый раз, и извлекать их из памяти для совершения операций ПО L1.[0029] In an exemplary embodiment, the system (110) may prevent specified channels associated with one or more lookup tables for multiple physical signals, defined by but not limited to the 5G L1 wireless protocol, from being reloaded from file input/output (FIFO) operations. which will last for specified hundreds of microseconds for each load, since we are talking about slow file I/O operations, and instead store their originally loaded lookup table in protected memory when the cell is activated for the first time, and retrieve them from memory to perform software operations L1.

[0030] В примерном варианте осуществления телекоммуникационная сеть (108) может содержать, в частности, по меньшей мере часть одной или нескольких сетей, содержащих один или несколько узлов, которые передают и/или принимают и/или пересылают и/или генерируют и/или буферизуют и/или хранят и/или маршрутизируют и/или переключают и/или обрабатывают и т.д. одно или несколько сообщений и/или пакетов и/или сигналов и/или волн и/или уровней напряжения или тока и т.д. Сеть может представлять собой, в частности, одну или несколько сетей: беспроводную сеть, проводную сеть, Интернет, интранет, публичную сеть, частную сеть, сеть с коммутацией пакетов, сеть с коммутацией каналов, децентрализованную сеть, сеть инфраструктуры, телефонную сеть общего пользования, кабельную сеть, сотовую сеть, спутниковую сеть, оптоволоконную сеть, а также некоторые их комбинации.[0030] In an exemplary embodiment, the telecommunications network (108) may include, in particular, at least a portion of one or more networks containing one or more nodes that transmit and/or receive and/or forward and/or generate and/or buffer and/or store and/or route and/or switch and/or process, etc. one or more messages and/or packets and/or signals and/or waves and/or voltage or current levels, etc. The network may be, but is not limited to, one or more of: a wireless network, a wired network, the Internet, an intranet, a public network, a private network, a packet switched network, a circuit switched network, a decentralized network, an infrastructure network, a public telephone network, cable network, cellular network, satellite network, fiber optic network, and some combinations thereof.

[0031] На ФИГ. 2 изображена примерная блок-схема, раскрывающая предложенный способ, в соответствии с одним из вариантов осуществления настоящего изобретения.[0031] In FIG. 2 is an exemplary flow diagram illustrating a proposed method in accordance with one embodiment of the present invention.

[0032] Как показано на фигуре, предложенный способ содержит несколько этапов, в число которых может входить достижение неожиданного состояния в ПО L1 на этапе 204 или возникновение неожиданного события или неожиданного атрибута в некотором состоянии, сообщение ERROR INDICATION передается от ПО L1 в ПО L2 для указания ПО L2 на возникновение состояния неустранимой ошибки, и ПО L2 должно использовать процедуру защиты FAIL SAFE для инициации процесса закрытия ячейки, чтобы перезапустить передачу и прием ячеек ПО L1. В этом процессе ПО L1 завершает все текущие задачи восходящего и нисходящего канала, а также не ставит в очередь никаких задач восходящего и нисходящего канала в течение 0,5 мс (для топологии 5G mu=1) интервала передачи.[0032] As shown in the figure, the proposed method contains several steps, which may include the achievement of an unexpected state in the L1 UE at step 204 or the occurrence of an unexpected event or unexpected attribute in some state, an ERROR INDICATION message is sent from the L1 UE to the L2 UE for indicates to the L2 UE that a fatal error condition has occurred, and the L2 UE shall use the FAIL SAFE protection procedure to initiate the cell close process to restart the transmission and reception of L1 UE cells. In this process, the L1 software completes all current uplink and downlink tasks and does not queue any uplink or downlink tasks during the 0.5 ms (for 5G topology mu=1) transmission interval.

[0033] На этапе 206 способ может предусматривать отправку сообщения STOP в ПО L1, когда ПО L2 получает сообщение ERROR INDICATION от ПО L1. ПО L1 - это программный уровень, который предоставляет услуги ПО L2, поэтому ПО L2 является ведущим по отношению к ПО L1.[0033] At step 206, the method may include sending a STOP message to the L1 UE when the L2 UE receives an ERROR INDICATION message from the L1 UE. L1 software is the software layer that provides services to L2 software, so L2 software is master of L1 software.

[0034] На этапе 208 способ может предусматривать выполнение функции PHY STOP, предоставляемой ПО L1, и все задачи обработки восходящего и нисходящего канала будут закрыты, а ПО L1 будет аккуратно завершать обработку восходящего и нисходящего канала L1, начиная с этого номера системного кадра в интервале передачи 0,5 мс. После этого ПО L1 отправляет ответ PHY STOP RESPONSE на ПО L2.[0034] At step 208, the method may include executing the PHY STOP function provided by the L1 software, and all uplink and downlink processing tasks will be closed, and the L1 software will gracefully terminate the L1 uplink and downlink processing starting at that system frame number in the interval transmission 0.5 ms. After this, L1 software sends a PHY STOP RESPONSE response to L2 software.

[0035] На этапе 210 способ может предусматривать повторную инициализацию ПО L1, всех его структур, контекстов и указателей восходящего и нисходящего канала, то есть ПО L1, вероятно, будет полностью выведено из неожиданного состояния, возникшего на этапе 1. ПО L1 переходит из состояния RUNNING в состояние IDLE.[0035] At step 210, the method may include reinitializing the L1 UE, all of its structures, contexts, and uplink and downlink indicators, that is, the L1 UE is likely to be fully recovered from the unexpected state encountered in step 1. The L1 UE transitions from the state RUNNING to IDLE state.

[0036] На этапе 212 способ может предусматривать продолжение отправки 0,5 мс SLOT INDICATIONS от ПО L1 в ПО L2 даже после того, как было получено сообщение PHY STOP, и цепочки обработки восходящего и нисходящего канала были остановлены на этапе 3-4. Это необходимо, поскольку ПО L1 является ведущим устройством синхронизации и должно предоставлять 0,5 мс SLOT INDICATIONS в качестве тактов синхронизации для ПО L2, чтобы синхронизация ПО L2 (действующий или текущий номер системного кадра) и синхронизация пользовательского устройства продолжалась, когда ячейка в конечном итоге будет перезапущена на ЭТАПЕ 10.[0036] At step 212, the method may include continuing to send 0.5 ms SLOT INDICATIONS from the L1 UE to the L2 UE even after a PHY STOP message has been received and the uplink and downlink processing chains have been stopped in steps 3-4. This is necessary because L1 software is the clock master and must provide 0.5 ms SLOT INDICATIONS as clock ticks to L2 software so that L2 software synchronization (effective or current system frame number) and user device synchronization will continue when the cell eventually will restart at STEP 10.

[0037] На этапе 212 способ может предусматривать сохранение в ПО L2 всех контекстов пользовательских устройств, подключенных к системе в настоящее время, когда наступил этап 1; кроме того, ПО L2 будет поддерживать синхронизацию своего планировщика при каждом такте SLOT INDICATION. Хотя ПО L1 прекратило работу, все контексты пользовательских устройств будут оставаться АКТИВНЫМИ в ПО L2 до истечения суммарного значения таймеров T310 и T311. Тем временем, ПО L2 отправит новое сообщение CONFIG REQUEST для конфигурирования ПО L1 с теми же атрибутами ячейки, которые были использованы для первичной конфигурации ячейки.[0037] At step 212, the method may include storing in the L2 software all user device contexts currently connected to the system when step 1 occurs; in addition, L2 software will keep its scheduler synchronized at each SLOT INDICATION clock cycle. Although L1 has stopped running, all user device contexts will remain ACTIVE in L2 until the sum of timers T310 and T311 has expired. In the meantime, the L2 software will send a new CONFIG REQUEST message to configure the L1 software with the same cell attributes that were used for the initial cell configuration.

[0038] На этапе 214 способ может предусматривать конфигурирование цепочек обработки восходящего и нисходящего канала в ПО L1, а также переход программного состояния из состояния IDLE в состояние CONFIGURED посредством ПО L1 при получении сообщения CELL CONFIG REQUEST от ПО L2.[0038] At step 214, the method may include configuring the uplink and downlink processing chains in the L1 UE, and transitioning the software state from the IDLE state to the CONFIGURED state by the L1 UE upon receiving a CELL CONFIG REQUEST message from the L2 UE.

[0039] На этапе 216 способ может предусматривать отправку ответного сообщения CELL CONFIG RESPONSE в ПО L2 из ПО L1 после успешного конфигурирования ячейки.[0039] At step 216, the method may include sending a CELL CONFIG RESPONSE message to the L2 UE from the L1 UE after the cell has been successfully configured.

[0040] На этапе 218 способ может предусматривать запуск ПО L1 в текущем действующем номере системного кадра посредством ПО L2 после получения положительного сообщения CELL CONFIG RESPONSE, путем отправки на него сообщения PHY START.[0040] At step 218, the method may include starting the L1 software at the current effective system frame number by the L2 software after receiving a positive CELL CONFIG RESPONSE message by sending a PHY START message to it.

[0041] На этапе 220 способ может предусматривать сброс цепочки обработки восходящего и нисходящего канала и связанных с ней контекстов задач и указателей, а также подготовку к обработке передаваемых пакетов нисходящего канала и принимаемых пакетов восходящего канала в каждом интервале передачи 0,5 мс посредством ПО L1 после получения сообщения PHY START. [0041] At step 220, the method may include resetting the uplink and downlink processing chain and associated task contexts and pointers, and preparing to process transmitted downlink packets and received uplink packets in each 0.5 ms transmission interval by the L1 software. after receiving the PHY START message.

[0042] На этапе 222 способ может дополнительно предусматривать отправку из ПО L1 сообщения PHY RESPONSE, указывающего на то, что ПО L1 теперь находится в состоянии RUNNING и начинает обрабатывать передаваемые пакеты нисходящего канала и принимаемые пакеты восходящего канала в каждом интервале передачи 0,5 мс согласно конфигурационным сообщениям DL CONFIG REQUEST и UL CONFIG REQUEST, полученным соответственно от ПО L2.[0042] At step 222, the method may further include sending the L1 UE a PHY RESPONSE message indicating that the L1 UE is now in the RUNNING state and begins processing transmitted downlink packets and received uplink packets in each 0.5 ms transmission interval. according to the DL CONFIG REQUEST and UL CONFIG REQUEST configuration messages received respectively from the L2 software.

[0043] В примерном варианте осуществления ПО L1 может проверить атрибуты, и если они окажутся ошибочными или возникнут любые иные непредвиденные случаи, в которых аппаратному акселератору потребуется слишком много циклов для завершения, ПО L1 переходит к этапу 202, на котором будут сброшены очереди в аппаратном акселераторе и ядро аппаратного акселератора, и выполнен его перезапуск. Этот сброс этапа 0 обычно завершается в течение 100 мс, в течение которых параллельно выполняются этапы 202-220. Таким образом, ячейка будет восстановлена в течение 100 мс (суммарное значение таймеров T310 и T311), что позволить свести к минимуму отключение ячеек и ухудшение ключевых показателей эффективности, как раскрыто в рассмотренных ранее способах.[0043] In an exemplary embodiment, the L1 software may check the attributes, and if they are found to be incorrect or there are any other unexpected cases in which the hardware accelerator would take too many cycles to complete, the L1 software will proceed to step 202 where the queues in the hardware will be reset. accelerator and the hardware accelerator core, and it is restarted. This stage 0 reset typically completes within 100 ms, during which stages 202-220 are executed in parallel. Thus, the cell will be restored within 100 ms (the sum of timers T310 and T311), thereby minimizing cell shutdowns and degradation of key performance indicators, as disclosed in previously discussed methods.

[0044] Таким образом, настоящим изобретением предложена уникальная и эффективная система, облегчающая внедрение способов отказоустойчивости сети и сводящая к минимуму отключение ячеек и ухудшение ключевых показателей эффективности.[0044] Thus, the present invention provides a unique and effective system that facilitates the implementation of network resiliency techniques and minimizes cell outages and degradation of key performance indicators.

[0045] Хотя в настоящем документе рассмотрены преимущественно предпочтительные варианты осуществления изобретения, очевидна возможность многих других вариантов осуществления и внесения разнообразных изменений в предпочтительные варианты осуществления без отхода от сущности изобретения. Эти и другие изменения в предпочтительных вариантах осуществления изобретения будут очевидны специалистам в данной области техники из описания в настоящей заявке, при этом следует понимать, что указанное описание лишь иллюстрирует изобретение, но не ограничивает его объем.[0045] Although primarily preferred embodiments of the invention have been discussed herein, it is clear that many other embodiments and various changes can be made to the preferred embodiments without departing from the spirit of the invention. These and other changes in the preferred embodiments of the invention will be apparent to those skilled in the art from the description herein, it being understood that the description merely illustrates the invention and does not limit its scope.

ПРЕИМУЩЕСТВА НАСТОЯЩЕГО ИЗОБРЕТЕНИЯADVANTAGES OF THE PRESENT INVENTION

[0046] Настоящим изобретением предложен подход к построению сети с отказоустойчивыми узлами. [0046] The present invention proposes an approach to building a network with fault-tolerant nodes.

[0047] Настоящим изобретением предложена система, позволяющая свести к минимуму длительность простоя сети и, тем самым, в минимальной степени снизить КПЭ сети.[0047] The present invention proposes a system that allows minimizing the duration of network downtime and thereby minimizing the KPI of the network.

[0048] Настоящим изобретением предложен способ, позволяющий ПО L1 соответствовать требованиям высокой доступности и защищающий gNodeB от любых неожиданных исключений ПО L1 в действующих разработках.[0048] The present invention provides a method for enabling L1 software to meet high availability requirements and protecting the gNodeB from any unexpected L1 software exceptions in live designs.

Claims

1. A system that minimizes cell outages and degradation of key performance indicators by ensuring node fault tolerance and containing

cell, characterized in that:

multiple user devices are connected to communicate with the network;

one or more electronic devices (gnodes) in the network are configured to create and/or receive and/or transmit information over a communication channel connected with the ability to communicate with multiple user devices;

one or more hardware accelerators are coupled to communicate with one or more gnodes and are configured by a layer 1 (L1) module to be capable of processing information received through the connection in accordance with the uplink and downlink configuration instructions received at each given transmission interval received from a layer 2 (L2) module associated with a network, in which the L1 module is associated with the physical (PHY) layer of the network, and the L2 module is associated with the data link layer of the network, and at the same time

the L1 module additionally contains a processor that executes a set of executable instructions stored in memory, and when executing these instructions, the processor initiates at the L1 level:

monitors one or more inconsistencies in the information received by one or more hardware accelerators;

extracts from one or more mismatches a set of attributes related to out-of-range attributes received from the L2 module or unexpected attributes sent by the L1 module to one or more hardware accelerators in a given transmission interval, causing one or more hardware accelerators to require more cycles to completion of information processing or processing is not completed at all;

any queue or combination of one or more queues in the hardware accelerator and the hardware accelerator core is reset based on the set of extracted attributes, and

the reset takes a specified shutdown time interval such that the specified shutdown time interval is less than the expiration time associated with the radio link failure timer (T310) connected to the network,

reset of any queue or combination of one or more queues in the hardware accelerator and the hardware accelerator core is performed in parallel with a simultaneous request-response interaction to restart the cell between L1 and L2 modules for a specified time interval in which the request-response interaction for cell restart includes sending an error indication message to the L2 module,

the L2 module deactivates the L1 module until the combination of one or more queues in the hardware accelerator and the hardware accelerator core is restarted,

multiple user devices in a cell are configured to be in a radio resource connected (RRC) state as long as a predetermined disconnect time interval is less than the expiration time of timer T310.

2. The system according to claim 1, in which several user devices connected to the network experience a radio link failure during a second time interval, the duration of which is less than the duration of the specified outage time interval.

3. The system of claim 1, wherein multiple user devices connected to the network do not enter the RRC IDLE state.

4. The system according to claim 1, in which the request-response interaction further provides for stopping the processing chain of the downlink and uplink in real time and transferring the L1 module from the RUNNING state to the IDLE state and in which the L1 module returns a response message about the physical stop to the L2 module.

5. The system of claim 4, wherein the request-response interaction further provides for the L1 module to be deactivated by the L2 module, wherein the L1 module continues to perform any of the SLOT INDICATIONS sends or combination of sends to continue to prevent the L2 scheduler from counting system frame numbers, preventing the module L2 clear one or more contexts associated with multiple user devices and multiple databases associated with multiple user devices.

6. The system of claim 5, wherein the request-response interaction further involves the L2 module sending a new CONFGURE CELL REQUEST message with one or more specified CELL attributes, wherein the one or more specified CELL attributes are associated with the attributes used in the original configuration module L1.

7. The system of claim 6, wherein the request-response interaction involves configuring the downlink and uplink processing chain of the L1 module and transitioning to the CONFIGURED state, in which the L1 module sends a CONFIGURE CELL RESPONSE message to the L2 module after transitioning to the CONFIGURED state.

8. The system according to claim 7, in which the request-response interaction involves the L2 module sending a PHY START message to restart the cell at a given system frame number obtained from SLOT INDICATIONS.

9. The system according to claim 8, in which the “request-response” interaction additionally provides for the L1 module to reset several databases and restart the downlink and uplink processing chains in real time, as well as the subsequent sending of a PHY RESPONSE message to the L2 module, after which the module L1 goes into RUNNING state.

10. The system according to claim 1, in which the specified request-response interaction interval for restarting the cell is less than the specified shutdown time interval or at least equal to it.

11. A method for ensuring minimal cell outages and deterioration of key performance indicators by ensuring node fault tolerance and containing the following steps:

one or more inconsistencies in the information received by one or more hardware accelerators that are connected with the ability to communicate with one or more electronic devices (gnodes) configured to create and/or receive and/or transmit information over a communication channel are monitored, connected with the ability to communicate with multiple user devices;

wherein one or more hardware accelerators are configured by a layer 1 (L1) module to be capable of processing information received through the connection in accordance with an uplink and downlink configuration instruction received at each given transmission interval received from a layer 2 (L2) module associated with a network in which the L1 module is coupled to the physical (PHY) layer of the network, and the L2 module is coupled to the data link layer of the network, and the L1 module further includes a processor executing a set of executable instructions stored in memory;

any queue or combination of one or more queues in the hardware accelerator and the hardware accelerator core is reset based on the set of extracted attributes,

wherein the reset takes a predetermined shutdown time interval such that the predetermined shutdown time interval is less than the expiration time associated with the radio link failure timer (T310) connected to the network,

wherein the reset of any queue or combination of one or more queues in the hardware accelerator and the hardware accelerator core is performed in parallel with the simultaneous implementation of the request-response interaction between the L1 and L2 modules for a specified time interval in which the request-response interaction is restarted cells includes sending an error indication message to the L2 module,

wherein the L2 module deactivates the L1 module until the combination of one or more queues in the hardware accelerator and the hardware accelerator core is restarted, and

wherein multiple user devices in a cell are configured to be in a state of connection to radio resources as long as a predetermined disconnect time interval is less than the expiration time of timer T310.

12. The method of claim 11, wherein multiple user devices connected to the network experience a radio link failure during a second time interval, the duration of which is less than the duration of the specified outage time interval.

13. The method of claim 11, wherein multiple user devices connected to the network do not enter the RRC IDLE state.

14. The method of claim 11, wherein the request-response interaction further comprises stopping the real-time downlink and uplink processing chain and transitioning the L1 module from the RUNNING state to the IDLE state, and wherein the L1 module returns a physical stop response message. to the L2 module.

15. The method of claim 14, wherein the request-response interaction further comprises deactivating the L1 module by the L2 module, wherein the L1 module continues to perform any of the SLOT INDICATIONS sends or combination of sends to continue to prevent the L2 scheduler from counting system frame numbers, preventing the module L2 clear one or more contexts associated with multiple user devices and multiple databases associated with multiple user devices.

16. The method of claim 15, wherein the request-response interaction further comprises the L2 module sending a new CONFGURE CELL REQUEST message with one or more specified CELL attributes, wherein the one or more specified CELL attributes are associated with the attributes used in the original configuration module L1.

17. The method of claim 16, wherein the request-response interaction involves configuring the downlink and uplink processing chain of the L1 module and placing the L1 module in the CONFIGURED state, in which the L1 module sends a CONFIGURE CELL RESPONSE message to the L2 module after entering the state CONFIGURED.

18. The method of claim 17, wherein the request-response interaction involves the L2 module sending a PHY START message to restart the cell at a given system frame number obtained from SLOT INDICATIONS.

19. The method according to claim 18, in which the request-response interaction further involves the L1 module resetting several databases and restarting the downlink and uplink processing chains in real time, as well as subsequently sending a PHY RESPONSE message to the L2 module, after which the module L1 goes into RUNNING state.

20. The method of claim 11, wherein the predetermined request-response interaction interval for restarting the cell is less than or at least equal to the predetermined shutdown time interval.