CN104113903A

CN104113903A - Interactive cognitive learning based downlink power adjusting method and device

Info

Publication number: CN104113903A
Application number: CN201410373841.7A
Authority: CN
Inventors: 高志斌; 温斌; 黄联芬; 蔡鸿祥; 姚彦; 张远见; 李馨
Original assignee: Xiamen University; Comba Telecom Systems Guangzhou Co Ltd
Current assignee: Xiamen University; Comba Network Systems Co Ltd
Priority date: 2014-07-31
Filing date: 2014-07-31
Publication date: 2014-10-22
Anticipated expiration: 2034-07-31
Also published as: CN104113903B

Abstract

A downlink power adjustment method and device based on interactive cognitive learning, relating to communication technology. As an intelligent body with learning ability, femtocell takes improving networking performance as the learning goal, and optimizes the network by adjusting the power allocation of resource blocks. In addition to its own learning evolution, femtocells can also conduct interactive learning based on communication similarity and professional knowledge to improve learning efficiency and save long-term energy consumption from accumulating learning experience through the air interface. Network optimization is performed by adjusting the power allocation of resource blocks, which can reduce interference in hybrid networking and improve throughput. In addition to self-learning, femtocell gateways are used as a case library, and femtocells can learn interactively based on similarity and professional knowledge to improve learning efficiency and save long-term energy consumption from accumulating learning experience through the air interface.

Description

Downlink power adjustment method and device based on interactive cognitive learning

技术领域technical field

本发明涉及通信技术，尤其是涉及一种基于交互式认知学习的下行功率调整方法和装置。The present invention relates to communication technology, in particular to a downlink power adjustment method and device based on interactive cognitive learning.

背景技术Background technique

文献《Femtocells:technologies and deployment》(Zhang,J.,&De la Roche,G.(2010).Femtocells:technologies and deployment(pp.1-13).New York:Wiley.)中调查显示，在传统的蜂窝小区中有2/3的话音业务和90％的数据业务发生在室内，室内无线接入网的覆盖和容量显得格外重要。而宏基站价格十分昂贵，并且站址的选取，设备的安装、调试和维护都要耗费大量的人力、财力、物力和时间，通过增加宏基站来解决这一问题会增加运营商的开支并带来大量复杂的网络规划问题。在这样的大背景下，家庭基站等新兴设备应运而生。The survey in the literature "Femtocells: technologies and deployment" (Zhang, J., & De la Roche, G. (2010). Femtocells: technologies and deployment (pp.1-13). New York: Wiley.) shows that in the traditional Two-thirds of the voice services and 90% of the data services in the cell area take place indoors, so the coverage and capacity of the indoor wireless access network are extremely important. However, the price of macro base stations is very expensive, and site selection, equipment installation, commissioning and maintenance will consume a lot of manpower, financial resources, material resources and time. Solving this problem by adding macro base stations will increase the operator's expenses and bring to a large number of complex network planning problems. Against such a background, emerging devices such as femtocells emerge as the times require.

家庭基站又称毫微微蜂窝式基站，是一种低功率无线接入点，工作于授权频段，由用户已有宽带(如DSL、有线电缆、光纤)接入，远程通过专用网关实现从IP网到移动核心网的连通。它具有成本低廉、安装方便、自动配置、即插即用等特点，与运营商的其它移动基站同制式、同频段，因此手机等移动终端可以通用。其发射功率与Wi-Fi设备相差不大，约为10～100mW，覆盖半径为10～50m，支持数个活动用户，用于灵活地改善室内室外的无线信号覆盖和增加网络容量。但是，在宏基站与家庭基站的混合组网中不可避免地存在干扰。一方面，家庭基站作为一种商品，是由用户自由购买和安装，导致运营商不知道其位置分布情况；另一方面，随着家庭基站的普及，其在数量上将全面超越传统宏基站的规模，这将进一步地影响整网的性能。从功率的角度来看，如果不进行干扰管理，下行可能会出现以下负面情况：Home base station, also known as femtocell base station, is a low-power wireless access point that works in the authorized frequency band and is accessed by users with existing broadband (such as DSL, cable, optical fiber). Connectivity to the mobile core network. It has the characteristics of low cost, easy installation, automatic configuration, and plug-and-play. It has the same standard and frequency band as other mobile base stations of operators, so mobile terminals such as mobile phones can be used in common use. Its transmission power is not much different from that of Wi-Fi equipment, about 10-100mW, and the coverage radius is 10-50m. It supports several active users and is used to flexibly improve indoor and outdoor wireless signal coverage and increase network capacity. However, interference inevitably exists in the mixed networking of the macro base station and the femtocell. On the one hand, femtocells, as a commodity, are freely purchased and installed by users, so that operators do not know their location distribution; scale, which will further affect the performance of the entire network. From a power point of view, without interference management, the downlink can experience the following negative situations:

(1)家庭基站的信号过小而被宏蜂窝信号覆盖淹没，导致家庭基站的覆盖区域很小、信号质量差；(1) The signal of the home base station is too small and is submerged by the coverage of the macro cell signal, resulting in a small coverage area of the home base station and poor signal quality;

(2)若家庭基站信号相对于宏基站信号过大，可能导致宏蜂窝用户失去与宏基站的连接进入一个盲区(死区)，即宏蜂窝用户既没有接入闭合模式下的家庭基站的权限，又不能连接到宏基站；也可能使得宏蜂窝用户受到家庭基站干扰过大，性能降低。(2) If the signal of the home base station is too large relative to the signal of the macro base station, it may cause the macro cell user to lose the connection with the macro base station and enter a dead zone (dead zone), that is, the macro cell user has neither the right to access the home base station in closed mode , and cannot be connected to the macro base station; it may also cause the macro cell user to be interfered too much by the home base station, and the performance is degraded.

传统的LTE家庭基站下行功率调整方法主要有以下几种：Traditional LTE femtocell downlink power adjustment methods mainly include the following:

文献《Improved Decentralized Q-learning Algorithm for Interference Reduction inLTE-Femtocells》(Serrano A M G.Self-organized Femtocells:a Time Difference LearningApproach[J].2012.)中提到的固定功率分配。Fixed power allocation mentioned in the document "Improved Decentralized Q-learning Algorithm for Interference Reduction inLTE-Femtocells" (Serrano A M G. Self-organized Femtocells: a Time Difference Learning Approach[J].2012.).

文献《Interference control for LTE Rel-9HeNB cells》(Jeju,Interference control for LTE Rel-9HeNB cells[S],3GPP TSG RAN WG4,R4-094245,November 9th-13th,2009)中提出的智能功率控制(SPC)方法。The intelligent power control (SPC) proposed in the document "Interference control for LTE Rel-9HeNB cells" (Jeju, Interference control for LTE Rel-9HeNB cells[S], 3GPP TSG RAN WG4, R4-094245, November 9th-13th, 2009) )method.

文献《Cognition and docition in OFDMA-based femtocell networks》(Galindo-Serrano A,Giupponi L,Dohler M.Cognition and docition in OFDMA-based femtocell networks[C]//GlobalTelecommunications Conference(GLOBECOM 2010),2010IEEE.IEEE,2010:1-6.)中提到的迭代注水算法。Literature "Cognition and docition in OFDMA-based femtocell networks" (Galindo-Serrano A, Giupponi L, Dohler M. Cognition and docition in OFDMA-based femtocell networks[C]//Global Telecommunications Conference (GLOBECOM 2010), 2010IEEE1.IEEE, :1-6.) mentioned iterative water filling algorithm.

这些方法有以下不足：These methods have the following disadvantages:

(1)首先，在LTE中定义能分配给用户的最小时频单位为资源块，而大部分传统方法只对最大总发射功率进行调整，没有考虑基于资源块的功率调整；(1) First, the minimum time-frequency unit that can be allocated to users is defined as a resource block in LTE, while most traditional methods only adjust the maximum total transmit power without considering the power adjustment based on resource blocks;

(2)其次，分布式管理中，支持基于资源块的功率调整方法很少考虑到家庭基站之间的交互学习，只以提高自身性能为唯一目的。(2) Secondly, in the distributed management, the method of supporting power adjustment based on resource blocks seldom takes into account the interactive learning between femtocells, and only aims at improving its own performance.

发明内容Contents of the invention

本发明的目的在于提供一种用以降低家庭基站对于宏蜂窝用户产生的干扰，并加速该过程实现的基于交互式认知学习的下行功率调整方法和装置。The purpose of the present invention is to provide a downlink power adjustment method and device based on interactive cognitive learning to reduce the interference generated by the home base station to the macro cell user and accelerate the realization of the process.

所述基于交互式认知学习的下行功率调整方法，包括两个进程：The downlink power adjustment method based on interactive cognitive learning includes two processes:

1.基于交互式认知的下行功率调整，包括：1. Downlink power adjustment based on interactive cognition, including:

(1)每个家庭基站维护一张下行功率信息表，所述下行功率信息表以资源块为最小单位，用于决定每个资源块的发射功率，对下行功率信息表的所有数据进行初始化；(1) Each home base station maintains a downlink power information table, the downlink power information table uses resource blocks as the smallest unit, and is used to determine the transmission power of each resource block, and initialize all data in the downlink power information table;

(2)家庭基站周期性地感知当前发射功率配置所造成的干扰情况，然后根据感知的信息和相应的更新规则更新下行功率信息表中所对应的数据；(2) The home base station periodically senses the interference caused by the current transmission power configuration, and then updates the corresponding data in the downlink power information table according to the sensed information and corresponding update rules;

(3)使用更新后的下行功率信息表和规定的分配方法决定下一周期的发射功率配置；(3) Use the updated downlink power information table and the specified allocation method to determine the transmit power configuration for the next cycle;

(4)重复进行步骤(2)和(3)，最终目标是使得每次根据感知信息所分配的发射功率最佳；(4) Steps (2) and (3) are repeated, and the ultimate goal is to make the transmission power allocated according to the sensing information the best;

2.基于交互式学习的下行功率调整，包括：2. Downlink power adjustment based on interactive learning, including:

(1)家庭基站周期性地将自身维护的功率信息表、相似度参数、专业知识度等信息作为一个案例上报给具有汇聚功能的终端设备，所述终端设备采用家庭基站网关，家庭基站网关为每个案例设定一个生存时间，案例超过生存时间后，自动删除；(1) The home base station periodically reports the power information table, similarity parameters, professional knowledge and other information maintained by itself as a case to the terminal device with the aggregation function. The terminal device uses a home base station gateway, and the home base station gateway is Set a survival time for each case, after the case exceeds the survival time, it will be automatically deleted;

(2)家庭基站进行主动学习，家庭基站网关进行主动教授，两种方式同时进行，用于对下行功率进行调整；(2) The home base station performs active learning, and the home base station gateway performs active teaching, and the two methods are carried out simultaneously to adjust the downlink power;

所述主动学习是指：对于开启时间小于阈值的家庭基站，主动向家庭基站网关发出学习申请，并上报相似度参数和专业知识度；网关根据上报的相似度参数计算与案例库中各案例的相似度，将相似度在阈值以上的案例作为备选案例，再比较备选案例的专业知识度，取最高者作为终选案例，将该案例的对应表格发送给发出学习申请的家庭基站，并将此案例在从案例库中删除，基站使用该表格信息执行基于交互式认知的功率调整后，若专业知识度提高，则向网关发送确认信号并覆盖原有表格，结束学习过程；否则，网关选择当前备选案例中专业知识度最高的案例作为终选案例，发送给发出学习申请的家庭基站进行学习，直到备选案例为空，网关向基站发送反馈信息，停止学习过程。The active learning refers to: for the home base station whose turn-on time is less than the threshold, it actively sends a learning application to the home base station gateway, and reports the similarity parameter and professional knowledge; Similarity, the case with a similarity above the threshold is used as an alternative case, and then the professional knowledge of the alternative cases is compared, and the highest one is selected as the final case, and the corresponding form of the case is sent to the home base station that issued the study application, and After deleting this case from the case library, the base station uses the table information to perform power adjustment based on interactive cognition. If the professional knowledge is improved, it will send a confirmation signal to the gateway and overwrite the original table to end the learning process; otherwise, The gateway selects the case with the highest professional knowledge among the current alternative cases as the final case, and sends it to the home base station that sends the learning application for learning, until the alternative case is empty, the gateway sends feedback information to the base station, and stops the learning process.

所述主动教授是指：对于最新上报的案例，家庭基站网关计算该案例与案例库中各案例的相似度，选择相似度最高的案例，比较上报案例与选择案例的专业知识度；若相差在阈值以内，则停止教授过程；否则将专业知识度较大的案例的表格发送给专业知识度较小的那一案例对应的家庭基站，以进行基于交互式认知的功率调整，若执行后专业知识度提高，则向网关发送确认信号并覆盖原有表格，结束教授过程；否则，保留原有表格，结束教授过程。此过程不需对所有新上报案例执行。The active teaching refers to: for the latest reported case, the femtocell gateway calculates the similarity between the case and the cases in the case database, selects the case with the highest similarity, and compares the professional knowledge of the reported case and the selected case; if the difference is within Within the threshold, stop the teaching process; otherwise, send the form of the case with higher professional knowledge to the femtocell corresponding to the case with lower professional knowledge to perform power adjustment based on interactive cognition. When the knowledge level increases, a confirmation signal is sent to the gateway and the original form is overwritten to end the teaching process; otherwise, the original form is kept and the teaching process ends. This process need not be performed on all newly reported cases.

以上所述交互式认知和学习的下行功率调整方法在实际操作中循环执行。The above-mentioned downlink power adjustment method of interactive cognition and learning is cyclically executed in actual operation.

所述基于交互式认知学习的下行功率调整装置，在基站侧包括三个模块单元：信息存储模块、信息发送/接收模块、信息处理模块；The downlink power adjustment device based on interactive cognitive learning includes three module units on the base station side: an information storage module, an information sending/receiving module, and an information processing module;

所述信息存储模块的主要功能是：1)存储基于交互式认知的功率调整所维护的功率信息表，信息处理模块将使用功率信息表，处理后再存入信息存储模块；2)暂时存储基于交互式学习的功率调整中获得的案例信息并决定当前使用的表格；The main functions of the information storage module are: 1) store the power information table maintained by the power adjustment based on interactive cognition, the information processing module will use the power information table, and store it in the information storage module after processing; 2) temporarily store Based on case information obtained in power adjustments for interactive learning and deciding on the currently used form;

所述信息发送/接收模块的主要功能是：1)周期性地接收由邻区宏蜂窝用户反馈的各种信息，并将接收值上报给信息处理模块，对于干扰信息，若不存在邻区宏蜂窝用户的干扰或者干扰极小(小于某预定的阈值)，可忽略，则人为地规定上报值为0；若存在多个邻区宏蜂窝用户的干扰，则从中选取干扰最大的作为上报值，使得最终处理结果满足对所有的宏蜂窝用户的干扰降至阈值以下；2)负责向家庭基站网关发送自身案例及其它相关信息；The main functions of the information sending/receiving module are: 1) Periodically receive various information fed back by neighboring cell macrocell users, and report the received value to the information processing module. For interference information, if there is no neighboring cell macro The interference of cellular users or the interference is very small (less than a predetermined threshold) and can be ignored, and the reporting value is artificially specified as 0; if there is interference from multiple adjacent macrocellular users, the one with the largest interference is selected as the reporting value. Make the final processing result satisfy that the interference to all macrocell users falls below the threshold; 2) be responsible for sending its own case and other relevant information to the femtocell gateway;

所述信息处理模块的主要功能是：1)在基于交互式认知的功率调整中，与信息接收模块同周期地处理存储信息和上报信息，具体是指更新表格信息以及决定资源块发射功率；2)在基于交互式学习的功率调整中，将自身维护表格、相似度参数、专业知识度以及其它所需信息组成案例并交予信息发送/接收模块。The main functions of the information processing module are: 1) in the power adjustment based on interactive cognition, process the stored information and report information at the same period as the information receiving module, specifically refer to updating table information and determining resource block transmission power; 2) In the power adjustment based on interactive learning, the self-maintenance form, similarity parameters, professional knowledge and other required information are composed into a case and handed over to the information sending/receiving module.

本发明通过调整资源块的功率分配来进行网络优化，能够减小混合组网的干扰，提高吞吐量。除自身学习以外，将家庭基站网关作为案例库，家庭基站可以根据相似度和专业知识度，来交互学习以提高学习效率同时节省由空中接口积累学习经验的长时间能量损耗。The invention optimizes the network by adjusting the power distribution of the resource blocks, can reduce the interference of mixed networking, and improve the throughput. In addition to self-learning, femtocell gateways are used as a case library, and femtocells can learn interactively based on similarity and professional knowledge to improve learning efficiency and save long-term energy consumption from accumulating learning experience through the air interface.

附图说明Description of drawings

图1为本发明实例中LTE宏基站和家庭基站混合组网的系统架构图。FIG. 1 is a system architecture diagram of a mixed network of LTE macro base stations and home base stations in an example of the present invention.

图2为家庭基站的交互式认知学习的功率调整装置模块功能关系图。FIG. 2 is a functional relationship diagram of a power adjustment device module for interactive cognitive learning of a home base station.

图3为家庭基站基于交互式认知的功率调整流程图。Fig. 3 is a flowchart of power adjustment based on interactive cognition of the home base station.

图4为家庭基站基于交互式学习的主动学习流程图。Fig. 4 is a flowchart of active learning based on interactive learning of the home base station.

图5为家庭基站网关基于交互式学习的主动教授流程图。Fig. 5 is a flowchart of active teaching based on interactive learning by the femtocell gateway.

具体实施方式Detailed ways

以下结合附图对本发明进行详细描述。The present invention will be described in detail below in conjunction with the accompanying drawings.

本发明实施场景以LTE宏基站与家庭基站混合组网为例，LTE宏基站与家庭基站混合组网系统架构如图1所示。The implementation scenario of the present invention takes the mixed networking of LTE macro base stations and home base stations as an example, and the system architecture of the mixed networking of LTE macro base stations and home base stations is shown in FIG. 1 .

本发明为基于交互式认知学习的下行功率调整方法，该方法主要进程为：1、基于交互式认知的下行功率调整；2、基于交互式学习的下行功率调整；进程1、2在实际操作中循环执行，并且基于1、信息存储模块；2、发送/接收模块；3、信息处理模块实现，模块功能关系如图2所示。The present invention is a downlink power adjustment method based on interactive cognitive learning. The main process of the method is: 1. Downlink power adjustment based on interactive cognition; 2. Downlink power adjustment based on interactive learning; It is executed cyclically during operation, and is implemented based on 1. information storage module; 2. sending/receiving module; 3. information processing module. The functional relationship of the modules is shown in Figure 2.

信息存储模块的主要功能是：1)存储基于交互式认知的功率调整所维护的功率信息表，信息处理模块将使用功率信息表，处理后再存入信息存储模块；2)暂时存储基于交互式学习的功率调整中获得的案例信息并决定当前使用的表格。The main functions of the information storage module are: 1) store the power information table maintained by the power adjustment based on interactive cognition, the information processing module will use the power information table, and store it in the information storage module after processing; 2) temporarily store the power information table based on interaction The case information obtained in the power adjustment of formula learning and decides the table currently used.

发送/接收模块的主要功能是：1)周期性地接收由邻区宏蜂窝用户反馈的各种信息，并将接收值上报给信息处理模块，对于干扰信息，若不存在邻区宏蜂窝用户的干扰或者干扰极小(小于某预定的阈值)，可忽略，则人为地规定上报值为0；若存在多个邻区宏蜂窝用户的干扰，则从中选取干扰最大的作为上报值，使得最终处理结果满足对所有的宏蜂窝用户的干扰降至阈值以下；2)负责向家庭基站网关发送自身案例及其它相关信息。The main functions of the sending/receiving module are: 1) Periodically receive various information fed back by neighboring macrocell users, and report the received value to the information processing module. For interference information, if there is no neighboring macrocell user’s Interference or interference is very small (less than a predetermined threshold) and can be ignored, and the reporting value is artificially specified as 0; if there is interference from multiple adjacent macrocell users, select the one with the largest interference as the reporting value, so that the final processing The result satisfies that the interference to all macrocell users falls below the threshold; 2) is responsible for sending its own case and other relevant information to the femtocell gateway.

信息处理模块的主要功能是：1)在基于交互式认知的功率调整中，与信息接收模块同周期地处理存储信息和上报信息，具体是指更新表格信息以及决定资源块发射功率；2)在基于交互式学习的功率调整中，将自身维护表格、相似度参数、专业知识度以及其它所需信息组成案例并交予信息发送/接收模块。The main functions of the information processing module are: 1) In the power adjustment based on interactive cognition, process the stored information and report information at the same period as the information receiving module, specifically referring to updating table information and determining the transmit power of resource blocks; 2) In the power adjustment based on interactive learning, the self-maintenance table, similarity parameters, professional knowledge and other required information are composed into a case and delivered to the information sending/receiving module.

本发明提供一个实施例，本例中的家庭基站资源分配采用比例公平算法独立地将资源块分配给不同的用户；本例中基于交互式认知的下行功率调整以Q学习为例进行说明。The present invention provides an embodiment. In this embodiment, the resource allocation of the home base station adopts a proportional fairness algorithm to independently allocate resource blocks to different users; in this embodiment, the downlink power adjustment based on interactive cognition is described by taking Q learning as an example.

Q学习中的智能体，状态，动作，回报定义如下：The agents, states, actions, and rewards in Q-learning are defined as follows:

智能体：家庭基站，基站数量i＝{1,2,…,N}，每个基站的资源块r＝{1,2,…,R}。Agent: home base station, number of base stations i={1, 2,...,N}, resource block r of each base station={1,2,...,R}.

第i个家庭基站第r个资源块上的状态s：s^i,r＝{I^i,r,Powⁱ}，其中，I^i,r为信干噪比指示符。State s of the i-th home base station on the r-th resource block: s ^i,r ={I ^i,r ,Pow ⁱ }, wherein, I ^i,r is a signal-to-interference-noise ratio indicator.

${I I}^{i i,, r r} = = \{\begin{matrix} 00 & if if & SINR SINR__{I I}^{i i,, r r} < < (({SINR SINR}_{Th Th} - - x x)) \\ 11 & if if & (({SINR SINR}_{Th Th} - - x x)) \leq \leq SINR SINR__{I I}^{i i,, r r} \leq \leq (({SINR SINR}_{Th Th} + + x x)),, \\ 22 & SINR SINR__{I I}^{i i,, r r} > > ((SIN SIN {R R}_{Th Th} + + x x)),, \end{matrix}$

SINR_I^i,r为上报的邻区宏蜂窝用户对第i个家庭基站第r个资源块的信干噪比，从相邻宏基站获取对应终端用户信息的具体方法可参考专利CN 102045795A《一种从目标基站获取信息的方法及装置》中所述实现，SINR_Th为规定的干扰阈值，x为一常数(单位为dB)，用于范围微调；SINR_I ^i,r is the reported signal-to-interference-noise ratio of the macro-cell user in the adjacent cell to the r-th resource block of the i-th home base station. The specific method for obtaining the corresponding terminal user information from the adjacent macro-base station can refer to the patent CN 102045795A "A Implementation as described in "Methods and Devices for Obtaining Information from a Target Base Station", SINR _Th is a prescribed interference threshold, and x is a constant (in dB) for range fine-tuning;

${Pow Pow}^{i i} = = \{\begin{matrix} 00 & if if & {Pow Pow}^{i i} < < (({Pow Pow}_{Th Th} - - y the y)) \\ 11 & if if & (({Pow Pow}_{Th Th} - - y the y)) \leq \leq {Pow Pow}^{i i} \leq \leq {Pow Pow}_{Th Th} \\ 22 & {Pow Pow}^{i i} > > {Pow Pow}_{Th Th} \end{matrix}$

Powⁱ为第i个家庭基站实际发射功率，Pow_Th为家庭基站额定最大发射功率，y为一常数(单位为dBm)，用于范围微调，实际情况当中，若分配功率之和超过最大发射功率，则采用最大发射功率按Q学习分配的功率成比例进行分配，但Q值表中记录的状态不变，这种情况会随着收敛的过程概率逐渐趋于0。Pow ⁱ is the actual transmission power of the i-th home base station, Pow _Th is the rated maximum transmission power of the home base station, and y is a constant (in dBm) for range fine-tuning. In actual situations, if the sum of the allocated power exceeds the maximum transmission power , the maximum transmit power is allocated in proportion to the power allocated by Q learning, but the state recorded in the Q value table remains unchanged, and the probability of this situation will gradually tend to 0 with the convergence process.

I^i,r和Powⁱ的量化精度可根据实际情况进行调整。The quantization precision of I ^{i, r} and Pow ⁱ can be adjusted according to the actual situation.

动作a：可分配给每个资源块的功率等级a^i，r∈{a₁，a₂，...，a_M}，单位为dBm。Action a: the power level a ^{i, r} ∈ {a ₁ , a ₂ , . . . , a _M } that can be allocated to each resource block, the unit is dBm.

第i个家庭基站第r个资源块上的回报值re：The return value re on the rth resource block of the i-th home base station:

${re}^{i, r} = \begin{matrix} \end{matrix} \{\begin{matrix} - 1 & if & {Pow}^{i} > {Pow}_{Th} \\ e^{{- (SINR_I^{i, r} - {SINR}_{Th})}^{2} - e^{(- {SINR}^{i, r} / K)}} & if & {Pow}^{i} \leq {Pow}_{Th}, \end{matrix}$ 其中，K为正常数，SINR^i,r为第i个家庭基站第r个资源块上的信干噪比。 ${re}^{i, r} = \begin{matrix} \end{matrix} \{\begin{matrix} - 1 & if & {Pow}^{i} > {Pow}_{Th} \\ e^{{- (SINR_I^{i, r} - {SINR}_{Th})}^{2} - e^{(- {SINR}^{i, r} / K)}} & if & {Pow}^{i} \leq {Pow}_{Th}, \end{matrix}$ Wherein, K is a normal number, and SINR ^{i, r} is the signal-to-interference-noise ratio on the r-th resource block of the i-th home base station.

此外，本例中所有的信干噪比(SINR)也可使用信道质量指示符(CQI)代替。In addition, all the signal-to-interference and noise ratio (SINR) in this example can also be replaced by channel quality indicator (CQI).

参见图3，本发明基于交互式认知的下行功率调整实现步骤如下：Referring to Fig. 3, the implementation steps of downlink power adjustment based on interactive cognition in the present invention are as follows:

(1)家庭基站维护一张三维Q值表，对Q值表、资源块所处状态s及动作a进行初始化。其中，Q值表第一维为状态s，第二维为动作a，第三维为资源块r；(1) The home base station maintains a three-dimensional Q value table, and initializes the Q value table, the state s of the resource block and the action a. Among them, the first dimension of the Q value table is state s, the second dimension is action a, and the third dimension is resource block r;

(2)家庭基站周期性地获取由邻区宏蜂窝用户反馈的信干噪比(SINR)；(2) The home base station periodically obtains the signal-to-interference-noise ratio (SINR) fed back by the macrocell users in the neighboring cells;

(3)根据接收的SINR值和相应状态量化规则获得当前状态的量化值s'，并计算回报值re，(3) Obtain the quantization value s' of the current state according to the received SINR value and the corresponding state quantization rules, and calculate the return value re,

(4)采用以下规则对Q值表进行更新：(4) Use the following rules to update the Q value table:

$Q Q ((s the s,, a a,, r r)) = = ((11 - - lf lf)) Q Q ((s the s,, a a,, r r)) + + lf lf ((re re + + γ γ \underset{{a a}^{' '}}{max max} Q Q (({s the s}^{' '},, {a a}^{' '},, r r))))$

其中，γ∈(0，1)为常数折现因子，它体现了未来回报相对当前回报的重要性，lf∈[0,1)为常数学习因子，它用于控制收敛的速率；Among them, γ∈(0,1) is a constant discount factor, which reflects the importance of future returns relative to current returns, and lf∈[0,1) is a constant learning factor, which is used to control the rate of convergence;

(5)对每一个资源块，根据当前Q值表及状态s'采用e-贪婪算法选取并在本周期内执行该动作a'。具体方法是：e为0-1之间的较小常数，产生一个在0-1之间的随机数p，若p<e,则随机选取一个动作；否则，选择状态s'中，对应Q值最大的动作；(5) For each resource block, use the e-greedy algorithm to select and execute the action a' in this period according to the current Q value table and state s'. The specific method is: e is a small constant between 0-1, generate a random number p between 0-1, if p<e, then randomly select an action; otherwise, select the state s', corresponding to Q The action with the largest value;

(6)更新状态及动作s＝s'，a＝a'，并转至步骤(2)。(6) Update state and action s=s', a=a', and go to step (2).

本例中基于交互式学习的功率调整中的相似度参数，专业知识度定义如下：In this example, the similarity parameter in the power adjustment based on interactive learning, the professional knowledge degree is defined as follows:

相似度参数sim：sim^i,r＝{en^i,r,act_en^i,r}，其中en^i,r是对应家庭基站i的资源块r上的环境相似度，这里使用SINR^i,r表示，但在其他实施例中不局限于使用SINR^i,r表示；act_en^i,r是对应家庭基站i的资源块r上的动作对环境的影响相似度，a^i,r为对应基站i的对应资源块r上Q学习中的动作，下标t和t-1分别表示当前周期与前一个周期。Similarity parameter sim: sim ^i,r ={en ^i,r ,act_en ^i,r }, where en ^i,r is the environmental similarity on resource block r corresponding to home base station i, which is represented by SINR ^i,r here, However, in other embodiments, it is not limited to using SINR ^i,r to represent; act_en ^i,r is the similarity of the impact of actions on the resource block r corresponding to home base station i on the environment, a ^{i, r} are the actions in Q learning on the corresponding resource block r corresponding to base station i, and the subscripts t and t-1 represent the current period and the previous period respectively.

专业知识度exp：exp^i，r＝θ₁·ηⁱ+θ₂·con^i，r其中θ₁、θ₂为权值，θ₁+θ₂＝1；ηⁱ为第i个家庭基站的能效利用率，C为家庭基站吞吐量，P为实际传输功率，w为使用带宽；con^i,r为对应家庭基站i的资源块r上的Q值表的收敛度，q^i,r为对应家庭基站i的资源块r上的Q值表，下标t和t-1分别表示当前周期与前一个周期。Professional knowledge degree exp: exp ^{i, r} = θ ₁ ·η ⁱ + θ ₂ · con ^{i, r} where θ ₁ and θ ₂ are weights, θ ₁ + θ ₂ = 1; η ⁱ is the value of the i-th home base station energy efficiency, C is the throughput of the home base station, P is the actual transmission power, and w is the bandwidth used; con ^{i, r} is the convergence degree of the Q value table on the resource block r corresponding to the home base station i, q ^i,r is the Q value table on the resource block r corresponding to the home base station i, and the subscripts t and t-1 represent the current cycle and the previous cycle respectively.

参见图4和5，本发明基于交互式学习的下行功率调整实现步骤如下：Referring to Figures 4 and 5, the implementation steps of downlink power adjustment based on interactive learning in the present invention are as follows:

(1)家庭基站周期性将Q值表、相似度参数、专业知识度等信息组合为一个案例通过S1接口上报给对应家庭基站网关，存入案例库中，案例超过生存时间后，自动从案例库中删除。假设基于交互式认知的功率调整的单周期时间为T，家庭基站上报周期为1000T，每个案例的生存时间为1200T，案例库存满时删除生存时间在设定值(如500T)以下的案例；(1) The home base station periodically combines information such as the Q value table, similarity parameters, and professional knowledge into a case, reports it to the corresponding home base station gateway through the S1 interface, and stores it in the case database. Deleted from the library. Assume that the single-cycle time of power adjustment based on interactive cognition is T, the reporting cycle of the home base station is 1000T, and the survival time of each case is 1200T. When the case inventory is full, delete the cases whose survival time is below the set value (such as 500T) ;

(2)家庭基站向网关发出学习申请，上报相似度参数sim₁和专业知识度exp₁；网关根据sim₁计算与案例库中各案例的相似度δ，计算公式如下:(2) The home base station sends a learning application to the gateway, and reports the similarity parameter sim ₁ and the professional knowledge exp ₁ ; the gateway calculates the similarity δ with each case in the case base according to sim ₁ , and the calculation formula is as follows:

其中υ_i为权值，|p_i1-p_i2|表示两案例对应相似度参数之差的绝对值，N为相似度参数的数量，本例中取2； Among them, υ _i is the weight value, |p _i1 -p _i2 | represents the absolute value of the difference between the corresponding similarity parameters of the two cases, N is the number of similarity parameters, and 2 is taken in this example;

(3)将相似度大于阈值d_Th对应的案例作为备选案例。若无备选案例，网关向基站发送相应反馈信息，停止学习过程，否则，选取备选案例中exp值最高的案例，将对应的Q值表发送给基站，并将案例从备选案例中删除，基站使用该Q值表进行功率调整，经100T后检测专业知识度是否提高，若提高则覆盖原有Q值表，向网关发送确认信号，网关释放备选案例，结束学习过程；若没有提高则再次发出学习申请，网关选择当前备选案例中exp值最高的案例，将对应的Q值表发送给基站进行学习，直至备选案例为空，若仍无满足条件的案例，则网关向基站发送反馈信息，停止学习过程。(3) Take the case corresponding to the similarity greater than the threshold d _Th as an alternative case. If there is no alternative case, the gateway sends corresponding feedback information to the base station and stops the learning process, otherwise, selects the case with the highest exp value among the alternative cases, sends the corresponding Q value table to the base station, and deletes the case from the alternative cases , the base station uses the Q value table to adjust the power. After 100T, it detects whether the professional knowledge is improved. If it is improved, it will overwrite the original Q value table and send a confirmation signal to the gateway. The gateway releases the alternative case and ends the learning process; if it does not increase Then send out the study application again, the gateway selects the case with the highest exp value among the current alternative cases, and sends the corresponding Q value table to the base station for learning until the alternative cases are empty, if there is still no case that meets the conditions, the gateway sends the base station Send feedback to stop the learning process.

(4)选取最新上报的案例c1，网关计算该案例与案例库中各案例的相似度，方法同步骤(2)描述。选择对应相似度最高的案例c2，比较c1和c2的专业知识度。若|exp₁-exp₂|≤exp_Th，则停止教授过程；若exp₁-exp₂＞exp_Th，则将c1的Q值表发送给c2对应的家庭基站，c2对应的基站使用该Q值表进行功率调整，如果专业知识度得到提升，则向网关发送确认信号并覆盖原有Q值表，否则仍使用原有Q值表，结束教授过程；若exp₂-exp₁＞exp_Th，则将c2的Q值表发送给c1对应的家庭基站，c1对应的基站使用该Q值表进行功率调整，如果专业知识度得到提升，则向网关发送确认信号并覆盖原有Q值表，否则仍使用原有Q值表，结束教授过程。exp_Th为专业知识度之差的阈值。(4) Select the latest reported case c1, and the gateway calculates the similarity between this case and each case in the case database, and the method is the same as that described in step (2). Select the case c2 corresponding to the highest similarity, and compare the professional knowledge of c1 and c2. If |exp ₁ -exp ₂ |≤exp _Th , stop the teaching process; if exp ₁ -exp ₂ >exp _Th , send the Q value table of c1 to the home base station corresponding to c2, and the base station corresponding to c2 uses the Q value table to adjust the power, if the professional knowledge is improved, then send a confirmation signal to the gateway and overwrite the original Q value table, otherwise the original Q value table is still used, and the teaching process ends; if exp ₂ -exp ₁ > exp _Th , then Send the Q value table of c2 to the home base station corresponding to c1, and the base station corresponding to c1 uses the Q value table for power adjustment. If the professional knowledge is improved, it will send a confirmation signal to the gateway and overwrite the original Q value table, otherwise it will still Use the original Q value table to end the teaching process. exp _Th is the threshold of the difference in professional knowledge.

可以理解，本发明提供的一种基于交互式认知的下行功率调整方法，包括以下步骤：It can be understood that a method for adjusting downlink power based on interactive cognition provided by the present invention includes the following steps:

步骤101：周期性地接收由邻区宏蜂窝用户反馈的基于资源块信息；Step 101: Periodically receive resource block-based information fed back by macrocell users in neighboring cells;

步骤102：根据所述基于资源块信息确定单资源块的信干噪比；Step 102: Determine the SINR of a single resource block according to the resource block-based information;

步骤103：根据所述信干噪比更新功率信息表，并根据功率信息表确定资源块的发射功率；Step 103: Update the power information table according to the SINR, and determine the transmit power of the resource block according to the power information table;

步骤104：将确定的资源块发射功率分配给对应的家庭基站。Step 104: Allocate the determined transmit power of the resource block to the corresponding HNB.

优选的，将确定的资源块发射功率分配给邻区宏蜂窝用户对应的家庭基站，具体包括以下步骤，将确定的资源块发射功率发送给邻区宏蜂窝用户对应的家庭基站；其中，所述功率信息表至少包括状态维度参数s、动作维度参数a、资源块维度参数r，其中，状态维度参数s为：基于家庭基站用户资源块的信干噪比和该家庭基站用户对应发射功率的集合；动作维度参数a为可分配给每个资源块的功率等级。Preferably, allocating the determined resource block transmit power to the home base station corresponding to the macrocell user in the neighboring cell specifically includes the following steps, sending the determined transmit power of the resource block to the home base station corresponding to the macrocell user in the neighboring cell; wherein, the The power information table includes at least a state dimension parameter s, an action dimension parameter a, and a resource block dimension parameter r, wherein the state dimension parameter s is: a set based on the signal-to-interference-noise ratio of the home base station user resource block and the corresponding transmit power of the home base station user ; The action dimension parameter a is the power level that can be allocated to each resource block.

另外本发明实施例中，根据所述信干噪比更新功率信息表具体包括以下步骤：In addition, in the embodiment of the present invention, updating the power information table according to the SINR specifically includes the following steps:

其中，Q(s,a,r)为基于状态维度参数s、动作维度参数a、资源块维度参数r的当前功率信息表配置规则；lf为常数学习因子，用于控制收敛的速率，lf∈[0,1)；γ为常数折现因子，γ∈(0，1)；s'为前一次状态维度参数的量化值，a′为前一次动作维度参数的量化值。可以理解，本发明实施例中的更新功率信息表还可以包括其他更新规则，如使用不同范围的参数，或者添加其他因子影响。Among them, Q(s, a, r) is the current power information table configuration rule based on the state dimension parameter s, the action dimension parameter a, and the resource block dimension parameter r; lf is a constant learning factor used to control the convergence rate, lf∈ [0,1); γ is a constant discount factor, γ∈(0, 1); s' is the quantized value of the previous state dimension parameter, and a' is the quantized value of the previous action dimension parameter. It can be understood that the update power information table in this embodiment of the present invention may also include other update rules, such as using parameters in different ranges, or adding other factors.

本发明提供的一种基于交互式学习的下行功率调整方法，包括以下步骤：A method for adjusting downlink power based on interactive learning provided by the present invention comprises the following steps:

步骤201：家庭基站将相关信息组合为一个案例，存入案例库中，案例超过生存时间后自动删除；Step 201: the femtocell combines relevant information into a case, stores it in the case database, and automatically deletes the case after the survival time exceeds;

步骤202：家庭基站向网关发出学习申请，上报相关信息后；案例库计算其与案例库中案例相似度；Step 202: the home base station sends a learning application to the gateway, and after reporting the relevant information; the case base calculates its similarity with the cases in the case base;

步骤203：根据相似度与专业知识度选取适合案例，将该案例的对应表格发送给发出学习申请的家庭基站，基站使用该表格信息执行基于交互式认知的功率调整后，根据专业知识度的变化决定是否继续学习过程；Step 203: Select a suitable case according to the degree of similarity and professional knowledge, and send the corresponding form of the case to the home base station that issued the learning application. changes decide whether to continue the learning process;

步骤204：对于最新上报的案例，家庭基站网关选取相似案例比较专业知识度，若相差在阈值以内，则停止教授过程；否则将专业知识度较大的案例的表格发送给专业知识度较小的那一案例对应的家庭基站，以进行基于交互式认知的功率调整，若执行后专业知识度提高，则向网关发送确认信号并覆盖原有表格；否则，保留原有表格，结束教授过程。Step 204: For the latest reported case, the femtocell gateway selects similar cases to compare the professional knowledge, if the difference is within the threshold, stop the teaching process; otherwise, send the form of the case with higher professional knowledge to the one with lower professional knowledge The home base station corresponding to that case performs power adjustment based on interactive cognition. If the professional knowledge is improved after execution, it will send a confirmation signal to the gateway and overwrite the original form; otherwise, keep the original form and end the teaching process.

通过上述方案，本发明实施例中，可以分配给用户的最小时频单位为资源块，进而，通过基于资源块的功率调整使得家庭基站之间的交互学习，提高了家庭基站的自身性能。Through the above solution, in the embodiment of the present invention, the smallest time-frequency unit that can be allocated to a user is a resource block, and further, the power adjustment based on the resource block enables interactive learning between the home base stations, improving the performance of the home base station itself.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序指令等相关硬件来完成，所述的程序可存储于一计算机可读存储介质中，该程序执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory，RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer program instructions and other related hardware. The program can be stored in a computer-readable storage medium. When the program is executed , may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), etc.

由此可见，本发明提供了下行功率调整方法和装置，并提供了多种可选适配方案，以上实施例并非限制本发明所描述的技术方案，因此，尽管本说明书参照上述的各个实施例对本发明已进行了说明，本领域的相关人员应当理解，一切不脱离本发明的精神和范围的技术方案及其改进，其均应涵盖在本发明的权利要求范围当中。It can be seen that the present invention provides a downlink power adjustment method and device, and provides a variety of optional adaptation solutions. The above embodiments do not limit the technical solutions described in the present invention. The present invention has been described, and those skilled in the art should understand that all technical solutions and improvements that do not deviate from the spirit and scope of the present invention should be included in the claims of the present invention.

Claims

1. the descending power method of adjustment based on interactive cognitive learning, is characterized in that comprising following process:

Process 1. is based on interactive cognitive descending power adjustment;

The descending power adjustment of process 2. based on interactive learning.

2. the descending power method of adjustment based on interactive cognitive learning as claimed in claim 1, is characterized in that in process 1, described based on interactive cognitive descending power adjustment, comprising:

(1) each Home eNodeB is safeguarded a descending power information table, and described descending power information table be take Resource Block as least unit, for determining the transmitting power of each Resource Block, all data of descending power information table is carried out to initialization;

(2) disturbed condition that the configuration of periodically perception current transmit power of Home eNodeB causes, then upgrades corresponding data in descending power information table according to the information of perception and corresponding update rule;

(3) distribution method of the descending power information table after use renewal and regulation determines the transmitting power configuration in next cycle;

(4) repeat step (2) and (3), final goal is to make each transmitting power of distributing according to perception information best.

3. the descending power method of adjustment based on interactive cognitive learning as claimed in claim 1, is characterized in that in process 2, and the described descending power adjustment based on interactive learning, comprising:

(1) Home eNodeB periodically reports the terminal equipment with aggregation feature using the power information table of self maintained, similarity parameter, professional knowledge degree as a case, described terminal equipment adopts femto gateway, femto gateway is a life span of each case setting, case surpasses after life span, automatically deletes;

(2) Home eNodeB carries out Active Learning, and femto gateway carries out active professor, and two kinds of modes are carried out simultaneously, for descending power is adjusted.

4. the descending power method of adjustment based on interactive cognitive learning as claimed in claim 3, it is characterized in that in step (2), described Active Learning refers to: the Home eNodeB that is less than threshold value for the opening time, initiatively to femto gateway, send study application, and report similarity parameter and professional knowledge degree; Gateway is according to the similarity of each case in the similarity calculation of parameter reporting and case library, case using similarity more than threshold value is as alternative case, the professional knowledge degree of more alternative case again, get soprano as selecting eventually case, the corresponding form of this case is sent to the Home eNodeB that sends study application, and this case is being deleted from case library, base station is used this form data to carry out based on after interactive cognitive power adjustment, if professional knowledge degree improves, to gateway, send confirmation signal and cover original form, finishing learning process; Otherwise gateway selects the case that in current alternative case, professional knowledge degree is the highest to select case as whole, sends to the Home eNodeB that sends study application to learn, until alternative case is empty, gateway sends feedback information to base station, stops learning process.

5. the descending power method of adjustment based on interactive cognitive learning as claimed in claim 3, it is characterized in that in step (2), described active professor refers to: for the up-to-date case reporting, femto gateway calculates the similarity of each case in this case and case library, select the highest case of similarity, relatively report case and the professional knowledge degree of selecting case; If differ in threshold value, stop professor's process; Otherwise the form of the larger case of professional knowledge degree is sent to Home eNodeB corresponding to that case that professional knowledge degree is less, to carry out based on interactive cognitive power adjustment, if professional knowledge degree improves after carrying out, to gateway, send confirmation signal and cover original form, finishing professor's process; Otherwise, retain original form, finish professor's process, this process does not need all cases that newly report to carry out.

6. the descending power method of adjustment based on interactive cognitive learning as claimed in claim 1, the descending power that it is characterized in that described interactive cognitive descending power adjustment and interactive learning is adjusted at circulation in practical operation and carries out.

7. the descending power adjusting device based on interactive cognitive learning, is characterized in that comprising three modular units in base station side: information storage module, information sending/receiving module, message processing module.

8. the descending power adjusting device based on interactive cognitive learning as claimed in claim 7, is characterized in that:

The major function of described information storage module is: the power information table that 1) storage is safeguarded based on interactive cognitive power adjustment, and message processing module will be used power information table, the information storage module of restoring after processing; 2) temporarily store the case information obtaining in the power adjustment based on interactive learning the form that determines current use;

The major function of described information sending/receiving module is: 1) periodically receive the various information by adjacent area macrocellular user feedback, and reception value is reported to message processing module, for interfere information, if do not exist adjacent area macrocellular user's interference or interference to be less than predetermined threshold value, can ignore, the regulation value of reporting is 0 artificially; If there is a plurality of adjacent area macrocellular users' interference, therefrom choose and disturb maximum conduct value of reporting, final process result is met all macrocellular users' interference is down to below threshold value; 2) be responsible for sending self case and other relevant information to femto gateway;

The major function of described message processing module is: 1) in adjusting based on interactive cognitive power,, with processing storage information and reporting information periodically, specifically refer to updating form lattice information and determine Resource Block transmitting power with information receiving module; 2), in the power based on interactive learning is adjusted, self maintained form, similarity parameter, professional knowledge degree and other information needed are formed to case and give information sending/receiving module.

9. a descending power method of adjustment, is characterized in that, comprises the following steps:

(1) periodically receive by adjacent area macrocellular user feedback based on Resource Block information;

(2) according to the described Signal to Interference plus Noise Ratio of determining single Resource Block based on Resource Block information;

(3) according to described Signal to Interference plus Noise Ratio, upgrade power information table, and according to power information table, determine the transmitting power of Resource Block;

(4) by definite Resource Block transmit power allocations, give corresponding Home eNodeB user.

10. method as claimed in claim 9, is characterized in that: by definite Resource Block transmit power allocations, to adjacent area macrocellular user, concrete steps are as follows: definite Resource Block transmitting power is sent to adjacent area macrocellular user; Wherein, described power information table at least comprises state dimension parameter s, action dimension parameter a, Resource Block dimension parameter r, wherein, state dimension parameter s is: the set of the corresponding transmitting power of the Signal to Interference plus Noise Ratio based on Home eNodeB user resources piece and this Home eNodeB user; A is for distributing to the power grade of each Resource Block for action dimension parameter;

The concrete steps of upgrading power information table according to described Signal to Interference plus Noise Ratio are as follows:

Wherein, Q (s, a, r) is the current power information table configuration rule based on state dimension parameter s, action dimension parameter a, Resource Block dimension parameter r; Lf is the constant study factor, for controlling the speed of convergence, lf ∈ [0,1); γ is constant discount factor, γ ∈ (0,1); S' is the quantized value of last next state dimension parameter, and a ' is the quantized value of front one-off dimension parameter.