WO2023109828A1 - 数据收集方法及装置、第一设备、第二设备 - Google Patents

数据收集方法及装置、第一设备、第二设备 Download PDF

Info

Publication number
WO2023109828A1
WO2023109828A1 PCT/CN2022/138757 CN2022138757W WO2023109828A1 WO 2023109828 A1 WO2023109828 A1 WO 2023109828A1 CN 2022138757 W CN2022138757 W CN 2022138757W WO 2023109828 A1 WO2023109828 A1 WO 2023109828A1
Authority
WO
WIPO (PCT)
Prior art keywords
candidate
data
training data
training
data collection
Prior art date
Application number
PCT/CN2022/138757
Other languages
English (en)
French (fr)
Inventor
孙布勒
孙鹏
杨昂
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2023109828A1 publication Critical patent/WO2023109828A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/0202Channel estimation
    • H04L25/0224Channel estimation using sounding signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L5/00Arrangements affording multiple use of the transmission path
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L5/00Arrangements affording multiple use of the transmission path
    • H04L5/003Arrangements for allocating sub-channels of the transmission path
    • H04L5/0053Allocation of signaling, i.e. of overhead other than pilot signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/10Scheduling measurement reports ; Arrangements for measurement reports
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • the present application belongs to the technical field of communications, and specifically relates to a data collection method and device, a first device, and a second device.
  • AI Artificial Intelligence
  • neural network decision tree
  • support vector machine support vector machine
  • Bayesian classifier Bayesian classifier
  • the AI module in the wireless communication system performs centralized training, it is necessary to collect a large amount of data from the remote end to construct a data set. Since there is a large overlap in remote wireless data, it is not necessary to collect data of all users. In this way, on the one hand, the transmission pressure during data collection can be alleviated, and on the other hand, it can avoid too many repeated (or similar) samples in the data set.
  • the embodiments of the present application provide a data collection method and device, a first device, and a second device, which can alleviate the transmission pressure during data collection and avoid excessive repeated samples in the data set.
  • a data collection method including:
  • the first device sends a first instruction to the second device, instructing the second device to collect and report training data for specific AI model training;
  • the first device receives the training data reported by the second device
  • the first device uses the training data to construct a data set to train the specific AI model.
  • a data collection device including:
  • a sending module configured to send a first instruction to a second device, instructing the second device to collect and report training data for specific AI model training;
  • a receiving module configured to receive the training data reported by the second device
  • a training module configured to use the training data to construct a data set to train the specific AI model.
  • a data collection method including:
  • the second device receives a first instruction from the first device, and the first instruction is used to instruct the second device to collect and report training data for specific AI model training;
  • the second device collects training data, and reports the training data to the first device.
  • a data collection device including:
  • a receiving module configured to receive a first instruction from the first device, where the first instruction is used to instruct the second device to collect and report training data for specific AI model training;
  • a processing module configured to collect training data and report the training data to the first device.
  • a first device in a fifth aspect, includes a processor and a memory, the memory stores programs or instructions that can run on the processor, and the programs or instructions are executed by the processor When realizing the steps of the method as described in the first aspect.
  • a first device including a processor and a communication interface, wherein the communication interface is used to send a first instruction to a second device, instructing the second device to collect and report a specific AI training data for model training; receiving the training data reported by the second device; the processor is configured to use the training data to construct a data set to train the specific AI model.
  • a second device in a seventh aspect, includes a processor and a memory, the memory stores programs or instructions that can run on the processor, and the programs or instructions are executed by the processor When realizing the steps of the method as described in the third aspect.
  • a second device including a processor and a communication interface, wherein the communication interface is used to receive a first instruction from the first device, and the first instruction is used to instruct the second device to collect And report the training data used for training the specific AI model; the processor is used to collect the training data, and report the training data to the first device.
  • a ninth aspect provides a data collection system, including: a first device and a second device, the first device can be used to execute the steps of the data collection method as described in the first aspect, and the second device can be used to Execute the steps of the data collection method as described in the third aspect.
  • a readable storage medium is provided, and a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method as described in the first aspect are implemented, or the The steps of the method described in the third aspect.
  • a chip in an eleventh aspect, includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run a program or an instruction to implement the method described in the first aspect. method, or implement the method as described in the third aspect.
  • a computer program/program product is provided, the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement the The data collection method, or the steps to realize the data collection method as described in the third aspect.
  • the first device does not require all candidate second devices to collect and report training data, but the first device first screens the candidate second devices to determine the first device that needs to collect and report training data. Two devices, and then send the first instruction to the second device, instructing the second device to collect and report training data, so that on the one hand, it can alleviate the transmission pressure during data collection, on the other hand, it can avoid too many repeated samples in the data set, and reduce model training pressure.
  • FIG. 1 is a block diagram of a wireless communication system to which an embodiment of the present application is applicable;
  • FIG. 2 is a schematic diagram of channel state information feedback
  • Figure 3 is a schematic diagram of the performance of AI training with different iterations
  • FIG. 4 is a schematic flow diagram of a first device-side data collection method according to an embodiment of the present application.
  • FIG. 5 is a schematic flow diagram of a second device-side data collection method according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a communication device according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a network side device according to an embodiment of the present application.
  • first, second and the like in the specification and claims of the present application are used to distinguish similar objects, and are not used to describe a specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or described herein and that "first" and “second” distinguish objects. It is usually one category, and the number of objects is not limited. For example, there may be one or more first objects.
  • “and/or” in the description and claims means at least one of the connected objects, and the character “/” generally means that the related objects are an "or” relationship.
  • LTE Long Term Evolution
  • LTE-Advanced LTE-Advanced
  • LTE-A Long Term Evolution-Advanced
  • CDMA Code Division Multiple Access
  • TDMA Time Division Multiple Access
  • FDMA Frequency Division Multiple Access
  • OFDMA Orthogonal Frequency Division Multiple Access
  • SC-FDMA Single-carrier Frequency Division Multiple Access
  • system and “network” in the embodiments of the present application are often used interchangeably, and the described technology can be used for the above-mentioned system and radio technology, and can also be used for other systems and radio technologies.
  • NR New Radio
  • the following description describes the New Radio (NR) system for illustrative purposes, and uses NR terminology in most of the following descriptions, but these techniques can also be applied to applications other than NR system applications, such as the 6th generation (6 th Generation, 6G) communication system.
  • 6G 6th Generation
  • Fig. 1 shows a block diagram of a wireless communication system to which the embodiment of the present application is applicable.
  • the wireless communication system includes a terminal 11 and a network side device 12 .
  • the terminal 11 can be a mobile phone, a tablet computer (Tablet Personal Computer), a laptop computer (Laptop Computer) or a notebook computer, a personal digital assistant (Personal Digital Assistant, PDA), a palmtop computer, a netbook, a super mobile personal computer (ultra-mobile personal computer, UMPC), mobile Internet device (Mobile Internet Device, MID), augmented reality (augmented reality, AR) / virtual reality (virtual reality, VR) equipment, robot, wearable device (Wearable Device) , Vehicle User Equipment (VUE), Pedestrian User Equipment (PUE), smart home (home equipment with wireless communication functions, such as refrigerators, TVs, washing machines or furniture, etc.), game consoles, personal computers (personal computer, PC), teller machine or self-service machine and other terminal side devices, wearable devices include: smart watches, smart bracelet
  • the network side device 12 may include an access network device or a core network device, where the access network device 12 may also be called a radio access network device, a radio access network (Radio Access Network, RAN), a radio access network function, or Wireless access network unit.
  • RAN Radio Access Network
  • RAN Radio Access Network
  • Wireless access network unit Wireless access network unit
  • the access network device 12 may include a base station, a WLAN access point, or a WiFi node, etc., and the base station may be called a Node B, an evolved Node B (eNB), an access point, a Base Transceiver Station (Base Transceiver Station, BTS), a radio Base station, radio transceiver, Basic Service Set (BSS), Extended Service Set (ESS), Home Node B, Home Evolved Node B, Transmitting Receiving Point (TRP) or all As long as the same technical effect is achieved, the base station is not limited to a specific technical vocabulary. It should be noted that in this embodiment of the application, only the base station in the NR system is used as an example for introduction, and The specific type of the base station is not limited.
  • the selected AI algorithm and the model used are also different.
  • the main way to improve the performance of the 5th Generation (5G) network with the help of AI is to enhance or replace existing algorithms or processing modules through algorithms and models based on neural networks.
  • algorithms and models based on neural networks can achieve better performance than deterministic algorithms.
  • the more commonly used neural networks include deep neural networks, convolutional neural networks, and recurrent neural networks.
  • the construction, training and verification of neural networks can be realized.
  • the embodiment of the present application provides a data collection method, as shown in Figure 4, including:
  • Step 101 the first device sends a first instruction to the second device, instructing the second device to collect and report training data for specific AI model training;
  • Step 102 the first device receives the training data reported by the second device
  • Step 103 The first device uses the training data to construct a data set to train the specific AI model.
  • the first device does not require all candidate second devices to collect and report training data, but the first device first screens the candidate second devices to determine the first device that needs to collect and report training data. Two devices, and then send the first instruction to the second device, instructing the second device to collect and report training data, so that on the one hand, it can alleviate the transmission pressure during data collection, on the other hand, it can avoid too many repeated samples in the data set, and reduce model training pressure.
  • the first device sending the first indication to the second device includes:
  • the first device selects N second devices from M candidate second devices according to a preset first filter condition, and unicasts the first indication to the N second devices, M , N is a positive integer, N is less than or equal to M; or
  • the first device broadcasts the first indication to the M candidate second devices, the first indication carries a second filtering condition, and the second filtering condition is used to filter the second indication that reports the training data. device, the second device satisfies the second screening condition.
  • the candidate second device is within the communication range of the first device, and the second device reporting training data is selected from the candidate second device. All candidate second devices can be used as the second device, or can be filtered out Part of the second device candidates are used as the second device. Broadcast is to send the first instruction to all candidate second devices, while unicast is only to send the first instruction to the selected second devices.
  • Candidate second devices that receive the unicast first instruction need to collect and report training data .
  • Candidate second devices that receive the broadcasted first indication need to judge whether they meet the second screening condition, and only the candidate second devices that meet the second screening condition collect and report the training data.
  • the first device sends the first indication to the second device through at least one of the following:
  • MAC Media access control
  • CE Control Element
  • Radio Resource Control Radio Resource Control, RRC
  • Non-access stratum Non-access stratum (Non-access stratum, NAS) message
  • SIB System Information Block
  • Physical Downlink shared channel Physical Downlink Shared Channel, PDSCH
  • Physical Random Access Channel Physical Random Access Channel (Physical Random Access Channel, PRACH) (Message, MSG) 2 information;
  • PC5 interface (a kind of interface) signaling
  • PSCCH Physical Sidelink Control Channel
  • PSSCH Physical Sidelink Shared Channel
  • PSBCH Physical Sidelink Broadcast Channel
  • PSDCH Physical Sidelink Discovery Channel
  • Physical Sidelink Feedback Channel Physical Sidelink Feedback Channel, PSFCH
  • the method before the first device sends the first indication to the second device, the method further includes:
  • the first device receives first training data and/or a first parameter reported by the candidate second device, and the first parameter may be a judgment parameter of the first screening condition.
  • the candidate second device may first report a small amount of training data (that is, the first training data) and/or the first parameter, and the first device determines the data participating in the training according to the small amount of training data and/or the first parameter source, to filter out the second device that collects and reports training data, and prevents all second devices from reporting training data.
  • a small amount of training data that is, the first training data
  • the first device determines the data participating in the training according to the small amount of training data and/or the first parameter source, to filter out the second device that collects and reports training data, and prevents all second devices from reporting training data.
  • the first device only receives first training data reported by the candidate second device, and determines the first parameter according to the first training data.
  • the first device may speculate, perceive, detect or infer the first parameter according to the first training data.
  • the first device may screen candidate second devices according to the first parameter to determine the second device.
  • the first parameter includes at least one of the following:
  • the service type of the candidate second device such as Enhanced Mobile Broadband (Enhanced Mobile Broadband, eMBB), Ultra-Reliable Low-Latency Communications (Ultra-Reliable Low-Latency Communications, URLLC), Massive Machine Type Communication (Massive Machine Type Communication, mMTC), other 6G new scenarios, etc.;
  • Enhanced Mobile Broadband Enhanced Mobile Broadband, eMBB
  • Ultra-Reliable Low-Latency Communications Ultra-Reliable Low-Latency Communications, URLLC
  • Massive Machine Type Communication Massive Machine Type Communication, mMTC
  • 6G new scenarios etc.
  • the working scenarios of the candidate second device include but not limited to: high speed, low speed, line of sight propagation (Line of Sight, LOS), non line of sight propagation (Non Line of Sight, NLOS), high signal-to-noise ratio, low signal Noise ratio and other working scenarios;
  • the communication network access mode of the candidate second device includes mobile network, WiFi and fixed network, wherein the mobile network includes the 2nd generation (2th Generation, 2G), the 3rd generation (3th generation, 3G), the 4th generation ( 4th Generation, 4G), 5G and 6G;
  • the power state of the candidate second device such as the specific value of the available remaining power, or the classification description result, charging or not charging, etc.
  • the storage state of the candidate second device such as a specific value of available memory, or a graded description result.
  • the candidate second device may first report a small amount of training data (that is, the first training data) and/or the first parameter to the first device, wherein the first parameter may be a judgment parameter of the first screening condition, and the second A device determines a second device that needs to collect and report training data according to the first screening condition, and the second device is selected from candidate second devices.
  • the candidate second device may first report a small amount of training data (that is, the first training data) and/or the first parameter to the first device, wherein the first parameter may be a judgment parameter of the first screening condition
  • the second A device determines a second device that needs to collect and report training data according to the first screening condition, and the second device is selected from candidate second devices.
  • the second device that needs to collect and report training data can be determined according to the data type of the candidate second device, and the candidate second devices are grouped according to the data type of the candidate second device, and the candidate second devices in each group The data types of the devices are the same or similar.
  • K1 candidate second devices select K1 candidate second devices from each group of candidate second devices as the second device that needs to collect and report training data, and K1 is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report the training data may be determined according to the business type of the candidate second device, and the candidate second devices are grouped according to the business type of the candidate second device, and the candidate second devices in each group
  • the service types of the devices are the same or similar.
  • K2 candidate second devices from each group of candidate second devices as the second devices that need to collect and report training data
  • K2 is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report training data can be determined according to the data distribution parameters of the candidate second device, and the candidate second devices are grouped according to the data distribution parameters of the candidate second device, and the candidate in each group The data distribution parameters of the second device are the same or similar.
  • K3 candidate second devices from each group of candidate second devices as the second devices that need to collect and report training data
  • K3 is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report training data can be determined according to the working scene of the candidate second device, and the candidate second devices are grouped according to the working scene of the candidate second device, and the candidate second devices in each group
  • the working scenarios of the equipment are the same or similar.
  • select A candidate second devices from each group of candidate second devices as the second device that needs to collect and report training data and A is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report training data can be determined according to the communication network access method of the candidate second device, and the candidate second devices are prioritized according to the communication network access method of the candidate second device
  • Communication network access methods include fixed network, WiFi and mobile network, and mobile network includes 2G, 3G, 4G, 5G, 6G, etc.
  • the priority of the fixed network is greater than or equal to the priority of the WiFi
  • the priority of the WiFi is greater than or equal to the priority of the mobile network. The higher the generation number in the mobile network, the higher the priority of being screened.
  • the priority of being screened for the 5G candidate second device is higher than that of the 4G candidate second device, according to the priority from high to low Select B candidate second devices from the candidate second devices as the second devices that need to collect and report the training data, and B is a positive integer.
  • the second device that needs to collect and report the training data can be determined according to the channel quality of the candidate second device, and the candidate second devices are prioritized according to the channel quality of the candidate second device, and the higher the channel quality
  • the second device that needs to collect and report training data can be determined according to the difficulty of data collection by the candidate second device, and the candidate second devices can be prioritized according to the difficulty of data collection by the candidate second device , the candidate second device with less difficulty in collecting data has a higher priority to be screened out, and D candidate second devices are selected from the candidate second devices according to the priority from high to low as the training data collection and reporting required
  • the second device, D is a positive integer, which can reduce the difficulty of data collection.
  • the second device that needs to collect and report the training data can be determined according to the power state of the candidate second device, and the candidate second devices are prioritized according to the power state of the candidate second device. The higher the power, the higher the priority of being screened.
  • the candidate second device in the charging state has the highest priority of being screened, and E candidate second devices are selected from the candidate second devices according to the priority from high to low.
  • the second device is used as the second device that needs to collect and report training data, and E is a positive integer, which can ensure that the second device that needs to collect and report training data has sufficient power.
  • the second device that needs to collect and report the training data may be determined according to the storage status of the candidate second device, and the candidate second devices are prioritized according to the storage status of the candidate second device.
  • the first indication of unicast includes at least one of the following:
  • the number of samples of the training data collected by the second device is different or the same as the number of samples of the training data collected by different second devices;
  • the time when the second device collects the training data is different or the same for different second devices
  • the time when the second device reports the training data to the first device, and the time when different second devices report the training data is different or the same;
  • a data format of the training data reported by the second device to the first device is
  • the broadcasted first indication includes at least one of the following:
  • the number of samples of training data collected by the candidate second device for data collection, the number of samples of training data collected by different candidate second devices is different or the same;
  • the time at which the candidate second device for data collection reports the training data to the first device, and the time at which different candidate second devices report the training data is different or the same;
  • the first filter condition is the first filter condition.
  • the identifier of the candidate second device that collects data and the identifier of the candidate second device that does not collect data constitute the second screening condition, and the candidate second device can judge whether it satisfies the second screening condition according to its own identifier.
  • the method further includes:
  • the first device sends the trained AI model and hyperparameters to L reasoning devices, where L is greater than M, equal to M, or less than M.
  • the first device builds a training data set based on the received training data, performs training on a specific AI model, and sends the AI model and hyperparameters that have converged in the training to L inference devices.
  • the inference device needs to analyze the AI model
  • the second device performing performance verification and reasoning, the reasoning device may be selected from candidate second devices, or other second devices other than candidate second devices.
  • the first device sends the trained AI model and hyperparameters to the inference device through at least one of the following:
  • Non-access stratum NAS message
  • SIB system information block
  • PSDCH information Physical direct link discovery channel
  • the physical direct link feedback channel PSFCH information The physical direct link feedback channel PSFCH information.
  • the AI model is a meta-learning model, and the hyperparameters may be determined by the first parameters.
  • the hyperparameters related to the meta-learning model include at least one of the following:
  • the first device may be a network side device, and the second device may be a terminal; or, the first device may be a network side device, and the second device may be a network side device, such as multiple A scenario where the network-side device aggregates training data to one network-side device for training; or, the first device is a terminal, and the second device is a terminal, such as a scenario where multiple terminals aggregate training data to one terminal for training .
  • the candidate second device may be a network side device or a terminal;
  • the reasoning device may be a network side device or a terminal.
  • the embodiment of the present application also provides a data collection method, as shown in Figure 5, including:
  • Step 201 the second device receives a first instruction from the first device, the first instruction is used to instruct the second device to collect and report training data for specific AI model training;
  • Step 202 The second device collects training data, and reports the training data to the first device.
  • the first device does not require all candidate second devices to collect and report training data, but the first device first screens the candidate second devices to determine the first device that needs to collect and report training data. Two devices, and then send the first instruction to the second device, instructing the second device to collect and report training data, so that on the one hand, it can alleviate the transmission pressure during data collection, on the other hand, it can avoid too many repeated samples in the data set, and reduce model training pressure.
  • the second device reports training data to the first device through at least one of the following:
  • Non-access stratum NAS message
  • PSDCH information Physical direct link discovery channel
  • the physical direct link feedback channel PSFCH information The physical direct link feedback channel PSFCH information.
  • the second device receiving the first indication from the first device includes:
  • the second device receives the first indication unicast by the first device, and the second device is a second device selected by the first device from candidate second devices according to a preset first filter condition. equipment; or
  • the second device receives the first indication broadcast by the first device to candidate second devices, the first indication carries a second filter condition, and the second filter condition is used to filter and report the training data The second device, the second device satisfies the second filter condition.
  • the second device collects training data, and reporting the training data to the first device includes:
  • the second device collects and reports the training data
  • the second device collects and reports the training data.
  • the candidate second device is within the communication range of the first device, and the second device reporting training data is selected from the candidate second device. All candidate second devices can be used as the second device, or can be filtered out Part of the second device candidates are used as the second device. Broadcast is to send the first instruction to all candidate second devices, while unicast is only to send the first instruction to the selected second devices.
  • Candidate second devices that receive the unicast first instruction need to collect and report training data .
  • Candidate second devices that receive the broadcasted first indication need to judge whether they meet the second screening condition, and only the candidate second devices that meet the second screening condition collect and report the training data.
  • the method before the second device receives the first indication from the first device, the method further includes:
  • the candidate second device reports first training data and/or a first parameter to the first device, and the first parameter may be a judgment parameter of the first screening condition.
  • the candidate second device may first report a small amount of training data (that is, the first training data) and/or the first parameter, and the first device determines the data participating in the training according to the small amount of training data and/or the first parameter source, to filter out the second device that collects and reports training data, and prevents all second devices from reporting training data.
  • a small amount of training data that is, the first training data
  • the first device determines the data participating in the training according to the small amount of training data and/or the first parameter source, to filter out the second device that collects and reports training data, and prevents all second devices from reporting training data.
  • the second device reports the first training data and/or the first parameters to the first device through at least one of the following:
  • Non-access stratum NAS message
  • PSDCH information Physical direct link discovery channel
  • the physical direct link feedback channel PSFCH information The physical direct link feedback channel PSFCH information.
  • the candidate second device only reports the first training data to the first device, and the first training data is used to determine the first parameter.
  • the first device only receives first training data reported by the candidate second device, and determines the first parameter according to the first training data.
  • the first device may speculate, perceive, detect or infer the first parameter according to the first training data.
  • the first device may screen candidate second devices according to the first parameter to determine the second device.
  • the first parameter includes at least one of the following:
  • the business type of the candidate second device such as enhanced mobile broadband (eMBB), ultra-reliable low-latency communication (URLLC), massive machine type communication (mMTC), other 6G new scenarios, etc.;
  • eMBB enhanced mobile broadband
  • URLLC ultra-reliable low-latency communication
  • mMTC massive machine type communication
  • the working scenarios of the candidate second device include but not limited to: high speed, low speed, line-of-sight propagation LOS, non-line-of-sight propagation NLOS, high signal-to-noise ratio, low signal-to-noise ratio and other working scenarios;
  • the communication network access method of the candidate second device includes mobile network, WiFi and fixed network, wherein the mobile network includes 2G, 3G, 4G, 5G and 6G;
  • the power state of the candidate second device such as the specific value of the available remaining power, or the classification description result, charging or not charging, etc.
  • the storage state of the candidate second device such as a specific value of available memory, or a graded description result.
  • the candidate second device may first report a small amount of training data (that is, the first training data) and/or the first parameter to the first device, wherein the first parameter may be a judgment parameter of the first screening condition, and the second A device determines a second device that needs to collect and report training data according to the first screening condition, and the second device is selected from candidate second devices.
  • the candidate second device may first report a small amount of training data (that is, the first training data) and/or the first parameter to the first device, wherein the first parameter may be a judgment parameter of the first screening condition
  • the second A device determines a second device that needs to collect and report training data according to the first screening condition, and the second device is selected from candidate second devices.
  • the second device that needs to collect and report training data can be determined according to the data type of the candidate second device, and the candidate second devices are grouped according to the data type of the candidate second device, and the candidate second devices in each group The data types of the devices are the same or similar.
  • K1 candidate second devices select K1 candidate second devices from each group of candidate second devices as the second device that needs to collect and report training data, and K1 is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report the training data may be determined according to the business type of the candidate second device, and the candidate second devices are grouped according to the business type of the candidate second device, and the candidate second devices in each group
  • the service types of the devices are the same or similar.
  • K2 candidate second devices from each group of candidate second devices as the second devices that need to collect and report training data
  • K2 is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report training data can be determined according to the data distribution parameters of the candidate second device, and the candidate second devices are grouped according to the data distribution parameters of the candidate second device, and the candidate in each group The data distribution parameters of the second device are the same or similar.
  • K4 candidate second devices from each group of candidate second devices as the second devices that need to collect and report training data
  • K3 is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report training data can be determined according to the working scene of the candidate second device, and the candidate second devices are grouped according to the working scene of the candidate second device, and the candidate second devices in each group
  • the working scenarios of the equipment are the same or similar.
  • select A candidate second devices from each group of candidate second devices as the second device that needs to collect and report training data and A is a positive integer, which can ensure the diversity of data participating in the training It is ensured that each group of candidate second devices has a second device that collects and reports training data.
  • the second device that needs to collect and report training data can be determined according to the communication network access method of the candidate second device, and the candidate second devices are prioritized according to the communication network access method of the candidate second device
  • Communication network access methods include fixed network, WiFi and mobile network, and mobile network includes 2G, 3G, 4G, 5G, 6G, etc.
  • the priority of the fixed network is greater than or equal to the priority of the WiFi
  • the priority of the WiFi is greater than or equal to the priority of the mobile network. The higher the generation number in the mobile network, the higher the priority of being screened.
  • the priority of being screened for the 5G candidate second device is higher than that of the 4G candidate second device, according to the priority from high to low Select B candidate second devices from the candidate second devices as the second devices that need to collect and report the training data, and B is a positive integer.
  • the second device that needs to collect and report the training data can be determined according to the channel quality of the candidate second device, and the candidate second devices are prioritized according to the channel quality of the candidate second device, and the higher the channel quality
  • the second device that needs to collect and report training data can be determined according to the difficulty of data collection by the candidate second device, and the candidate second devices can be prioritized according to the difficulty of data collection by the candidate second device , the candidate second device with less difficulty in collecting data has a higher priority to be screened out, and D candidate second devices are selected from the candidate second devices according to the priority from high to low as the training data collection and reporting required
  • the second device, D is a positive integer, which can reduce the difficulty of data collection.
  • the second device that needs to collect and report the training data can be determined according to the power state of the candidate second device, and the candidate second devices are prioritized according to the power state of the candidate second device. The higher the power, the higher the priority of being screened.
  • the candidate second device in the charging state has the highest priority of being screened, and E candidate second devices are selected from the candidate second devices according to the priority from high to low.
  • the second device is used as the second device that needs to collect and report training data, and E is a positive integer, which can ensure that the second device that needs to collect and report training data has sufficient power.
  • the second device that needs to collect and report training data can be determined according to the storage status of the candidate second device, and the candidate second devices are prioritized according to the storage status of the candidate second device.
  • the method before reporting the training data to the first device, the method further includes:
  • the second device sends a first request to the first device, requesting to collect and report training data.
  • the first indication of unicast includes at least one of the following:
  • the number of samples of the training data collected by the second device is different or the same as the number of samples of the training data collected by different second devices;
  • the time when the second device collects the training data is different or the same for different second devices
  • the time when the second device reports the training data to the first device, and the time when different second devices report the training data is different or the same;
  • a data format of the training data reported by the second device to the first device is
  • the broadcasted first indication includes at least one of the following:
  • the number of samples of training data collected by the candidate second device for data collection, the number of samples of training data collected by different candidate second devices is different or the same;
  • the time at which the candidate second device for data collection reports the training data to the first device, and the time at which different candidate second devices report the training data is different or the same;
  • the first filter condition is the first filter condition.
  • the identifier of the candidate second device that collects data and the identifier of the candidate second device that does not collect data constitute the second screening condition, and the candidate second device can judge whether it satisfies the second screening condition according to its own identifier.
  • the method further includes:
  • the reasoning device receives the trained AI model and hyperparameters sent by the first device.
  • the first device builds a training data set based on the received training data, performs training on a specific AI model, and sends the AI model and hyperparameters that have converged in the training to L inference devices.
  • the inference device needs to analyze the AI model
  • the second device performing performance verification and reasoning, the reasoning device may be selected from candidate second devices, or other second devices other than candidate second devices.
  • the AI model is a meta-learning model, and the hyperparameters may be determined by the first parameters.
  • the hyperparameters include at least one of the following:
  • the method further includes:
  • the reasoning device performs performance verification on the AI model
  • the reasoning device uses the AI model for reasoning.
  • the first condition may be configured or pre-configured by the first device or stipulated in a protocol.
  • the inference device verifies the performance of the AI model, it may also report the result of whether to perform inference to the first device.
  • the AI model for performance verification is the AI model delivered by the first device, or a fine-tuned model of the AI model delivered by the first device.
  • the reasoning device may directly use the AI model delivered by the first device to perform performance verification, or perform performance verification after fine-tuning the AI model delivered by the first device.
  • the special hyperparameters related to meta-learning can be different for each inference device.
  • Special hyperparameters related to meta-learning of each inference device can be determined according to the first parameter corresponding to each inference device (mainly according to the difficulty of data collection, power state, storage state, etc. in the first parameter).
  • the first device may be a network side device, and the second device may be a terminal; or, the first device may be a network side device, and the second device may be a network side device, such as multiple A scenario where the network-side device aggregates training data to one network-side device for training; or, the first device is a terminal, and the second device is a terminal, such as a scenario where multiple terminals aggregate training data to one terminal for training .
  • the candidate second device may be a network side device or a terminal;
  • the reasoning device may be a network side device or a terminal.
  • the specific AI model may be a channel estimation model, a mobility prediction model, and the like.
  • the technical solutions of the embodiments of the present application can be applied to 6G networks, and can also be applied to 5G and 5.5G networks.
  • the data collection method provided in the embodiment of the present application may be executed by a data collection device.
  • the data collection method performed by the data collection device is taken as an example to illustrate the data collection device provided in the embodiment of the present application.
  • An embodiment of the present application provides a data collection device, including:
  • a sending module configured to send a first instruction to a second device, instructing the second device to collect and report training data for specific AI model training;
  • a receiving module configured to receive the training data reported by the second device
  • a training module configured to use the training data to construct a data set to train the specific AI model.
  • the sending module is specifically configured to select N second devices from M candidate second devices according to a preset first filter condition, and unicast the message to the N second devices.
  • M and N are positive integers, and N is less than or equal to M; or
  • the second filter condition is used to filter the second device that reports the training data, the first The second device satisfies the second screening condition.
  • the receiving module is further configured to receive first training data and/or a first parameter reported by the candidate second device, and the first parameter may be a judgment parameter of the first screening condition.
  • the receiving module is configured to only receive first training data reported by the candidate second device, and determine the first parameter according to the first training data.
  • the first parameter includes at least one of the following:
  • the communication network access method of the candidate second device
  • the stored state of the candidate second device is the stored state of the candidate second device.
  • the first indication of unicast includes at least one of the following:
  • the number of samples of the training data collected by the second device is different or the same as the number of samples of the training data collected by different second devices;
  • the time when the second device collects the training data is different or the same for different second devices
  • the time when the second device reports the training data to the first device, and the time when different second devices report the training data is different or the same;
  • a data format of the training data reported by the second device to the first device is
  • the broadcasted first indication includes at least one of the following:
  • the number of samples of training data collected by the candidate second device for data collection, the number of samples of training data collected by different candidate second devices is different or the same;
  • the time at which the candidate second device for data collection reports the training data to the first device, and the time at which different candidate second devices report the training data is different or the same;
  • the first filter condition is the first filter condition.
  • the sending module is further configured to send the trained AI model and hyperparameters to L inference devices, where L is greater than M, equal to M or less than M.
  • the AI model is a meta-learning model, and the hyperparameters are determined by the first parameters.
  • the hyperparameters include at least one of the following:
  • the first device may be a network side device, and the second device may be a terminal; or, the first device may be a network side device, and the second device may be a network side device, such as multiple A scenario where the network-side device aggregates training data to one network-side device for training; or, the first device is a terminal, and the second device is a terminal, such as a scenario where multiple terminals aggregate training data to one terminal for training .
  • the candidate second device may be a network side device or a terminal;
  • the reasoning device may be a network side device or a terminal.
  • the embodiment of the present application also provides a data collection device, including:
  • a receiving module configured to receive a first instruction from the first device, where the first instruction is used to instruct the second device to collect and report training data for specific AI model training;
  • a processing module configured to collect training data and report the training data to the first device.
  • the receiving module is configured to receive the first indication unicast by the first device, and the second device is a candidate second device selected by the first device according to a preset first filtering condition. The second device filtered in ; or
  • the first indication broadcast by the first device to candidate second devices, where the first indication carries a second filter condition, and the second filter condition is used to filter the second device that reports the training data, The second device satisfies the second screening condition.
  • the processing module is configured to collect and report the training data by the second device if the second device receives the first indication unicast by the first device; or
  • the second device collects and reports the training data.
  • the candidate second device reports first training data and/or a first parameter to the first device
  • the first parameter may be a judgment parameter of the first screening condition
  • the candidate second device only reports the first training data to the first device, and the first training data is used to determine the first parameter.
  • the first parameter includes at least one of the following:
  • the communication network access method of the candidate second device
  • the stored state of the candidate second device is the stored state of the candidate second device.
  • the processing module is further configured to send a first request to the first device, requesting to collect and report training data.
  • the first indication of unicast includes at least one of the following:
  • the number of samples of the training data collected by the second device is different or the same as the number of samples of the training data collected by different second devices;
  • the time when the second device collects the training data is different or the same for different second devices
  • the time when the second device reports the training data to the first device, and the time when different second devices report the training data is different or the same;
  • a data format of the training data reported by the second device to the first device is
  • the broadcasted first indication includes at least one of the following:
  • the number of samples of training data collected by the candidate second device for data collection, the number of samples of training data collected by different candidate second devices is different or the same;
  • the time at which the candidate second device for data collection reports the training data to the first device, and the time at which different candidate second devices report the training data is different or the same;
  • the first filter condition is the first filter condition.
  • the reasoning device receives the trained AI model and hyperparameters sent by the first device.
  • the AI model is a meta-learning model, and the hyperparameters are determined by the first parameters.
  • the hyperparameters include at least one of the following:
  • the data collection device also includes:
  • the reasoning module is used to verify the performance of the AI model; if the performance verification result satisfies the preset first condition, the AI model is used for reasoning.
  • the AI model for performance verification is the AI model delivered by the first device, or a fine-tuned model of the AI model delivered by the first device.
  • the first device may be a network side device, and the second device may be a terminal; or, the first device may be a network side device, and the second device may be a network side device, such as multiple A scenario where the network-side device aggregates training data to one network-side device for training; or, the first device is a terminal, and the second device is a terminal, such as a scenario where multiple terminals aggregate training data to one terminal for training .
  • the candidate second device may be a network side device or a terminal; the reasoning device may be a network side device or a terminal.
  • the data collection apparatus in the embodiment of the present application may be an electronic device, such as an electronic device with an operating system, or a component in the electronic device, such as an integrated circuit or a chip.
  • the electronic device may be a terminal, or other devices other than the terminal.
  • the terminal may include, but not limited to, the types of terminal 11 listed above, and other devices may be servers, Network Attached Storage (NAS), etc., which are not specifically limited in this embodiment of the present application.
  • NAS Network Attached Storage
  • the data collection device provided by the embodiment of the present application can implement the various processes realized by the method embodiments in Fig. 4 to Fig. 5 and achieve the same technical effect. To avoid repetition, details are not repeated here.
  • this embodiment of the present application also provides a communication device 600, including a processor 601 and a memory 602, and the memory 602 stores programs or instructions that can run on the processor 601, such as
  • the communication device 600 is the first device, when the program or instruction is executed by the processor 601, each step of the above data collection method embodiment can be realized, and the same technical effect can be achieved.
  • the communication device 600 is the second device, when the program or instruction is executed by the processor 601, each step of the above data collection method embodiment can be achieved, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.
  • the embodiment of the present application also provides a first device, the first device includes a processor and a memory, the memory stores programs or instructions that can run on the processor, and the programs or instructions are executed by the processor When executed, the steps of the data collection method are implemented as described above.
  • the embodiment of the present application also provides a first device, including a processor and a communication interface, wherein the communication interface is used to send a first indication to the second device, instructing the second device to collect and report the specific training data for AI model training; receiving the training data reported by the second device; the processor is configured to use the training data to construct a data set to train the specific AI model.
  • the embodiment of the present application also provides a second device, the second device includes a processor and a memory, the memory stores programs or instructions that can run on the processor, and the programs or instructions are executed by the processor When executed, the steps of the data collection method are implemented as described above.
  • the embodiment of the present application also provides a second device, including a processor and a communication interface, wherein the communication interface is used to receive a first indication from the first device, and the first indication is used to indicate that the second device Collecting and reporting training data for training a specific AI model; the processor is used to collect training data and report the training data to the first device.
  • a second device including a processor and a communication interface, wherein the communication interface is used to receive a first indication from the first device, and the first indication is used to indicate that the second device Collecting and reporting training data for training a specific AI model; the processor is used to collect training data and report the training data to the first device.
  • the foregoing first device may be a network-side device or a terminal
  • the second device may be a network-side device or a terminal.
  • FIG. 7 is a schematic diagram of a hardware structure of a terminal implementing an embodiment of the present application.
  • the terminal 700 includes, but is not limited to: a radio frequency unit 701, a network module 702, an audio output unit 703, an input unit 704, a sensor 705, a display unit 706, a user input unit 707, an interface unit 708, a memory 709, and a processor 710. At least some parts.
  • the terminal 700 can also include a power supply (such as a battery) for supplying power to various components, and the power supply can be logically connected to the processor 710 through the power management system, so that the management of charging, discharging, and functions can be realized through the power management system. Consumption management and other functions.
  • a power supply such as a battery
  • the terminal structure shown in FIG. 7 does not constitute a limitation on the terminal, and the terminal may include more or fewer components than shown in the figure, or combine some components, or arrange different components, which will not be repeated here.
  • the input unit 704 may include a graphics processing unit (Graphics Processing Unit, GPU) 7041 and a microphone 7042, and the graphics processor 7041 is used by the image capture device (such as the image data of the still picture or video obtained by the camera) for processing.
  • the display unit 706 may include a display panel 7061, and the display panel 7061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 707 includes at least one of a touch panel 7071 and other input devices 7072 .
  • the touch panel 7071 is also called a touch screen.
  • the touch panel 7071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 7072 may include, but are not limited to, physical keyboards, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, and joysticks, which will not be described in detail here.
  • the radio frequency unit 701 may transmit the downlink data from the network side device to the processor 710 for processing after receiving the downlink data; in addition, the radio frequency unit 701 may send uplink data to the network side device.
  • the radio frequency unit 701 includes, but is not limited to, an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
  • the memory 709 can be used to store software programs or instructions as well as various data.
  • the memory 709 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required by at least one function (such as a sound playing function, image playback function, etc.), etc.
  • memory 709 may include volatile memory or nonvolatile memory, or, memory 709 may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • ROM Read-Only Memory
  • PROM programmable read-only memory
  • Erasable PROM Erasable PROM
  • EPROM erasable programmable read-only memory
  • Electrical EPROM Electrical EPROM
  • EEPROM electronically programmable Erase Programmable Read-Only Memory
  • Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (Synch link DRAM , SLDRAM) and Direct Memory Bus Random Access Memory (Direct Rambus RAM, DRRAM).
  • RAM Random Access Memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM Double Data Rate SDRAM
  • DDRSDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • Synch link DRAM , SLDRAM
  • Direct Memory Bus Random Access Memory Direct Rambus
  • the processor 710 may include one or more processing units; optionally, the processor 710 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to the operating system, user interface, and application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the foregoing modem processor may not be integrated into the processor 710 .
  • the first device is a terminal
  • the processor 710 is configured to send a first instruction to the second device, instructing the second device to collect and report training data for specific AI model training; to receive the second The training data reported by the device; using the training data to construct a data set to train the specific AI model.
  • the processor 710 is specifically configured to select N second devices from M candidate second devices according to a preset first filter condition, and unicast the selected second devices to the N second devices.
  • M and N are positive integers, and N is less than or equal to M; or
  • the second filter condition is used to filter the second device that reports the training data, the first The second device satisfies the second screening condition.
  • the processor 710 is further configured to receive first training data and/or a first parameter reported by the candidate second device, and the first parameter may be a judgment parameter of the first screening condition.
  • the processor 710 is configured to only receive first training data reported by the candidate second device, and determine the first parameter according to the first training data.
  • the first parameter includes at least one of the following:
  • the communication network access method of the candidate second device
  • the stored state of the candidate second device is the stored state of the candidate second device.
  • the first indication of unicast includes at least one of the following:
  • the number of samples of the training data collected by the second device is different or the same as the number of samples of the training data collected by different second devices;
  • the time when the second device collects the training data is different or the same for different second devices
  • the time when the second device reports the training data to the first device, and the time when different second devices report the training data is different or the same;
  • a data format of the training data reported by the second device to the first device is
  • the broadcasted first indication includes at least one of the following:
  • the number of samples of training data collected by the candidate second device for data collection, the number of samples of training data collected by different candidate second devices is different or the same;
  • the time at which the candidate second device for data collection reports the training data to the first device, and the time at which different candidate second devices report the training data is different or the same;
  • the first filter condition is the first filter condition.
  • the processor 710 is further configured to send the trained AI model and hyperparameters to L inference devices, where L is greater than M, equal to M or less than M.
  • the AI model is a meta-learning model, and the hyperparameters are determined by the first parameters.
  • the hyperparameters include at least one of the following:
  • the first device may be a network side device, and the second device may be a terminal; or, the first device may be a network side device, and the second device may be a network side device, such as multiple A scenario where the network-side device aggregates training data to one network-side device for training; or, the first device is a terminal, and the second device is a terminal, such as a scenario where multiple terminals aggregate training data to one terminal for training .
  • the candidate second device may be a network side device or a terminal; the reasoning device may be a network side device or a terminal.
  • the second device is a terminal, and the processor 710 is configured to receive a first instruction from the first device, and the first instruction is used to instruct the second device to collect and report training data for specific AI model training. Data: collect training data, and report the training data to the first device.
  • the processor 710 is configured to receive the first indication unicast by the first device, and the second device is selected from candidate second devices by the first device according to a preset first filtering condition. the screened second device; or
  • the first indication broadcast by the first device to candidate second devices, where the first indication carries a second filter condition, and the second filter condition is used to filter the second device that reports the training data, The second device satisfies the second screening condition.
  • the processor 710 is configured to collect and report the training data by the second device if the second device receives the first indication unicast by the first device; or
  • the second device collects and reports the training data.
  • the candidate second device reports first training data and/or a first parameter to the first device
  • the first parameter may be a judgment parameter of the first screening condition
  • the candidate second device only reports the first training data to the first device, and the first training data is used to determine the first parameter.
  • the first parameter includes at least one of the following:
  • the communication network access method of the candidate second device
  • the stored state of the candidate second device is the stored state of the candidate second device.
  • the processor 710 is further configured to send a first request to the first device, requesting to collect and report training data.
  • the first indication of unicast includes at least one of the following:
  • the number of samples of the training data collected by the second device is different or the same as the number of samples of the training data collected by different second devices;
  • the time when the second device collects the training data is different or the same for different second devices
  • the time when the second device reports the training data to the first device, and the time when different second devices report the training data is different or the same;
  • a data format of the training data reported by the second device to the first device is
  • the broadcasted first indication includes at least one of the following:
  • the number of samples of training data collected by the candidate second device for data collection, the number of samples of training data collected by different candidate second devices is different or the same;
  • the time at which the candidate second device for data collection reports the training data to the first device, and the time at which different candidate second devices report the training data is different or the same;
  • the first filter condition is the first filter condition.
  • the reasoning device receives the trained AI model and hyperparameters sent by the first device.
  • the AI model is a meta-learning model, and the hyperparameters are determined by the first parameters.
  • the hyperparameters include at least one of the following:
  • the processor 710 is configured to perform performance verification on the AI model; if the performance verification result satisfies a preset first condition, use the AI model for reasoning.
  • the AI model for performance verification is the AI model delivered by the first device, or a fine-tuned model of the AI model delivered by the first device.
  • the embodiment of the present application further provides a network side device, including a processor and a communication interface.
  • the network-side device embodiment corresponds to the above-mentioned network-side device method embodiment, and each implementation process and implementation mode of the above-mentioned method embodiment can be applied to this network-side device embodiment, and can achieve the same technical effect.
  • the embodiment of the present application also provides a network side device.
  • the network side device 800 includes: an antenna 81 , a radio frequency device 82 , a baseband device 83 , a processor 84 and a memory 85 .
  • the antenna 81 is connected to a radio frequency device 82 .
  • the radio frequency device 82 receives information through the antenna 81, and sends the received information to the baseband device 83 for processing.
  • the baseband device 83 processes the information to be sent and sends it to the radio frequency device 82
  • the radio frequency device 82 processes the received information and sends it out through the antenna 81 .
  • the method performed by the network side device in the above embodiments may be implemented in the baseband device 83, where the baseband device 83 includes a baseband processor.
  • the baseband device 83 can include at least one baseband board, for example, a plurality of chips are arranged on the baseband board, as shown in FIG.
  • the program executes the network device operations shown in the above method embodiments.
  • the network side device may also include a network interface 86, such as a common public radio interface (common public radio interface, CPRI).
  • a network interface 86 such as a common public radio interface (common public radio interface, CPRI).
  • the network side device 800 in this embodiment of the present invention also includes: instructions or programs stored in the memory 85 and operable on the processor 84, and the processor 84 calls the instructions or programs in the memory 85 to execute the above-mentioned data To avoid repetition, the method of collecting and achieving the same technical effect will not be repeated here.
  • the embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored, and when the program or instruction is executed by a processor, each process of the above-mentioned data collection method embodiment is realized, and the same To avoid repetition, the technical effects will not be repeated here.
  • the processor is the processor in the terminal described in the foregoing embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk, and the like.
  • the embodiment of the present application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the above data collection method embodiment
  • the chip includes a processor and a communication interface
  • the communication interface is coupled to the processor
  • the processor is used to run programs or instructions to implement the above data collection method embodiment
  • the chip mentioned in the embodiment of the present application may also be called a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip.
  • the embodiment of the present application further provides a computer program/program product, the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement the above-mentioned data collection method embodiment
  • the computer program/program product is executed by at least one processor to implement the above-mentioned data collection method embodiment
  • the embodiment of the present application also provides a data collection system, including: a first device and a second device, the first device can be used to perform the steps of the above data collection method, and the second device can be used to perform the above Steps of the data collection method described.
  • the term “comprising”, “comprising” or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase “comprising a " does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.
  • the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions are performed, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
  • the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of computer software products, which are stored in a storage medium (such as ROM/RAM, magnetic disk, etc.) , CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electromagnetism (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Power Engineering (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Selective Calling Equipment (AREA)
  • Small-Scale Networks (AREA)

Abstract

本申请公开了一种数据收集方法及装置、第一设备、第二设备,属于通信技术领域,本申请实施例的数据收集方法,包括:第一设备向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;所述第一设备接收所述第二设备上报的训练数据;所述第一设备利用所述训练数据构造数据集,对所述特定AI模型进行训练。

Description

数据收集方法及装置、第一设备、第二设备
相关申请的交叉引用
本申请主张在2021年12月15日在中国提交的中国专利申请No.202111540035.0的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于通信技术领域,具体涉及一种数据收集方法及装置、第一设备、第二设备。
背景技术
人工智能(Artificial Intelligence,AI)目前在各个领域获得了广泛的应用。AI模块有多种实现方式,例如神经网络、决策树、支持向量机、贝叶斯分类器等。
在无线通信系统中的AI模块进行集中式训练时,需要从远端收集大量的数据构造数据集。由于远端的无线数据存在较大的重叠性,因此没有必要收集所有用户的数据。这样一方面可以缓解数据收集时的传输压力,另一方面避免数据集中重复(或相似)样本过多。
发明内容
本申请实施例提供一种数据收集方法及装置、第一设备、第二设备,能够缓解数据收集时的传输压力,避免数据集中重复样本过多。
第一方面,提供了一种数据收集方法,包括:
第一设备向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
所述第一设备接收所述第二设备上报的训练数据;
所述第一设备利用所述训练数据构造数据集,对所述特定AI模型进行训练。
第二方面,提供了一种数据收集装置,包括:
发送模块,用于向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
接收模块,用于接收所述第二设备上报的训练数据;
训练模块,用于利用所述训练数据构造数据集,对所述特定AI模型进行训练。
第三方面,提供了一种数据收集方法,包括:
第二设备接收第一设备的第一指示,所述第一指示用以指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
所述第二设备收集训练数据,并向所述第一设备上报所述训练数据。
第四方面,提供了一种数据收集装置,包括:
接收模块,用于接收第一设备的第一指示,所述第一指示用以指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
处理模块,用于收集训练数据,并向所述第一设备上报所述训练数据。
第五方面,提供了一种第一设备,该第一设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。
第六方面,提供了一种第一设备,包括处理器及通信接口,其中,所述通信接口用于向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;接收所述第二设备上报的训练数据;所述处理器用于利用所述训练数据构造数据集,对所述特定AI模型进行训练。
第七方面,提供了一种第二设备,该第二设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第三方面所述的方法的步骤。
第八方面,提供了一种第二设备,包括处理器及通信接口,其中,所述通信接口用于接收第一设备的第一指示,所述第一指示用以指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;所述处理器用于收集训练数据,并向所述第一设备上报所述训练数据。
第九方面,提供了一种数据收集系统,包括:第一设备及第二设备,所述第一设备可用于执行如第一方面所述的数据收集方法的步骤,所述第二设 备可用于执行如第三方面所述的数据收集方法的步骤。
第十方面,提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤,或者实现如第三方面所述的方法的步骤。
第十一方面,提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法,或实现如第三方面所述的方法。
第十二方面,提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现如第一方面所述的数据收集方法,或实现如第三方面所述的数据收集方法的步骤。
在本申请实施例中,第一设备并不是要求所有的候选第二设备收集并上报训练数据,而是由第一设备先对候选第二设备进行筛选,确定需要进行训练数据收集和上报的第二设备,然后向第二设备发送第一指示,指示第二设备收集并上报训练数据,这样一方面可以缓解数据收集时的传输压力,另一方面可以避免数据集中重复样本过多,减轻模型训练的压力。
附图说明
图1是本申请实施例可应用的一种无线通信系统的框图;
图2是信道状态信息反馈的示意图;
图3是AI训练不同迭代次数时的性能示意图;
图4是本申请实施例第一设备侧数据收集方法的流程示意图;
图5是本申请实施例第二设备侧数据收集方法的流程示意图;
图6是本申请实施例通信设备的结构示意图;
图7是本申请实施例终端的结构示意图;
图8是本申请实施例网络侧设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行 清楚描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”所区别的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”一般表示前后关联对象是一种“或”的关系。
值得指出的是,本申请实施例所描述的技术不限于长期演进型(Long Term Evolution,LTE)/LTE的演进(LTE-Advanced,LTE-A)系统,还可用于其他无线通信系统,诸如码分多址(Code Division Multiple Access,CDMA)、时分多址(Time Division Multiple Access,TDMA)、频分多址(Frequency Division Multiple Access,FDMA)、正交频分多址(Orthogonal Frequency Division Multiple Access,OFDMA)、单载波频分多址(Single-carrier Frequency Division Multiple Access,SC-FDMA)和其他系统。本申请实施例中的术语“系统”和“网络”常被可互换地使用,所描述的技术既可用于以上提及的系统和无线电技术,也可用于其他系统和无线电技术。以下描述出于示例目的描述了新空口(New Radio,NR)系统,并且在以下大部分描述中使用NR术语,但是这些技术也可应用于NR系统应用以外的应用,如第6代(6 th Generation,6G)通信系统。
图1示出本申请实施例可应用的一种无线通信系统的框图。无线通信系统包括终端11和网络侧设备12。其中,终端11可以是手机、平板电脑(Tablet Personal Computer)、膝上型电脑(Laptop Computer)或称为笔记本电脑、个人数字助理(Personal Digital Assistant,PDA)、掌上电脑、上网本、超级移动个人计算机(ultra-mobile personal computer,UMPC)、移动上网装置(Mobile Internet Device,MID)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、机器人、可穿戴式设备(Wearable  Device)、车载设备(Vehicle User Equipment,VUE)、行人终端(Pedestrian User Equipment,PUE)、智能家居(具有无线通信功能的家居设备,如冰箱、电视、洗衣机或者家具等)、游戏机、个人计算机(personal computer,PC)、柜员机或者自助机等终端侧设备,可穿戴式设备包括:智能手表、智能手环、智能耳机、智能眼镜、智能首饰(智能手镯、智能手链、智能戒指、智能项链、智能脚镯、智能脚链等)、智能腕带、智能服装等。需要说明的是,在本申请实施例并不限定终端11的具体类型。网络侧设备12可以包括接入网设备或核心网设备,其中,接入网设备12也可以称为无线接入网设备、无线接入网(Radio Access Network,RAN)、无线接入网功能或无线接入网单元。接入网设备12可以包括基站、WLAN接入点或WiFi节点等,基站可被称为节点B、演进节点B(eNB)、接入点、基收发机站(Base Transceiver Station,BTS)、无线电基站、无线电收发机、基本服务集(Basic Service Set,BSS)、扩展服务集(Extended Service Set,ESS)、家用B节点、家用演进型B节点、发送接收点(Transmitting Receiving Point,TRP)或所述领域中其他某个合适的术语,只要达到相同的技术效果,所述基站不限于特定技术词汇,需要说明的是,在本申请实施例中仅以NR系统中的基站为例进行介绍,并不限定基站的具体类型。
一般而言,根据解决类型不同,选取的AI算法和采用的模型也有所差别。借助AI提升第5代(5th Generation,5G)网络性能的主要方法是通过基于神经网络的算法和模型增强或者替代目前已有的算法或处理模块。在特定场景下,基于神经网络的算法和模型可以取得比基于确定性算法更好的性能。比较常用的神经网络包括深度神经网络、卷积神经网络和循环神经网络等。借助已有AI工具,可以实现神经网络的搭建、训练与验证工作。
通过AI方法替代现有系统中的模块能够有效提升系统性能。如图2所示的信道状态信息(Channel State Information,CSI)反馈,通过AI编码器(encoder)和AI解码器(decoder)替代常规的CSI计算,可以在相同开销的情况下大幅度提升相应的系统性能。通过基于AI的方案,系统的频谱效率可以提升30%左右。
AI训练不同迭代次数时的性能如图3所示,其中,横坐标为训练时期, 纵坐标为相关性的平方。不同迭代需要不同的训练数据,可以看到需要大量的训练迭代才能达到性能收敛。
AI应用于无线通信系统中时,有很多模型是在终端侧工作的,有很多在基站侧工作的任务需要利用终端侧收集的数据进行训练。由于终端本身的算力有限,一种可行的解决方案是将数据上报到网络侧,在网络侧进行集中式的训练。由于很多终端具有相同或相似的终端类型、业务类型,且工作在相同或相似的环境中,因此这些终端的数据会具有很高的相似度。用于训练的数据集中应尽量避免产生相同的数据,因为相同的数据对模型的收敛益处不大,反而会造成过拟合或泛化性能较差。元学习是一种提高模型泛化能力的学习方法,基于数据集构造多个任务,进行多任务学习,进而获得一个最优的初始化模型。在新的场景下通过元学习获得的初始化模型会快速实现微调和收敛,具备很高的适应能力。在构造多个任务时要求不同的人物之间的数据特征有一定的差异性。因此,无论是传统的学习方案还是元学习方案,都对数据集中数据的重合度或数据的相似性有要求。
下面结合附图,通过一些实施例及其应用场景对本申请实施例提供的数据收集方法进行详细地说明。
本申请实施例提供一种数据收集方法,如图4所示,包括:
步骤101:第一设备向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
步骤102:所述第一设备接收所述第二设备上报的训练数据;
步骤103:所述第一设备利用所述训练数据构造数据集,对所述特定AI模型进行训练。
在本申请实施例中,第一设备并不是要求所有的候选第二设备收集并上报训练数据,而是由第一设备先对候选第二设备进行筛选,确定需要进行训练数据收集和上报的第二设备,然后向第二设备发送第一指示,指示第二设备收集并上报训练数据,这样一方面可以缓解数据收集时的传输压力,另一方面可以避免数据集中重复样本过多,减轻模型训练的压力。
一些实施例中,所述第一设备向第二设备发送第一指示包括:
所述第一设备按照预设的第一筛选条件从M个候选第二设备中筛选出N 个所述第二设备,向所述N个所述第二设备单播所述第一指示,M,N为正整数,N小于或等于M;或
所述第一设备向所述M个候选第二设备广播所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
本实施例中,在第一设备通信范围内的为候选第二设备,上报训练数据的第二设备选自候选第二设备,可以将所有的候选第二设备作为第二设备,也可以筛选出部分候选第二设备作为第二设备。广播是向所有的候选第二设备发送第一指示,而单播只是向筛选出的第二设备发送第一指示,收到单播的第一指示的候选第二设备均需要收集并上报训练数据。收到广播的第一指示的候选第二设备需要判断自身是否满足第二筛选条件,满足第二筛选条件的候选第二设备才收集并上报训练数据。
一些实施例中,所述第一设备通过以下至少一项向所述第二设备发送第一指示:
媒体介入控制(Medium Access Control,MAC)控制单元(Control Element,CE);
无线资源控制(Radio Resource Control,RRC)消息;
非接入层(Non-access stratum,NAS)消息;
管理编排消息;
用户面数据;
下行控制信息;
系统信息块(System Information Block,SIB);
物理下行控制信道(Physical Downlink Control Channel,PDCCH)的层1信令;
物理下行共享信道(Physical Downlink Shared Channel,PDSCH)的信息;
物理随机接入信道(Physical Random Access Channel,PRACH)的(Message,MSG)2信息;
物理随机接入信道PRACH的MSG 4信息;
物理随机接入信道PRACH的MSG B信息;
广播信道的信息或信令;
Xn接口(一种接口)信令;
PC5接口(一种接口)信令;
物理侧边链路控制信道(Physical Sidelink Control Channel,PSCCH)的信息或信令;
物理侧边链路共享信道(Physical Sidelink Shared Channel,PSSCH)的信息;
物理侧边链路广播信道(Physical Sidelink Broadcast Channel,PSBCH)的信息;
物理直通链路发现信道(Physical Sidelink Discovery Channel,PSDCH)的信息;
物理直通链路反馈信道(Physical Sidelink Feedback Channel,PSFCH)的信息。
一些实施例中,所述第一设备向第二设备发送第一指示之前,所述方法还包括:
所述第一设备接收所述候选第二设备上报的第一训练数据和/或第一参数,所述第一参数可以是所述第一筛选条件的判断参数。
本实施例中,候选第二设备可以先上报少量的训练数据(即第一训练数据)和/或第一参数,由第一设备根据少量的训练数据和/或第一参数确定参与训练的数据来源,筛选出收集和上报训练数据的第二设备,避免所有的第二设备都上报训练数据。
一些实施例中,所述第一设备仅接收所述候选第二设备上报的第一训练数据,根据所述第一训练数据确定所述第一参数。第一设备可以根据第一训练数据推测、感知、检测或推理出第一参数。第一设备可以依据第一参数进行候选第二设备的筛选,确定第二设备。
一些实施例中,所述第一参数包括以下至少一项:
所述候选第二设备的数据类型;
所述候选第二设备的数据分布参数;
所述候选第二设备的业务类型,比如增强移动宽带(Enhanced Mobile Broadband,eMBB),超可靠低延迟通信(Ultra-Reliable Low-Latency Communications,URLLC),大规模机器类型通信(Massive Machine Type Communication,mMTC),其他6G新场景等;
所述候选第二设备的工作场景,包括但不限于:高速、低速、视距传播(Line of Sight,LOS)、非视距传播(Non Line of Sight,NLOS)、高信噪比、低信噪比等工作场景;
所述候选第二设备的通信网络接入方式,包括移动网络、WiFi和固网,其中移动网络包括第2代(2th Generation,2G),第3代(3th Generation,3G),第4代(4th Generation,4G),5G和6G;
所述候选第二设备的信道质量;
所述候选第二设备收集数据的难易程度;
所述候选第二设备的电量状态,比如可用剩余电量的具体值,或分级描述结果,充电或不充电等;
所述候选第二设备的存储状态,比如可用内存的具体值,或分级描述结果。
本实施例中,候选第二设备可以先向第一设备上报少量的训练数据(即第一训练数据)和/或第一参数,其中,第一参数可以是第一筛选条件的判断参数,第一设备根据第一筛选条件确定需要进行训练数据收集和上报的第二设备,该第二设备是选自候选第二设备,具体地,可以有M个候选第二设备,从其中确定N个第二设备需要进行训练数据收集和上报,N可以小于M,也可以等于M。
一具体示例中,可以根据候选第二设备的数据类型确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的数据类型对候选第二设备进行分组,每组内的候选第二设备的数据类型相同或相近。在筛选第二设备时,从每一组候选第二设备中选取K1个候选第二设备作为需要进行训练数据收集和上报的第二设备,K1为正整数,这样可以保证参与训练的数据的多样性,保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的业务类型确定需要进行训练数 据收集和上报的第二设备,按照候选第二设备的业务类型对候选第二设备进行分组,每组内的候选第二设备的业务类型相同或相近。在筛选第二设备时,从每一组候选第二设备中选取K2个候选第二设备作为需要进行训练数据收集和上报的第二设备,K2为正整数,这样可以保证参与训练的数据的多样性,保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的数据分布参数确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的数据分布参数对候选第二设备进行分组,每组内的候选第二设备的数据分布参数相同或相近。在筛选第二设备时,从每一组候选第二设备中选取K3个候选第二设备作为需要进行训练数据收集和上报的第二设备,K3为正整数,这样可以保证参与训练的数据的多样性,保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的工作场景确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的工作场景对候选第二设备进行分组,每组内的候选第二设备的工作场景相同或相近。在筛选第二设备时,从每一组候选第二设备中选取A个候选第二设备作为需要进行训练数据收集和上报的第二设备,A为正整数,这样可以保证参与训练的数据的多样性,保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的通信网络接入方式确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的通信网络接入方式对候选第二设备进行优先级排序,通信网络接入方式包括固网、WiFi和移动网络,移动网络包括2G,3G,4G,5G,6G等。其中,固网被筛选到的优先级大于等于WiFi被筛选到的优先级,WiFi被筛选到的优先级大于等于移动网络被筛选到的优先级。移动网络中代数越高,被筛选到的优先级越高,比如5G候选第二设备被筛选到的优先级高于4G候选第二设备被筛选到的优先级,按照优先级的从高到低从候选第二设备中选取B个候选第二设备作为需要进行训练数据收集和上报的第二设备,B为正整数。
一具体示例中,可以根据候选第二设备的信道质量确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的信道质量对候选第二设备进行优先级排序,信道质量越高的候选第二设备,被筛选到的优先级越高,按 照优先级的从高到低从候选第二设备中选取C个候选第二设备作为需要进行训练数据收集和上报的第二设备,C为正整数,这样可以保证信道质量好的第二设备收集和上报训练数据,保证特定AI模型的训练质量。
一具体示例中,可以根据候选第二设备收集数据的难易程度确定需要进行训练数据收集和上报的第二设备,按照候选第二设备收集数据的难易程度对候选第二设备进行优先级排序,收集数据难度越小的候选第二设备,被筛选到的优先级越高,按照优先级的从高到低从候选第二设备中选取D个候选第二设备作为需要进行训练数据收集和上报的第二设备,D为正整数,这样可以降低数据收集的难度。
一具体示例中,可以根据候选第二设备的电量状态确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的电量状态对候选第二设备进行优先级排序,候选第二设备的电量越高,被筛选到的优先级越高,另外,处于充电状态的候选第二设备被筛选到的优先级最高,按照优先级的从高到低从候选第二设备中选取E个候选第二设备作为需要进行训练数据收集和上报的第二设备,E为正整数,这样可以保证进行训练数据收集和上报的第二设备有足够的电量。
一具体示例中,可以根据候选第二设备的存储状态确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的存储状态对候选第二设备进行优先级排序,候选第二设备的可用存储空间越大,被筛选到的优先级越高,按照优先级的从高到低从候选第二设备中选取F个候选第二设备作为需要进行训练数据收集和上报的第二设备,F为正整数,这样可以保证进行训练数据收集和上报的第二设备有足够的可用存储空间来进行训练数据的存储。
一些实施例中,单播的所述第一指示包括以下至少一项:
所述第二设备收集的训练数据的样本数,不同第二设备收集的训练数据的样本数不同或相同;
所述第二设备收集训练数据的时间,不同第二设备收集训练数据的时间不同或相同;
所述第二设备向所述第一设备上报训练数据的时间,不同第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
所述第二设备向所述第一设备上报的训练数据的数据格式。
一些实施例中,广播的所述第一指示包括以下至少一项:
进行数据收集的候选第二设备的标识;
不进行数据收集的候选第二设备的标识;
进行数据收集的候选第二设备所需要收集的训练数据的样本数,不同候选第二设备收集的训练数据的样本数不同或相同;
进行数据收集的候选第二设备收集训练数据的时间,不同候选第二设备收集训练数据的时间不同或相同;
进行数据收集的候选第二设备向所述第一设备上报训练数据的时间,不同候选第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
所述第一筛选条件。
其中,进行数据收集的候选第二设备的标识和不进行数据收集的候选第二设备的标识组成所述第二筛选条件,候选第二设备可以根据自身的标识判断自身是否满足第二筛选条件。
一些实施例中,对特定AI模型进行训练之后,所述方法还包括:
所述第一设备向L个推理设备发送训练后的AI模型和超参数,所述L大于M、等于M或小于M。
本实施例中,第一设备基于接收到的训练数据构建训练数据集,进行特定AI模型的训练,将训练收敛的AI模型和超参数下发给L个推理设备,推理设备是需要对AI模型进行性能验证和推理的第二设备,推理设备可以选自候选第二设备,还可以是候选第二设备之外的其他第二设备。
一些实施例中,所述第一设备通过以下至少一项向推理设备发送训练后的AI模型和超参数:
媒体介入控制MAC控制单元CE;
无线资源控制RRC消息;
非接入层NAS消息;
管理编排消息;
用户面数据;
下行控制信息;
系统信息块SIB;
物理下行控制信道PDCCH的层1信令;
物理下行共享信道PDSCH的信息;
物理随机接入信道PRACH的MSG 2信息;
物理随机接入信道PRACH的MSG 4信息;
物理随机接入信道PRACH的MSG B信息;
广播信道的信息或信令;
Xn接口信令;
PC5接口信令;
物理侧边链路控制信道PSCCH的信息或信令;
物理侧边链路共享信道PSSCH的信息;
物理侧边链路广播信道PSBCH的信息;
物理直通链路发现信道PSDCH的信息;
物理直通链路反馈信道PSFCH的信息。
一些实施例中,所述AI模型为元学习模型,所述超参数可以由所述第一参数决定。
一些实施例中,元学习模型相关的超参数包括以下至少一项:
外迭代学习率;
不同训练任务或所述推理设备对应的内迭代学习率;
元学习率;
不同训练任务或所述推理设备对应的内迭代次数;
不同训练任务或所述推理设备对应的外迭代次数。
本实施例中,所述第一设备可以为网络侧设备,所述第二设备可以为终 端;或,所述第一设备为网络侧设备,所述第二设备为网络侧设备,如多个网络侧设备将训练数据汇聚到一个网络侧设备进行训练的场景;或,所述第一设备为终端,所述第二设备为终端,如多个终端将训练数据汇聚到一个终端进行训练的场景。
另外,候选第二设备可以是网络侧设备,也可以是终端;推理设备可以是网络侧设备,也可以是终端。
本申请实施例还提供了一种数据收集方法,如图5所示,包括:
步骤201:第二设备接收第一设备的第一指示,所述第一指示用以指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
步骤202:所述第二设备收集训练数据,并向所述第一设备上报所述训练数据。
在本申请实施例中,第一设备并不是要求所有的候选第二设备收集并上报训练数据,而是由第一设备先对候选第二设备进行筛选,确定需要进行训练数据收集和上报的第二设备,然后向第二设备发送第一指示,指示第二设备收集并上报训练数据,这样一方面可以缓解数据收集时的传输压力,另一方面可以避免数据集中重复样本过多,减轻模型训练的压力。
一些实施例中,所述第二设备通过以下至少一项向所述第一设备上报训练数据:
媒体介入控制MAC控制单元CE;
无线资源控制RRC消息;
非接入层NAS消息;
物理上行控制信道PUCCH的层1信令;
物理随机接入信道PRACH的MSG 1信息;
物理随机接入信道PRACH的MSG 3信息;
物理随机接入信道PRACH的MSG A信息;
物理上行共享信道PUSCH的信息;
Xn接口信令;
PC5接口信令;
物理侧边链路控制信道PSCCH的信息或信令;
物理侧边链路共享信道PSSCH的信息;
物理侧边链路广播信道PSBCH的信息;
物理直通链路发现信道PSDCH的信息;
物理直通链路反馈信道PSFCH的信息。
一些实施例中,所述第二设备接收第一设备的第一指示包括:
所述第二设备接收所述第一设备单播的所述第一指示,所述第二设备为所述第一设备按照预设的第一筛选条件从候选第二设备中筛选出的第二设备;或
所述第二设备接收所述第一设备向候选第二设备广播的所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
一些实施例中,所述第二设备收集训练数据,并向所述第一设备上报所述训练数据包括:
若所述第二设备接收所述第一设备单播的所述第一指示,所述第二设备收集并上报所述训练数据;或
若所述第二设备接收所述第一设备广播的所述第一指示,所述第二设备收集并上报所述训练数据。
本实施例中,在第一设备通信范围内的为候选第二设备,上报训练数据的第二设备选自候选第二设备,可以将所有的候选第二设备作为第二设备,也可以筛选出部分候选第二设备作为第二设备。广播是向所有的候选第二设备发送第一指示,而单播只是向筛选出的第二设备发送第一指示,收到单播的第一指示的候选第二设备均需要收集并上报训练数据。收到广播的第一指示的候选第二设备需要判断自身是否满足第二筛选条件,满足第二筛选条件的候选第二设备才收集并上报训练数据。
一些实施例中,所述第二设备接收第一设备的第一指示之前,所述方法还包括:
候选第二设备向所述第一设备上报第一训练数据和/或第一参数,所述第一参数可以是所述第一筛选条件的判断参数。
本实施例中,候选第二设备可以先上报少量的训练数据(即第一训练数 据)和/或第一参数,由第一设备根据少量的训练数据和/或第一参数确定参与训练的数据来源,筛选出收集和上报训练数据的第二设备,避免所有的第二设备都上报训练数据。
一些实施例中,所述第二设备通过以下至少一项向所述第一设备上报第一训练数据和/或第一参数:
媒体介入控制MAC控制单元CE;
无线资源控制RRC消息;
非接入层NAS消息;
物理上行控制信道PUCCH的层1信令;
物理随机接入信道PRACH的MSG 1信息;
物理随机接入信道PRACH的MSG 3信息;
物理随机接入信道PRACH的MSG A信息;
物理上行共享信道PUSCH的信息;
Xn接口信令;
PC5接口信令;
物理侧边链路控制信道PSCCH的信息或信令;
物理侧边链路共享信道PSSCH的信息;
物理侧边链路广播信道PSBCH的信息;
物理直通链路发现信道PSDCH的信息;
物理直通链路反馈信道PSFCH的信息。
一些实施例中,所述候选第二设备向所述第一设备仅上报所述第一训练数据,所述第一训练数据用于确定所述第一参数。
一些实施例中,所述第一设备仅接收所述候选第二设备上报的第一训练数据,根据所述第一训练数据确定所述第一参数。第一设备可以根据第一训练数据推测、感知、检测或推理出第一参数。第一设备可以依据第一参数进行候选第二设备的筛选,确定第二设备。
一些实施例中,所述第一参数包括以下至少一项:
所述候选第二设备的数据类型;
所述候选第二设备的数据分布参数;
所述候选第二设备的业务类型,比如增强移动宽带(eMBB),超可靠低延迟通信(URLLC),大规模机器类型通信(mMTC),其他6G新场景等;
所述候选第二设备的工作场景,包括但不限于:高速、低速、视距传播LOS、非视距传播NLOS、高信噪比、低信噪比等工作场景;
所述候选第二设备的通信网络接入方式,包括移动网络、WiFi和固网,其中移动网络包括2G,3G,4G,5G和6G;
所述候选第二设备的信道质量;
所述候选第二设备收集数据的难易程度;
所述候选第二设备的电量状态,比如可用剩余电量的具体值,或分级描述结果,充电或不充电等;
所述候选第二设备的存储状态,比如可用内存的具体值,或分级描述结果。
本实施例中,候选第二设备可以先向第一设备上报少量的训练数据(即第一训练数据)和/或第一参数,其中,第一参数可以是第一筛选条件的判断参数,第一设备根据第一筛选条件确定需要进行训练数据收集和上报的第二设备,该第二设备是选自候选第二设备,具体地,可以有M个候选第二设备,从其中确定N个第二设备需要进行训练数据收集和上报,N可以小于M,也可以等于M。
一具体示例中,可以根据候选第二设备的数据类型确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的数据类型对候选第二设备进行分组,每组内的候选第二设备的数据类型相同或相近。在筛选第二设备时,从每一组候选第二设备中选取K1个候选第二设备作为需要进行训练数据收集和上报的第二设备,K1为正整数,这样可以保证参与训练的数据的多样性,保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的业务类型确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的业务类型对候选第二设备进行分组,每组内的候选第二设备的业务类型相同或相近。在筛选第二设备时,从每一组候选第二设备中选取K2个候选第二设备作为需要进行训练数据收集和上报的第二设备,K2为正整数,这样可以保证参与训练的数据的多样性, 保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的数据分布参数确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的数据分布参数对候选第二设备进行分组,每组内的候选第二设备的数据分布参数相同或相近。在筛选第二设备时,从每一组候选第二设备中选取K4个候选第二设备作为需要进行训练数据收集和上报的第二设备,K3为正整数,这样可以保证参与训练的数据的多样性,保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的工作场景确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的工作场景对候选第二设备进行分组,每组内的候选第二设备的工作场景相同或相近。在筛选第二设备时,从每一组候选第二设备中选取A个候选第二设备作为需要进行训练数据收集和上报的第二设备,A为正整数,这样可以保证参与训练的数据的多样性,保证每一组候选第二设备都有第二设备收集和上报训练数据。
一具体示例中,可以根据候选第二设备的通信网络接入方式确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的通信网络接入方式对候选第二设备进行优先级排序,通信网络接入方式包括固网、WiFi和移动网络,移动网络包括2G,3G,4G,5G,6G等。其中,固网被筛选到的优先级大于等于WiFi被筛选到的优先级,WiFi被筛选到的优先级大于等于移动网络被筛选到的优先级。移动网络中代数越高,被筛选到的优先级越高,比如5G候选第二设备被筛选到的优先级高于4G候选第二设备被筛选到的优先级,按照优先级的从高到低从候选第二设备中选取B个候选第二设备作为需要进行训练数据收集和上报的第二设备,B为正整数。
一具体示例中,可以根据候选第二设备的信道质量确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的信道质量对候选第二设备进行优先级排序,信道质量越高的候选第二设备,被筛选到的优先级越高,按照优先级的从高到低从候选第二设备中选取C个候选第二设备作为需要进行训练数据收集和上报的第二设备,C为正整数,这样可以保证信道质量好的第二设备收集和上报训练数据,保证特定AI模型的训练质量。
一具体示例中,可以根据候选第二设备收集数据的难易程度确定需要进 行训练数据收集和上报的第二设备,按照候选第二设备收集数据的难易程度对候选第二设备进行优先级排序,收集数据难度越小的候选第二设备,被筛选到的优先级越高,按照优先级的从高到低从候选第二设备中选取D个候选第二设备作为需要进行训练数据收集和上报的第二设备,D为正整数,这样可以降低数据收集的难度。
一具体示例中,可以根据候选第二设备的电量状态确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的电量状态对候选第二设备进行优先级排序,候选第二设备的电量越高,被筛选到的优先级越高,另外,处于充电状态的候选第二设备被筛选到的优先级最高,按照优先级的从高到低从候选第二设备中选取E个候选第二设备作为需要进行训练数据收集和上报的第二设备,E为正整数,这样可以保证进行训练数据收集和上报的第二设备有足够的电量。
一具体示例中,可以根据候选第二设备的存储状态确定需要进行训练数据收集和上报的第二设备,按照候选第二设备的存储状态对候选第二设备进行优先级排序,候选第二设备的可用存储空间越大,被筛选到的优先级越高,按照优先级的从高到低从候选第二设备中选取F个候选第二设备作为需要进行训练数据收集和上报的第二设备,F为正整数,这样可以保证进行训练数据收集和上报的第二设备有足够的可用存储空间来进行训练数据的存储。
一些实施例中,向所述第一设备上报所述训练数据之前,所述方法还包括:
所述第二设备向所述第一设备发送第一请求,请求进行训练数据的收集和上报。
一些实施例中,单播的所述第一指示包括以下至少一项:
所述第二设备收集的训练数据的样本数,不同第二设备收集的训练数据的样本数不同或相同;
所述第二设备收集训练数据的时间,不同第二设备收集训练数据的时间不同或相同;
所述第二设备向所述第一设备上报训练数据的时间,不同第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
所述第二设备向所述第一设备上报的训练数据的数据格式。
一些实施例中,广播的所述第一指示包括以下至少一项:
进行数据收集的候选第二设备的标识;
不进行数据收集的候选第二设备的标识;
进行数据收集的候选第二设备所需要收集的训练数据的样本数,不同候选第二设备收集的训练数据的样本数不同或相同;
进行数据收集的候选第二设备收集训练数据的时间,不同候选第二设备收集训练数据的时间不同或相同;
进行数据收集的候选第二设备向所述第一设备上报训练数据的时间,不同候选第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
所述第一筛选条件。
其中,进行数据收集的候选第二设备的标识和不进行数据收集的候选第二设备的标识组成所述第二筛选条件,候选第二设备可以根据自身的标识判断自身是否满足第二筛选条件。
一些实施例中,所述第二设备收集训练数据,并向所述第一设备上报所述训练数据之后,所述方法还包括:
推理设备接收所述第一设备发送的训练后的AI模型和超参数。
本实施例中,第一设备基于接收到的训练数据构建训练数据集,进行特定AI模型的训练,将训练收敛的AI模型和超参数下发给L个推理设备,推理设备是需要对AI模型进行性能验证和推理的第二设备,推理设备可以选自候选第二设备,还可以是候选第二设备之外的其他第二设备。
一些实施例中,所述AI模型为元学习模型,所述超参数可以由所述第一参数决定。
一些实施例中,所述超参数包括以下至少一项:
外迭代学习率;
不同训练任务或所述推理设备对应的内迭代学习率;
元学习率;
不同训练任务或所述推理设备对应的内迭代次数;
不同训练任务或所述推理设备对应的外迭代次数。
一些实施例中,所述推理设备接收所述第一设备发送的训练后的AI模型和超参数之后,所述方法还包括:
所述推理设备对所述AI模型进行性能验证;
若性能验证结果满足预设的第一条件,所述推理设备将所述AI模型用于推理。其中,第一条件可以是第一设备配置或预配置或协议约定的,推理设备对所述AI模型进行性能验证后,还可以将是否进行推理的结果上报给第一设备。
一些实施例中,进行性能验证的AI模型为所述第一设备下发的AI模型,或,所述第一设备下发的AI模型经过微调(fine-tuning)后的模型。
本实施例中,推理设备可以直接利用第一设备下发的AI模型进行性能验证,也可以是将第一设备下发的AI模型进行微调后再进行性能验证。对于元学习的微调,每个推理设备对应的元学习相关的特殊超参数可以不同。可以根据每个推理设备对应的第一参数(主要是根据第一参数中的数据收集难易程度、电量状态、存储状态等)来决定每个推理设备的元学习相关的特殊超参数。
本实施例中,所述第一设备可以为网络侧设备,所述第二设备可以为终端;或,所述第一设备为网络侧设备,所述第二设备为网络侧设备,如多个网络侧设备将训练数据汇聚到一个网络侧设备进行训练的场景;或,所述第一设备为终端,所述第二设备为终端,如多个终端将训练数据汇聚到一个终端进行训练的场景。
另外,候选第二设备可以是网络侧设备,也可以是终端;推理设备可以是网络侧设备,也可以是终端。
上述实施例中,特定AI模型可以为信道估计模型、移动性预测模型等。 本申请实施例的技术方案可以应用于6G网络中,还可以应用于5G和5.5G网络中。
本申请实施例提供的数据收集方法,执行主体可以为数据收集装置。本申请实施例中以数据收集装置执行数据收集方法为例,说明本申请实施例提供的数据收集装置。
本申请实施例提供一种数据收集装置,包括:
发送模块,用于向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
接收模块,用于接收所述第二设备上报的训练数据;
训练模块,用于利用所述训练数据构造数据集,对所述特定AI模型进行训练。
一些实施例中,所述发送模块具体用于按照预设的第一筛选条件从M个候选第二设备中筛选出N个所述第二设备,向所述N个所述第二设备单播所述第一指示,M,N为正整数,N小于或等于M;或
向所述M个候选第二设备广播所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
一些实施例中,所述接收模块还用于接收所述候选第二设备上报的第一训练数据和/或第一参数,所述第一参数可以是所述第一筛选条件的判断参数。
一些实施例中,所述接收模块用于仅接收所述候选第二设备上报的第一训练数据,根据所述第一训练数据确定所述第一参数。
一些实施例中,所述第一参数包括以下至少一项:
所述候选第二设备的数据类型;
所述候选第二设备的数据分布参数;
所述候选第二设备的业务类型;
所述候选第二设备的工作场景;
所述候选第二设备的通信网络接入方式;
所述候选第二设备的信道质量;
所述候选第二设备收集数据的难易程度;
所述候选第二设备的电量状态;
所述候选第二设备的存储状态。
一些实施例中,单播的所述第一指示包括以下至少一项:
所述第二设备收集的训练数据的样本数,不同第二设备收集的训练数据的样本数不同或相同;
所述第二设备收集训练数据的时间,不同第二设备收集训练数据的时间不同或相同;
所述第二设备向所述第一设备上报训练数据的时间,不同第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
所述第二设备向所述第一设备上报的训练数据的数据格式。
一些实施例中,广播的所述第一指示包括以下至少一项:
进行数据收集的候选第二设备的标识;
不进行数据收集的候选第二设备的标识;
进行数据收集的候选第二设备所需要收集的训练数据的样本数,不同候选第二设备收集的训练数据的样本数不同或相同;
进行数据收集的候选第二设备收集训练数据的时间,不同候选第二设备收集训练数据的时间不同或相同;
进行数据收集的候选第二设备向所述第一设备上报训练数据的时间,不同候选第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
所述第一筛选条件。
一些实施例中,所述发送模块还用于向L个推理设备发送训练后的AI模型和超参数,所述L大于M、等于M或小于M。
一些实施例中,所述AI模型为元学习模型,所述超参数由所述第一参数 决定。
一些实施例中,所述超参数包括以下至少一项:
外迭代学习率;
不同训练任务或所述推理设备对应的内迭代学习率;
元学习率;
不同训练任务或所述推理设备对应的内迭代次数;
不同训练任务或所述推理设备对应的外迭代次数。
本实施例中,所述第一设备可以为网络侧设备,所述第二设备可以为终端;或,所述第一设备为网络侧设备,所述第二设备为网络侧设备,如多个网络侧设备将训练数据汇聚到一个网络侧设备进行训练的场景;或,所述第一设备为终端,所述第二设备为终端,如多个终端将训练数据汇聚到一个终端进行训练的场景。
另外,候选第二设备可以是网络侧设备,也可以是终端;推理设备可以是网络侧设备,也可以是终端。本申请实施例还提供了一种数据收集装置,包括:
接收模块,用于接收第一设备的第一指示,所述第一指示用以指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
处理模块,用于收集训练数据,并向所述第一设备上报所述训练数据。
一些实施例中,所述接收模块用于接收所述第一设备单播的所述第一指示,所述第二设备为所述第一设备按照预设的第一筛选条件从候选第二设备中筛选出的第二设备;或
接收所述第一设备向候选第二设备广播的所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
一些实施例中,所述处理模块用于若所述第二设备接收所述第一设备单播的所述第一指示,所述第二设备收集并上报所述训练数据;或
若所述第二设备接收所述第一设备广播的所述第一指示,所述第二设备收集并上报所述训练数据。
一些实施例中,候选第二设备向所述第一设备上报第一训练数据和/或第 一参数,所述第一参数可以是所述第一筛选条件的判断参数。
一些实施例中,所述候选第二设备向所述第一设备仅上报所述第一训练数据,所述第一训练数据用于确定所述第一参数。
一些实施例中,所述第一参数包括以下至少一项:
所述候选第二设备的数据类型;
所述候选第二设备的数据分布参数;
所述候选第二设备的业务类型;
所述候选第二设备的工作场景;
所述候选第二设备的通信网络接入方式;
所述候选第二设备的信道质量;
所述候选第二设备收集数据的难易程度;
所述候选第二设备的电量状态;
所述候选第二设备的存储状态。
一些实施例中,所述处理模块还用于向所述第一设备发送第一请求,请求进行训练数据的收集和上报。
一些实施例中,单播的所述第一指示包括以下至少一项:
所述第二设备收集的训练数据的样本数,不同第二设备收集的训练数据的样本数不同或相同;
所述第二设备收集训练数据的时间,不同第二设备收集训练数据的时间不同或相同;
所述第二设备向所述第一设备上报训练数据的时间,不同第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
所述第二设备向所述第一设备上报的训练数据的数据格式。
一些实施例中,广播的所述第一指示包括以下至少一项:
进行数据收集的候选第二设备的标识;
不进行数据收集的候选第二设备的标识;
进行数据收集的候选第二设备所需要收集的训练数据的样本数,不同候 选第二设备收集的训练数据的样本数不同或相同;
进行数据收集的候选第二设备收集训练数据的时间,不同候选第二设备收集训练数据的时间不同或相同;
进行数据收集的候选第二设备向所述第一设备上报训练数据的时间,不同候选第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
所述第一筛选条件。
一些实施例中,推理设备接收所述第一设备发送的训练后的AI模型和超参数。
一些实施例中,所述AI模型为元学习模型,所述超参数由所述第一参数决定。
一些实施例中,所述超参数包括以下至少一项:
外迭代学习率;
不同训练任务或所述推理设备对应的内迭代学习率;
元学习率;
不同训练任务或所述推理设备对应的内迭代次数;
不同训练任务或所述推理设备对应的外迭代次数。
一些实施例中,所述数据收集装置还包括:
推理模块,用于对所述AI模型进行性能验证;若性能验证结果满足预设的第一条件,将所述AI模型用于推理。
一些实施例中,进行性能验证的AI模型为所述第一设备下发的AI模型,或,所述第一设备下发的AI模型经过微调后的模型。
本实施例中,所述第一设备可以为网络侧设备,所述第二设备可以为终端;或,所述第一设备为网络侧设备,所述第二设备为网络侧设备,如多个网络侧设备将训练数据汇聚到一个网络侧设备进行训练的场景;或,所述第一设备为终端,所述第二设备为终端,如多个终端将训练数据汇聚到一个终 端进行训练的场景。
另外,候选第二设备可以是网络侧设备,也可以是终端;推理设备可以是网络侧设备,也可以是终端。本申请实施例中的数据收集装置可以是电子设备,例如具有操作系统的电子设备,也可以是电子设备中的部件,例如集成电路或芯片。该电子设备可以是终端,也可以为除终端之外的其他设备。示例性的,终端可以包括但不限于上述所列举的终端11的类型,其他设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)等,本申请实施例不作具体限定。
本申请实施例提供的数据收集装置能够实现图4至图5的方法实施例实现的各个过程,并达到相同的技术效果,为避免重复,这里不再赘述。
可选的,如图6所示,本申请实施例还提供一种通信设备600,包括处理器601和存储器602,存储器602上存储有可在所述处理器601上运行的程序或指令,例如,该通信设备600为第一设备时,该程序或指令被处理器601执行时实现上述数据收集方法实施例的各个步骤,且能达到相同的技术效果。该通信设备600为第二设备时,该程序或指令被处理器601执行时实现上述数据收集方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供了一种第一设备,该第一设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如上所述的数据收集方法的步骤。
本申请实施例还提供了一种第一设备,包括处理器及通信接口,其中,所述通信接口用于向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;接收所述第二设备上报的训练数据;所述处理器用于利用所述训练数据构造数据集,对所述特定AI模型进行训练。
本申请实施例还提供了一种第二设备,该第二设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如上所述的数据收集方法的步骤。
本申请实施例还提供了一种第二设备,包括处理器及通信接口,其中,所述通信接口用于接收第一设备的第一指示,所述第一指示用以指示所述第 二设备收集并上报用以进行特定AI模型训练的训练数据;所述处理器用于收集训练数据,并向所述第一设备上报所述训练数据。
上述第一设备可以为网络侧设备或终端,第二设备可以为网络侧设备或终端。
当第一设备和/或第二设备为终端时,本申请实施例还提供一种终端,包括处理器和通信接口,该终端实施例与上述终端侧方法实施例对应,上述方法实施例的各个实施过程和实现方式均可适用于该终端实施例中,且能达到相同的技术效果。具体地,图7为实现本申请实施例的一种终端的硬件结构示意图。
该终端700包括但不限于:射频单元701、网络模块702、音频输出单元703、输入单元704、传感器705、显示单元706、用户输入单元707、接口单元708、存储器709以及处理器710等中的至少部分部件。
本领域技术人员可以理解,终端700还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器7 10逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图7中示出的终端结构并不构成对终端的限定,终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
应理解的是,本申请实施例中,输入单元704可以包括图形处理单元(Graphics Processing Unit,GPU)7041和麦克风7042,图形处理器7041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元706可包括显示面板7061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板7061。用户输入单元707包括触控面板7071以及其他输入设备7072中的至少一种。触控面板7071,也称为触摸屏。触控面板7071可包括触摸检测装置和触摸控制器两个部分。其他输入设备7072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。
本申请实施例中,射频单元701接收来自网络侧设备的下行数据后,可以传输给处理器710进行处理;另外,射频单元701可以向网络侧设备发送上行数据。通常,射频单元701包括但不限于天线、放大器、收发信机、耦 合器、低噪声放大器、双工器等。
存储器709可用于存储软件程序或指令以及各种数据。存储器709可主要包括存储程序或指令的第一存储区和存储数据的第二存储区,其中,第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外,存储器709可以包括易失性存储器或非易失性存储器,或者,存储器709可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请实施例中的存储器709包括但不限于这些和任意其它适合类型的存储器。
处理器710可包括一个或多个处理单元;可选的,处理器710集成应用处理器和调制解调处理器,其中,应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作,调制解调处理器主要处理无线通信信号,如基带处理器。可以理解的是,上述调制解调处理器也可以不集成到处理器710中。
一些实施例中,第一设备为终端,处理器710用于向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;接收所述第二设备上报的训练数据;利用所述训练数据构造数据集,对所述特定AI模型进行训练。
一些实施例中,处理器710具体用于按照预设的第一筛选条件从M个候选第二设备中筛选出N个所述第二设备,向所述N个所述第二设备单播所述第一指示,M,N为正整数,N小于或等于M;或
向所述M个候选第二设备广播所述第一指示,所述第一指示携带有第二 筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
一些实施例中,处理器710还用于接收所述候选第二设备上报的第一训练数据和/或第一参数,所述第一参数可以是所述第一筛选条件的判断参数。
一些实施例中,处理器710用于仅接收所述候选第二设备上报的第一训练数据,根据所述第一训练数据确定所述第一参数。
一些实施例中,所述第一参数包括以下至少一项:
所述候选第二设备的数据类型;
所述候选第二设备的数据分布参数;
所述候选第二设备的业务类型;
所述候选第二设备的工作场景;
所述候选第二设备的通信网络接入方式;
所述候选第二设备的信道质量;
所述候选第二设备收集数据的难易程度;
所述候选第二设备的电量状态;
所述候选第二设备的存储状态。
一些实施例中,单播的所述第一指示包括以下至少一项:
所述第二设备收集的训练数据的样本数,不同第二设备收集的训练数据的样本数不同或相同;
所述第二设备收集训练数据的时间,不同第二设备收集训练数据的时间不同或相同;
所述第二设备向所述第一设备上报训练数据的时间,不同第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
所述第二设备向所述第一设备上报的训练数据的数据格式。
一些实施例中,广播的所述第一指示包括以下至少一项:
进行数据收集的候选第二设备的标识;
不进行数据收集的候选第二设备的标识;
进行数据收集的候选第二设备所需要收集的训练数据的样本数,不同候选第二设备收集的训练数据的样本数不同或相同;
进行数据收集的候选第二设备收集训练数据的时间,不同候选第二设备收集训练数据的时间不同或相同;
进行数据收集的候选第二设备向所述第一设备上报训练数据的时间,不同候选第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
所述第一筛选条件。
一些实施例中,处理器710还用于向L个推理设备发送训练后的AI模型和超参数,所述L大于M、等于M或小于M。
一些实施例中,所述AI模型为元学习模型,所述超参数由所述第一参数决定。
一些实施例中,所述超参数包括以下至少一项:
外迭代学习率;
不同训练任务或所述推理设备对应的内迭代学习率;
元学习率;
不同训练任务或所述推理设备对应的内迭代次数;
不同训练任务或所述推理设备对应的外迭代次数。
本实施例中,所述第一设备可以为网络侧设备,所述第二设备可以为终端;或,所述第一设备为网络侧设备,所述第二设备为网络侧设备,如多个网络侧设备将训练数据汇聚到一个网络侧设备进行训练的场景;或,所述第一设备为终端,所述第二设备为终端,如多个终端将训练数据汇聚到一个终端进行训练的场景。
另外,候选第二设备可以是网络侧设备,也可以是终端;推理设备可以是网络侧设备,也可以是终端。一些实施例中,第二设备为终端,处理器710用于接收第一设备的第一指示,所述第一指示用以指示所述第二设备收集并 上报用以进行特定AI模型训练的训练数据;收集训练数据,并向所述第一设备上报所述训练数据。
一些实施例中,处理器710用于接收所述第一设备单播的所述第一指示,所述第二设备为所述第一设备按照预设的第一筛选条件从候选第二设备中筛选出的第二设备;或
接收所述第一设备向候选第二设备广播的所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
一些实施例中,处理器710用于若所述第二设备接收所述第一设备单播的所述第一指示,所述第二设备收集并上报所述训练数据;或
若所述第二设备接收所述第一设备广播的所述第一指示,所述第二设备收集并上报所述训练数据。
一些实施例中,候选第二设备向所述第一设备上报第一训练数据和/或第一参数,所述第一参数可以是所述第一筛选条件的判断参数。
一些实施例中,所述候选第二设备向所述第一设备仅上报所述第一训练数据,所述第一训练数据用于确定所述第一参数。
一些实施例中,所述第一参数包括以下至少一项:
所述候选第二设备的数据类型;
所述候选第二设备的数据分布参数;
所述候选第二设备的业务类型;
所述候选第二设备的工作场景;
所述候选第二设备的通信网络接入方式;
所述候选第二设备的信道质量;
所述候选第二设备收集数据的难易程度;
所述候选第二设备的电量状态;
所述候选第二设备的存储状态。
一些实施例中,处理器710还用于向所述第一设备发送第一请求,请求进行训练数据的收集和上报。
一些实施例中,单播的所述第一指示包括以下至少一项:
所述第二设备收集的训练数据的样本数,不同第二设备收集的训练数据的样本数不同或相同;
所述第二设备收集训练数据的时间,不同第二设备收集训练数据的时间不同或相同;
所述第二设备向所述第一设备上报训练数据的时间,不同第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
所述第二设备向所述第一设备上报的训练数据的数据格式。
一些实施例中,广播的所述第一指示包括以下至少一项:
进行数据收集的候选第二设备的标识;
不进行数据收集的候选第二设备的标识;
进行数据收集的候选第二设备所需要收集的训练数据的样本数,不同候选第二设备收集的训练数据的样本数不同或相同;
进行数据收集的候选第二设备收集训练数据的时间,不同候选第二设备收集训练数据的时间不同或相同;
进行数据收集的候选第二设备向所述第一设备上报训练数据的时间,不同候选第二设备上报训练数据的时间不同或相同;
是否需要对收集的数据进行预处理;
对收集的数据进行预处理的方式;
进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
所述第一筛选条件。
一些实施例中,推理设备接收所述第一设备发送的训练后的AI模型和超参数。
一些实施例中,所述AI模型为元学习模型,所述超参数由所述第一参数决定。
一些实施例中,所述超参数包括以下至少一项:
外迭代学习率;
不同训练任务或所述推理设备对应的内迭代学习率;
元学习率;
不同训练任务或所述推理设备对应的内迭代次数;
不同训练任务或所述推理设备对应的外迭代次数。
一些实施例中,处理器710用于对所述AI模型进行性能验证;若性能验证结果满足预设的第一条件,将所述AI模型用于推理。
一些实施例中,进行性能验证的AI模型为所述第一设备下发的AI模型,或,所述第一设备下发的AI模型经过微调后的模型。
当第一设备和/或第二设备为网络侧设备时,本申请实施例还提供一种网络侧设备,包括处理器和通信接口。该网络侧设备实施例与上述网络侧设备方法实施例对应,上述方法实施例的各个实施过程和实现方式均可适用于该网络侧设备实施例中,且能达到相同的技术效果。
具体地,本申请实施例还提供了一种网络侧设备。如图8所示,该网络侧设备800包括:天线81、射频装置82、基带装置83、处理器84和存储器85。天线81与射频装置82连接。在上行方向上,射频装置82通过天线81接收信息,将接收的信息发送给基带装置83进行处理。在下行方向上,基带装置83对要发送的信息进行处理,并发送给射频装置82,射频装置82对收到的信息进行处理后经过天线81发送出去。
以上实施例中网络侧设备执行的方法可以在基带装置83中实现,该基带装置83包括基带处理器。
基带装置83例如可以包括至少一个基带板,该基带板上设置有多个芯片,如图8所示,其中一个芯片例如为基带处理器,通过总线接口与存储器85连接,以调用存储器85中的程序,执行以上方法实施例中所示的网络设备操作。
该网络侧设备还可以包括网络接口86,该接口例如为通用公共无线接口(common public radio interface,CPRI)。
具体地,本发明实施例的网络侧设备800还包括:存储在存储器85上并可在处理器84上运行的指令或程序,处理器84调用存储器85中的指令或程序执行如上所述的数据收集方法,并达到相同的技术效果,为避免重复,故不在此赘述。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述数据收集方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的终端中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述数据收集方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片,系统芯片,芯片系统或片上系统芯片等。
本申请实施例另提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现上述数据收集方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供了一种数据收集系统,包括:第一设备及第二设备,所述第一设备可用于执行如上所述的数据收集方法的步骤,所述第二设备可用于执行如上所述的数据收集方法的步骤。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被 组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (33)

  1. 一种数据收集方法,包括:
    第一设备向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
    所述第一设备接收所述第二设备上报的训练数据;
    所述第一设备利用所述训练数据构造数据集,对所述特定AI模型进行训练。
  2. 根据权利要求1所述的数据收集方法,其中,所述第一设备向第二设备发送第一指示包括:
    所述第一设备按照预设的第一筛选条件从M个候选第二设备中筛选出N个所述第二设备,向所述N个所述第二设备单播所述第一指示,M,N为正整数,N小于或等于M;或
    所述第一设备向所述M个候选第二设备广播所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
  3. 根据权利要求2所述的数据收集方法,其中,所述第一设备向第二设备发送第一指示之前,所述方法还包括:
    所述第一设备接收所述候选第二设备上报的第一训练数据和/或第一参数,所述第一参数是所述第一筛选条件的判断参数。
  4. 根据权利要求3所述的数据收集方法,其中,
    所述第一设备仅接收所述候选第二设备上报的第一训练数据,根据所述第一训练数据确定所述第一参数。
  5. 根据权利要求3或4所述的数据收集方法,其中,所述第一参数包括以下至少一项:
    所述候选第二设备的数据类型;
    所述候选第二设备的数据分布参数;
    所述候选第二设备的业务类型;
    所述候选第二设备的工作场景;
    所述候选第二设备的通信网络接入方式;
    所述候选第二设备的信道质量;
    所述候选第二设备收集数据的难易程度;
    所述候选第二设备的电量状态;
    所述候选第二设备的存储状态。
  6. 根据权利要求2所述的数据收集方法,其中,单播的所述第一指示包括以下至少一项:
    所述第二设备收集的训练数据的样本数;
    所述第二设备收集训练数据的时间;
    所述第二设备向所述第一设备上报训练数据的时间;
    是否需要对收集的数据进行预处理;
    对收集的数据进行预处理的方式;
    所述第二设备向所述第一设备上报的训练数据的数据格式。
  7. 根据权利要求2所述的数据收集方法,其中,广播的所述第一指示包括以下至少一项:
    进行数据收集的候选第二设备的标识;
    不进行数据收集的候选第二设备的标识;
    进行数据收集的候选第二设备所需要收集的训练数据的样本数;
    进行数据收集的候选第二设备收集训练数据的时间;
    进行数据收集的候选第二设备向所述第一设备上报训练数据的时间;
    是否需要对收集的数据进行预处理;
    对收集的数据进行预处理的方式;
    进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
    所述第一筛选条件。
  8. 根据权利要求3所述的数据收集方法,其中,对特定AI模型进行训练之后,所述方法还包括:
    所述第一设备向L个推理设备发送训练后的AI模型和超参数,所述L大于M、等于M或小于M。
  9. 根据权利要求8所述的数据收集方法,其中,所述AI模型为元学习模型,所述超参数由所述第一参数决定。
  10. 根据权利要求8所述的数据收集方法,其中,所述超参数包括以下至少一项:
    外迭代学习率;
    不同训练任务或所述推理设备对应的内迭代学习率;
    元学习率;
    不同训练任务或所述推理设备对应的内迭代次数;
    不同训练任务或所述推理设备对应的外迭代次数。
  11. 根据权利要求1所述的数据收集方法,
    所述第一设备为网络侧设备,所述第二设备为终端;或
    所述第一设备为网络侧设备,所述第二设备为网络侧设备;或
    所述第一设备为终端,所述第二设备为终端。
  12. 一种数据收集方法,包括:
    第二设备接收第一设备的第一指示,所述第一指示用以指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
    所述第二设备收集训练数据,并向所述第一设备上报所述训练数据。
  13. 根据权利要求12所述的数据收集方法,其中,所述第二设备接收第一设备的第一指示包括:
    所述第二设备接收所述第一设备单播的所述第一指示,所述第二设备为所述第一设备按照预设的第一筛选条件从候选第二设备中筛选出的第二设备;或
    所述第二设备接收所述第一设备向候选第二设备广播的所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
  14. 根据权利要求13所述的数据收集方法,其中,所述第二设备收集训练数据,并向所述第一设备上报所述训练数据包括:
    若所述第二设备接收所述第一设备单播的所述第一指示,所述第二设备收集并上报所述训练数据;或
    若所述第二设备接收所述第一设备广播的所述第一指示,所述第二设备收集并上报所述训练数据。
  15. 根据权利要求13所述的数据收集方法,其中,所述第二设备接收第一设备的第一指示之前,所述方法还包括:
    候选第二设备向所述第一设备上报第一训练数据和/或第一参数,所述第一参数是所述第一筛选条件的判断参数。
  16. 根据权利要求15所述的数据收集方法,其中,所述候选第二设备向所述第一设备仅上报所述第一训练数据,所述第一训练数据用于确定所述第一参数。
  17. 根据权利要求15所述的数据收集方法,其中,所述第一参数包括以下至少一项:
    所述候选第二设备的数据类型;
    所述候选第二设备的数据分布参数;
    所述候选第二设备的业务类型;
    所述候选第二设备的工作场景;
    所述候选第二设备的通信网络接入方式;
    所述候选第二设备的信道质量;
    所述候选第二设备收集数据的难易程度;
    所述候选第二设备的电量状态;
    所述候选第二设备的存储状态。
  18. 根据权利要求12所述的数据收集方法,其中,向所述第一设备上报所述训练数据之前,所述方法还包括:
    所述第二设备向所述第一设备发送第一请求,请求进行训练数据的收集和上报。
  19. 根据权利要求13所述的数据收集方法,其中,单播的所述第一指示包括以下至少一项:
    所述第二设备收集的训练数据的样本数;
    所述第二设备收集训练数据的时间;
    所述第二设备向所述第一设备上报训练数据的时间;
    是否需要对收集的数据进行预处理;
    对收集的数据进行预处理的方式;
    所述第二设备向所述第一设备上报的训练数据的数据格式。
  20. 根据权利要求13所述的数据收集方法,其中,广播的所述第一指示包括以下至少一项:
    进行数据收集的候选第二设备的标识;
    不进行数据收集的候选第二设备的标识;
    进行数据收集的候选第二设备所需要收集的训练数据的样本数;
    进行数据收集的候选第二设备收集训练数据的时间;
    进行数据收集的候选第二设备向所述第一设备上报训练数据的时间;
    是否需要对收集的数据进行预处理;
    对收集的数据进行预处理的方式;
    进行数据收集的候选第二设备向所述第一设备上报的训练数据的数据格式;
    所述第一筛选条件。
  21. 根据权利要求15所述的数据收集方法,其中,所述第二设备收集训练数据,并向所述第一设备上报所述训练数据之后,所述方法还包括:
    推理设备接收所述第一设备发送的训练后的AI模型和超参数。
  22. 根据权利要求21所述的数据收集方法,其中,所述AI模型为元学习模型,所述超参数由所述第一参数决定。
  23. 根据权利要求21所述的数据收集方法,其中,所述超参数包括以下至少一项:
    外迭代学习率;
    不同训练任务或所述推理设备对应的内迭代学习率;
    元学习率;
    不同训练任务或所述推理设备对应的内迭代次数;
    不同训练任务或所述推理设备对应的外迭代次数。
  24. 根据权利要求21所述的数据收集方法,其中,所述推理设备接收所述第一设备发送的训练后的AI模型和超参数之后,所述方法还包括:
    所述推理设备对所述AI模型进行性能验证;
    若性能验证结果满足预设的第一条件,所述推理设备将所述AI模型用于推理。
  25. 根据权利要求24所述的数据收集方法,其中,进行性能验证的AI模型为所述第一设备下发的AI模型,或,所述第一设备下发的AI模型经过微调后的模型。
  26. 根据权利要求12所述的数据收集方法,其中,
    所述第一设备为网络侧设备,所述第二设备为终端;或
    所述第一设备为网络侧设备,所述第二设备为网络侧设备;或
    所述第一设备为终端,所述第二设备为终端。
  27. 一种数据收集装置,包括:
    发送模块,用于向第二设备发送第一指示,指示所述第二设备收集并上报用以进行特定AI模型训练的训练数据;
    接收模块,用于接收所述第二设备上报的训练数据;
    训练模块,用于利用所述训练数据构造数据集,对所述特定AI模型进行训练。
  28. 根据权利要求27所述的数据收集装置,其中,
    所述发送模块具体用于按照预设的第一筛选条件从M个候选第二设备中筛选出N个所述第二设备,向所述N个所述第二设备单播所述第一指示,M,N为正整数,N小于或等于M;或
    向所述M个候选第二设备广播所述第一指示,所述第一指示携带有第二筛选条件,所述第二筛选条件用于筛选上报所述训练数据的第二设备,所述第二设备满足所述第二筛选条件。
  29. 一种数据收集装置,包括:
    接收模块,用于接收第一设备的第一指示,所述第一指示用以指示第二设备收集并上报用以进行特定AI模型训练的训练数据;
    处理模块,用于收集训练数据,并向所述第一设备上报所述训练数据。
  30. 根据权利要求29所述的数据收集装置,其中,所述数据收集装置还包括:
    推理模块,用于对所述AI模型进行性能验证;若性能验证结果满足预设的第一条件,将所述AI模型用于推理。
  31. 一种第一设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至11任一项所述的数据收集方法的步骤。
  32. 一种第二设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求12至26任一项所述的数据收集方法的步骤。
  33. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1-11任一项所述的数据收集方法,或者实现如权利要求12至26任一项所述的数据收集方法的步骤。
PCT/CN2022/138757 2021-12-15 2022-12-13 数据收集方法及装置、第一设备、第二设备 WO2023109828A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111540035.0 2021-12-15
CN202111540035.0A CN116264712A (zh) 2021-12-15 2021-12-15 数据收集方法及装置、第一设备、第二设备

Publications (1)

Publication Number Publication Date
WO2023109828A1 true WO2023109828A1 (zh) 2023-06-22

Family

ID=86722671

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/138757 WO2023109828A1 (zh) 2021-12-15 2022-12-13 数据收集方法及装置、第一设备、第二设备

Country Status (2)

Country Link
CN (1) CN116264712A (zh)
WO (1) WO2023109828A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107741899A (zh) * 2017-10-16 2018-02-27 北京小米移动软件有限公司 处理终端数据的方法、装置及系统
CN110457476A (zh) * 2019-08-06 2019-11-15 北京百度网讯科技有限公司 用于生成分类模型的方法和装置
US10576380B1 (en) * 2018-11-05 2020-03-03 Sony Interactive Entertainment LLC Artificial intelligence (AI) model training using cloud gaming network
CN111226238A (zh) * 2017-11-07 2020-06-02 华为技术有限公司 一种预测方法及终端、服务器
CN111931876A (zh) * 2020-10-12 2020-11-13 支付宝(杭州)信息技术有限公司 一种用于分布式模型训练的目标数据方筛选方法及系统
CN113177367A (zh) * 2021-05-28 2021-07-27 北京邮电大学 高能效的联邦学习方法、装置、边缘服务器及用户设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107741899A (zh) * 2017-10-16 2018-02-27 北京小米移动软件有限公司 处理终端数据的方法、装置及系统
CN111226238A (zh) * 2017-11-07 2020-06-02 华为技术有限公司 一种预测方法及终端、服务器
US10576380B1 (en) * 2018-11-05 2020-03-03 Sony Interactive Entertainment LLC Artificial intelligence (AI) model training using cloud gaming network
CN110457476A (zh) * 2019-08-06 2019-11-15 北京百度网讯科技有限公司 用于生成分类模型的方法和装置
CN111931876A (zh) * 2020-10-12 2020-11-13 支付宝(杭州)信息技术有限公司 一种用于分布式模型训练的目标数据方筛选方法及系统
CN113177367A (zh) * 2021-05-28 2021-07-27 北京邮电大学 高能效的联邦学习方法、装置、边缘服务器及用户设备

Also Published As

Publication number Publication date
CN116264712A (zh) 2023-06-16

Similar Documents

Publication Publication Date Title
WO2022078276A1 (zh) Ai网络参数的配置方法和设备
WO2023109827A1 (zh) 客户端筛选方法及装置、客户端及中心设备
WO2023066288A1 (zh) 模型请求方法、模型请求处理方法及相关设备
WO2023284796A1 (zh) Tci状态的指示方法、装置、终端和网络侧设备
WO2023284801A1 (zh) Tci状态确定方法、装置、终端及网络侧设备
WO2023025017A1 (zh) 传输处理方法、装置及设备
WO2023109828A1 (zh) 数据收集方法及装置、第一设备、第二设备
WO2022105907A1 (zh) Ai网络部分输入缺失的处理方法和设备
WO2023040885A1 (zh) 参数选择方法、参数配置方法、终端及网络侧设备
WO2023284800A1 (zh) Srs的传输方法、装置、终端及网络侧设备
WO2023179651A1 (zh) 波束处理方法、装置及设备
WO2023098535A1 (zh) 信息交互方法、装置及通信设备
WO2023151650A1 (zh) 信息激活方法、终端及网络侧设备
WO2023036309A1 (zh) 参考信号序列生成方法、装置、设备及介质
WO2024051564A1 (zh) 信息传输方法、ai网络模型训练方法、装置和通信设备
WO2024032694A1 (zh) Csi预测处理方法、装置、通信设备及可读存储介质
WO2023088269A1 (zh) Ai信息的传输方法和设备
WO2023179649A1 (zh) 人工智能模型的输入处理方法、装置及设备
WO2023160547A1 (zh) 信息响应方法、信息发送方法、终端及网络侧设备
WO2023198094A1 (zh) 模型输入的确定方法及通信设备
WO2022237833A1 (zh) 信息处理方法、装置、终端和网络侧设备
WO2024012303A1 (zh) 一种ai网络模型交互方法、装置和通信设备
WO2023202603A1 (zh) 条件切换的配置方法、终端及网络侧设备
WO2023072239A1 (zh) 信道预测方法、装置、网络侧设备及终端
WO2023179753A1 (zh) 波束信息指示方法、装置、终端及网络侧设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22906573

Country of ref document: EP

Kind code of ref document: A1