WO2022048387A1 - Data storage method and system, and data calling method and system - Google Patents

Data storage method and system, and data calling method and system Download PDF

Info

Publication number
WO2022048387A1
WO2022048387A1 PCT/CN2021/110847 CN2021110847W WO2022048387A1 WO 2022048387 A1 WO2022048387 A1 WO 2022048387A1 CN 2021110847 W CN2021110847 W CN 2021110847W WO 2022048387 A1 WO2022048387 A1 WO 2022048387A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
target data
data set
target
information
Prior art date
Application number
PCT/CN2021/110847
Other languages
French (fr)
Chinese (zh)
Inventor
闵令昂
Original Assignee
北京航迹科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202010931768.6A external-priority patent/CN112069368B/en
Application filed by 北京航迹科技有限公司 filed Critical 北京航迹科技有限公司
Publication of WO2022048387A1 publication Critical patent/WO2022048387A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying

Definitions

  • the present application relates to the field of information technology, and in particular, to a data storage and invocation method and system.
  • road testing autonomous driving test vehicles usually need to collect a large amount of data (referred to as “road test data") for analysis, debugging and other purposes.
  • the relevant personnel's demand for drive test data is usually targeted, that is, only part of the large amount of data needs to be called.
  • a specific team usually only needs to call drive test data belonging to a specific type rather than all types of drive test data (eg, an image processing team only needs drive test data related to image types).
  • only drive test data within a specific range for example, a specific time period) needs to be called.
  • One aspect of the present application provides a data storage method executed by a computing device, characterized in that the method includes: acquiring an original data set, the original data set includes a plurality of data elements, each data element has a label for the data The type information of the meta type; according to the type information of the data elements in the original data set, the number N of different types is obtained, and N different target data sets are established correspondingly, and the N different target data sets are associated with different types of data sets.
  • N is an integer greater than or equal to 2; and based on the type information of the data element in the original data set and each target data set, the data element corresponding to the target data set is stored in the corresponding target data set In the data set, the target data set is stored in the first storage device.
  • the dataset is a file
  • the data elements of the file are messages.
  • the types include one or more of an image class, a location class, a sensor class, a packet class, and a controller area network bus class.
  • the method further includes: establishing index information of the target data set, where the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set; wherein , and the meta identification information refers to the identification information of the corresponding data element.
  • the data elements in the target data set are arranged in chronological order, and the meta identification information includes time information of the corresponding data elements.
  • the index information further includes set identification information of the original data set corresponding to each data element in the target data set; wherein, the set identification information refers to the identification information of the original data set.
  • the data in the raw data set includes data generated or collected during operation of the autonomous vehicle.
  • the method further includes: receiving a data call request sent by the client, the data call request at least including the type of the data to be called; based on the data call request from the N different targets The data set determines a corresponding type of target data set; obtains the data to be called based on the data elements in the determined target data set; and sends the data to be called to the second storage device of the client.
  • the obtaining the data to be called based on the data elements in the determined target data set further comprises: acquiring the target data set and the data elements stored therein from the first storage device; Dividing the target data set into multiple target data subsets at preset time intervals; and acquiring, based on the data call request, data elements corresponding to some target data subsets in the multiple target data subsets, the The data to be called includes the data elements corresponding to the partial target data subsets.
  • the obtaining the data to be called based on the data elements in the determined target data set further comprises: converting the determined target data set acquired in the first storage device and the stored data thereof.
  • the data element is sent to a third storage device, and the first storage device is farther from the user end than the third storage device is far from the user end; the target data set is divided into multiple sets at preset time intervals
  • the target data subset is stored in the third storage device; multiple logical files are established, each logical file corresponds to one of the multiple target data subsets, and each logical file includes the target data subsets.
  • index information corresponding to the data element based on the data call request and the logical file, obtain, from the third storage device, data elements stored in some target data subsets in the plurality of target data subsets, the The data to be called includes the data elements corresponding to the partial target data subsets.
  • the system includes an original data set acquisition module, a target data set establishment module and a storage module.
  • the original data set acquisition module is used for acquiring an original data set
  • the original data set includes a plurality of data elements
  • each data element has type information marking the type of the data element.
  • the target data set establishment module is used to obtain the number N of different types according to the type information of the data elements in the original data set, and correspondingly establish N different target data sets, the N different target data sets and Different types of data elements correspond; among them, N is an integer greater than or equal to 2.
  • the storage module is configured to store the data elements corresponding to the target data set in the corresponding target data set based on the type information of the data elements in the original data set and the target data set.
  • the dataset is a file
  • the data elements of the file are messages.
  • the types include one or more of an image class, a location class, a sensor class, a packet class, and a controller area network bus class.
  • the system further includes an index information establishment module, configured to establish index information of the target data set, where the index information at least includes meta-identification information of one-to-one correspondence of each data element in the target data set and storage location information; wherein, the meta identification information refers to the identification information of the corresponding data element.
  • the data elements in the target data set are arranged in chronological order, and the meta identification information includes time information of the corresponding data elements.
  • the index information further includes set identification information of the original data set corresponding to each data element in the target data set; wherein, the set identification information refers to the identification information of the original data set.
  • Another aspect of the present application provides a storage medium, wherein the storage medium is used for storing computer instructions, and after the computer reads the computer instructions in the storage medium, the above data storage method is executed.
  • Yet another aspect of the present application provides a data calling method executed by a computing device, wherein the data elements in the original data set are stored in a corresponding target data set according to the above data storage method, and the target data set is stored in a computer in the first storage device associated with the apparatus.
  • the data invocation method includes: acquiring a data invocation request sent by a client, where the data invocation request at least includes the type of data to be invoked; acquiring part of the data in the target data set based on the data invocation request to obtain the data invoking Data to be called, the partial data includes data elements in the target data set corresponding to the type to which the data to be called belongs; and the data to be called is sent to the second storage device of the client.
  • the target data set has corresponding index information
  • the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set, wherein the meta-identification information refers to The identification information of the corresponding data element; the data call request further includes the meta qualification related to the meta identification information.
  • the acquiring, based on the data calling request, the data elements in the target data set of the corresponding type includes: acquiring, based on the data calling request, index information corresponding to the corresponding type and satisfying the element qualification; and based on the acquired index information The storage location of the data element is obtained.
  • the data elements in the target data set are arranged in chronological order
  • the meta identification information includes time information of the corresponding data elements
  • the meta qualification includes a time range corresponding to the data to be called.
  • the acquiring part of the data in the target data set based on the data calling request further includes: dividing the target data set into multiple target data subsets at preset time intervals;
  • the data invocation request is used to obtain the data elements corresponding to some target data subsets in the target data subsets, and the data to be invoked includes the data elements corresponding to the partial target data subsets.
  • the acquiring part of the data in the target data set based on the data invocation request further includes: sending the target data set acquired in the first storage device and the data elements stored therein to the first storage device.
  • the first storage device is farther from the user terminal than the third storage device is far from the user terminal;
  • the target data set is divided into multiple target data subsets at preset time intervals and storing in the third storage device; establishing multiple logical files, each logical file corresponding to one of the multiple target data subsets, the logical file including index information corresponding to the data elements in the target data subset; and based on
  • For the data calling request and the logic file data elements corresponding to some target data subsets in the plurality of target data subsets are obtained from the third storage device, and the data to be called includes the partial target data The data element corresponding to the subset.
  • the index information further includes set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set; the data invocation request further includes Includes set qualifications related to set identification information.
  • the acquiring index information corresponding to the corresponding type and satisfying the meta-qualifying condition based on the data calling request includes: acquiring index information corresponding to the corresponding type and satisfying the set-qualifying condition based on the data calling request.
  • the data calling system includes a user request acquiring module and a calling module.
  • the user request obtaining module is configured to obtain a data calling request sent by the client, where the data calling request at least includes the type of the data to be called.
  • the calling module is configured to acquire, based on the data calling request, part of the data in the target data set to obtain the data to be called, where the partial data includes the data in the target data set corresponding to the type to which the data to be called belongs. data element.
  • the target data set has corresponding index information
  • the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set, wherein the meta-identification information is Refers to the identification information of the corresponding data element; the data invocation request also includes meta qualifications related to the meta identification information.
  • the calling module includes an index information acquisition unit, a segment storage unit and a data element acquisition unit.
  • the index information obtaining unit is configured to obtain index information corresponding to the corresponding type and satisfying the meta-qualification condition based on the data call request.
  • the segment storage unit is configured to divide the target data set into a plurality of target data subsets at preset time intervals, and store each target data subset separately.
  • the data element obtaining unit is configured to obtain the data element based on the storage location in the obtained index information.
  • the data elements in the target data set are arranged in chronological order
  • the meta identification information includes time information of the corresponding data elements
  • the meta qualification includes a time range corresponding to the data to be called.
  • the data calling system further includes a synchronization module, and the synchronization module is configured to send the data to be called to the second storage device of the client.
  • the index information further includes set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set; the data invocation request further includes Includes set qualifications related to set identification information.
  • the index information obtaining unit is further configured to obtain, based on the data call request, index information corresponding to the corresponding type and satisfying the set qualification condition.
  • Another aspect of the present application provides a storage medium, wherein the storage medium is used for storing computer instructions, and after the computer reads the computer instructions in the storage medium, the above data calling method is executed.
  • FIG. 1 is a schematic diagram of an application scenario of a data processing system according to some embodiments of the present application.
  • FIG. 2 is a block diagram of an exemplary processing device according to some embodiments of the present application.
  • FIG. 3 is a block diagram of another exemplary processing device according to some embodiments of the present application.
  • FIG. 4 is an exemplary flowchart of a data storage method according to some embodiments of the present application.
  • FIG. 5 is a schematic diagram of storing different types of data elements in an original data set in a corresponding target data set according to some embodiments of the present application
  • FIG. 6 is a schematic diagram of index information corresponding to a target dataset according to some embodiments of the present application.
  • FIG. 7 is an exemplary flowchart of a data calling method according to some embodiments of the present application.
  • FIG. 8 is a schematic diagram of a data calling process according to some embodiments of the present application.
  • FIG. 9 is a schematic diagram of a data calling scenario according to some embodiments of the present application.
  • FIG. 10 is a schematic diagram of data invocation according to some embodiments of the present application.
  • FIG. 11 is a schematic diagram of data storage and invocation according to some embodiments of the present application.
  • FIG. 12 is a schematic diagram of a user interaction interface according to some embodiments of the present application.
  • system means for distinguishing different components, elements, parts, parts or assemblies at different levels.
  • device means for converting signals into signals.
  • unit means for converting signals into signals.
  • module means for converting signals into signals.
  • the embodiments of the present application can be applied to a data storage and invocation scenario where the amount of data is large and the user's demand for data is specific. In this scenario, the user usually only needs to invoke part of the data in the large amount of data.
  • the large amount of data may include road test data collected by an autonomous driving test vehicle during a road test.
  • the amount of road test data collected by the autonomous driving test vehicle can reach about 17MB/sec/vehicle, the average amount of road test data called each time can exceed 11G, and the average amount of road test data called every day It can exceed 8T, and the amount of data is large.
  • the embodiments of the present application provide a data storage and/or calling method, which can efficiently call part of the data that meets the specific needs of the user from the large amount of data.
  • the application scenarios of the data storage and invocation method and system of the present application are only some examples or embodiments of the present application. For those of ordinary skill in the art, without creative work, they can also The application is applied to other similar scenarios according to these figures.
  • this application mainly takes the drive test data as an example for description, it should be noted that the principles of this application can also be applied to the storage and invocation of other data with a large amount of data and targeted data requirements of users, for example, Positioning data, production data, monitoring data, etc.
  • FIG. 1 is a schematic diagram of an application scenario of a data processing system according to some embodiments of the present application.
  • data processing system 100 may include a vehicle 110 (eg, vehicles 110-1, 110-2... and/or 110-n), server 120, terminal device 130, storage device 140, network 150, and Positioning and Navigation System 160 .
  • vehicle 110 eg, vehicles 110-1, 110-2... and/or 110-n
  • server 120 e.g, terminal device 130, storage device 140, network 150, and Positioning and Navigation System 160 .
  • the data processing system 100 can be applied to taxi services, security systems, network monitoring, driverless vehicles, and the like. It should be noted that the description about autonomous driving in this application is for illustration purposes only, and does not limit the scope of this application.
  • Vehicle 110 may be any type of autonomous vehicle, drone, or the like.
  • An unmanned vehicle or drone can refer to a vehicle capable of achieving a certain level of driving automation.
  • Exemplary levels of driving automation may include: a first level, where the vehicle is primarily supervised by humans and has certain autonomous functions (eg, autonomous steering or acceleration); a second level, where the vehicle has one or more brakes that can control the vehicle , advanced driver assistance systems (ADAS) for steering and/or acceleration (e.g., adaptive cruise control, lane keeping systems); Level 3, the vehicle is capable of driving itself when one or more specific conditions are met; Level 4, the vehicle Can operate without human input or supervision, but still subject to certain constraints (e.g., restricted to a certain area); Level 5, where the vehicle operates autonomously in all situations, etc., or any combination thereof.
  • Vehicle 110 may also be a vehicle or other vehicle that is driven under human control for the purpose of collecting data.
  • the vehicle 110 may have an equivalent structure that enables the vehicle 110 to move around or fly.
  • vehicle 110 may include conventional vehicle structures such as a chassis, suspension, steering (eg, steering wheel), braking (eg, brake pedal), accelerator, and the like.
  • vehicle 110 may have a body and at least one wheel.
  • the body can be any body type, such as a sports car, coupe, sedan, pickup truck, station wagon, sport utility vehicle (SUV), minivan, or converted van.
  • the at least one wheel may be all wheel drive (AWD), front wheel drive (FWR), rear wheel drive (RWD), or the like.
  • the vehicle 110 may be an electric vehicle, a fuel cell vehicle, a hybrid vehicle, a conventional internal combustion engine vehicle, or the like.
  • the vehicle 110 can sense its environment and navigate using one or more detection units 112 .
  • the detection unit 112 may include a global positioning system (GPS) module, a radar (eg, light detection and ranging (LiDAR)), an inertial measurement unit (IMU), a camera, etc., or any combination thereof.
  • GPS global positioning system
  • LiDAR light detection and ranging
  • IMU inertial measurement unit
  • a GPS module may refer to a device capable of receiving geographic location and time information from GPS satellites and calculating its geographic location.
  • An IMU can refer to an electronic device that uses various inertial sensors to measure and provide the vehicle's specific force, angular velocity, and sometimes the magnetic field surrounding the vehicle.
  • the various inertial sensors may include acceleration sensors (eg, piezoelectric sensors), velocity sensors (eg, Hall sensors), distance sensors (eg, radar, LIDAR, infrared sensors), rotational angle sensors (eg, tilt sensors), Traction related sensors (eg, force sensors).
  • the camera may be configured to acquire one or more images related to objects within the camera range (eg, people, animals, trees, roadblocks, buildings, or vehicles).
  • server 120 may be a single server or a group of servers.
  • the server group may be centralized or distributed (eg, server 120 may be a distributed system).
  • server 120 may be local or remote.
  • server 120 may access information and/or data stored in terminal device 130 , detection unit 112 , vehicle 110 , storage device 140 , and/or positioning and navigation system 160 via network 150 .
  • the server 120 may be directly connected to the terminal device 130 , the detection unit 112 , the vehicle 110 and/or the storage device 140 to access stored information and/or data.
  • server 120 may be implemented on a cloud platform or on-board computer.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distribution cloud, an internal cloud, a multi-layer cloud, etc., or any combination thereof.
  • server 120 may execute on a computing device that includes one or more components.
  • server 120 may include processing device 122 .
  • Processing device 122 may process information and/or data to perform one or more of the functions described herein. For example, the processing device 122 may establish a corresponding target data set according to the type information of the data elements in the original data set, and store the data elements corresponding to the target data set in the corresponding target data set. Further, the processing device 122 may store the target data set including the data elements in the storage device 140 or other storage device or system. As another example, the processing device 122 may create a query index for data stored in the storage device 140 or other storage devices or systems. Specifically, the raw data may include data generated by multiple vehicles during a drive test process, and may include camera data, radar data, and the like.
  • the processing repository 122 may build a query index based on each vehicle's trip ID, time range, and type of data element in the raw data.
  • the processing device 122 may include one or more processing engines (eg, a single-chip processing engine or a multi-chip processing engine).
  • the processing device 122 may include one or more hardware processors, such as a central processing unit (CPU), an application specific integrated circuit (ASIC), an application specific instruction set processor (ASIP), a graphics processing unit (GPU), a physical Arithmetic Processing Unit (PPU), Digital Signal Processor (DSP), Field Programmable Gate Array (FPGA), Programmable Logic Device (PLD), Controller, Microcontroller Unit, Reduced Instruction Set Computer (RISC), Microprocessor device, etc. or any combination thereof.
  • the processing device 122 may be integrated in the terminal device 130 .
  • the terminal device 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a vehicle built-in device 130-4, 130-5, etc., or any combination thereof.
  • the mobile device 130-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof.
  • smart home devices may include smart lighting devices, control devices for smart appliances, smart monitoring devices, smart TVs, smart cameras, walkie-talkies, etc., or any combination thereof.
  • the wearable device may include a smart bracelet, smart footwear, smart glasses, smart helmets, smart watches, smart clothing, smart backpacks, smart accessories, etc., or any combination thereof.
  • the smart mobile device may include a smart phone, a personal digital assistant (PDA), a gaming device, a navigation device, a POS device, etc., or any combination thereof.
  • the virtual reality device and/or the augmented reality device may include a virtual reality headset, virtual reality glasses, virtual reality goggles, augmented reality helmet, augmented reality glasses, augmented reality goggles, etc. or any combination thereof.
  • virtual reality devices and/or augmented reality devices may include Google TM Glasses, Oculus Rift, HoloLens, Gear VR, and the like.
  • vehicle built-in devices 130-4 may include an on-board computer, on-board television, and the like.
  • the server 120 may be integrated into the terminal device 130 .
  • the terminal device 130 may include a device with a positioning function to determine the location of the user and/or the terminal device 130 .
  • Storage device 140 may store data and/or instructions.
  • storage device 140 may store data obtained from vehicle 110 , detection unit 112 , processing device 122 , terminal device 130 , positioning and navigation system 160 , and/or external devices.
  • the storage device 140 may store drive test data obtained from the vehicle 110 .
  • storage device 140 may store data and/or instructions that may be executed or used to perform the example methods described in this application.
  • storage device 140 may store instructions that processing device 122 may execute to store and/or recall drive test data.
  • storage device 140 may include mass storage, removable storage, volatile read-write memory, read-only memory (ROM), the like, or any combination thereof.
  • Exemplary mass storage may include magnetic disks, optical disks, solid state drives, and the like.
  • Exemplary removable storage may include flash drives, floppy disks, optical disks, memory cards, magnetic disks, tapes, and the like.
  • Exemplary volatile read-write memory may include random access memory (RAM).
  • Exemplary RAMs may include dynamic random access memory (DRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), static random access memory (SRAM), thyristor random access memory (T-RAM), and zero capacitance Random Access Memory (Z-RAM), etc.
  • DRAM dynamic random access memory
  • DDR SDRAM double data rate synchronous dynamic random access memory
  • SRAM static random access memory
  • T-RAM thyristor random access memory
  • Z-RAM zero capacitance Random Access Memory
  • Exemplary read-only memories may include masked read-only memory (MROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), Compact Disc Read Only Memory (CD-ROM) and Digital Versatile Disk Read Only Memory, etc.
  • the storage device 140 may further include a distributed file system (Hadoop Distributed File System, HDFS).
  • the distributed file systems may be located in different regions (eg, different countries, different regions, different sites, etc.) and associated with each other.
  • the distributed file system may include a first distributed file system and a second distributed file system, the first distributed file system server belongs to the first region, and the second distributed file system server belongs to the second region.
  • the drive test data is collected in the first area and stored in the first distributed system according to any method shown in the embodiments of this application.
  • the location of the user terminal belongs to the second area, and the distance between the user terminal and the second distributed system server is smaller than the distance between the user terminal and the first distributed system server.
  • the user may invoke at least part of the data stored in the first distributed system through any method shown in the embodiments of this application.
  • the data processing system 100 can synchronize at least part of the data in the first distributed system with the second distributed system, and the user can use any method shown in the embodiments of this application to synchronize data from the second distributed system call this at least part of the data.
  • the storage device 140 may execute on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distribution cloud, an internal cloud, a multi-layer cloud, etc., or any combination thereof.
  • storage device 140 may be connected to network 150 for communication with one or more components in data processing system 100 (eg, server 120 , terminal device 130 , detection unit 112 , vehicle 110 , and/or positioning and navigation system 160 ). ) communication.
  • One or more components in data processing system 100 may access data or instructions stored in storage device 140 via network 150 .
  • storage device 140 may be directly connected to one or more components in data processing system 100 (eg, server 120 , end device 130 , detection unit 112 , vehicle 110 , and/or positioning and navigation system 160 ) or with communication.
  • storage device 140 may be part of server 120 .
  • the storage device 140 may be integrated in the vehicle 110 .
  • Network 150 may facilitate the exchange of information and/or data.
  • one or more components in data processing system 100 eg, server 120 , end device 130 , detection unit 112 , vehicle 110 , storage device 140 , and/or positioning and navigation system 160 ) /Send/obtain information and/or data from other components in data processing system 100.
  • processing device 122 may obtain drive test data from vehicle 110 via network 150 .
  • the processing device 122 may obtain the data call request input by the user from the terminal device 130 via the network 150 .
  • the network 150 may be a wired network or a wireless network, or the like, or any combination thereof.
  • the network 150 may include a cable network, a wired network, a fiber optic network, a telecommunications network, an internal network, the Internet, a local area network (LAN), a wide area network (WAN), a wireless local area network (WLAN), a metropolitan area network (MAN) , Public Switched Telephone Network (PSTN), Bluetooth network, ZigBee network, Near Field Communication (NFC) network, etc. or any combination thereof.
  • network 150 may include one or more network access points.
  • network 150 may include wired or wireless network access points (eg, base stations and/or Internet exchange points 150-1, 150-2) through which one or more components of data processing system 100 may connect to network 150 to exchange data and/or information.
  • Positioning and navigation system 160 may determine information associated with objects, eg, end devices 130, vehicles 110, and the like.
  • the positioning and navigation system 160 may be a Global Positioning System (GPS), Global Navigation Satellite System (GLONASS), Compass Navigation System (COMPASS), Beidou Navigation Satellite System, Galileo Positioning System, Quasi-Zenith Satellite System ( QZSS) etc.
  • the information may include the object's position, height, velocity or acceleration, current time, and the like.
  • Positioning and navigation system 160 may include one or more satellites, such as satellite 160-1, satellite 160-2, and satellite 160-3. The satellites 160-1 to 160-3 may independently or collectively determine the above information.
  • the positioning and navigation system 160 may transmit the above-mentioned information to the network 150 , the terminal device 130 or the vehicle 110 via a wireless connection.
  • an element may execute through electrical and/or electromagnetic signals.
  • the processor of terminal device 130 may generate an electrical signal that encodes the request.
  • the processor of the terminal device 130 may then transmit the electrical signal to the output port.
  • the output port may be physically connected to a cable, which may also transmit electrical signals to the input port of the server 120 .
  • the output port of the terminal device 130 may be one or more antennas that convert electrical signals into electromagnetic signals.
  • an electronic device such as terminal device 130 and/or server 120
  • when its processor processes instructions, issues instructions, and/or performs actions the instructions and/or actions are performed through electrical signals.
  • a processor retrieves or saves data from a storage medium (eg, storage device 140), it can send electrical signals to a read/write device of the storage medium, which can read or write in the storage medium into structured data.
  • the structured data can be sent to the processor in the form of electrical signals over the bus of the electronic device.
  • an electrical signal may refer to one electrical signal, a series of electrical signals, and/or a plurality of discrete electrical signals.
  • FIG. 2 is a block diagram of an exemplary processing device shown in accordance with some embodiments of the present application.
  • processing device 122 may be used for data storage.
  • the processing device 122 may include an original data set acquisition module 210 , a target data set establishment module 220 , an index establishment module 230 and a storage module 240 .
  • the raw data set obtaining module 210 can be used to obtain a raw data set, the raw data set includes a plurality of data elements, and each data element has type information marking the type of the data element.
  • a data set may refer to a data set that includes a plurality of data elements.
  • the dataset may be a file, and the data elements of the file are messages.
  • different data sets and/or data elements in the same data set may have respective identification information.
  • the raw data set obtaining module 210 may obtain the raw data set (ie, drive test data) from a test vehicle (eg, vehicle 110 ) via the network 150 .
  • the drive test data may be a message with a temporal nature.
  • the original data set acquisition module 210 may organize the messages collected by a test vehicle during a test trip into a file (for example, a bag file) for storage to obtain an original data set.
  • the original data set acquisition module 210 may also use identification information related to the test vehicle and the itinerary as the identification information of the file.
  • the identification information of the file may be set according to the id of the test vehicle and the id of the test itinerary.
  • the original data set obtaining module 210 may further use the time information of the message as the identification information of the message in the file.
  • the identification information of the message in the file may be set according to the timestamp of the message.
  • each data element in the original data set has type information that identifies the type of the data element.
  • the types may include one or more of an image class, a location class, a sensor class, a packet class, and a Controller Area Network Bus (CAN Bus) class.
  • CAN Bus Controller Area Network Bus
  • the target data set establishment module 220 may be configured to establish different target data sets according to the type information of the data elements in the original data set.
  • Each type of data element can correspond to a target data set.
  • the number of types of data elements in the original data set is N, and N different target data sets can be established, and the N different target data sets correspond to different types of data elements, where N is Integer greater than or equal to 2.
  • the target data set establishment module 220 may identify the type of each data element in the original data set and determine the number of different types. Further, the target data set establishment module 220 may establish different target data sets corresponding to different types of data elements.
  • the type of data element can be represented by the type of the device that acquires the drive test data.
  • the devices may include cameras, radars, inertial measurement units (IMUs), and the like.
  • the raw data may include camera data, radar data, IMU data, and the like.
  • the different target datasets may include camera-type target datasets, radar-type target datasets, and IMU-type target datasets.
  • the type of a data element can be expressed by a data type.
  • the data types may include audio data, image data, text data, and the like.
  • the different target data sets may include audio target data sets, image target data sets, text target data sets, and the like.
  • the target data set establishment module 220 may further set the identification information of the target data set according to the corresponding type of the target data set, so as to identify different types of target data sets. Since the identification information and type information of the target data set are corresponding, in some embodiments, the identification information may include type information common to all data elements in the same target data set.
  • the index building module 230 may build an index of the original data set to provide an indexing function for data elements in the original data set.
  • the index building module 230 may determine the meta-service information of the original data set, which may also be referred to as meta-information.
  • the meta-service information may be used to describe the structure, semantics, purpose, usage, etc. of the original data set or data elements in the original data set.
  • the meta-service information may also be referred to as index information or include index information for determining the storage location of the original data set or data elements in the original data set in the storage device.
  • the meta-service information may at least include meta-identification information (eg, timestamp) and storage location information (eg, offset) corresponding to each data element in the target data set.
  • the meta identification information may refer to identification information (eg, timestamp) of the corresponding data element.
  • the meta-service information may further include identification information of the target data set, where the identification information of the target data set corresponds to the type.
  • the meta-service information of the target data set may further include set identification information of the original data set corresponding to each data element in the target data set (eg, the id of the test vehicle and/or the id of the test trip, etc.).
  • indexing module 230 or storage module 240 may store the meta-service information in a storage device, eg, storage module 240, storage device 140, or other storage device, processing device 122 (eg, calling module 320). ) can access the storage device based on the user's data invocation request, and further locate the data element corresponding to the user's data invocation request based on the meta-service information and the identification information corresponding to the target data set (for example, an image target data set) , that is, determine the storage location of the data element in the storage device.
  • the index building module 230 may build meta-service information for each target dataset separately. The meta-service information of each target data set is stored in a storage device in the form of a list, for example, the storage device 140 or other storage devices.
  • the storage module 240 may be configured to store the data elements corresponding to the target data set in the corresponding target data set based on the type information of the data elements in the original data set and the target data set. In some embodiments, the storage module 240 may determine the type of each data element in the original data set and store the data element in the corresponding target data set. For example, if the data element is an image-like data element, the processing device 122 may store the data element in an image-like object dataset. Further, the storage module 240 may also store the target dataset (eg, an image-type target dataset) and the data elements stored therein in a storage device, eg, the storage device 140 (eg, a distributed file system).
  • a storage device eg, the storage device 140 (eg, a distributed file system).
  • the storage module 240 may store each data element in the same original data set in a physically contiguous memory space, or may store it in a physically non-contiguous memory space and the data elements stored in the non-contiguously pass through between the data elements. pointer to link.
  • the storage module 240 may store the data elements in the target dataset in a distributed file system (HDFS), which may be physical storage.
  • HDFS distributed file system
  • the storage module 240 may also store meta-service information corresponding to the target dataset (eg, an image-type target dataset) in a storage device, eg, the storage device 140 (eg, a distributed file system) )middle.
  • the meta-service information may point to the target data set through a pointer. The user can locate the target dataset through the meta-service information.
  • the storage module 240 may store the meta-service information corresponding to the target dataset in the form of a list.
  • the storage device used to store the target dataset and the data elements therein may be the same or different from the storage device used to store the meta-service information.
  • the above description of the processing device 122 and its modules is only for convenience of description, and cannot limit the present application to the scope of the illustrated embodiments. It can be understood that for those skilled in the art, after understanding the principle of the system, various modules may be combined arbitrarily, or a subsystem may be formed to connect with other modules without departing from the principle.
  • the original data set acquisition module 210, the target data set establishment module 220, the index establishment module 230, and the storage module 240 disclosed in FIG. 2 may be different modules in a system, or may be one module to implement the above two or both functions of more than one module.
  • the storage module 240 and the index building module 230 may be two independent modules, or one module may have the functions of data storage, index building and caching at the same time. Such deformations are all within the protection scope of the present application.
  • FIG. 3 is a block diagram of another exemplary processing device shown in accordance with some embodiments of the present application.
  • processing device 122 may be used to invoke data.
  • the processing device 122 may include a user request obtaining module 310 and a calling module 320 .
  • the user request acquisition module 310 may be configured to acquire a user's data invocation request, where the data invocation request at least includes the type of the data to be invoked.
  • the data call request may be input by the user through a mobile device (eg, an input/output interface of the terminal device 130 ) or a computing device.
  • the input/output interface of the terminal device 130 may include an input device, such as a keyboard, a mouse, a touch screen, a microphone, a trackball, etc., or any combination thereof, and the user may use the input device to input the data calling request.
  • the user request acquisition module 310 may acquire the data call request (eg, via the network 150).
  • the data includes drive test data.
  • the data invocation request may also include information such as the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
  • the calling module 320 may be configured to obtain the data elements in the target data set of the corresponding type based on the data calling request, and obtain the data to be called.
  • the calling module 320 may include an index information obtaining unit 322 , a segment storage unit 323 , and a data element obtaining unit 324 .
  • the index information obtaining unit 322 may be configured to obtain index information corresponding to the data calling request based on the data calling request.
  • the data element obtaining unit 324 may be configured to obtain the data element based on the index information.
  • data processing system 100 may provide an indexing mechanism.
  • the indexing module 230 may determine meta-service information for the target dataset, which may be stored in a storage device.
  • the index information obtaining unit 322 may obtain the user's index request information from the data calling request.
  • the data element obtaining unit 324 may match the index request information in the data invocation request with the meta-service information stored in the storage device, thereby determining index information (or index information matching the index request information in the data invocation request). meta-service information), to obtain data elements based on the index information or the storage location pointed to by the meta-service information.
  • the data invocation request at least includes the type of the data to be invoked.
  • the type described by the data to be called may include one or more than one. Since the identification information and type information of the target data set are corresponding, the index information obtaining unit 322 can access the meta-service information based on one or more types (that is, user index request information) selected by the user included in the data call request, Index information in the meta-service information that matches the one or more types is determined. Further, the data element obtaining unit 324 may determine the storage location of the corresponding target data set based on the index information, so as to call the data element in the corresponding target data set.
  • the data invocation request may further include more filter conditions related to the data to be invoked.
  • the index information may include at least one-to-one corresponding meta identification information (eg, timestamp) and storage location information of each data element in the target data set.
  • the data invocation request may further include meta-qualification conditions (eg, time range) related to the meta-identification information for the data to be invoked.
  • the index information obtaining unit 322 may access the meta-service information based on the data call request, and determine the index information in the meta-service information that corresponds to the one or more types and satisfies the meta-qualifying condition.
  • the data element obtaining unit 324 may obtain the data element according to the storage location corresponding to the index information.
  • the index information may further include set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set.
  • the data invocation request may further include a set qualification related to the set identification information for the data to be invoked.
  • the index information obtaining unit 322 may access meta-service information based on the data call request, and determine index information in the meta-service information that corresponds to the one or more types and satisfies the meta-qualifying condition and/or the set-qualifying condition. Further, the data element obtaining unit 324 may obtain the data element according to the storage location corresponding to the index information.
  • the data may include drive test data.
  • the user's data invocation request may include index request information such as the type of the drive test data collection device, the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
  • the index request information determined by the data calling request may include at least the id of the test vehicle and/or the id of the test trip, the type of the data to be called, the time range, the data time length, and the like.
  • the segment storage unit 323 may be configured to obtain the target data set stored in the target data set from a storage device (eg, a distributed file system) in which the target data set is stored based on the meta-service information matched with the user's data invocation request. data element. Further, the segment storage unit 323 may further store the target data set according to the preset time information (eg, timestamp) based on the time information (for example, time stamp) identified by the data element, that is, the time information corresponding to the target data set and the time information in the user data call request.
  • a storage device eg, a distributed file system
  • the segment storage unit 323 may further store the target data set according to the preset time information (eg, timestamp) based on the time information (for example, time stamp) identified by the data element, that is, the time information corresponding to the target data set and the time information in the user data call request.
  • the time interval is divided into multiple target data subsets, each target data subset corresponds to a time interval, and the data elements obtained at each time interval (for example, every 10s) are stored in each target data subset (also called physical data). data files).
  • each target data subset also called physical data. data files.
  • the target data set is divided into multiple target data subsets at preset time intervals. For example, if the time length of the data element corresponding to each target data set is 100 seconds, and the time length in the user data call request is 20 seconds, the segment storage unit 323 may divide the target data set into 10 target data subsets, Each target data subset corresponds to 10 seconds of data elements.
  • the time range here may refer to the time range in which the data elements are collected.
  • the segment storage unit 323 may store each target data subset and the data elements stored therein in the memory of the processing device 122 by means of physical storage. After the user completes the invocation of some data elements in the target data set, the target data subset and the data elements stored therein can be erased. Further, the data obtaining unit 324 may obtain, from the storage device, data elements in the target data subsets matching the time information in each time interval based on the time information of the user data call request.
  • the processing device 122 may distribute the data based on the above method to the user's terminal (also referred to as the user terminal, eg, terminal device 130).
  • the segment storage unit 323 may determine the target data set pointed to by the meta-service information based on the meta-service information matching the user's data invocation request. And based on the time information (eg, time range) corresponding to the target data set and the time information of the user data invocation request, multiple logical files corresponding to the target data set are established. For example, when the location of the client is not in the same area (eg, city or country) as the storage device (referred to as the first storage device, such as a distributed file system) storing the original data set, the processing device 122 and the first storage device The storage devices are in the same region (eg, city or country), and the processing device 122 can create multiple logical files corresponding to the target dataset.
  • the time information eg, time range
  • the processing device 122 may further send the target data set and its stored data elements to a second storage device located in the same region (eg, city or country) as the client.
  • the server where the second storage device is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively.
  • the plurality of logical files point to the target data subset in the second storage device by way of pointers.
  • the data acquisition unit 324 can determine the logical file matching the user invocation request by matching the time information in the user data invocation request with the time information in each logical file, and based on the second storage device pointed to by the logical file matched by the user. the target data subset, instructing the server of the second storage device to send the data elements in the matched target data subset to the client. Further, the server of the second storage device may combine the data elements in the multiple target data subsets and send them to the client.
  • the data elements in the target data subset are not stored in the logical file, and information about the data elements (eg, part of the meta-service information) may be stored.
  • Logical files can point to physical data files (ie, target data subsets) by way of pointers.
  • the target data set can be divided into multiple target data subsets according to preset time intervals, each target data subset corresponds to a time interval, and each target data subset can create a logic file.
  • Each logical file includes meta-service information of data elements stored in each target data subset.
  • the user may locate the target sub-data set in the second storage device through the time information in the user call request through the meta-service information in the logical file.
  • the logical file and the physically stored target data subset may be linked by a pointer.
  • the second storage device may store each data element in the same target sub-data set in a physically contiguous memory space, or may store it in a physically non-contiguous memory space.
  • the second storage device may store the data elements in the target sub-dataset in a distributed file system (HDFS), which may be physical storage.
  • HDFS distributed file system
  • the user can call the data actually corresponding to the specified time period (that is, the physically stored target data subset) from the second storage device located in the same area as the user terminal by accessing the logical file in the first storage device. , so as to realize the function of quickly calling part of the data.
  • the segment storage unit 323 may set the time interval according to the minimum value of the time period specified by the user for the data to be called, so as to ensure that several target data subsets called according to the time period are related to the time interval as much as possible. The actual corresponding data of the segment are consistent with each other.
  • the segment storage unit 323 may directly set the minimum value of the time period specified by the user for the data to be called as the time interval.
  • the segment storage unit 323 may determine and acquire the target data set from the storage device storing the original data set based on the time information (eg, timestamp) identified by the data element, and store the target data set according to a preset value.
  • the time interval is divided into multiple target data subsets, and each target data subset is stored separately.
  • the processing device 122 does not need to send the data in the entire target data set.
  • the client it is only necessary to send the data elements corresponding to the time range information in the user's call data request (that is, the data elements in the target data subset) to the client, so as to realize the function of quickly calling part of the data and improve the efficiency of data calling.
  • the above description of the processing device 122 and its modules is only for convenience of description, and cannot limit the present application to the scope of the illustrated embodiments. It can be understood that for those skilled in the art, after understanding the principle of the system, various modules may be combined arbitrarily, or a subsystem may be formed to connect with other modules without departing from the principle.
  • the user request acquisition module 310 and the invocation module 320 disclosed in FIG. 3 may be different modules in a system, or may be a module that implements the functions of the above-mentioned two or more modules.
  • the user request obtaining module 310 and the calling module 320 may be two modules, or one module may have the functions of obtaining user requests and calling data at the same time. Such deformations are all within the protection scope of the present application.
  • system and its modules shown in Figures 2 and 3 may be implemented in various ways.
  • the system and its modules may be implemented in hardware, software, or a combination of software and hardware.
  • the hardware part can be realized by using dedicated logic;
  • the software part can be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware.
  • a suitable instruction execution system such as a microprocessor or specially designed hardware.
  • the methods and systems described above may be implemented using computer-executable instructions and/or embodied in processor control code, for example on a carrier medium such as a disk, CD or DVD-ROM, such as a read-only memory (firmware) ) or a data carrier such as an optical or electronic signal carrier.
  • the system and its modules of the present application can not only be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc. , can also be implemented by, for example, software executed by various types of processors, and can also be implemented by a combination of the above-mentioned hardware circuits and software (eg, firmware).
  • FIG. 4 is an exemplary flowchart of a data storage method according to some embodiments of the present application. As shown in Figure 4, the data storage method may include:
  • Step 410 Obtain an original data set, where the original data set includes a plurality of data elements, and each data element has type information indicating the type of the data element.
  • step 410 may be performed by processing device 122 (eg, raw data set acquisition module 210).
  • a data set may refer to a data set that includes a plurality of data elements.
  • a unit is a dataset.
  • the dataset may be a file, and the data elements of the file are messages.
  • the file may be in a bag format, hereinafter referred to as a "bag file".
  • the data elements in the data set may be associated with each other.
  • different data sets and/or data elements in the same data set may have respective identification information.
  • the drive test data collected by the test vehicle may be a message with a temporal nature (for example, the message may have a time stamp).
  • the processing device 122 may organize the messages collected by a test vehicle during a test trip into a file (eg, a bag file) for storage to obtain an original data set. Further, the processing device 122 may also use identification information related to the test vehicle and the itinerary as the identification information of the file. For example, the identification information of the file may be set according to the id of the test vehicle and the id of the test trip. In some embodiments, the processing device 122 may also use the time information of the message as the identification information of the message in the file. For example, the identification information of the message in the file may be set according to the timestamp of the message.
  • each data element in the original data set has type information that identifies the type of the data element.
  • the types may include one or more of an image class, a location class, a sensor class, a packet class, and a Controller Area Network Bus (CAN Bus) class.
  • CAN Bus Controller Area Network Bus
  • the data element can mark the type to which it belongs by carrying type information. That is, a storage unit with a size capable of accommodating the data element and its type information can be allocated to the data element, and the data element and its type information can be organized together according to preset rules for storage.
  • the data element and the corresponding type information may be connected by a preset connection symbol, one side of the connection symbol is the data element, and the other side of the connection symbol is the type information of the data element.
  • a storage unit for storing a data element and its type information may be divided into at least two partitions, including a first partition for storing the data element itself and a first partition for storing the type information of the data element Second division.
  • Step 420 Establish different target data sets according to the type information of the data elements in the original data set.
  • Each type of data element can correspond to a target data set.
  • the number of types of data elements in the original data set is N, and N different target data sets can be established, and the N different target data sets correspond to different types of data elements, where N is Integer greater than or equal to 2.
  • step 420 may be performed by processing device 122 (eg, target dataset establishment module 220).
  • processing device 122 may identify the type of each data element in the original data set and determine the number of different types.
  • the original data set may include three types of data elements: image type, position type, and speed type.
  • the processing device 122 may determine that the number of types of data elements in the original data set is three.
  • the processing device 122 may establish three different target data sets corresponding to different types of data elements.
  • the three different target datasets may be an image class target dataset, a location class target dataset, and a speed class target dataset.
  • the types of data elements may be divided according to the type of the device that acquires the drive test data.
  • the devices may include cameras, radars, inertial measurement units (IMUs), and the like.
  • the raw data may include camera data, radar data, IMU data, and the like.
  • the processing device 122 may establish different target data sets corresponding to different types of data elements.
  • the different target datasets may be camera-type target datasets, radar-type target datasets, and IMU-type target datasets.
  • the processing device 122 may further set the identification information of the target data set according to the type corresponding to the target data set, so as to identify different types of target data sets. Since the identification information and type information of the target data set are corresponding, in some embodiments, the identification information may include type information common to all data elements in the same target data set.
  • the processing device 122 may store a target dataset in one file and store identification information corresponding to the target dataset in another file, the target dataset and The corresponding identification information can be connected by preset connection symbols.
  • the preset side of the connection symbol is the target data set, and the other side of the connection symbol is the identification information of the target data set.
  • Step 430 Based on the type information of the data elements in the original data set and the target data set, store the data elements corresponding to the target data set in the corresponding target data set.
  • step 430 may be performed by processing device 122 (eg, storage module 240).
  • processing device 122 may determine the type of each data element in the original dataset and store the data element in the corresponding target dataset. For example, if the data element is an image-like data element, the processing device 122 may store the data element in an image-like object dataset.
  • FIG. 5 is a schematic diagram of storing different types of data elements in the original data set in the corresponding target data set according to some embodiments of the present application. As shown in FIG. 5 , the data element types of an original data set include three types A, B, and C, and three target data sets are established corresponding to these three types respectively.
  • the data elements A1, A2 and A3 belonging to type A are stored in the target data set corresponding to type A
  • the data elements B1 and B2 belonging to type B are stored in the target data set corresponding to type B
  • the data elements belonging to type C are stored in the target data set corresponding to type B.
  • Data elements C1, C2, C3, and C4 are stored in the target dataset corresponding to type C.
  • the processing device 122 may also store the identification information corresponding to the target data set in the storage device as meta-service information.
  • the identification information may point to the image class target dataset through a pointer.
  • the user may locate the image-like target dataset based on the meta-service information. For example, a user may input a query request related to the image-based target data set through the input/output interface of the terminal device 130, and the processing device 122 may access the storage device based on the query request, thereby determining the image-based target data set location.
  • each data element in the same original data set may be stored in a physically contiguous memory space, or may be stored in a physically non-contiguous memory space, and the data elements stored in a non-contiguous manner are linked by pointers.
  • the processing device 122 eg, the index building module 230
  • the index information may include at least one-to-one corresponding meta identification information and storage location information of each data element in the target data set.
  • the meta identification information may refer to the identification information of the corresponding data element.
  • the index information may further include identification information of the target dataset. Based on this, the processing device 122 may determine the location of the target dataset to be invoked based on the identification information of the dataset in the user's data invocation request. In some embodiments, the index information may be stored in a storage device, and the processing device 122 may access the storage device based on the user's data call request, and further locate the data call with the user based on the index information in the storage device Request the corresponding data element. In some embodiments, processing device 122 may establish index information for each target data set separately.
  • the meta identification information may include time information of the corresponding data element. Further, in some embodiments, the time information may include a time stamp. Timestamps can be used to uniquely identify when a piece of data was generated (eg, when the data element was collected).
  • the storage location information may include an offset. The offset may refer to the distance between the actual address of the storage unit (eg, the address of the data element) and the segment address of the segment in which it is located (eg, the target data set). For the specific implementation of the index information, reference may be made to FIG. 6 and related descriptions.
  • FIG. 6 is a schematic diagram of index information corresponding to a target dataset according to some embodiments of the present application. As shown in FIG.
  • the type file represents the target data set
  • the type index file represents the index information of the target data set.
  • Timestamp represents the timestamp
  • Offset represents the offset.
  • Each message (indicated by Msg) in the type index file corresponding to the type file points to (links to) each message in the type file, and includes the time stamp and storage location of the pointed message. Based on the build type index file, the corresponding message can be determined by the time stamp. Similarly, it can be understood that by establishing index information, corresponding data elements can be determined by time information.
  • the system can obtain the time period specified by the user and query the index information containing the time information belonging to the time period, and according to the time information belonging to the time period in the queried index information
  • the corresponding storage location information determines the location of the data element belonging to the time period, so that the corresponding part of the data element is called according to the time period specified by the user.
  • the data elements in the target data set may be stored contiguously in chronological order. Based on this, for the time period specified by the user for the data to be called, the system can determine the start time and end time of the time period, and determine the corresponding start time according to the start time and end time and the established index information.
  • the data element (called “starting data element") and the data element corresponding to the termination time (called “terminating data element”), and then call all the data elements from the starting data element to the ending data element (that is, all data elements belonging to the time period).
  • starting data element the data element corresponding to the termination time
  • the messages in the type file are stored continuously in the order of their respective timestamps.
  • Start offset and end offset corresponding to the start timestamp and end timestamp, and then determine the corresponding start message and end message in the type file according to the start offset and end offset, and call the start and end messages from the All messages from the Start message to the Termination message (eg, Msg3 to Msg5 in Figure 6).
  • the Termination message eg, Msg3 to Msg5 in Figure 6.
  • each data element in the original data set may also be continuously stored in time sequence.
  • the data elements of the same type determined from the original data set can be spliced together in the original order in the original data set, so as to obtain The target dataset in which each data element is stored consecutively in chronological order.
  • the index information of the target data set may further include set identification information of the original data set corresponding to each data element in the target data set.
  • the set identification information may include the id of the test vehicle and/or the id of the test trip. Based on this, according to the index conditions set by the user for the set identification information, only the data elements in the target dataset that meet the conditions can be called.
  • the index condition may include a range of test vehicle ids, a range of test trip ids, the ids of a specific one or more test vehicles/test trips, etc., or any combination thereof.
  • the user can directly obtain data from the corresponding target data set according to the type of the data to be called when calling data, rather than extracting some of the types based on the original data set
  • the calling method is direct and the amount of data called is small, so it can efficiently call part of the data that meets the specific needs of users from a huge amount of data.
  • data elements of the same type in the original data set can be stored in a target data set, and the system only needs to find and access the target data set belonging to the user-specified type, and then the user-specified type can be called from it. The data.
  • each data element in the target data set can be continuously stored in chronological order, and the user can further obtain target data of a specified type within a specified time period.
  • the target data set can be divided into multiple target data subsets at preset time intervals and cached respectively, and the user can obtain only the data elements corresponding to some target data subsets in the multiple target data subsets based on the data call request.
  • the data storage method provided by the embodiment of the present application makes the data calling process simpler and the amount of data access smaller, and can better improve the efficiency of data calling.
  • FIG. 7 is an exemplary flowchart of a data calling method according to some embodiments of the present application. As shown in Figure 7, the data calling method may include:
  • Step 710 Acquire a data invocation request sent by the client, where the data invocation request at least includes the type of the data to be invoked.
  • step 710 may be performed by processing device 122 (eg, user request acquisition module 310).
  • the data call request may be input by the user through a mobile device (eg, an input/output interface of the terminal device 130 ) or a computing device.
  • the input/output interface of the terminal device 130 may include an input device, such as a keyboard, a mouse, a touch screen, a microphone, a trackball, etc., or any combination thereof, and the user may use the input device to input the data calling request.
  • the data invocation request may be further sent (eg, via network 150 ) to processing device 122 and/or other components of data processing system 100 .
  • the mobile device or computing device may provide a data query interface, which may support the user to input filter conditions related to the data to be called.
  • the mobile device or computing device generates a corresponding data call request after acquiring the filter condition input by the user, and sends the data call request to the processing device 122 and/or other components of the data processing system 100 .
  • the data includes drive test data.
  • the data invocation request may also include information such as the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
  • Step 720 Obtain partial data in the original data based on the data calling request to obtain the data to be called, where the partial data includes data elements in the target data set corresponding to the type to which the data to be called belongs.
  • step 720 may be performed by processing device 122 (eg, calling module 320).
  • data processing system 100 may provide an indexing mechanism.
  • the processing device 122 may determine meta-service information of the target dataset to provide an indexing function for data elements in the target dataset.
  • the meta-service information may be stored in a storage device.
  • the processing device 122 (for example, the index information obtaining unit 322) may obtain the user's index request information from the data call request. Further, the processing device 122 may compare the index request information in the data call request with the data stored in the storage device. Match the meta-service information in the data call request to determine the index information (or meta-service information) that matches the index request information in the data call request, and obtain the data element based on the storage location pointed to by the index information or the meta-service information.
  • the data invocation request at least includes the type of the data to be invoked.
  • the type described by the data to be called may include one or more than one.
  • the data query interface provided by the mobile device or computing device can display multiple candidate types. After the user selects one or more types to which the data to be called belongs, the mobile device or computing device generates one or more types including the one or more types selected by the user. type of data call request and send the data call request to data processing system 100 .
  • the index information may include at least identification information of the target dataset.
  • the processing device 122 can access the meta-service information based on one or more types selected by the user (that is, the user index request information) included in the data call request, and determine the meta-service information. Index information corresponding to the one or more types in the service information. Further, the processing device 122 may determine the storage location of the corresponding target data set based on the index information, so as to call the data elements in the corresponding target data set.
  • the data invocation request may further include more filter conditions related to the data to be invoked.
  • the processing device 122 may acquire the data to be invoked that meets the multiple filtering conditions in various ways. For example, in some embodiments, the processing device 122 may first determine multiple target data sets corresponding to multiple types selected by the user one-to-one, and then select the target data sets of each type according to other filtering conditions in the data call request. The data set filters out the data elements that meet the conditions, so as to obtain the data to be called that satisfies multiple filter conditions.
  • the processing device 122 may first filter out data elements that meet other filtering conditions from all types of target data sets, and then filter out data elements belonging to multiple types selected by the user from the filtered data elements. data elements, so as to obtain the data to be called that satisfies multiple filter conditions.
  • the processing device 122 may access the meta-service information based on the data invocation request, and determine the storage location of the data to be invoked that satisfies multiple screening conditions based on the index information in the meta-service information, so as to obtain the data to be invoked that satisfies the plurality of screening conditions. data.
  • the index information may include at least one-to-one correspondence of meta identification information and storage location information of each data element in the target data set, wherein the meta identification information refers to identification information of a corresponding data element.
  • the data invocation request may further include meta-qualification conditions related to meta-identification information for the data to be invoked.
  • the processing device 122 may access meta-service information based on the data call request, and determine index information in the meta-service information that corresponds to the one or more types and satisfies the meta-qualifying condition. Further, the processing device 122 may acquire the data element according to the storage location corresponding to the index information.
  • the data may include drive test data.
  • the user's data invocation request may include index request information such as the type of the drive test data collection device, the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
  • the index request information determined by the data calling request may include at least the id of the test vehicle and/or the id of the test trip, the type of the data to be called, the time range, the data time length, and the like.
  • data processing system 100 may provide an edge caching mechanism.
  • processing device 122 eg, segment storage unit 323
  • the processing device 122 may divide the target data set according to preset time intervals based on the time information (eg, timestamp) identified by the data element, that is, the time information corresponding to the target data set and the time information in the user data call request.
  • time information eg, timestamp
  • each target data subset corresponds to a time interval, and store the data elements acquired at each time interval (for example, every 10s) to each target data subset (also called a physical data file) .
  • the target data set is divided into multiple target data subsets at preset time intervals. For example, if the time length of the data element corresponding to each target data set is 100 seconds, and the time length in the user data call request is 20 seconds, the processing device 122 may divide the target data set into 10 target data subsets, each The target data subset corresponds to 10 seconds of data elements.
  • the time range here may refer to the time range in which the data elements are collected.
  • the time interval may be set according to the minimum value of the time period specified by the user for the data to be called, so as to ensure that several target data subsets called according to the time period are actually corresponding to the time period. match.
  • the minimum value of the time period specified by the user for the data to be called may be directly set as the time interval.
  • the processing device 122 may store each target data subset and the data elements stored therein in the memory of the processing device 122 by means of physical storage. After the user completes the invocation of some data elements in the target data set, the target data subset and the data elements stored therein can be erased. Further, the processing device 122 (eg, the data obtaining unit 324 ) may obtain, from the storage device, data elements in the target data subset matching the time information in each time interval based on the time information of the user data call request.
  • the processing device 122 may distribute the data based on the above method to the user's terminal (also referred to as the user terminal, eg, terminal device 130).
  • the processing device 122 may determine the target dataset to which the meta-service information points based on the meta-service information that matches the user's data invocation request. And based on the time information (eg, time range) corresponding to the target data set and the time information of the user data invocation request, multiple logical files corresponding to the target data set are established. For example, when the location of the client is not in the same area (eg, city or country) as the storage device (referred to as the first storage device, such as a distributed file system) storing the original data set, the processing device 122 and the first storage device The storage devices are in the same region (eg, city or country), and the processing device 122 can create multiple logical files corresponding to the target dataset.
  • time information eg, time range
  • the processing device 122 may further send the target data set and its stored data elements to a second storage device located in the same region (eg, city or country) as the client.
  • the distance between the second storage device and the user terminal is smaller than the distance between the first storage device and the user terminal.
  • the server where the second storage device is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively.
  • the plurality of logical files point to the target data subset in the second storage device by way of pointers.
  • the processing device 122 may determine the logical file matching the user invocation request by matching the time information in the user data invocation request with the time information in each logical file, and based on the data in the second storage device pointed to by the user-matched logical file.
  • the target data subset instructing the server of the second storage device to send the data elements in the matched target data subset to the client.
  • the data elements in the target data subset are not stored in the logical file, and information about the data elements (eg, part of the meta-service information) may be stored.
  • Logical files can point to physical data files (ie, target data subsets) by way of pointers.
  • the target data set can be divided into multiple target data subsets according to preset time intervals, each target data subset corresponds to a time interval, and each target data subset can create a logic file.
  • Each logical file includes meta-service information of data elements stored in each target data subset.
  • Step 730 Synchronize the data to be called to the storage device of the client.
  • step 730 may be performed by processing device 122 (eg, a synchronization module (not shown)).
  • the processing device 122 may further combine the data elements in the multiple target data subsets and send them to the storage device of the client, so as to realize the synchronization of the data to be called on the client.
  • the processing device 122 when the data call request further includes the time range corresponding to the data to be called, and the time range corresponding to the data to be called is smaller than the time range corresponding to the target data set, the processing device 122 does not need to transfer the data in the entire target data set To send to the client, it is only necessary to send the data element corresponding to the time range information in the user's call data request (that is, the data element in the target data subset) to the client, so as to realize the function of quickly calling part of the data and improve the efficiency of data calling .
  • step 720 may also include an intelligent recommendation process.
  • the processing device 122 may record the calling habits of the user, and recommend calling results to the user according to the calling habits.
  • the processing device 122 may also predict the user's search behavior based on a machine learning algorithm.
  • FIG. 8 is a schematic diagram of a data calling process according to some embodiments of the present application.
  • the data invocation request input by the user may include information related to the trip ID (ie, the id of the test trip), the time range and the type.
  • the multiple target datasets may include A-type files, B-type files, and C-type files.
  • the processing device 122 can access the meta-service information based on the data call request.
  • the meta-service information may include the type file index and type file information as shown in FIG.
  • the type file index may be index information related to the trip ID (that is, the id of the test trip), time range, etc.
  • the information may be index information related to identification information of the target data set, where the identification information of the target data set corresponds to the type.
  • the processing device 122 may determine, according to the type file information and the type file index, the target data set corresponding to the type, and the specific position (for example, the offset start point and the offset end point) of the data element to be called in each target data set, thereby Get the object to be called in each target dataset. Further, the processing device 122 may combine the data elements to be called to generate a data packet. The data packet can be transmitted to the client as a result of the data call.
  • FIG. 9 is a schematic diagram of a data calling scenario according to some embodiments of the present application.
  • the data invocation scenario may include a client, a local server (and the data processing system 100 ), and a remote data center.
  • the user may input a data call request through the client side, that is, the computing device (eg, the input/output interface of the terminal device 130).
  • the computing device eg, the input/output interface of the terminal device 130.
  • a local server may include an upper file system (which may also be referred to as a logical file system) and an underlying file system.
  • the upper file system (or logical file system) can be used to define the interface (ie access) between the local server and the client.
  • the upper file system may provide an indexing mechanism.
  • the index may be established by the processing device 122 based on the raw data set.
  • the upper file system can also define information such as files and their attributes, operations allowed by files, and directories of files.
  • the processing device 122 can determine the index request information corresponding to the data invocation request according to the data invocation request through the upper-layer file system, and determine the meta-service information in the underlying file system corresponding to the invocation request based on the data index information and the file directory. , and determine the storage location of the data element that satisfies the data calling request based on the meta-service information (eg, storage location information), so as to obtain the data to be called through the underlying file system.
  • the meta-service information eg, storage location information
  • the underlying file system is used to map the upper file system to a physical storage device (eg, a hard disk in a local server) or a memory device.
  • the underlying file system may include meta-service information, including index information of the target data set (for example, the identification information of the target data set, the one-to-one meta-identification information and storage location information of each data element in the target data set, the original data set identification information for the set, etc.).
  • the underlying file system can match the meta-service information based on the index request information determined in the upper-level file system, and determine the location where the data pointed to by the matching meta-service information is stored in the physical storage device, thereby obtaining data elements, and realizing the upper-level file system to Mapping between physical storage devices.
  • the local server may provide an edge caching mechanism.
  • the local server may store the target data set in the remote data center to the storage device of the local server according to the method described in the process 400 .
  • the local server may divide the target data set into multiple target data subsets at preset time intervals based on the time information (eg, timestamp) identified by the data element, and store them in the storage device respectively.
  • the target data subset includes time information of the metadata, which may be the address at which the data element is stored on the storage device.
  • the local server may acquire, from the storage device, data elements in the target data subsets that match the time information in each time interval.
  • the local server can distribute the data to the client based on the above method.
  • the local server can create a target Multiple logical files corresponding to the dataset.
  • the local server may further send the target data set and the data elements stored therein to a second storage device, where the second storage device and the client are located in the same area.
  • the server where the second storage device is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively.
  • the plurality of logical files point to the target data subset in the second storage device by way of pointers.
  • the local server can determine the logical file matching the user's invocation request by matching the time information in the user data invocation request with the time information in each logical file, and based on the target in the second storage device pointed to by the logical file matched by the user Data subset, instructing the server of the second storage device to send the data elements in the matched target data subset to the client.
  • the server of the second storage device may combine the data elements in the multiple target data subsets and send them to the client.
  • the data invocation request further includes the time range corresponding to the data to be invoked
  • the data actually corresponding to the time range can be invoked by accessing the meta-service information corresponding to the target data subset within the specified time range.
  • Data that is, a subset of the target data in physical storage
  • only the data actually corresponding to the time range can be synchronized, rather than the entire target data set, so as to realize the function of quickly calling part of the data and improve the efficiency of data calling.
  • a user can obtain data stored in a remote data center in a preset test station for further analysis and processing (eg, R&D program debugging, test simulation, problem data analysis, etc.).
  • FIG. 10 is a schematic diagram of data calling according to some embodiments of the present application.
  • a data call request can be input through the user terminal (step 1), and the data processing system 100 can obtain the meta-service information matching the call request from the meta-service module based on the data call request. (or index request information) (step 2). Further, the data processing system 100 can determine the target dataset to which the meta-service information points.
  • the data processing system 100 and the local storage device storing the original data set are in the first area
  • the client location and the second HDFS are in the second area
  • the first area and the second area are different areas
  • the client is closer to the server of the second HDFS than the client is to the server of the first HDFS.
  • the first region may be located in the United States and the second region may be located in China.
  • the first HDFS may be a data center established in the United States
  • the second HDFS may be a data center established in China (eg, Inner Mongolia (NMG) data center).
  • the data processing system 100 After the data processing system 100 determines the target data set (for example, the camera data file) pointed to by the meta-service information, it can establish the target data set based on the time information (for example, the time range) corresponding to the target data set and the time information of the user data call request Corresponding multiple logical files (eg, logical camera data files). Further, the data processing system 100 may send the target dataset and its stored data elements to the second HDFS.
  • the server where the second storage HDFS is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively.
  • the plurality of logical files point to the target data subset in the second HDFS by way of pointers.
  • the data processing system 100 can determine the logical file matching the user invocation request by matching the time information in the user data invocation request with the time information in each logical file, and determine the logical file in the second HDFS to which the logical file matched by the user points.
  • Target data subset instructing the server of the second HDFS to send the data elements in the matched target data subset to the client (step 3).
  • the server of the second HDFS may combine the data elements in the multiple target data subsets and send them to the client (step 4).
  • FIG. 11 is a schematic diagram of data storage and invocation according to some embodiments of the present application.
  • data processing system 100 includes a meta-service module that obtains data packets (as described elsewhere herein) for data packets (as described elsewhere herein).
  • the meta-service module generates a packet processing task in response to the received packet (step 1). And send the data packet processing task and the data packet to the processing module.
  • the processing module may acquire a data packet (ie, original data set) processing task and process and store the data packet.
  • the processing module may process and store the original data packet (original data set) based on the process 400 described in FIG. 4 .
  • the processing module may establish different target data sets according to the type information of the data elements in the original data packets.
  • N different target data sets may be established, and the N different target data sets correspond to different types of data elements.
  • the processing module may also set the identification information of the target data set according to the type corresponding to the target data set, so as to identify different types of target data sets. Further, the processing module may determine the meta-service information (also referred to as index information or include index information) of the original data set, so as to determine the storage location of the original data set or data elements in the original data set in the storage device.
  • the meta-service information also referred to as index information or include index information
  • the processing module may generate a target data set and corresponding meta-service information. Further, the processing module can upload the target data set and its stored data elements (step 2) and store them in a local storage device or system (ie, the first distributed file system (HDFS)) (step 3), and store the meta-service information in the storage device associated with the meta-service module (step 4).
  • a local storage device or system refers to a storage device or system in the same region (eg, city or country) as data processing system 100 .
  • the first HDFS may synchronize the processed data packets to the second HDFS.
  • the second HDFS is located in a different region (eg, a different city or country) from the first HDFS, so that the user terminal located in the region where the second HDFS is located can call data.
  • a different region eg, a different city or country
  • FIG. 10 For more description on data calling based on the second HDFS, reference may be made to FIG. 10 .
  • the meta-service information may include meta-identification information (for example, a timestamp) and storage location information (for example, an offset) that correspond to each data element in the target data set, the identification information of the target data set, and the corresponding data elements in the target data set.
  • meta-identification information for example, a timestamp
  • storage location information for example, an offset
  • the user can obtain the download address or the access address of the data in the first HDFS (step 0).
  • the user can invoke the request based on the user-side input data (step 5).
  • the data invocation request may include information such as the type of the data to be invoked, meta-qualification conditions (eg, time range) related to the meta-identification information.
  • the data call request may include index request information such as the type of the drive test data collection device, the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
  • the data processing system 100 may obtain meta-service information (or index request information) matching the call request from the meta-service module based on the data call request (step 6), and call the distributed file system stored near the client according to the meta-service information (step 7) in the target data subset in HDFS.
  • calling data from a distributed file system near the client may refer to calling data from the first HDFS or the second HDFS (step 8).
  • the data processing system 100 may acquire the data elements stored in the target data set from the first HDFS based on the metadata service information.
  • the data processing system 100 may, based on the time information (for example, timestamp) identified by the data element, that is, the time information corresponding to the target data set and the time information in the user data call request, store the target data set at preset time intervals.
  • the target data subsets are divided into a plurality of target data subsets, and each target data subset and the data elements stored therein are stored in the memory of the data processing system 100 by means of physical storage.
  • the data processing system 100 may further acquire, from the storage device, data elements in the target data subsets that match the time information in each time interval.
  • the first HDFS may synchronize the processed data packets to the second HDFS.
  • the second HDFS is located in a different region (eg, a different city or country) from the first HDFS, so that the user terminal located in the region where the second HDFS is located can call data.
  • the distance between the second HDFS and the user terminal is smaller than the distance between the first HDFS and the user terminal.
  • the data processing system 100 may further combine data elements in multiple target data subsets obtained from a distributed file system (HDFS) near the client and send it to the client (step 9).
  • HDFS distributed file system
  • FIG. 12 is a schematic diagram of a user interaction interface according to some embodiments of the present application.
  • the user interaction interface may include a time selection area 1410 , a type selection area 1420 , a data representation area 1430 , a download address area 1440 and a processing progress area 1450 .
  • the user can input the time range corresponding to the data to be recalled.
  • a user may input the time range through an input device (eg, keyboard, mouse, touch screen, microphone, trackball) associated with the user interface.
  • an input device eg, keyboard, mouse, touch screen, microphone, trackball
  • the user can input the type corresponding to the data to be called. For example, as shown in FIG. 12 , the user can input the type corresponding to the data to be called by checking the selection box corresponding to the type through an input device (eg, a mouse).
  • an input device eg, a mouse
  • the data representation area 1430 may be used to display the data to be called corresponding to the time range and type input by the user. For example, as shown in FIG. 12 , the data to be called corresponding to the time range and type input by the user can be displayed in the data representation area 1430 in the form of a combination of timeline and data subset, so that the user can check or confirm whether the input information is not correct.
  • the download address area 1440 may be used to provide a download link corresponding to the data to be called. For example, the user can click on the download chain to trigger the data invocation process.
  • the data to be called obtained during the data calling process can be combined to generate a file package, which is further downloaded on the user end.
  • the processing progress area 1450 may be used to display the progress of data processing.
  • the progress of data processing may include processing completed, unprocessed, and unable to process.
  • the user can determine the progress of data processing (eg, data calling) through the processing progress area 1450 .
  • the possible beneficial effects of the embodiments of the present application include, but are not limited to: (1) Store data elements of the same type in the original data set in a target data set, and only need to find and access the target data set belonging to the type specified by the user. Call out the data of the type specified by the user; (2) the data elements in the target data set can be stored continuously in chronological order, and the user can further obtain the target data of the specified type within the specified time period; (3) the target data set can be preset The time interval is divided into multiple target data subsets, and the user can obtain only the data elements corresponding to some target data subsets in the multiple target data subsets based on the data calling request to realize the function of quickly calling partial data.
  • the data storage method Compared with calling data based on the original data set, the data storage method provided by the embodiment of the present application makes the data calling process simpler and the amount of data access smaller, and can better improve the efficiency of data calling. It should be noted that different embodiments may have different beneficial effects, and in different embodiments, the possible beneficial effects may be any one or a combination of the above, or any other possible beneficial effects.
  • aspects of this application may be illustrated and described in several patentable categories or situations, including any new and useful process, machine, product, or combination of matter, or combinations of them. of any new and useful improvements. Accordingly, various aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, microcode, etc.), or by a combination of hardware and software.
  • the above hardware or software may be referred to as a "data block”, “module”, “engine”, “unit”, “component” or “system”.
  • aspects of the present application may be embodied as a computer product comprising computer readable program code embodied in one or more computer readable media.
  • a computer storage medium may contain a propagated data signal with the computer program code embodied therein, for example, on baseband or as part of a carrier wave.
  • the propagating signal may take a variety of manifestations, including electromagnetic, optical, etc., or a suitable combination.
  • Computer storage media can be any computer-readable media other than computer-readable storage media that can communicate, propagate, or transmit a program for use by coupling to an instruction execution system, apparatus, or device.
  • Program code on a computer storage medium may be transmitted over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or a combination of any of the foregoing.
  • the computer program coding required for the operation of the various parts of this application may be written in any one or more programming languages, including object-oriented programming languages such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB.NET, Python Etc., conventional procedural programming languages such as C language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages.
  • the program code may run entirely on the user's computer, or as a stand-alone software package on the user's computer, or partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any network, such as a local area network (LAN) or wide area network (WAN), or to an external computer (eg, through the Internet), or in a cloud computing environment, or as a service Use eg software as a service (SaaS).
  • LAN local area network
  • WAN wide area network
  • SaaS software as a service

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data storage method and system, and a data calling method and system. The data storage method comprises: obtaining an original data set (410), the original data set comprising a plurality of data elements, and each data element having type information marking the type of the data element; obtaining the number N of different types according to the type information, and establishing N different target data sets (420); and storing, on the basis of the type information of the data elements in the original data set and each target data set, in corresponding target data sets the data elements corresponding to the target data sets (430), the target data sets being stored in a first storage device. The data calling method comprises: obtaining a data calling request sent by a user end (710), the data calling request comprising at least the type of data to be called; obtaining part of the data in a target data set on the basis of the data calling request to obtain the data to be called (720); and sending said data to a second storage device of the user end (730).

Description

数据存储、调用方法及系统Data storage, calling method and system
交叉引用cross reference
本申请要求2020年09月07日提交的中国专利申请202010931768.6的优先权,其内容全部通过引用并入本文。This application claims the priority of Chinese patent application 202010931768.6 filed on September 7, 2020, the contents of which are fully incorporated herein by reference.
技术领域technical field
本申请涉及信息技术领域,特别涉及一种数据存储、调用方法及系统。The present application relates to the field of information technology, and in particular, to a data storage and invocation method and system.
背景技术Background technique
自动驾驶测试车在道路测试中通常需要采集大量的数据(简称“路测数据”)以供分析、调试等用途。而相关人员对路测数据的需求通常是有针对性的,即,仅需要调用大量数据中的部分数据。例如,特定的团队通常仅需要调用属于特定类型的路测数据而非所有类型的路测数据(例如,图像处理团队仅需要图像类型相关的路测数据)。又例如,有些情况下只需要调用特定范围(例如,特定时间段)内的路测数据。In road testing, autonomous driving test vehicles usually need to collect a large amount of data (referred to as "road test data") for analysis, debugging and other purposes. The relevant personnel's demand for drive test data is usually targeted, that is, only part of the large amount of data needs to be called. For example, a specific team usually only needs to call drive test data belonging to a specific type rather than all types of drive test data (eg, an image processing team only needs drive test data related to image types). For another example, in some cases, only drive test data within a specific range (for example, a specific time period) needs to be called.
因此,希望提供一种数据存储和/或调用方案,能够从庞大数量的数据中高效地调用满足用户特定需求的部分数据。Therefore, it is desirable to provide a data storage and/or recall solution that can efficiently recall part of the data that meets the specific needs of users from a huge amount of data.
发明内容SUMMARY OF THE INVENTION
本申请的一个方面提供一种由计算装置执行的数据存储方法,其特征在于,所述方法包括:获取原始数据集,所述原始数据集包括多个数据元,每个数据元具有标注该数据元类型的类型信息;根据所述原始数据集中数据元的类型信息,得到不同类型的个数N,并对应建立N个不同的目标数据集,所述N个不同的目标数据集与不同类型的数据元相对应;其中,N为大于等于2的整数;以及基于所述原始数据集中的数据元的类型信息以及各目标数据集,将与所述目标数据集对应的数据元存储在相应的目标数据集中,所述目标数据集存储于第一存储设备中。One aspect of the present application provides a data storage method executed by a computing device, characterized in that the method includes: acquiring an original data set, the original data set includes a plurality of data elements, each data element has a label for the data The type information of the meta type; according to the type information of the data elements in the original data set, the number N of different types is obtained, and N different target data sets are established correspondingly, and the N different target data sets are associated with different types of data sets. corresponding to the data element; wherein, N is an integer greater than or equal to 2; and based on the type information of the data element in the original data set and each target data set, the data element corresponding to the target data set is stored in the corresponding target data set In the data set, the target data set is stored in the first storage device.
在一些实施例中,所述数据集为文件,所述文件的数据元为消息。In some embodiments, the dataset is a file, and the data elements of the file are messages.
在一些实施例中,所述类型包括图像类、位置类、传感器类、数据包类和控制器局域网总线类中的一种或多种。In some embodiments, the types include one or more of an image class, a location class, a sensor class, a packet class, and a controller area network bus class.
在一些实施例中,所述方法还包括:建立所述目标数据集的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息;其中,元标识信息是指相应数据元的标识信息。In some embodiments, the method further includes: establishing index information of the target data set, where the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set; wherein , and the meta identification information refers to the identification information of the corresponding data element.
在一些实施例中,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息。In some embodiments, the data elements in the target data set are arranged in chronological order, and the meta identification information includes time information of the corresponding data elements.
在一些实施例中,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息;其中,集标识信息是指原始数据集的标识信息。In some embodiments, the index information further includes set identification information of the original data set corresponding to each data element in the target data set; wherein, the set identification information refers to the identification information of the original data set.
在一些实施例中,所述原始数据集中的数据包括自动驾驶交通工具在运行过程中所产生或采集的数据。In some embodiments, the data in the raw data set includes data generated or collected during operation of the autonomous vehicle.
在一些实施例中,所述方法还包括:接收由用户端发送的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型;基于所述数据调用请求从所述N个不同的目标数据集确定相应类型目标数据集;基于所述确定的目标数据集中的数据元,得到所述待调用数据;以及将所述待调用数据发送给所述用户端的第二存储设备中。In some embodiments, the method further includes: receiving a data call request sent by the client, the data call request at least including the type of the data to be called; based on the data call request from the N different targets The data set determines a corresponding type of target data set; obtains the data to be called based on the data elements in the determined target data set; and sends the data to be called to the second storage device of the client.
在一些实施例中,所述基于所述确定的目标数据集中的数据元,得到所述待调用数据还包括:从所述第一存储设备中获取所述目标数据集以及其存储的数据元;将所述目标数据集按预设的时间间隔划分成多个目标数据子集;以及基于所述数据调用请求,获取所述多个目标数据子集中部分目标数据子集对应的数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。In some embodiments, the obtaining the data to be called based on the data elements in the determined target data set further comprises: acquiring the target data set and the data elements stored therein from the first storage device; Dividing the target data set into multiple target data subsets at preset time intervals; and acquiring, based on the data call request, data elements corresponding to some target data subsets in the multiple target data subsets, the The data to be called includes the data elements corresponding to the partial target data subsets.
在一些实施例中,所述基于所述确定的目标数据集中的数据元,得到所述待调用数据还包括:将所述第一存储设备中获取的所述确定的目标数据集以及其存储的数据元发送至第三存储设备,所述第一存储设备距离所述用户端远于所述第三存储设备距离所述用户端;将所述目标数据集按预设的时间间隔划分成多个目标数据子集并存储于所述第三存储设备;建立多个逻辑文件,每个逻辑文件对应所述多个目标数据子集中的一个,所述每个逻辑文件包括所述目标数据子集中的数据元对应的索引信息;以及基于所述数据调用请求以及所述逻辑文件,从所述第三存储设备中获取所述多个目标数据子集中的部分目标数据子集存储的数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。In some embodiments, the obtaining the data to be called based on the data elements in the determined target data set further comprises: converting the determined target data set acquired in the first storage device and the stored data thereof. The data element is sent to a third storage device, and the first storage device is farther from the user end than the third storage device is far from the user end; the target data set is divided into multiple sets at preset time intervals The target data subset is stored in the third storage device; multiple logical files are established, each logical file corresponds to one of the multiple target data subsets, and each logical file includes the target data subsets. index information corresponding to the data element; and based on the data call request and the logical file, obtain, from the third storage device, data elements stored in some target data subsets in the plurality of target data subsets, the The data to be called includes the data elements corresponding to the partial target data subsets.
本申请的另一个方面提供一种数据存储系统,其特征在于,所述系统包括原始数据集获取模块、目标数据集建立模块和存储模块。所述原始数据集获取模块用于获取原始数据集,所述原始数据集包括多个数据元,每个数据元具有标注该数据元类型的类型信息。所述目标数据集建立模块用于根据所述原始数据集中数据元的类型信息,得到不同类型的个数N,并对应建立N个不同的目标数据集,所述N个不同的目标数据集与不同类型的数据元相对应;其中,N为大于等于2的整数。所述存储模块,用于基于 所述原始数据集中的数据元的类型信息以及目标数据集,将与所述目标数据集对应的数据元存储在相应的目标数据集中。Another aspect of the present application provides a data storage system, characterized in that, the system includes an original data set acquisition module, a target data set establishment module and a storage module. The original data set acquisition module is used for acquiring an original data set, the original data set includes a plurality of data elements, and each data element has type information marking the type of the data element. The target data set establishment module is used to obtain the number N of different types according to the type information of the data elements in the original data set, and correspondingly establish N different target data sets, the N different target data sets and Different types of data elements correspond; among them, N is an integer greater than or equal to 2. The storage module is configured to store the data elements corresponding to the target data set in the corresponding target data set based on the type information of the data elements in the original data set and the target data set.
在一些实施例中,所述数据集为文件,所述文件的数据元为消息。In some embodiments, the dataset is a file, and the data elements of the file are messages.
在一些实施例中,所述类型包括图像类、位置类、传感器类、数据包类和控制器局域网总线类中的一种或多种。In some embodiments, the types include one or more of an image class, a location class, a sensor class, a packet class, and a controller area network bus class.
在一些实施例中,所述系统还包括索引信息建立模块,用于建立所述目标数据集的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息;其中,元标识信息是指相应数据元的标识信息。In some embodiments, the system further includes an index information establishment module, configured to establish index information of the target data set, where the index information at least includes meta-identification information of one-to-one correspondence of each data element in the target data set and storage location information; wherein, the meta identification information refers to the identification information of the corresponding data element.
在一些实施例中,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息。In some embodiments, the data elements in the target data set are arranged in chronological order, and the meta identification information includes time information of the corresponding data elements.
在一些实施例中,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息;其中,集标识信息是指原始数据集的标识信息。In some embodiments, the index information further includes set identification information of the original data set corresponding to each data element in the target data set; wherein, the set identification information refers to the identification information of the original data set.
本申请的又一个方面提供一种存储介质,其特征在于,所述存储介质用于存储计算机指令,当计算机读取所述存储介质中的计算机指令后,执行上述数据存储方法。Another aspect of the present application provides a storage medium, wherein the storage medium is used for storing computer instructions, and after the computer reads the computer instructions in the storage medium, the above data storage method is executed.
本申请的又一个方面提供一种由计算装置执行的数据调用方法,其特征在于,原始数据集中的数据元按上述数据存储方法被存储于相应的目标数据集,所述目标数据集存储于计算机装置相关联的第一存储设备中。所述数据调用方法包括:获取由用户端发送的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型;基于所述数据调用请求获取所述目标数据集中的部分数据以得到所述待调用数据,所述部分数据包括与所述待调用数据所属的类型相对应的目标数据集中的数据元;以及将所述待调用数据发送至所述用户端的第二存储设备中。Yet another aspect of the present application provides a data calling method executed by a computing device, wherein the data elements in the original data set are stored in a corresponding target data set according to the above data storage method, and the target data set is stored in a computer in the first storage device associated with the apparatus. The data invocation method includes: acquiring a data invocation request sent by a client, where the data invocation request at least includes the type of data to be invoked; acquiring part of the data in the target data set based on the data invocation request to obtain the data invoking Data to be called, the partial data includes data elements in the target data set corresponding to the type to which the data to be called belongs; and the data to be called is sent to the second storage device of the client.
在一些实施例中,所述目标数据集具有相应的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息,其中,元标识信息指相应数据元的标识信息;所述数据调用请求还包括与元标识信息相关的元限定条件。所述基于所述数据调用请求获取相应类型目标数据集中的数据元包括:基于所述数据调用请求获取与相应类型对应且满足所述元限定条件的索引信息;以及基于所述获取的索引信息中的存储位置获取所述数据元。In some embodiments, the target data set has corresponding index information, and the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set, wherein the meta-identification information refers to The identification information of the corresponding data element; the data call request further includes the meta qualification related to the meta identification information. The acquiring, based on the data calling request, the data elements in the target data set of the corresponding type includes: acquiring, based on the data calling request, index information corresponding to the corresponding type and satisfying the element qualification; and based on the acquired index information The storage location of the data element is obtained.
在一些实施例中,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息;所述元限定条件包括所述待调用数据对应的时间范围。In some embodiments, the data elements in the target data set are arranged in chronological order, the meta identification information includes time information of the corresponding data elements, and the meta qualification includes a time range corresponding to the data to be called.
在一些实施例中,所述基于所述数据调用请求获取所述目标数据集中的部分数 据还包括:将所述目标数据集按预设的时间间隔划分成多个目标数据子集;以及基于所述数据调用请求,获取所述目标数据子集中部分目标数据子集对应的所述数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。In some embodiments, the acquiring part of the data in the target data set based on the data calling request further includes: dividing the target data set into multiple target data subsets at preset time intervals; The data invocation request is used to obtain the data elements corresponding to some target data subsets in the target data subsets, and the data to be invoked includes the data elements corresponding to the partial target data subsets.
在一些实施例中,所述基于所述数据调用请求获取所述目标数据集中的部分数据还包括:将所述第一存储设备中获取的所述目标数据集以及其存储的数据元发送至第三存储设备,所述第一存储设备距离所述用户端远于所述第三存储设备距离所述用户端;将所述目标数据集按预设的时间间隔划分成多个目标数据子集并存储于所述第三存储设备;建立多个逻辑文件,每个逻辑文件对应所述多个目标数据子集中的一个,所述逻辑文件包括目标数据子集中的数据元对应的索引信息;以及基于所述数据调用请求以及所述逻辑文件,从所述第三存储设备中获取所述多个目标数据子集中的部分目标数据子集对应的数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。In some embodiments, the acquiring part of the data in the target data set based on the data invocation request further includes: sending the target data set acquired in the first storage device and the data elements stored therein to the first storage device. three storage devices, the first storage device is farther from the user terminal than the third storage device is far from the user terminal; the target data set is divided into multiple target data subsets at preset time intervals and storing in the third storage device; establishing multiple logical files, each logical file corresponding to one of the multiple target data subsets, the logical file including index information corresponding to the data elements in the target data subset; and based on For the data calling request and the logic file, data elements corresponding to some target data subsets in the plurality of target data subsets are obtained from the third storage device, and the data to be called includes the partial target data The data element corresponding to the subset.
在一些实施例中,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息,其中,集标识信息是指原始数据集的标识信息;所述数据调用请求还包括与集标识信息相关的集限定条件。所述基于所述数据调用请求获取与相应类型对应且满足所述元限定条件的索引信息包括:基于所述数据调用请求获取与相应类型对应且满足所述集限定条件的索引信息。In some embodiments, the index information further includes set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set; the data invocation request further includes Includes set qualifications related to set identification information. The acquiring index information corresponding to the corresponding type and satisfying the meta-qualifying condition based on the data calling request includes: acquiring index information corresponding to the corresponding type and satisfying the set-qualifying condition based on the data calling request.
本申请的又一个方面提供一种数据调用系统,其特征在于,原始数据集中的数据元按上述数据存储方法被存储于相应的目标数据集,所述目标数据集存储于计算机装置相关联的第一存储设备中。所述数据调用系统包括用户请求获取模块和调用模块。所述用户请求获取模块用于获取由用户端发送的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型。所述调用模块用于基于所述数据调用请求获取所述目标数据集中的部分数据以得到所述待调用数据,所述部分数据包括与所述待调用数据所属的类型相对应的目标数据集中的数据元。Yet another aspect of the present application provides a data calling system, characterized in that the data elements in the original data set are stored in the corresponding target data set according to the above data storage method, and the target data set is stored in the first data set associated with the computer device. in a storage device. The data calling system includes a user request acquiring module and a calling module. The user request obtaining module is configured to obtain a data calling request sent by the client, where the data calling request at least includes the type of the data to be called. The calling module is configured to acquire, based on the data calling request, part of the data in the target data set to obtain the data to be called, where the partial data includes the data in the target data set corresponding to the type to which the data to be called belongs. data element.
在一些实施例中,所述目标数据集具有相应的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息,其中,元标识信息是指相应数据元的标识信息;所述数据调用请求还包括与元标识信息相关的元限定条件。所述调用模块包括索引信息获取单元、分段存储单元和数据元获取单元。所述索引信息获取单元用于基于所述数据调用请求获取与相应类型对应且满足所述元限定条件的索引信息。所述分段存储单元,用于将所述目标数据集按预设的时间间隔划分成多个目标数据子集,分别存储每个目标数据子集。所述数据元获取单元用于基于获取的索 引信息中的存储位置获取数据元。In some embodiments, the target data set has corresponding index information, and the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set, wherein the meta-identification information is Refers to the identification information of the corresponding data element; the data invocation request also includes meta qualifications related to the meta identification information. The calling module includes an index information acquisition unit, a segment storage unit and a data element acquisition unit. The index information obtaining unit is configured to obtain index information corresponding to the corresponding type and satisfying the meta-qualification condition based on the data call request. The segment storage unit is configured to divide the target data set into a plurality of target data subsets at preset time intervals, and store each target data subset separately. The data element obtaining unit is configured to obtain the data element based on the storage location in the obtained index information.
在一些实施例中,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息;所述元限定条件包括所述待调用数据对应的时间范围。In some embodiments, the data elements in the target data set are arranged in chronological order, the meta identification information includes time information of the corresponding data elements, and the meta qualification includes a time range corresponding to the data to be called.
在一些实施例中,所述数据调用系统还包括同步模块,所述同步模块用于将所述待调用数据发送至所述用户端的第二存储设备中。In some embodiments, the data calling system further includes a synchronization module, and the synchronization module is configured to send the data to be called to the second storage device of the client.
在一些实施例中,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息,其中,集标识信息是指原始数据集的标识信息;所述数据调用请求还包括与集标识信息相关的集限定条件。所述索引信息获取单元进一步用于基于所述数据调用请求获取与相应类型对应且满足所述集限定条件的索引信息。In some embodiments, the index information further includes set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set; the data invocation request further includes Includes set qualifications related to set identification information. The index information obtaining unit is further configured to obtain, based on the data call request, index information corresponding to the corresponding type and satisfying the set qualification condition.
本申请的又一个方面提供一种存储介质,其特征在于,所述存储介质用于存储计算机指令,当计算机读取所述存储介质中的计算机指令后,执行上述数据调用方法。Another aspect of the present application provides a storage medium, wherein the storage medium is used for storing computer instructions, and after the computer reads the computer instructions in the storage medium, the above data calling method is executed.
附图说明Description of drawings
本申请将以示例性实施例的方式进一步说明,这些示例性实施例将通过附图进行详细描述。这些实施例并非限制性的,在这些实施例中,相同的编号表示相同的结构,其中:The present application will be further described by way of exemplary embodiments, which will be described in detail with reference to the accompanying drawings. These examples are not limiting, and in these examples, the same numbers refer to the same structures, wherein:
图1是根据本申请一些实施例所示的数据处理系统的应用场景示意图;1 is a schematic diagram of an application scenario of a data processing system according to some embodiments of the present application;
图2是根据本申请一些实施例所示的示例性处理设备的框图;Figure 2 is a block diagram of an exemplary processing device according to some embodiments of the present application;
图3是根据本申请一些实施例所示的另一示例性处理设备的框图;3 is a block diagram of another exemplary processing device according to some embodiments of the present application;
图4是根据本申请一些实施例所示的数据存储方法的示例性流程图;FIG. 4 is an exemplary flowchart of a data storage method according to some embodiments of the present application;
图5是根据本申请一些实施例所示的将原始数据集中不同类型的数据元存储在相应的目标数据集中的示意图;5 is a schematic diagram of storing different types of data elements in an original data set in a corresponding target data set according to some embodiments of the present application;
图6是根据本申请一些实施例所示的与目标数据集对应的索引信息的示意图;6 is a schematic diagram of index information corresponding to a target dataset according to some embodiments of the present application;
图7是根据本申请一些实施例所示的数据调用方法的示例性流程图;FIG. 7 is an exemplary flowchart of a data calling method according to some embodiments of the present application;
图8是根据本申请一些实施例所示的数据调用过程的示意图;8 is a schematic diagram of a data calling process according to some embodiments of the present application;
图9是根据本申请一些实施例所示的数据调用场景的示意图;9 is a schematic diagram of a data calling scenario according to some embodiments of the present application;
图10是根据本申请一些实施例所示的数据调用示意图;FIG. 10 is a schematic diagram of data invocation according to some embodiments of the present application;
图11是根据本申请一些实施例所示的数据存储和调用示意图;以及FIG. 11 is a schematic diagram of data storage and invocation according to some embodiments of the present application; and
图12是根据本申请一些实施例所示的用户交互界面的示意图。FIG. 12 is a schematic diagram of a user interaction interface according to some embodiments of the present application.
具体实施方式detailed description
为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本申请的一些示例或实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图将本申请应用于其它类似情景。除非从语言环境中显而易见或另做说明,图中相同标号代表相同结构或操作。In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some examples or embodiments of the present application. For those of ordinary skill in the art, without any creative effort, the present application can also be applied to the present application according to these drawings. other similar situations. Unless obvious from the locale or otherwise specified, the same reference numbers in the figures represent the same structure or operation.
应当理解,本文使用的“系统”、“装置”、“单元”和/或“模组”是用于区分不同级别的不同组件、元件、部件、部分或装配的一种方法。然而,如果其他词语可实现相同的目的,则可通过其他表达来替换所述词语。It should be understood that "system", "device", "unit" and/or "module" as used herein is a method used to distinguish different components, elements, parts, parts or assemblies at different levels. However, other words may be replaced by other expressions if they serve the same purpose.
如本申请和权利要求书中所示,除非上下文明确提示例外情形,“一”、“一个”、“一种”和/或“该”等词并非特指单数,也可包括复数。一般说来,术语“包括”与“包含”仅提示包括已明确标识的步骤和元素,而这些步骤和元素不构成一个排它性的罗列,方法或者设备也可能包含其它的步骤或元素。As shown in this application and in the claims, unless the context clearly dictates otherwise, the words "a", "an", "an" and/or "the" are not intended to be specific in the singular and may include the plural. Generally speaking, the terms "comprising" and "comprising" only imply that the clearly identified steps and elements are included, and these steps and elements do not constitute an exclusive list, and the method or apparatus may also include other steps or elements.
本申请中使用了流程图用来说明根据本申请的实施例的系统所执行的操作。应当理解的是,前面或后面操作不一定按照顺序来精确地执行。相反,可以按照倒序或同时处理各个步骤。同时,也可以将其他操作添加到这些过程中,或从这些过程移除某一步或数步操作。Flow diagrams are used in this application to illustrate operations performed by a system according to an embodiment of the application. It should be understood that the preceding or following operations are not necessarily performed in the exact order. Instead, the various steps can be processed in reverse order or simultaneously. At the same time, other actions can be added to these procedures, or a step or steps can be removed from these procedures.
本申请的实施例可以应用于数据量大且用户对数据的需求具有针对性的数据存储和调用场景,在该场景中,用户通常仅需要调用大数据量数据中的部分数据。在一些实施例中,所述大数据量数据可以包括自动驾驶测试车在道路测试中采集的路测数据。例如,在一些应用场景下,所述自动驾驶测试车采集的路测数据量可以达到约17MB/秒/车,平均每次调用的路测数据量可以超过11G,平均每天调用的路测数据量可以超过8T,数据量较大。本申请的实施例提供一种数据存储和/或调用方法,能够从所述大数据量数据中高效地调用满足用户特定需求的部分数据。应当理解的是,本申请的数据存储、调用方法和系统的应用场景仅仅是本申请的一些示例或实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图将本申请应用于其它类似情景。虽然本申请主要以路测数据为例进行了描述,但是需要注意的是,本申请的原理也可以应用于其他数据量大且用户对数据的需求具有针对性的数据的存储和调用,例如,定位数据、生产数据、监控数据等等。The embodiments of the present application can be applied to a data storage and invocation scenario where the amount of data is large and the user's demand for data is specific. In this scenario, the user usually only needs to invoke part of the data in the large amount of data. In some embodiments, the large amount of data may include road test data collected by an autonomous driving test vehicle during a road test. For example, in some application scenarios, the amount of road test data collected by the autonomous driving test vehicle can reach about 17MB/sec/vehicle, the average amount of road test data called each time can exceed 11G, and the average amount of road test data called every day It can exceed 8T, and the amount of data is large. The embodiments of the present application provide a data storage and/or calling method, which can efficiently call part of the data that meets the specific needs of the user from the large amount of data. It should be understood that the application scenarios of the data storage and invocation method and system of the present application are only some examples or embodiments of the present application. For those of ordinary skill in the art, without creative work, they can also The application is applied to other similar scenarios according to these figures. Although this application mainly takes the drive test data as an example for description, it should be noted that the principles of this application can also be applied to the storage and invocation of other data with a large amount of data and targeted data requirements of users, for example, Positioning data, production data, monitoring data, etc.
图1是根据本申请一些实施例所示的数据处理系统的应用场景示意图。在一些实施例中,数据处理系统100可以包括车辆110(例如,车辆110-1、110-2...和/或110- n)、服务器120、终端设备130、存储设备140、网络150以及定位和导航系统160。数据处理系统100可以应用于打车服务、安防系统、网络监控、无人驾驶等。需注意的是,在本申请中关于自动驾驶的描述仅仅是说明目的,不会限制本申请的范围。FIG. 1 is a schematic diagram of an application scenario of a data processing system according to some embodiments of the present application. In some embodiments, data processing system 100 may include a vehicle 110 (eg, vehicles 110-1, 110-2... and/or 110-n), server 120, terminal device 130, storage device 140, network 150, and Positioning and Navigation System 160 . The data processing system 100 can be applied to taxi services, security systems, network monitoring, driverless vehicles, and the like. It should be noted that the description about autonomous driving in this application is for illustration purposes only, and does not limit the scope of this application.
车辆110可以是任何类型的自动驾驶车辆、无人机等。无人驾驶车辆或无人机可以指能够实现一定的驾驶自动化等级的车辆。示例性的驾驶自动化等级可以包括:第一等级,车辆主要由人来监督并且具有特定的自主功能(例如,自主转向或加速);第二等级,车辆具有一个或多个可以控制车辆的制动、转向和/或加速的先进驾驶辅助系统(ADAS)(例如,自适应巡航控制系统、车道保持系统);第三等级,满足一个或多个特定条件时车辆能够自动驾驶;第四等级,车辆可在没有人工输入或监督的情况下操作,但仍受某些约束(例如,限制在一定区域内);第五等级,车辆在所有情况下下均可自主运行,等,或其任意组合。车辆110也可以是为了收集数据而在有人控制情况下行驶的车辆或其他交通工具。 Vehicle 110 may be any type of autonomous vehicle, drone, or the like. An unmanned vehicle or drone can refer to a vehicle capable of achieving a certain level of driving automation. Exemplary levels of driving automation may include: a first level, where the vehicle is primarily supervised by humans and has certain autonomous functions (eg, autonomous steering or acceleration); a second level, where the vehicle has one or more brakes that can control the vehicle , advanced driver assistance systems (ADAS) for steering and/or acceleration (e.g., adaptive cruise control, lane keeping systems); Level 3, the vehicle is capable of driving itself when one or more specific conditions are met; Level 4, the vehicle Can operate without human input or supervision, but still subject to certain constraints (e.g., restricted to a certain area); Level 5, where the vehicle operates autonomously in all situations, etc., or any combination thereof. Vehicle 110 may also be a vehicle or other vehicle that is driven under human control for the purpose of collecting data.
在一些实施例中,车辆110可以具有使车辆110能够四处移动或飞行的等效结构。例如,车辆110可以包括常规车辆的结构,例如底盘、悬架、转向装置(例如,方向盘)、制动装置(例如,制动踏板)、加速器等。再例如,车辆110可以具有车身和至少一个车轮。车身可以是任何车身类型,例如跑车、双门轿车、轿车、皮卡车、旅行车、运动型多功能车(SUV)、小型货车或改装货车。所述至少一个车轮可以是全轮驱动(AWD)、前轮驱动(FWR)、后轮驱动(RWD)等。在一些实施例中,车辆110可以是电动车辆、燃料电池车辆、混合动力车辆、传统内燃机车辆等。In some embodiments, the vehicle 110 may have an equivalent structure that enables the vehicle 110 to move around or fly. For example, vehicle 110 may include conventional vehicle structures such as a chassis, suspension, steering (eg, steering wheel), braking (eg, brake pedal), accelerator, and the like. As another example, the vehicle 110 may have a body and at least one wheel. The body can be any body type, such as a sports car, coupe, sedan, pickup truck, station wagon, sport utility vehicle (SUV), minivan, or converted van. The at least one wheel may be all wheel drive (AWD), front wheel drive (FWR), rear wheel drive (RWD), or the like. In some embodiments, the vehicle 110 may be an electric vehicle, a fuel cell vehicle, a hybrid vehicle, a conventional internal combustion engine vehicle, or the like.
在一些实施例中,车辆110能够感测其环境并使用一个或多个检测单元112导航。检测单元112可以包括全球定位系统(GPS)模块、雷达(例如,光检测和测距(LiDAR))、惯性测量单元(IMU)、相机等或其任意组合。雷达(例如,LiDAR)可以用于扫描周围环境并生成点云数据。然后,所述点云数据可用于对车辆110周围的一个或多个对象进行数字3D表示。GPS模块可指能够从GPS卫星接收地理位置和时间信息并计算其地理位置的设备。IMU可以指使用各种惯性传感器来测量并提供车辆的比力、角速度以及有时是围绕车辆的磁场的电子设备。所述各种惯性传感器可以包括加速度传感器(例如,压电传感器)、速度传感器(例如,霍尔传感器)、距离传感器(例如,雷达、LIDAR、红外传感器)、转角传感器(例如,倾斜传感器)、牵引力相关传感器(例如,力传感器)。相机可以被配置为获取与摄像范围内的对象(例如,人、动物、树木、路障、建筑物或车辆)有关的一张或多张图像。In some embodiments, the vehicle 110 can sense its environment and navigate using one or more detection units 112 . The detection unit 112 may include a global positioning system (GPS) module, a radar (eg, light detection and ranging (LiDAR)), an inertial measurement unit (IMU), a camera, etc., or any combination thereof. Radar (eg, LiDAR) can be used to scan the surrounding environment and generate point cloud data. The point cloud data can then be used to digitally 3D represent one or more objects around the vehicle 110 . A GPS module may refer to a device capable of receiving geographic location and time information from GPS satellites and calculating its geographic location. An IMU can refer to an electronic device that uses various inertial sensors to measure and provide the vehicle's specific force, angular velocity, and sometimes the magnetic field surrounding the vehicle. The various inertial sensors may include acceleration sensors (eg, piezoelectric sensors), velocity sensors (eg, Hall sensors), distance sensors (eg, radar, LIDAR, infrared sensors), rotational angle sensors (eg, tilt sensors), Traction related sensors (eg, force sensors). The camera may be configured to acquire one or more images related to objects within the camera range (eg, people, animals, trees, roadblocks, buildings, or vehicles).
在一些实施例中,服务器120可以是单个服务器或服务器组。所述服务器组可以是集中式的,也可以是分布式的(例如,服务器120可以是分布式的系统)。在一些实施例中,服务器120可以是本地的,也可以是远程的。例如,服务器120可以经由网络150访问存储在终端设备130、检测单元112、车辆110、存储设备140和/或定位和导航系统160中的信息和/或数据。又例如,服务器120可以直接连接到终端设备130、检测单元112、车辆110和/或存储设备140,以访问存储的信息和/或数据。在一些实施例中,服务器120可以在云平台或车载计算机上实施。仅作为示例,该云平台可以包括私有云、公共云、混合云、社区云、分布云、内部云、多层云等或其任意组合。在一些实施例中,服务器120可以在包含了一个或以上组件的计算设备上执行。In some embodiments, server 120 may be a single server or a group of servers. The server group may be centralized or distributed (eg, server 120 may be a distributed system). In some embodiments, server 120 may be local or remote. For example, server 120 may access information and/or data stored in terminal device 130 , detection unit 112 , vehicle 110 , storage device 140 , and/or positioning and navigation system 160 via network 150 . As another example, the server 120 may be directly connected to the terminal device 130 , the detection unit 112 , the vehicle 110 and/or the storage device 140 to access stored information and/or data. In some embodiments, server 120 may be implemented on a cloud platform or on-board computer. For example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distribution cloud, an internal cloud, a multi-layer cloud, etc., or any combination thereof. In some embodiments, server 120 may execute on a computing device that includes one or more components.
在一些实施例中,服务器120可以包括处理设备122。处理设备122可以处理信息和/或数据以执行本申请中描述的一个或以上的功能。例如,处理设备122可以根据原始数据集中数据元的类型信息建立对应的目标数据集,并将与所述目标数据集对应的数据元存储在相应的目标数据集中。进一步的,处理设备122可以将包括了数据元的目标数据集存储在存储设备140或其他存储设备或系统。又例如,处理设备122可以对存储在存储设备140或其他存储设备或系统中的数据建立查询索引。具体的,原始数据可以包括多个车辆在一次路测过程中产生的数据,可以包括相机数据、雷达数据等。处理储备122可以基于每个车辆的行程ID、时间范围以及原始数据中数据元的类型建立查询索引。在一些实施例中,所述处理设备122可包括一个或以上处理引擎(例如,单芯片处理引擎或多芯片处理引擎)。仅作为示例,处理设备122可以包括一个或以上硬件处理器,例如中央处理单元(CPU)、特定应用集成电路(ASIC)、特定应用指令集处理器(ASIP)、图像处理单元(GPU)、物理运算处理单元(PPU)、数字信号处理器(DSP)、现场可编程门阵列(FPGA)、可编程逻辑设备(PLD)、控制器、微控制器单元、精简指令集计算机(RISC)、微处理器等或其任意组合。在一些实施例中,所述处理设备122可以集成在终端设备130中。In some embodiments, server 120 may include processing device 122 . Processing device 122 may process information and/or data to perform one or more of the functions described herein. For example, the processing device 122 may establish a corresponding target data set according to the type information of the data elements in the original data set, and store the data elements corresponding to the target data set in the corresponding target data set. Further, the processing device 122 may store the target data set including the data elements in the storage device 140 or other storage device or system. As another example, the processing device 122 may create a query index for data stored in the storage device 140 or other storage devices or systems. Specifically, the raw data may include data generated by multiple vehicles during a drive test process, and may include camera data, radar data, and the like. The processing repository 122 may build a query index based on each vehicle's trip ID, time range, and type of data element in the raw data. In some embodiments, the processing device 122 may include one or more processing engines (eg, a single-chip processing engine or a multi-chip processing engine). For example only, the processing device 122 may include one or more hardware processors, such as a central processing unit (CPU), an application specific integrated circuit (ASIC), an application specific instruction set processor (ASIP), a graphics processing unit (GPU), a physical Arithmetic Processing Unit (PPU), Digital Signal Processor (DSP), Field Programmable Gate Array (FPGA), Programmable Logic Device (PLD), Controller, Microcontroller Unit, Reduced Instruction Set Computer (RISC), Microprocessor device, etc. or any combination thereof. In some embodiments, the processing device 122 may be integrated in the terminal device 130 .
在一些实施例中,终端设备130可以包括移动设备130-1、平板电脑130-2、膝上型电脑130-3、机动车内建装置130-4、130-5等或其任意组合。在一些实施例中,移动设备130-1可包括智能家居设备、可穿戴设备、智能行动设备、虚拟实境设备、增强实境设备等或其任意组合。在一些实施例中,智能家居设备可包括智能照明设备、智能电器的控制设备、智能监测设备、智能电视、智能摄像机、对讲机等或其任意组合。在一些实施例中,可穿戴设备可包括智能手环、智能鞋袜、智能眼镜、智能头盔、智能手 表、智能衣物、智能背包、智能配饰等或其任意组合。在一些实施例中,智能行动设备可包括智能电话、个人数位助理(PDA)、游戏设备、导航设备、POS设备等或其任意组合。在一些实施例中,虚拟实境设备和/或增强实境设备可包括虚拟实境头盔、虚拟实境眼镜、虚拟实境眼罩、增强实境头盔、增强实境眼镜、增强实境眼罩等或其任意组合。例如,虚拟现实设备和/或增强现实设备可以包括Google TM眼镜、OculusRift、HoloLens、Gear VR等。在一些实施例中,机动车内建装置130-4可以包括车载计算机、车载电视等。在一些实施例中,服务器120可以集成到终端设备130中。在一些实施例中,终端设备130可包括具有定位功能的装置,以确定用户和/或终端设备130的位置。 In some embodiments, the terminal device 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a vehicle built-in device 130-4, 130-5, etc., or any combination thereof. In some embodiments, the mobile device 130-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, smart home devices may include smart lighting devices, control devices for smart appliances, smart monitoring devices, smart TVs, smart cameras, walkie-talkies, etc., or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, smart footwear, smart glasses, smart helmets, smart watches, smart clothing, smart backpacks, smart accessories, etc., or any combination thereof. In some embodiments, the smart mobile device may include a smart phone, a personal digital assistant (PDA), a gaming device, a navigation device, a POS device, etc., or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality headset, virtual reality glasses, virtual reality goggles, augmented reality helmet, augmented reality glasses, augmented reality goggles, etc. or any combination thereof. For example, virtual reality devices and/or augmented reality devices may include Google Glasses, Oculus Rift, HoloLens, Gear VR, and the like. In some embodiments, vehicle built-in devices 130-4 may include an on-board computer, on-board television, and the like. In some embodiments, the server 120 may be integrated into the terminal device 130 . In some embodiments, the terminal device 130 may include a device with a positioning function to determine the location of the user and/or the terminal device 130 .
存储设备140可以存储数据和/或指令。在一些实施例中,存储设备140可以存储从车辆110、检测单元112、处理设备122、终端设备130、定位和导航系统160和/或外接设备获得的数据。例如,存储设备140可以存储从车辆110获取的路测数据。在一些实施例中,存储设备140可以存储可以被执行或用于执行本申请中描述的示例性方法的数据和/或指令。例如,存储设备140可以存储处理设备122可以执行存储和/或调用路测数据的指令。在一些实施例中,存储设备140可包括大容量存储器、可移动存储器、易失性读写内存、只读内存(ROM)等或其任意组合。示例性大容量存储器可以包括磁盘、光盘、固态驱动器等。示例性可移动存储器可以包括闪存驱动器、软盘、光盘、内存卡、磁盘、磁带等。示例性易失性读写内存可以包括随机存取内存(RAM)。示例性RAM可包括动态随机存取存储器(DRAM)双倍数据速率同步动态随机存取存储器(DDR SDRAM)、静态随机存取存储器(SRAM)、晶闸管随机存取存储器(T-RAM)和零电容随机存取存储器(Z-RAM)等。示例性只读存储器可以包括掩模型只读存储器(MROM)、可编程只读存储器(PROM)、可擦除可编程只读存储器(EPROM)、电可擦除可编程只读存储器(EEPROM)、光盘只读存储器(CD-ROM)和数字多功能磁盘只读存储器等。一些实施例中,存储设备140还可以包括分布式文件系统(Hadoop DistributedFile System,HDFS)。在一些实施例中,所述分布式文件系统可以设置在不同区域(例如,不同国家、不同地区、不同站点等)并彼此关联。用户可以访问其所在区域的分布式文件系统以获取存储在其中的数据,也可以通过其所在区域的分布式文件系统调用其他区域的分布式文件系统中的数据。例如,分布式文件系统可以包括第一分布式文件系统和第二分布式文件系统,第一分布式文件系统服务器所属第一区域,第二分布式文件系统服务器所属第二区域。路测数据于第一区域内采集并按本申请中实施例所示任一方法存储于第一分布式系统中。用户端所处位置属于第二区域,用户端距离第 二分布式系统服务器的距离小于用户端距离第一分布式系统服务器的距离。在一些实施例中,用户可以通过本申请中实施例所示任一方法调用存储于第一分布式系统中的至少部分数据。在一些实施例中,数据处理系统100可以将第一分布式系统中的至少部分数据与第二分布式系统进行同步,用户可以通过本申请中实施例所示任一方法从第二分布式系统中调用该至少部分数据。在一些实施例中,所述存储设备140可在云端平台上执行。仅作为示例,该云平台可以包括私有云、公共云、混合云、社区云、分布云、内部云、多层云等或其任意组合。 Storage device 140 may store data and/or instructions. In some embodiments, storage device 140 may store data obtained from vehicle 110 , detection unit 112 , processing device 122 , terminal device 130 , positioning and navigation system 160 , and/or external devices. For example, the storage device 140 may store drive test data obtained from the vehicle 110 . In some embodiments, storage device 140 may store data and/or instructions that may be executed or used to perform the example methods described in this application. For example, storage device 140 may store instructions that processing device 122 may execute to store and/or recall drive test data. In some embodiments, storage device 140 may include mass storage, removable storage, volatile read-write memory, read-only memory (ROM), the like, or any combination thereof. Exemplary mass storage may include magnetic disks, optical disks, solid state drives, and the like. Exemplary removable storage may include flash drives, floppy disks, optical disks, memory cards, magnetic disks, tapes, and the like. Exemplary volatile read-write memory may include random access memory (RAM). Exemplary RAMs may include dynamic random access memory (DRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), static random access memory (SRAM), thyristor random access memory (T-RAM), and zero capacitance Random Access Memory (Z-RAM), etc. Exemplary read-only memories may include masked read-only memory (MROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), Compact Disc Read Only Memory (CD-ROM) and Digital Versatile Disk Read Only Memory, etc. In some embodiments, the storage device 140 may further include a distributed file system (Hadoop Distributed File System, HDFS). In some embodiments, the distributed file systems may be located in different regions (eg, different countries, different regions, different sites, etc.) and associated with each other. Users can access the distributed file system in their region to obtain the data stored in it, and can also call the data in the distributed file systems in other regions through the distributed file system in their region. For example, the distributed file system may include a first distributed file system and a second distributed file system, the first distributed file system server belongs to the first region, and the second distributed file system server belongs to the second region. The drive test data is collected in the first area and stored in the first distributed system according to any method shown in the embodiments of this application. The location of the user terminal belongs to the second area, and the distance between the user terminal and the second distributed system server is smaller than the distance between the user terminal and the first distributed system server. In some embodiments, the user may invoke at least part of the data stored in the first distributed system through any method shown in the embodiments of this application. In some embodiments, the data processing system 100 can synchronize at least part of the data in the first distributed system with the second distributed system, and the user can use any method shown in the embodiments of this application to synchronize data from the second distributed system call this at least part of the data. In some embodiments, the storage device 140 may execute on a cloud platform. For example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distribution cloud, an internal cloud, a multi-layer cloud, etc., or any combination thereof.
在一些实施例中,存储设备140可以连接到网络150以与数据处理系统100中的一个或以上组件(例如,服务器120、终端设备130、检测单元112、车辆110和/或定位和导航系统160)通信。数据处理系统100中的一个或以上组件可以经由网络150访问存储设备140中存储的数据或指令。在一些实施例中,存储设备140可以直接连接到数据处理系统100中的一个或以上组件(例如,服务器120、终端设备130、检测单元112、车辆110和/或定位和导航系统160)或与之通信。在一些实施例中,存储设备140可以是服务器120的一部分。在一些实施例中,存储设备140可以集成在车辆110中。In some embodiments, storage device 140 may be connected to network 150 for communication with one or more components in data processing system 100 (eg, server 120 , terminal device 130 , detection unit 112 , vehicle 110 , and/or positioning and navigation system 160 ). ) communication. One or more components in data processing system 100 may access data or instructions stored in storage device 140 via network 150 . In some embodiments, storage device 140 may be directly connected to one or more components in data processing system 100 (eg, server 120 , end device 130 , detection unit 112 , vehicle 110 , and/or positioning and navigation system 160 ) or with communication. In some embodiments, storage device 140 may be part of server 120 . In some embodiments, the storage device 140 may be integrated in the vehicle 110 .
网络150可以促进信息和/或数据的交换。在一些实施例中,数据处理系统100中的一个或以上组件(例如,服务器120、终端设备130、检测单元112、车辆110、存储设备140和/或定位和导航系统160)可以经由网络150向/从数据处理系统100中的其他组件发送/获得信息和/或数据。例如,处理设备122可以经由网络150从车辆110获取路测数据。又例如,处理设备122可以经由网络150从终端设备130获取用户输入的数据调用请求。在一些实施例中,网络150可以是有线网络或无线网络等或其任意组合。仅作为示例,网络150可以包括电缆网络、有线网络、光纤网络、电信网络、内部网络、互联网、局域网络(LAN)、广域网络(WAN)、无线局域网络(WLAN)、城域网(MAN)、公共交换电话网络(PSTN)、蓝牙网络、紫蜂网络、近场通信(NFC)网络等或其任意组合。在一些实施例中,网络150可以包括一个或以上网络接入点。例如,网络150可以包括有线或无线网络接入点(如基站和/或互联网交换点150-1、150-2),数据处理系统100的一个或以上部件可以通过所述网络接入点连接到网络150以交换数据和/或信息。 Network 150 may facilitate the exchange of information and/or data. In some embodiments, one or more components in data processing system 100 (eg, server 120 , end device 130 , detection unit 112 , vehicle 110 , storage device 140 , and/or positioning and navigation system 160 ) /Send/obtain information and/or data from other components in data processing system 100. For example, processing device 122 may obtain drive test data from vehicle 110 via network 150 . For another example, the processing device 122 may obtain the data call request input by the user from the terminal device 130 via the network 150 . In some embodiments, the network 150 may be a wired network or a wireless network, or the like, or any combination thereof. By way of example only, the network 150 may include a cable network, a wired network, a fiber optic network, a telecommunications network, an internal network, the Internet, a local area network (LAN), a wide area network (WAN), a wireless local area network (WLAN), a metropolitan area network (MAN) , Public Switched Telephone Network (PSTN), Bluetooth network, ZigBee network, Near Field Communication (NFC) network, etc. or any combination thereof. In some embodiments, network 150 may include one or more network access points. For example, network 150 may include wired or wireless network access points (eg, base stations and/or Internet exchange points 150-1, 150-2) through which one or more components of data processing system 100 may connect to network 150 to exchange data and/or information.
定位和导航系统160可以确定与对象相关联的信息,例如,终端设备130、车辆110,等。在一些实施例中,定位和导航系统160可以是全球定位系统(GPS)、全 球导航卫星系统(GLONASS)、罗盘导航系统(COMPASS)、北斗导航卫星系统、伽利略定位系统、准天顶卫星系统(QZSS)等。所述信息可以包括对象的位置、高度、速度或加速度、当前时间等。定位和导航系统160可以包括一个或以上卫星,例如卫星160-1、卫星160-2和卫星160-3。卫星160-1至160-3可以独立地或共同地确定上述信息。定位和导航系统160可以经由无线连接将上述信息发送到网络150、终端设备130或车辆110。Positioning and navigation system 160 may determine information associated with objects, eg, end devices 130, vehicles 110, and the like. In some embodiments, the positioning and navigation system 160 may be a Global Positioning System (GPS), Global Navigation Satellite System (GLONASS), Compass Navigation System (COMPASS), Beidou Navigation Satellite System, Galileo Positioning System, Quasi-Zenith Satellite System ( QZSS) etc. The information may include the object's position, height, velocity or acceleration, current time, and the like. Positioning and navigation system 160 may include one or more satellites, such as satellite 160-1, satellite 160-2, and satellite 160-3. The satellites 160-1 to 160-3 may independently or collectively determine the above information. The positioning and navigation system 160 may transmit the above-mentioned information to the network 150 , the terminal device 130 or the vehicle 110 via a wireless connection.
本领域普通技术人员将理解,当数据处理系统100的元件(或部件)执行时,该元件可以通过电信号和/或电磁信号执行。例如,当终端设备130向服务器120发送请求时,终端设备130的处理器可以生成对该请求进行编码的电信号。终端设备130的处理器然后可以将电信号传输到输出端口。如果终端设备130经由有线网络与服务器120通信,则输出端口可以物理地连接至电缆,该电缆还可以将电信号传输至服务器120的输入端口。如果终端设备130经由无线网络与服务器120通信,则终端设备130的输出端口可以是一个或多个天线,其将电信号转换为电磁信号。在诸如终端设备130和/或服务器120的电子设备内,当其处理器处理指令、发出指令和/或执行动作时,该指令和/或动作是通过电信号来进行的。例如,当处理器从存储介质(例如,存储设备140)检索或保存数据时,它可以将电信号发送到存储介质的读取/写入设备,该设备可以在该存储介质中读取或写入结构化数据。可以通过电子设备的总线以电信号的形式将结构化数据发送到处理器。在此,电信号可以指一个电信号、一系列电信号和/或多个离散电信号。Those of ordinary skill in the art will understand that when an element (or component) of data processing system 100 executes, the element may execute through electrical and/or electromagnetic signals. For example, when terminal device 130 sends a request to server 120, the processor of terminal device 130 may generate an electrical signal that encodes the request. The processor of the terminal device 130 may then transmit the electrical signal to the output port. If the terminal device 130 communicates with the server 120 via a wired network, the output port may be physically connected to a cable, which may also transmit electrical signals to the input port of the server 120 . If the terminal device 130 communicates with the server 120 via a wireless network, the output port of the terminal device 130 may be one or more antennas that convert electrical signals into electromagnetic signals. Within an electronic device such as terminal device 130 and/or server 120, when its processor processes instructions, issues instructions, and/or performs actions, the instructions and/or actions are performed through electrical signals. For example, when a processor retrieves or saves data from a storage medium (eg, storage device 140), it can send electrical signals to a read/write device of the storage medium, which can read or write in the storage medium into structured data. The structured data can be sent to the processor in the form of electrical signals over the bus of the electronic device. Here, an electrical signal may refer to one electrical signal, a series of electrical signals, and/or a plurality of discrete electrical signals.
图2是根据本申请一些实施例所示的示例性处理设备的框图。在一些实施例中,处理设备122可以用于数据存储。如图2所示,处理设备122可以包括原始数据集获取模块210、目标数据集建立模块220、索引建立模块230以及存储模块240。Figure 2 is a block diagram of an exemplary processing device shown in accordance with some embodiments of the present application. In some embodiments, processing device 122 may be used for data storage. As shown in FIG. 2 , the processing device 122 may include an original data set acquisition module 210 , a target data set establishment module 220 , an index establishment module 230 and a storage module 240 .
原始数据集获取模块210可以用于获取原始数据集,所述原始数据集包括多个数据元,每个数据元具有标注该数据元类型的类型信息。在一些实施例中,数据集可以指包括多个数据元的数据集合。在一些实施例中,数据集可以是文件,文件的数据元为消息。在一些实施例中,不同数据集和/或同一数据集中的数据元可以具有各自的标识信息。The raw data set obtaining module 210 can be used to obtain a raw data set, the raw data set includes a plurality of data elements, and each data element has type information marking the type of the data element. In some embodiments, a data set may refer to a data set that includes a plurality of data elements. In some embodiments, the dataset may be a file, and the data elements of the file are messages. In some embodiments, different data sets and/or data elements in the same data set may have respective identification information.
在一些实施例中,以路测数据为例,原始数据集获取模块210可以经由网络150从测试车(例如,车辆110)获取所述原始数据集(即路测数据)。具体地,所述路测数据可以是具有时间性质的消息。原始数据集获取模块210可以将一个测试车在一次测 试行程下采集到的消息组织成一个文件(例如,bag文件)进行存储,得到一个原始数据集。进一步地,原始数据集获取模块210还可以将与测试车和行程有关的标识信息作为所述文件的标识信息,例如,可以根据测试车的id和测试行程的id设置所述文件的标识信息。在一些实施例中,原始数据集获取模块210还可以将消息的时间信息作为所述消息在所述文件中的标识信息。例如,可以根据消息的时间戳设置所述消息在所述文件中的标识信息。In some embodiments, taking drive test data as an example, the raw data set obtaining module 210 may obtain the raw data set (ie, drive test data) from a test vehicle (eg, vehicle 110 ) via the network 150 . Specifically, the drive test data may be a message with a temporal nature. The original data set acquisition module 210 may organize the messages collected by a test vehicle during a test trip into a file (for example, a bag file) for storage to obtain an original data set. Further, the original data set acquisition module 210 may also use identification information related to the test vehicle and the itinerary as the identification information of the file. For example, the identification information of the file may be set according to the id of the test vehicle and the id of the test itinerary. In some embodiments, the original data set obtaining module 210 may further use the time information of the message as the identification information of the message in the file. For example, the identification information of the message in the file may be set according to the timestamp of the message.
在一些实施例中,所述原始数据集中的每个数据元具有标注该数据元类型的类型信息。在一些实施例中,所述类型可以包括图像类、位置类、传感器类、数据包(packets)类和控制器局域网总线(Controller Area Network Bus,CAN Bus)类中的一种或多种。In some embodiments, each data element in the original data set has type information that identifies the type of the data element. In some embodiments, the types may include one or more of an image class, a location class, a sensor class, a packet class, and a Controller Area Network Bus (CAN Bus) class.
目标数据集建立模块220可以用于根据所述原始数据集中数据元的类型信息,建立不同的目标数据集。每个类型的数据元可以对应一目标数据集。例如,所述原始数据集中的数据元的类型的个数为N,可以建立N个不同的目标数据集,所述N个不同的目标数据集与不同类型的数据元相对应,其中,N为大于等于2的整数。在一些实施例中,目标数据集建立模块220可以识别所述原始数据集中每个数据元的类型并确定不同类型的个数。进一步地,目标数据集建立模块220可以建立不同的目标数据集,所述不同的目标数据集与不同类型的数据元对应。例如,可以用获取路测数据的设备类型表示数据元的类型。所述设备可以包括相机、雷达、惯性测量单元(IMU)等。所述原始数据可以包括相机类数据、雷达类数据、IMU类数据等。所述不同的目标数据集可以包括相机类目标数据集、雷达类目标数据集以及IMU类目标数据集。又例如,可以用数据类型表述数据元的类型。数据类型可以包括音频类数据、图像类数据、文字类数据等。所述不同的目标数据集可以包括音频类目标数据集、图像类目标数据集、文字类目标数据集等。The target data set establishment module 220 may be configured to establish different target data sets according to the type information of the data elements in the original data set. Each type of data element can correspond to a target data set. For example, the number of types of data elements in the original data set is N, and N different target data sets can be established, and the N different target data sets correspond to different types of data elements, where N is Integer greater than or equal to 2. In some embodiments, the target data set establishment module 220 may identify the type of each data element in the original data set and determine the number of different types. Further, the target data set establishment module 220 may establish different target data sets corresponding to different types of data elements. For example, the type of data element can be represented by the type of the device that acquires the drive test data. The devices may include cameras, radars, inertial measurement units (IMUs), and the like. The raw data may include camera data, radar data, IMU data, and the like. The different target datasets may include camera-type target datasets, radar-type target datasets, and IMU-type target datasets. For another example, the type of a data element can be expressed by a data type. The data types may include audio data, image data, text data, and the like. The different target data sets may include audio target data sets, image target data sets, text target data sets, and the like.
在一些实施例中,目标数据集建立模块220还可以根据目标数据集对应的类型设置该目标数据集的标识信息,以识别不同类型的目标数据集。由于目标数据集的标识信息和类型信息是对应的,在一些实施例中,标识信息可以包括同一目标数据集中各数据元共同的类型信息。In some embodiments, the target data set establishment module 220 may further set the identification information of the target data set according to the corresponding type of the target data set, so as to identify different types of target data sets. Since the identification information and type information of the target data set are corresponding, in some embodiments, the identification information may include type information common to all data elements in the same target data set.
索引建立模块230可以建立原始数据集的索引,以提供对所述原始数据集中数据元的索引功能。在一些实施例中,索引建立模块230可以确定原始数据集的元服务信息,也可以简称元信息。所述元服务信息可以用于描述原始数据集或原始数据集中的数据元的结构、语义、用途、用法等。在一些实施例中,所述元服务信息也可以称为索引 信息或包括索引信息,用于确定原始数据集或原始数据集中的数据元在存储设备中的存储位置。在一些实施例中,所述元服务信息可以至少包括所述目标数据集中各数据元一一对应的元标识信息(例如,时间戳)和存储位置信息(例如,偏移量)。其中,元标识信息可以指相应数据元的标识信息(例如,时间戳)。在一些实施例中,所述元服务信息还可以包括目标数据集的标识信息,所述目标数据集的标识信息与类型对应。在一些实施例中,目标数据集的元服务信息还可以包括该目标数据集中各数据元对应的原始数据集的集标识信息(例如,测试车的id和/或测试行程的id等)。The index building module 230 may build an index of the original data set to provide an indexing function for data elements in the original data set. In some embodiments, the index building module 230 may determine the meta-service information of the original data set, which may also be referred to as meta-information. The meta-service information may be used to describe the structure, semantics, purpose, usage, etc. of the original data set or data elements in the original data set. In some embodiments, the meta-service information may also be referred to as index information or include index information for determining the storage location of the original data set or data elements in the original data set in the storage device. In some embodiments, the meta-service information may at least include meta-identification information (eg, timestamp) and storage location information (eg, offset) corresponding to each data element in the target data set. Wherein, the meta identification information may refer to identification information (eg, timestamp) of the corresponding data element. In some embodiments, the meta-service information may further include identification information of the target data set, where the identification information of the target data set corresponds to the type. In some embodiments, the meta-service information of the target data set may further include set identification information of the original data set corresponding to each data element in the target data set (eg, the id of the test vehicle and/or the id of the test trip, etc.).
在一些实施例中,索引建立模块230或存储模块240可以将所述元服务信息存储在存储设备中,例如,存储模块240、存储设备140或其他存储设备,处理设备122(例如,调用模块320)可以基于用户的数据调用请求访问存储设备,并进一步基于所述元服务信息以及目标数据集(例如,图像类目标数据集)对应的标识信息定位与所述用户的数据调用请求对应的数据元,即确定数据元在存储设备中的存储位置。在一些实施例中,索引建立模块230可以分别建立每个目标数据集的元服务信息。并将所述每个目标数据集的元服务信息以列表的方式存储于存储设备,例如,存储设备140或其他存储设备。In some embodiments, indexing module 230 or storage module 240 may store the meta-service information in a storage device, eg, storage module 240, storage device 140, or other storage device, processing device 122 (eg, calling module 320). ) can access the storage device based on the user's data invocation request, and further locate the data element corresponding to the user's data invocation request based on the meta-service information and the identification information corresponding to the target data set (for example, an image target data set) , that is, determine the storage location of the data element in the storage device. In some embodiments, the index building module 230 may build meta-service information for each target dataset separately. The meta-service information of each target data set is stored in a storage device in the form of a list, for example, the storage device 140 or other storage devices.
存储模块240可以用于基于所述原始数据集中的数据元的类型信息以及目标数据集,将与所述目标数据集对应的数据元存储在相应的目标数据集中。在一些实施例中,存储模块240可以确定原始数据集中每个数据元的类型,并将所述数据元存储在对应的目标数据集中。例如,如果所述数据元是图像类数据元,处理设备122可以将所述数据元存储在图像类目标数据集中。进一步地,存储模块240还可以将与所述目标数据集(例如,图像类目标数据集)以及其中存储的数据元存储在存储设备中,例如,存储设备140(例如,分布式文件系统)。在一些实施例中,存储模块240可以将同一原始数据集中的各数据元存放在物理上连续的内存空间,也可以将其存放在物理上非连续的内存空间且非连续存放的数据元间通过指针进行链接。仅作为示例,在一些实施例中,存储模块240可以将目标数据集中的数据元存储在分布式文件系统(HDFS)中,所述存储可以是物理存储。The storage module 240 may be configured to store the data elements corresponding to the target data set in the corresponding target data set based on the type information of the data elements in the original data set and the target data set. In some embodiments, the storage module 240 may determine the type of each data element in the original data set and store the data element in the corresponding target data set. For example, if the data element is an image-like data element, the processing device 122 may store the data element in an image-like object dataset. Further, the storage module 240 may also store the target dataset (eg, an image-type target dataset) and the data elements stored therein in a storage device, eg, the storage device 140 (eg, a distributed file system). In some embodiments, the storage module 240 may store each data element in the same original data set in a physically contiguous memory space, or may store it in a physically non-contiguous memory space and the data elements stored in the non-contiguously pass through between the data elements. pointer to link. For example only, in some embodiments, the storage module 240 may store the data elements in the target dataset in a distributed file system (HDFS), which may be physical storage.
在一些实施例中,存储模块240还可以将与所述目标数据集(例如,图像类目标数据集)对应的元服务信息存储在存储设备中,例如,存储设备140(例如,分布式文件系统)中。所述元服务信息可以通过指针指向所述目标数据集。用户可以通过所述元服务信息定位所述目标数据集。在一些实施例中,存储模块240可以以列表的方式存 储目标数据集对应的元服务信息。在一些实施例中,用于存储目标数据集以及其中的数据元的存储设备与用于存储元服务信息的存储设备可以是相同的,也可以是不同的。In some embodiments, the storage module 240 may also store meta-service information corresponding to the target dataset (eg, an image-type target dataset) in a storage device, eg, the storage device 140 (eg, a distributed file system) )middle. The meta-service information may point to the target data set through a pointer. The user can locate the target dataset through the meta-service information. In some embodiments, the storage module 240 may store the meta-service information corresponding to the target dataset in the form of a list. In some embodiments, the storage device used to store the target dataset and the data elements therein may be the same or different from the storage device used to store the meta-service information.
需要注意的是,以上对于处理设备122及其模块的描述,仅为描述方便,并不能把本申请限制在所举实施例范围之内。可以理解,对于本领域的技术人员来说,在了解该系统的原理后,可能在不背离这一原理的情况下,对各个模块进行任意组合,或者构成子系统与其他模块连接。例如,图2中披露的原始数据集获取模块210、目标数据集建立模块220、索引建立模块230、存储模块240可以是一个系统中的不同模块,也可以是一个模块实现上述的两个或两个以上模块的功能。例如,存储模块240、索引建立模块230可以是两个独立模块,也可以是一个模块同时具有数据存储、建立索引以及缓存的功能。诸如此类的变形,均在本申请的保护范围之内。It should be noted that the above description of the processing device 122 and its modules is only for convenience of description, and cannot limit the present application to the scope of the illustrated embodiments. It can be understood that for those skilled in the art, after understanding the principle of the system, various modules may be combined arbitrarily, or a subsystem may be formed to connect with other modules without departing from the principle. For example, the original data set acquisition module 210, the target data set establishment module 220, the index establishment module 230, and the storage module 240 disclosed in FIG. 2 may be different modules in a system, or may be one module to implement the above two or both functions of more than one module. For example, the storage module 240 and the index building module 230 may be two independent modules, or one module may have the functions of data storage, index building and caching at the same time. Such deformations are all within the protection scope of the present application.
图3是根据本申请一些实施例所示的另一示例性处理设备的框图。在一些实施例中,处理设备122可以用于调用数据。如图3所示,处理设备122可以包括用户请求获取模块310和调用模块320。FIG. 3 is a block diagram of another exemplary processing device shown in accordance with some embodiments of the present application. In some embodiments, processing device 122 may be used to invoke data. As shown in FIG. 3 , the processing device 122 may include a user request obtaining module 310 and a calling module 320 .
用户请求获取模块310可以用于获取用户的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型。在一些实施例中,所述数据调用请求可以由用户通过移动设备(例如,终端设备130的输入/输出接口)或计算设备输入。例如,终端设备130的输入/输出接口可以包括输入设备,诸如键盘、鼠标、触摸屏、麦克风、轨迹球等或其任意组合,用户可以使用所述输入设备输入所述数据调用请求。在一些实施例中,用户请求获取模块310可以(例如,通过网络150)获取所述数据调用请求。在一些实施例中,所述数据包括路测数据。数据调用请求还可以包括路测数据采集时间范围信息、测试车的id和/或测试行程的id等信息。The user request acquisition module 310 may be configured to acquire a user's data invocation request, where the data invocation request at least includes the type of the data to be invoked. In some embodiments, the data call request may be input by the user through a mobile device (eg, an input/output interface of the terminal device 130 ) or a computing device. For example, the input/output interface of the terminal device 130 may include an input device, such as a keyboard, a mouse, a touch screen, a microphone, a trackball, etc., or any combination thereof, and the user may use the input device to input the data calling request. In some embodiments, the user request acquisition module 310 may acquire the data call request (eg, via the network 150). In some embodiments, the data includes drive test data. The data invocation request may also include information such as the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
调用模块320可以用于基于所述数据调用请求获取相应类型目标数据集中的数据元,得到所述待调用数据。在一些实施例中,调用模块320可以包括索引信息获取单元322、分段存储单元323、数据元获取单元324。The calling module 320 may be configured to obtain the data elements in the target data set of the corresponding type based on the data calling request, and obtain the data to be called. In some embodiments, the calling module 320 may include an index information obtaining unit 322 , a segment storage unit 323 , and a data element obtaining unit 324 .
索引信息获取单元322可以用于基于所述数据调用请求获取与所述数据调用请求对应的索引信息。数据元获取单元324可以用于基于所述索引信息获取数据元。在一些实施例中,数据处理系统100可以提供索引机制。例如,索引建立模块230可以确定所述目标数据集的元服务信息,所述元服务信息可以存储在存储设备中。索引信息获取单元322可以从所述数据调用请求中获取用户的索引请求信息。进一步地,数据元获取单元324可以基于所述数据调用请求中的索引请求信息与存储在存储设备中的元服务 信息进行匹配,从而确定与数据调用请求中的索引请求信息匹配的索引信息(或元服务信息),基于索引信息或元服务信息指向的存储位置获取数据元。The index information obtaining unit 322 may be configured to obtain index information corresponding to the data calling request based on the data calling request. The data element obtaining unit 324 may be configured to obtain the data element based on the index information. In some embodiments, data processing system 100 may provide an indexing mechanism. For example, the indexing module 230 may determine meta-service information for the target dataset, which may be stored in a storage device. The index information obtaining unit 322 may obtain the user's index request information from the data calling request. Further, the data element obtaining unit 324 may match the index request information in the data invocation request with the meta-service information stored in the storage device, thereby determining index information (or index information matching the index request information in the data invocation request). meta-service information), to obtain data elements based on the index information or the storage location pointed to by the meta-service information.
在一些实施例中,所述数据调用请求至少包括待调用数据所属的类型。在一些实施例中,待调用数据所述的类型可以包括一个,也可以包括多个。由于目标数据集的标识信息和类型信息是对应的,索引信息获取单元322可以基于所述数据调用请求所包含的用户选定的一个或多个类型(即用户索引请求信息)访问元服务信息,确定元服务信息中与所述一个或多个类型匹配的索引信息。进一步地,数据元获取单元324可以基于所述索引信息确定相应目标数据集的存储位置,从而调用相应目标数据集中的数据元。In some embodiments, the data invocation request at least includes the type of the data to be invoked. In some embodiments, the type described by the data to be called may include one or more than one. Since the identification information and type information of the target data set are corresponding, the index information obtaining unit 322 can access the meta-service information based on one or more types (that is, user index request information) selected by the user included in the data call request, Index information in the meta-service information that matches the one or more types is determined. Further, the data element obtaining unit 324 may determine the storage location of the corresponding target data set based on the index information, so as to call the data element in the corresponding target data set.
在一些实施例中,所述数据调用请求还可以包括与待调用数据相关的更多筛选条件。例如,在一些实施例中,所述索引信息可以至少包括所述目标数据集中各数据元的一一对应的元标识信息(例如,时间戳)和存储位置信息。相应地,所述数据调用请求还可以包括针对待调用数据的与元标识信息相关的元限定条件(例如,时间范围)。索引信息获取单元322可以基于所述数据调用请求访问元服务信息,确定元服务信息中与所述一个或多个类型对应且满足所述元限定条件的索引信息。进一步地,数据元获取单元324可以根据所述索引信息对应的存储位置获取数据元。In some embodiments, the data invocation request may further include more filter conditions related to the data to be invoked. For example, in some embodiments, the index information may include at least one-to-one corresponding meta identification information (eg, timestamp) and storage location information of each data element in the target data set. Correspondingly, the data invocation request may further include meta-qualification conditions (eg, time range) related to the meta-identification information for the data to be invoked. The index information obtaining unit 322 may access the meta-service information based on the data call request, and determine the index information in the meta-service information that corresponds to the one or more types and satisfies the meta-qualifying condition. Further, the data element obtaining unit 324 may obtain the data element according to the storage location corresponding to the index information.
在一些实施例中,所述索引信息还可以包括所述目标数据集中各数据元对应的原始数据集的集标识信息,其中,集标识信息是指原始数据集的标识信息。相应地,所述数据调用请求还可以包括针对待调用数据的与集标识信息相关的集限定条件。索引信息获取单元322可以基于所述数据调用请求访问元服务信息,确定元服务信息中与所述一个或多个类型对应且满足所述元限定条件和/或所述集限定条件的索引信息。进一步地,数据元获取单元324可以根据所述索引信息对应的存储位置获取数据元。In some embodiments, the index information may further include set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set. Correspondingly, the data invocation request may further include a set qualification related to the set identification information for the data to be invoked. The index information obtaining unit 322 may access meta-service information based on the data call request, and determine index information in the meta-service information that corresponds to the one or more types and satisfies the meta-qualifying condition and/or the set-qualifying condition. Further, the data element obtaining unit 324 may obtain the data element according to the storage location corresponding to the index information.
在一些实施例中,所述数据可以包括路测数据。用户的数据调用请求可以包括路测数据采集设备的类型、路测数据采集时间范围信息、测试车的id和/或测试行程的id等索引请求信息。由所述数据调用请求确定的所述索引请求信息可以至少包括测试车的id和/或测试行程的id、待调用数据所属的类型、时间范围、数据时间长度等。In some embodiments, the data may include drive test data. The user's data invocation request may include index request information such as the type of the drive test data collection device, the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip. The index request information determined by the data calling request may include at least the id of the test vehicle and/or the id of the test trip, the type of the data to be called, the time range, the data time length, and the like.
在一些实施例中,分段存储单元323可以用于基于与用户的数据调用请求匹配的元服务信息从存储有目标数据集的存储设备(例如,分布式文件系统)中获取目标数据集中存储的数据元。进一步的,分段存储单元323可以基于数据元标识的时间信息(例如,时间戳),即目标数据集对应的时间信息以及用户数据调用请求中的时间信息,进一步将目标数据集按预设的时间间隔划分成多个目标数据子集,每个目标数据子集对 应一个时间间隔,并分别存储各个时间间隔(例如,每10s)获取的数据元至每个目标数据子集(也称为物理数据文件)。例如,当用户数据调用请求中的时间范围长度小于目标数据集的时间范围长度,将目标数据集按预设的时间间隔划分成多个目标数据子集。例如,每个目标数据集对应的数据元的时间长度为100秒,用户数据调用请求中的时间长度为20秒,则分段存储单元323可以将目标数据集划分为10个目标数据子集,每个目标数据子集对应10秒的数据元。如本文所述,此处的时间范围可以指的是数据元被采集的时间范围。In some embodiments, the segment storage unit 323 may be configured to obtain the target data set stored in the target data set from a storage device (eg, a distributed file system) in which the target data set is stored based on the meta-service information matched with the user's data invocation request. data element. Further, the segment storage unit 323 may further store the target data set according to the preset time information (eg, timestamp) based on the time information (for example, time stamp) identified by the data element, that is, the time information corresponding to the target data set and the time information in the user data call request. The time interval is divided into multiple target data subsets, each target data subset corresponds to a time interval, and the data elements obtained at each time interval (for example, every 10s) are stored in each target data subset (also called physical data). data files). For example, when the time range length in the user data call request is smaller than the time range length of the target data set, the target data set is divided into multiple target data subsets at preset time intervals. For example, if the time length of the data element corresponding to each target data set is 100 seconds, and the time length in the user data call request is 20 seconds, the segment storage unit 323 may divide the target data set into 10 target data subsets, Each target data subset corresponds to 10 seconds of data elements. As described herein, the time range here may refer to the time range in which the data elements are collected.
在一些实施例中,分段存储单元323可以通过物理存储的方式将每个目标数据子集以及存储在其中的数据元存储在处理设备122的存储器中。当用户完成该目标数据集中的部分数据元的调用后,该目标数据子集以及其所存储的数据元可以被擦除。进一步的,数据获取单元324可以基于用户数据调用请求的时间信息,从存储设备中获取各个时间间隔中与该时间信息相匹配的目标数据子集中的数据元。例如,当用户所在地与存储有原始数据集的存储设备(称为第一存储设备,例如,分布式文件系统)处于同一城市或国家,处理设备122(称为本地服务器)可以基于上述方法分发数据给用户的终端(也称为用户端,例如,终端设备130)。In some embodiments, the segment storage unit 323 may store each target data subset and the data elements stored therein in the memory of the processing device 122 by means of physical storage. After the user completes the invocation of some data elements in the target data set, the target data subset and the data elements stored therein can be erased. Further, the data obtaining unit 324 may obtain, from the storage device, data elements in the target data subsets matching the time information in each time interval based on the time information of the user data call request. For example, when the user's location is in the same city or country as the storage device (referred to as the first storage device, eg, a distributed file system) that stores the original data set, the processing device 122 (referred to as the local server) may distribute the data based on the above method To the user's terminal (also referred to as the user terminal, eg, terminal device 130).
在一些实施例中,分段存储单元323可以基于与用户的数据调用请求匹配的元服务信息,确定元服务信息指向的目标数据集。并基于目标数据集对应的时间信息(例如,时间范围)以及用户数据调用请求的时间信息建立目标数据集对应的多个逻辑文件。例如,当用户端所在地与存储有原始数据集的存储设备(称为第一存储设备,例如,分布式文件系统)不处于同一区域(例如,城市或国家),处理设备122与所述第一存储设备处于同一区域(例如,城市或国家),处理设备122可以建立目标数据集对应的多个逻辑文件。处理设备122可以进一步地将目标数据集以及其存储的数据元发送给第二存储设备,所述第二存储设备与用户端处于同一区域(例如,城市或国家)。第二存储设备所处的服务器可以将接收的目标数据集(物理数据文件)按预设的时间间隔划分成多个目标数据子集,并将数据元分别存储至对应的目标数据子集中。所述多个逻辑文件通过指针方式指向处于第二存储设备的目标数据子集。数据获取单元324可以通过匹配用户数据调用请求中的时间信息与每个逻辑文件中的时间信息,确定与用户调用请求匹配的逻辑文件,并基于用户匹配的逻辑文件所指向的第二存储设备中的目标数据子集,指示第二存储设备的服务器向用户端发送匹配的目标数据子集中的数据元。进一步的,第二存储设备的服务器可以将多个目标数据子集中的数据元进行合并后发送给用户端。In some embodiments, the segment storage unit 323 may determine the target data set pointed to by the meta-service information based on the meta-service information matching the user's data invocation request. And based on the time information (eg, time range) corresponding to the target data set and the time information of the user data invocation request, multiple logical files corresponding to the target data set are established. For example, when the location of the client is not in the same area (eg, city or country) as the storage device (referred to as the first storage device, such as a distributed file system) storing the original data set, the processing device 122 and the first storage device The storage devices are in the same region (eg, city or country), and the processing device 122 can create multiple logical files corresponding to the target dataset. The processing device 122 may further send the target data set and its stored data elements to a second storage device located in the same region (eg, city or country) as the client. The server where the second storage device is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively. The plurality of logical files point to the target data subset in the second storage device by way of pointers. The data acquisition unit 324 can determine the logical file matching the user invocation request by matching the time information in the user data invocation request with the time information in each logical file, and based on the second storage device pointed to by the logical file matched by the user. the target data subset, instructing the server of the second storage device to send the data elements in the matched target data subset to the client. Further, the server of the second storage device may combine the data elements in the multiple target data subsets and send them to the client.
逻辑文件中不存储目标数据子集中的数据元,可以存储关于数据元的信息(例如,部分的元服务信息)。逻辑文件可以通过指针方式指向物理数据文件(即目标数据子集)。例如,可以将目标数据集按预设的时间间隔划分成多个目标数据子集,每个目标数据子集对应一个时间间隔,每个目标数据子集可以建立一个逻辑文件。每个逻辑文件中包括每个目标数据子集中存储的数据元的元服务信息。The data elements in the target data subset are not stored in the logical file, and information about the data elements (eg, part of the meta-service information) may be stored. Logical files can point to physical data files (ie, target data subsets) by way of pointers. For example, the target data set can be divided into multiple target data subsets according to preset time intervals, each target data subset corresponds to a time interval, and each target data subset can create a logic file. Each logical file includes meta-service information of data elements stored in each target data subset.
用户可以通过用户调用请求中的时间信息通过所述逻辑文件中的元服务信息定位所述第二存储设备中的目标子数据集。在一些实施例中,所述逻辑文件与物理存储的目标数据子集之间可以通过指针进行链接。在一些实施例中,第二存储设备可以将同一目标子数据集中的各数据元存放在物理上连续的内存空间,也可以将其存放在物理上非连续的内存空间。仅作为示例,在一些实施例中,第二存储设备可以将目标子数据集中的数据元存储在分布式文件系统(HDFS)中,所述存储可以是物理存储。相应地,用户可以通过访问第一存储设备中的逻辑文件,从与用户端处于同一区域内的第二存储设备中调用与该指定时间段实际对应的数据(即物理存储的目标数据子集),从而实现快速调用部分数据的功能。在一些实施例中,分段存储单元323可以依据用户针对待调用数据指定的时间段的最小值来设定所述时间间隔,以尽量保证按照该时间段调用的若干目标数据子集与该时间段实际对应的数据相吻合。在一些实施例中,分段存储单元323可以直接将用户针对待调用数据指定的时间段的最小值设为所述时间间隔。The user may locate the target sub-data set in the second storage device through the time information in the user call request through the meta-service information in the logical file. In some embodiments, the logical file and the physically stored target data subset may be linked by a pointer. In some embodiments, the second storage device may store each data element in the same target sub-data set in a physically contiguous memory space, or may store it in a physically non-contiguous memory space. For example only, in some embodiments, the second storage device may store the data elements in the target sub-dataset in a distributed file system (HDFS), which may be physical storage. Correspondingly, the user can call the data actually corresponding to the specified time period (that is, the physically stored target data subset) from the second storage device located in the same area as the user terminal by accessing the logical file in the first storage device. , so as to realize the function of quickly calling part of the data. In some embodiments, the segment storage unit 323 may set the time interval according to the minimum value of the time period specified by the user for the data to be called, so as to ensure that several target data subsets called according to the time period are related to the time interval as much as possible. The actual corresponding data of the segment are consistent with each other. In some embodiments, the segment storage unit 323 may directly set the minimum value of the time period specified by the user for the data to be called as the time interval.
在一些实施例中,分段存储单元323可以基于数据元标识的时间信息(例如,时间戳),从存储有原始数据集的存储设备中确定以及获取目标数据集,将目标数据集按预设的时间间隔划分成多个目标数据子集,并分别存储每个目标数据子集。进一步地,当所述数据调用请求还包括待调用数据对应的时间范围时,且待调用数据对应的时间范围小于目标数据集对应的时间范围,处理设备122不需要将整个目标数据集中的数据发送给用户端,只需将于用户调用数据请求中的时间范围信息对应的数据元(即目标数据子集中的数据元)发送给用户端,从而实现快速调用部分数据的功能,提高数据调用效率。In some embodiments, the segment storage unit 323 may determine and acquire the target data set from the storage device storing the original data set based on the time information (eg, timestamp) identified by the data element, and store the target data set according to a preset value. The time interval is divided into multiple target data subsets, and each target data subset is stored separately. Further, when the data call request also includes the time range corresponding to the data to be called, and the time range corresponding to the data to be called is smaller than the time range corresponding to the target data set, the processing device 122 does not need to send the data in the entire target data set. For the client, it is only necessary to send the data elements corresponding to the time range information in the user's call data request (that is, the data elements in the target data subset) to the client, so as to realize the function of quickly calling part of the data and improve the efficiency of data calling.
需要注意的是,以上对于处理设备122及其模块的描述,仅为描述方便,并不能把本申请限制在所举实施例范围之内。可以理解,对于本领域的技术人员来说,在了解该系统的原理后,可能在不背离这一原理的情况下,对各个模块进行任意组合,或者构成子系统与其他模块连接。例如,图3中披露的用户请求获取模块310和调用模块320可以是一个系统中的不同模块,也可以是一个模块实现上述的两个或两个以上模块 的功能。例如,用户请求获取模块310和调用模块320可以是两个模块,也可以是一个模块同时具有获取用户请求以及调用数据的功能。诸如此类的变形,均在本申请的保护范围之内。It should be noted that the above description of the processing device 122 and its modules is only for convenience of description, and cannot limit the present application to the scope of the illustrated embodiments. It can be understood that for those skilled in the art, after understanding the principle of the system, various modules may be combined arbitrarily, or a subsystem may be formed to connect with other modules without departing from the principle. For example, the user request acquisition module 310 and the invocation module 320 disclosed in FIG. 3 may be different modules in a system, or may be a module that implements the functions of the above-mentioned two or more modules. For example, the user request obtaining module 310 and the calling module 320 may be two modules, or one module may have the functions of obtaining user requests and calling data at the same time. Such deformations are all within the protection scope of the present application.
应当理解,图2和图3所示的系统及其模块可以利用各种方式来实现。例如,在一些实施例中,系统及其模块可以通过硬件、软件或者软件和硬件的结合来实现。其中,硬件部分可以利用专用逻辑来实现;软件部分则可以存储在存储器中,由适当的指令执行系统,例如微处理器或者专用设计硬件来执行。本领域技术人员可以理解上述的方法和系统可以使用计算机可执行指令和/或包含在处理器控制代码中来实现,例如在诸如磁盘、CD或DVD-ROM的载体介质、诸如只读存储器(固件)的可编程的存储器或者诸如光学或电子信号载体的数据载体上提供了这样的代码。本申请的系统及其模块不仅可以有诸如超大规模集成电路或门阵列、诸如逻辑芯片、晶体管等的半导体、或者诸如现场可编程门阵列、可编程逻辑设备等的可编程硬件设备的硬件电路实现,也可以用例如由各种类型的处理器所执行的软件实现,还可以由上述硬件电路和软件的结合(例如,固件)来实现。It should be understood that the system and its modules shown in Figures 2 and 3 may be implemented in various ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein, the hardware part can be realized by using dedicated logic; the software part can be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer-executable instructions and/or embodied in processor control code, for example on a carrier medium such as a disk, CD or DVD-ROM, such as a read-only memory (firmware) ) or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application can not only be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc. , can also be implemented by, for example, software executed by various types of processors, and can also be implemented by a combination of the above-mentioned hardware circuits and software (eg, firmware).
图4是根据本申请一些实施例所示的数据存储方法的示例性流程图。如图4所示,该数据存储方法可以包括:FIG. 4 is an exemplary flowchart of a data storage method according to some embodiments of the present application. As shown in Figure 4, the data storage method may include:
步骤410,获取原始数据集,所述原始数据集包括多个数据元,每个数据元具有标注该数据元类型的类型信息。在一些实施例中,步骤410可以由处理设备122(例如,原始数据集获取模块210)执行。Step 410: Obtain an original data set, where the original data set includes a plurality of data elements, and each data element has type information indicating the type of the data element. In some embodiments, step 410 may be performed by processing device 122 (eg, raw data set acquisition module 210).
在一些实施例中,数据集可以指包括多个数据元的数据集合。例如,当数据量较为庞大时,希望将数据组织成较大的单元分别进行存储。此时,一个单元即为一个数据集。在一些实施例中,数据集可以是文件,文件的数据元为消息。进一步地,文件可以是文件包(bag)格式,以下称为“bag文件”。在一些实施例中,数据集中的各数据元可以是互相关联的。在一些实施例中,不同数据集和/或同一数据集中的数据元可以具有各自的标识信息。以路测数据为例,测试车采集到的路测数据可以是具有时间性质的消息(例如,所述消息可以具有时间戳)。处理设备122可以将一个测试车在一次测试行程下采集到的消息组织成一个文件(例如,bag文件)进行存储,得到一个原始数据集。进一步地,处理设备122还可以将与测试车和行程有关的标识信息作为所述文件的标识信息,例如,可以根据测试车的id和测试行程的id设置所述文件的标识信息。在一些实施例中,处理设备122还可以将消息的时间信息作为所述消息在所述文件中的标 识信息。例如,可以根据消息的时间戳设置所述消息在所述文件中的标识信息。In some embodiments, a data set may refer to a data set that includes a plurality of data elements. For example, when the amount of data is relatively large, it is desirable to organize the data into larger units and store them separately. In this case, a unit is a dataset. In some embodiments, the dataset may be a file, and the data elements of the file are messages. Further, the file may be in a bag format, hereinafter referred to as a "bag file". In some embodiments, the data elements in the data set may be associated with each other. In some embodiments, different data sets and/or data elements in the same data set may have respective identification information. Taking the drive test data as an example, the drive test data collected by the test vehicle may be a message with a temporal nature (for example, the message may have a time stamp). The processing device 122 may organize the messages collected by a test vehicle during a test trip into a file (eg, a bag file) for storage to obtain an original data set. Further, the processing device 122 may also use identification information related to the test vehicle and the itinerary as the identification information of the file. For example, the identification information of the file may be set according to the id of the test vehicle and the id of the test trip. In some embodiments, the processing device 122 may also use the time information of the message as the identification information of the message in the file. For example, the identification information of the message in the file may be set according to the timestamp of the message.
在一些实施例中,所述原始数据集中的每个数据元具有标注该数据元类型的类型信息。在一些实施例中,所述类型可以包括图像类、位置类、传感器类、数据包(packets)类和控制器局域网总线(Controller Area Network Bus,CAN Bus)类中的一种或多种。In some embodiments, each data element in the original data set has type information that identifies the type of the data element. In some embodiments, the types may include one or more of an image class, a location class, a sensor class, a packet class, and a Controller Area Network Bus (CAN Bus) class.
在一些实施例中,数据元可以通过携带类型信息来标注自身所属的类型。即,可以为数据元分配大小能容纳数据元及其类型信息的存储单元,并将数据元及其类型信息按预设规则组织在一起后进行存储。在一些实施例中,可以将数据元和相应的类型信息通过预设的连接符号连接起来,连接符号预设的一边是数据元,连接符号的另一边则是该数据元的类型信息。在一些实施例中,可以将用于存储数据元及其类型信息的存储单元划分成至少两个分区,其中包括用于存储数据元本身的第一分区和用于储存该数据元的类型信息的第二分区。In some embodiments, the data element can mark the type to which it belongs by carrying type information. That is, a storage unit with a size capable of accommodating the data element and its type information can be allocated to the data element, and the data element and its type information can be organized together according to preset rules for storage. In some embodiments, the data element and the corresponding type information may be connected by a preset connection symbol, one side of the connection symbol is the data element, and the other side of the connection symbol is the type information of the data element. In some embodiments, a storage unit for storing a data element and its type information may be divided into at least two partitions, including a first partition for storing the data element itself and a first partition for storing the type information of the data element Second division.
步骤420,根据所述原始数据集中数据元的类型信息,建立不同的目标数据集。每个类型的数据元可以对应一目标数据集。例如,所述原始数据集中的数据元的类型的个数为N,可以建立N个不同的目标数据集,所述N个不同的目标数据集与不同类型的数据元相对应,其中,N为大于等于2的整数。在一些实施例中,步骤420可以由处理设备122(例如,目标数据集建立模块220)执行。Step 420: Establish different target data sets according to the type information of the data elements in the original data set. Each type of data element can correspond to a target data set. For example, the number of types of data elements in the original data set is N, and N different target data sets can be established, and the N different target data sets correspond to different types of data elements, where N is Integer greater than or equal to 2. In some embodiments, step 420 may be performed by processing device 122 (eg, target dataset establishment module 220).
在一些实施例中,处理设备122可以识别所述原始数据集中每个数据元的类型并确定不同类型的个数。例如,以路测数据为例,所述原始数据集可以包括图像类、位置类、速度类这三种类型的数据元。处理设备122可以确定原始数据集中数据元的类型的个数为三个。进一步地,处理设备122可以建立三个不同的目标数据集,所述三个不同的目标数据集与不同类型的数据元对应。例如,所述三个不同的目标数据集可以是图像类目标数据集、位置类目标数据集以及速度类目标数据集。又例如,可以根据获取路测数据的设备类型划分数据元的类型。所述设备可以包括相机、雷达、惯性测量单元(IMU)等。所述原始数据可以包括相机类数据、雷达类数据、IMU类数据等。处理设备122可以建立不同的目标数据集,所述不同的目标数据集与不同类型的数据元对应。例如,所述不同的目标数据集可以是相机类目标数据集、雷达类目标数据集以及IMU类目标数据集。In some embodiments, processing device 122 may identify the type of each data element in the original data set and determine the number of different types. For example, taking drive test data as an example, the original data set may include three types of data elements: image type, position type, and speed type. The processing device 122 may determine that the number of types of data elements in the original data set is three. Further, the processing device 122 may establish three different target data sets corresponding to different types of data elements. For example, the three different target datasets may be an image class target dataset, a location class target dataset, and a speed class target dataset. For another example, the types of data elements may be divided according to the type of the device that acquires the drive test data. The devices may include cameras, radars, inertial measurement units (IMUs), and the like. The raw data may include camera data, radar data, IMU data, and the like. The processing device 122 may establish different target data sets corresponding to different types of data elements. For example, the different target datasets may be camera-type target datasets, radar-type target datasets, and IMU-type target datasets.
在一些实施例中,处理设备122还可以根据目标数据集对应的类型设置该目标数据集的标识信息,以识别不同类型的目标数据集。由于目标数据集的标识信息和类型信息是对应的,在一些实施例中,标识信息可以包括同一目标数据集中各数据元共同的 类型信息。In some embodiments, the processing device 122 may further set the identification information of the target data set according to the type corresponding to the target data set, so as to identify different types of target data sets. Since the identification information and type information of the target data set are corresponding, in some embodiments, the identification information may include type information common to all data elements in the same target data set.
仅作为示例,在一些实施例中,处理设备122可以将一个目标数据集存储在一个文件中,并将与所述目标数据集对应的标识信息存储在另一个文件中,所述目标数据集和其对应的标识信息可以通过预设的连接符号连接起来。连接符号预设的一边是目标数据集,连接符号的另一边则是该目标数据集的标识信息。For example only, in some embodiments, the processing device 122 may store a target dataset in one file and store identification information corresponding to the target dataset in another file, the target dataset and The corresponding identification information can be connected by preset connection symbols. The preset side of the connection symbol is the target data set, and the other side of the connection symbol is the identification information of the target data set.
步骤430,基于所述原始数据集中的数据元的类型信息以及目标数据集,将与所述目标数据集对应的数据元存储在相应的目标数据集中。在一些实施例中,步骤430可以由处理设备122(例如,存储模块240)执行。Step 430: Based on the type information of the data elements in the original data set and the target data set, store the data elements corresponding to the target data set in the corresponding target data set. In some embodiments, step 430 may be performed by processing device 122 (eg, storage module 240).
在一些实施例中,处理设备122可以确定原始数据集中每个数据元的类型,并将所述数据元存储在对应的目标数据集中。例如,如果所述数据元是图像类数据元,处理设备122可以将所述数据元存储在图像类目标数据集中。例如,图5是根据本申请一些实施例所示的将原始数据集中不同类型的数据元存储在相应的目标数据集中的示意图。如图5所示,一原始数据集的数据元类型包括3种类型A、B、C,则分别对应这3种类型建立3个目标数据集。进而,将属于类型A的数据元A1、A2、A3存储在类型A对应的目标数据集中,将属于类型B的数据元B1、B2存储在类型B对应的目标数据集中,以及将属于类型C的数据元C1、C2、C3、C4存储在类型C对应的目标数据集中。In some embodiments, processing device 122 may determine the type of each data element in the original dataset and store the data element in the corresponding target dataset. For example, if the data element is an image-like data element, the processing device 122 may store the data element in an image-like object dataset. For example, FIG. 5 is a schematic diagram of storing different types of data elements in the original data set in the corresponding target data set according to some embodiments of the present application. As shown in FIG. 5 , the data element types of an original data set include three types A, B, and C, and three target data sets are established corresponding to these three types respectively. Further, the data elements A1, A2 and A3 belonging to type A are stored in the target data set corresponding to type A, the data elements B1 and B2 belonging to type B are stored in the target data set corresponding to type B, and the data elements belonging to type C are stored in the target data set corresponding to type B. Data elements C1, C2, C3, and C4 are stored in the target dataset corresponding to type C.
进一步地,处理设备122还可以将与所述目标数据集对应的标识信息作为元服务信息存储在存储设备中。在所述存储设备中,所述标识信息可以通过指针指向所述图像类目标数据集。在一些实施例中,用户可以基于所述元服务信息定位所述图像类目标数据集。例如,用户可以通过终端设备130的输入/输出接口输入与所述图像类目标数据集相关的查询请求,处理设备122可以基于所述查询请求访问所述存储设备,从而确定所述图像类目标数据集的位置。Further, the processing device 122 may also store the identification information corresponding to the target data set in the storage device as meta-service information. In the storage device, the identification information may point to the image class target dataset through a pointer. In some embodiments, the user may locate the image-like target dataset based on the meta-service information. For example, a user may input a query request related to the image-based target data set through the input/output interface of the terminal device 130, and the processing device 122 may access the storage device based on the query request, thereby determining the image-based target data set location.
在一些实施例中,同一原始数据集中的各数据元可以存放在物理上连续的内存空间,也可以存放在物理上非连续的内存空间且非连续存放的数据元间通过指针进行链接。在一些实施例中,处理设备122(例如,索引建立模块230)还可以建立所述目标数据集的索引信息,以提供对所述目标数据集中数据元的索引功能。在一些实施例中,所述索引信息可以至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息。其中,元标识信息可以指相应数据元的标识信息。基于此,一旦确定数据元的元标识信息,即可确定与该数据元的元标识信息对应的该数据元的存储位置信息, 从而可以根据确定的存储位置信息调用该数据元。在一些实施例中,所述索引信息还可以包括目标数据集的标识信息。基于此,处理设备122可以基于用户的数据调用请求中的数据集的标识信息确定待调用目标数据集的位置。在一些实施例中,所述索引信息可以存储在存储设备中,处理设备122可以基于用户的数据调用请求访问存储设备,并进一步基于所述存储设备中的索引信息定位与所述用户的数据调用请求对应的数据元。在一些实施例中,处理设备122可以分别建立每个目标数据集的索引信息。In some embodiments, each data element in the same original data set may be stored in a physically contiguous memory space, or may be stored in a physically non-contiguous memory space, and the data elements stored in a non-contiguous manner are linked by pointers. In some embodiments, the processing device 122 (eg, the index building module 230 ) may also build index information of the target data set to provide an indexing function for data elements in the target data set. In some embodiments, the index information may include at least one-to-one corresponding meta identification information and storage location information of each data element in the target data set. Wherein, the meta identification information may refer to the identification information of the corresponding data element. Based on this, once the meta identification information of the data element is determined, the storage location information of the data element corresponding to the meta identification information of the data element can be determined, so that the data element can be called according to the determined storage location information. In some embodiments, the index information may further include identification information of the target dataset. Based on this, the processing device 122 may determine the location of the target dataset to be invoked based on the identification information of the dataset in the user's data invocation request. In some embodiments, the index information may be stored in a storage device, and the processing device 122 may access the storage device based on the user's data call request, and further locate the data call with the user based on the index information in the storage device Request the corresponding data element. In some embodiments, processing device 122 may establish index information for each target data set separately.
在一些实施例中,所述元标识信息可以包括相应数据元的时间信息。进一步地,在一些实施例中,所述时间信息可以包括时间戳。时间戳可以用于唯一地标识一份数据生成的时间(例如,所述数据元被采集的时间)。在一些实施例中,所述存储位置信息可以包括偏移量。偏移量可以指存储单元的实际地址(例如,所述数据元的地址)与其所在段(例如,所述目标数据集)的段地址之间的距离。关于所述索引信息的具体实现方式,可以参考图6及其相关描述。图6是根据本申请一些实施例所示的与目标数据集对应的索引信息的示意图。如图6所示,类型文件表示目标数据集,类型索引文件表示该目标数据集的索引信息。在类型索引文件中,Timestamp表示时间戳,Offset表示偏移量。类型文件对应的类型索引文件中的每个消息(用Msg表示)对应指向(链接)该类型文件中的每个消息,并且包括所指向的消息的时间戳和存储位置。基于建立类型索引文件,可以通过时间戳确定相应的消息。类似地,可以理解,通过建立索引信息,可以通过时间信息确定相应的数据元。例如,当用户希望调用特定时间段的数据时,系统可以获取用户指定的时间段并查询包含属于该时间段的时间信息的索引信息,并根据查询到的索引信息中属于该时间段的时间信息对应的存储位置信息确定属于该时间段的数据元的位置,从而按照用户指定的时间段调用相应部分的数据元。In some embodiments, the meta identification information may include time information of the corresponding data element. Further, in some embodiments, the time information may include a time stamp. Timestamps can be used to uniquely identify when a piece of data was generated (eg, when the data element was collected). In some embodiments, the storage location information may include an offset. The offset may refer to the distance between the actual address of the storage unit (eg, the address of the data element) and the segment address of the segment in which it is located (eg, the target data set). For the specific implementation of the index information, reference may be made to FIG. 6 and related descriptions. FIG. 6 is a schematic diagram of index information corresponding to a target dataset according to some embodiments of the present application. As shown in FIG. 6 , the type file represents the target data set, and the type index file represents the index information of the target data set. In the type index file, Timestamp represents the timestamp and Offset represents the offset. Each message (indicated by Msg) in the type index file corresponding to the type file points to (links to) each message in the type file, and includes the time stamp and storage location of the pointed message. Based on the build type index file, the corresponding message can be determined by the time stamp. Similarly, it can be understood that by establishing index information, corresponding data elements can be determined by time information. For example, when the user wants to call data of a specific time period, the system can obtain the time period specified by the user and query the index information containing the time information belonging to the time period, and according to the time information belonging to the time period in the queried index information The corresponding storage location information determines the location of the data element belonging to the time period, so that the corresponding part of the data element is called according to the time period specified by the user.
在一些实施例中,目标数据集中的各数据元可以是按时间顺序连续存储的。基于此,对于用户针对待调用数据指定的时间段,系统可以确定该时间段的起始时刻和终止时刻,并根据该起始时刻和终止时刻和建立的索引信息确定与该起始时刻对应的数据元(称为“起始数据元”)和与该终止时刻对应的数据元(称为“终止数据元”),进而调用从该起始数据元到该终止数据元的所有数据元(即属于该时间段的所有数据元)。具体地,可以继续参考图6,类型文件中的消息按各自时间戳的顺序连续存储,系统通过确定起始时间戳(Start Timetamp)和终止时间戳(End Timestamp),查询索引信息中与该起始时间戳和终止时间戳对应的起始偏移量和终止偏移量,进而根据该起始偏移量和终止偏移量确定类型文件中相应的起始消息和终止消息并调用从该起始消息到该终 止消息的所有消息(例如,图6中Msg3到Msg5)。如此,通过定位起始和终止数据元即可调用属于指定时间段的所有数据元,相较于通过定位属于指定时间段的每个数据元来调用属于指定时间段的所有数据元相比,提高了数据调用效率。In some embodiments, the data elements in the target data set may be stored contiguously in chronological order. Based on this, for the time period specified by the user for the data to be called, the system can determine the start time and end time of the time period, and determine the corresponding start time according to the start time and end time and the established index information. The data element (called "starting data element") and the data element corresponding to the termination time (called "terminating data element"), and then call all the data elements from the starting data element to the ending data element (that is, all data elements belonging to the time period). Specifically, referring to Fig. 6, the messages in the type file are stored continuously in the order of their respective timestamps. Start offset and end offset corresponding to the start timestamp and end timestamp, and then determine the corresponding start message and end message in the type file according to the start offset and end offset, and call the start and end messages from the All messages from the Start message to the Termination message (eg, Msg3 to Msg5 in Figure 6). In this way, all data elements belonging to a specified time period can be called by locating the start and end data elements, which is improved compared to calling all data elements belonging to a specified time period by locating each data element belonging to a specified time period. data call efficiency.
值得说明的是,在一些实施例中,原始数据集中的各数据元也可以是按时间顺序连续存储的。如此,在按数据元的类型将原始数据集划分成多个目标数据集时,可以将从原始数据集确定的属于同一类型的数据元按原本在原始数据集中的排列顺序依次拼接起来,从而得到各数据元按时间顺序连续存储的目标数据集。It should be noted that, in some embodiments, each data element in the original data set may also be continuously stored in time sequence. In this way, when the original data set is divided into multiple target data sets according to the type of data elements, the data elements of the same type determined from the original data set can be spliced together in the original order in the original data set, so as to obtain The target dataset in which each data element is stored consecutively in chronological order.
在一些实施例中,目标数据集的索引信息还可以包括该目标数据集中各数据元对应的原始数据集的集标识信息。参照前文,该集标识信息可以包括测试车的id和/或测试行程的id。基于此,可以根据用户针对集标识信息设定的索引条件,仅调用符合条件的目标数据集中的数据元。例如,在一些实施例中,该索引条件可以包括测试车id范围、测试行程id范围、特定的一个或多个测试车/测试行程的id等或其任意组合。In some embodiments, the index information of the target data set may further include set identification information of the original data set corresponding to each data element in the target data set. Referring to the foregoing, the set identification information may include the id of the test vehicle and/or the id of the test trip. Based on this, according to the index conditions set by the user for the set identification information, only the data elements in the target dataset that meet the conditions can be called. For example, in some embodiments, the index condition may include a range of test vehicle ids, a range of test trip ids, the ids of a specific one or more test vehicles/test trips, etc., or any combination thereof.
需要注意的是,以上对于数据存储方法400的描述,仅为描述方便,并不能把本申请限制在所举实施例范围之内。可以理解,对于本领域的技术人员来说,在了解该方法的原理后,可能在不背离这一原理的情况下,对各个步骤进行任意组合,或者,可以增加或删减任意步骤。It should be noted that the above description of the data storage method 400 is only for the convenience of description, and does not limit the present application to the scope of the illustrated embodiments. It can be understood that for those skilled in the art, after understanding the principle of the method, various steps may be combined arbitrarily without departing from the principle, or any steps may be added or deleted.
通过将原始数据集切割成与不同类型对应的多个目标数据集存储,用户调用数据时可以直接根据待调用数据的类型从相应的目标数据集中获取数据,相比基于原始数据集提取其中部分类型的数据元,调用方式直接且调用的数据量小,因此能够从庞大数量的数据中高效地调用满足用户特定需求的部分数据。通过本申请实施例提供的数据存储方法,原始数据集中同一类型的数据元可以存储在一个目标数据集中,系统只需查找并访问属于用户指定类型的目标数据集,即可从中调用出用户指定类型的数据。另外,目标数据集中的各数据元可以按时间顺序连续存储,用户可以进一步获取指定类型在指定时间段内的目标数据。进一步地,目标数据集可以按预设的时间间隔划分成多个目标数据子集并分别缓存,用户可以基于数据调用请求仅获取所述多个目标数据子集中部分目标数据子集对应的数据元以实现快速调用部分数据的功能。相较于基于原始数据集调用数据,本申请实施例提供的数据存储方法使得数据调用过程更为简单且数据访问量更小,能够较好地提高数据调用的效率。By dividing the original data set into multiple target data sets corresponding to different types for storage, the user can directly obtain data from the corresponding target data set according to the type of the data to be called when calling data, rather than extracting some of the types based on the original data set The calling method is direct and the amount of data called is small, so it can efficiently call part of the data that meets the specific needs of users from a huge amount of data. With the data storage method provided by the embodiment of the present application, data elements of the same type in the original data set can be stored in a target data set, and the system only needs to find and access the target data set belonging to the user-specified type, and then the user-specified type can be called from it. The data. In addition, each data element in the target data set can be continuously stored in chronological order, and the user can further obtain target data of a specified type within a specified time period. Further, the target data set can be divided into multiple target data subsets at preset time intervals and cached respectively, and the user can obtain only the data elements corresponding to some target data subsets in the multiple target data subsets based on the data call request. In order to realize the function of quickly calling part of the data. Compared with calling data based on the original data set, the data storage method provided by the embodiment of the present application makes the data calling process simpler and the amount of data access smaller, and can better improve the efficiency of data calling.
图7是根据本申请一些实施例所示的数据调用方法的示例性流程图。如图7所示,该数据调用方法可以包括:FIG. 7 is an exemplary flowchart of a data calling method according to some embodiments of the present application. As shown in Figure 7, the data calling method may include:
步骤710,获取由用户端发送的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型。在一些实施例中,步骤710可以由处理设备122(例如,用户请求获取模块310)执行。Step 710: Acquire a data invocation request sent by the client, where the data invocation request at least includes the type of the data to be invoked. In some embodiments, step 710 may be performed by processing device 122 (eg, user request acquisition module 310).
在一些实施例中,所述数据调用请求可以由用户通过移动设备(例如,终端设备130的输入/输出接口)或计算设备输入。例如,终端设备130的输入/输出接口可以包括输入设备,诸如键盘、鼠标、触摸屏、麦克风、轨迹球等或其任意组合,用户可以使用所述输入设备输入所述数据调用请求。在一些实施例中,所述数据调用请求可以被进一步发送到(例如,经由网络150)数据处理系统100的处理设备122和/或其他组件。仅作为示例,所述移动设备或计算设备可以提供数据查询界面,该数据查询界面可以支持用户输入与待调用数据相关的筛选条件。移动设备或计算设备获取用户输入的筛选条件后生成相应的数据调用请求并将所述数据调用请求发送给数据处理系统100的处理设备122和/或其他组件。在一些实施例中,所述数据包括路测数据。数据调用请求还可以包括路测数据采集时间范围信息、测试车的id和/或测试行程的id等信息。In some embodiments, the data call request may be input by the user through a mobile device (eg, an input/output interface of the terminal device 130 ) or a computing device. For example, the input/output interface of the terminal device 130 may include an input device, such as a keyboard, a mouse, a touch screen, a microphone, a trackball, etc., or any combination thereof, and the user may use the input device to input the data calling request. In some embodiments, the data invocation request may be further sent (eg, via network 150 ) to processing device 122 and/or other components of data processing system 100 . For example only, the mobile device or computing device may provide a data query interface, which may support the user to input filter conditions related to the data to be called. The mobile device or computing device generates a corresponding data call request after acquiring the filter condition input by the user, and sends the data call request to the processing device 122 and/or other components of the data processing system 100 . In some embodiments, the data includes drive test data. The data invocation request may also include information such as the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
步骤720,基于所述数据调用请求获取原始数据中的部分数据以得到所述待调用数据,所述部分数据包括与所述待调用数据所属的类型相对应的目标数据集中的数据元。在一些实施例中,步骤720可以由处理设备122(例如,调用模块320)执行。Step 720: Obtain partial data in the original data based on the data calling request to obtain the data to be called, where the partial data includes data elements in the target data set corresponding to the type to which the data to be called belongs. In some embodiments, step 720 may be performed by processing device 122 (eg, calling module 320).
在一些实施例中,数据处理系统100可以提供索引机制。例如,如图4所述,处理设备122可以确定所述目标数据集的元服务信息,以提供对所述目标数据集中数据元的索引功能。所述元服务信息可以存储在存储设备中。处理设备122(例如,索引信息获取单元322)可以从所述数据调用请求中获取用户的索引请求信息,进一步地,处理设备122可以基于所述数据调用请求中的索引请求信息与存储在存储设备中的元服务信息进行匹配,从而确定与数据调用请求中的索引请求信息匹配的索引信息(或元服务信息),基于索引信息或元服务信息指向的存储位置获取数据元。In some embodiments, data processing system 100 may provide an indexing mechanism. For example, as described in FIG. 4 , the processing device 122 may determine meta-service information of the target dataset to provide an indexing function for data elements in the target dataset. The meta-service information may be stored in a storage device. The processing device 122 (for example, the index information obtaining unit 322) may obtain the user's index request information from the data call request. Further, the processing device 122 may compare the index request information in the data call request with the data stored in the storage device. Match the meta-service information in the data call request to determine the index information (or meta-service information) that matches the index request information in the data call request, and obtain the data element based on the storage location pointed to by the index information or the meta-service information.
在一些实施例中,所述数据调用请求至少包括待调用数据所属的类型。在一些实施例中,待调用数据所述的类型可以包括一个,也可以包括多个。例如,移动设备或计算设备提供的数据查询界面可以显示多个候选类型,用户从中选定待调用数据所属的一个或多个类型后,移动设备或计算设备生成包括用户选定的一个或多个类型的数据调用请求并将该数据调用请求发送至数据处理系统100。在一些实施例中,所述索引信息可以至少包括目标数据集的标识信息。由于目标数据集的标识信息和类型信息是对应的,处理设备122可以基于所述数据调用请求所包含的用户选定的一个或多个类型(即用户 索引请求信息)访问元服务信息,确定元服务信息中与所述一个或多个类型对应的索引信息。进一步地,处理设备122可以基于所述索引信息确定相应目标数据集的存储位置,从而调用相应目标数据集中的数据元。In some embodiments, the data invocation request at least includes the type of the data to be invoked. In some embodiments, the type described by the data to be called may include one or more than one. For example, the data query interface provided by the mobile device or computing device can display multiple candidate types. After the user selects one or more types to which the data to be called belongs, the mobile device or computing device generates one or more types including the one or more types selected by the user. type of data call request and send the data call request to data processing system 100 . In some embodiments, the index information may include at least identification information of the target dataset. Since the identification information and type information of the target data set are corresponding, the processing device 122 can access the meta-service information based on one or more types selected by the user (that is, the user index request information) included in the data call request, and determine the meta-service information. Index information corresponding to the one or more types in the service information. Further, the processing device 122 may determine the storage location of the corresponding target data set based on the index information, so as to call the data elements in the corresponding target data set.
在一些实施例中,所述数据调用请求还可以包括与待调用数据相关的更多筛选条件。值得说明的是,当所述数据调用请求包括用户选定的多个类型时,处理设备122获取满足多个筛选条件的待调用数据的方式可以包括多种。例如,在一些实施例中,处理设备122可以首先确定与用户选定的多个类型一一对应的多个目标数据集,再根据所述数据调用请求中的其他筛选条件从每个类型的目标数据集筛选出符合条件的数据元,从而得到满足多个筛选条件的待调用数据。又例如,在一些实施例中,处理设备122可以先从所有类型的目标数据集中筛选出符合其他筛选条件的数据元,再从筛选出的数据元中筛选出属于用户选定的多个类型的数据元,从而得到满足多个筛选条件的待调用数据。再例如,处理设备122可以基于所述数据调用请求访问元服务信息,并基于元服务信息中的索引信息确定满足多个筛选条件的待调用数据存储位置,从而得到满足多个筛选条件的待调用数据。In some embodiments, the data invocation request may further include more filter conditions related to the data to be invoked. It should be noted that, when the data invocation request includes multiple types selected by the user, the processing device 122 may acquire the data to be invoked that meets the multiple filtering conditions in various ways. For example, in some embodiments, the processing device 122 may first determine multiple target data sets corresponding to multiple types selected by the user one-to-one, and then select the target data sets of each type according to other filtering conditions in the data call request. The data set filters out the data elements that meet the conditions, so as to obtain the data to be called that satisfies multiple filter conditions. For another example, in some embodiments, the processing device 122 may first filter out data elements that meet other filtering conditions from all types of target data sets, and then filter out data elements belonging to multiple types selected by the user from the filtered data elements. data elements, so as to obtain the data to be called that satisfies multiple filter conditions. For another example, the processing device 122 may access the meta-service information based on the data invocation request, and determine the storage location of the data to be invoked that satisfies multiple screening conditions based on the index information in the meta-service information, so as to obtain the data to be invoked that satisfies the plurality of screening conditions. data.
在一些实施例中,所述索引信息可以至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息,其中,元标识信息指相应数据元的标识信息。相应地,所述数据调用请求还可以包括针对待调用数据的与元标识信息相关的元限定条件。处理设备122可以基于所述数据调用请求访问元服务信息,确定元服务信息中与所述一个或多个类型对应且满足所述元限定条件的索引信息。进一步地,处理设备122可以根据所述索引信息对应的存储位置获取数据元。In some embodiments, the index information may include at least one-to-one correspondence of meta identification information and storage location information of each data element in the target data set, wherein the meta identification information refers to identification information of a corresponding data element. Correspondingly, the data invocation request may further include meta-qualification conditions related to meta-identification information for the data to be invoked. The processing device 122 may access meta-service information based on the data call request, and determine index information in the meta-service information that corresponds to the one or more types and satisfies the meta-qualifying condition. Further, the processing device 122 may acquire the data element according to the storage location corresponding to the index information.
在一些实施例中,所述数据可以包括路测数据。用户的数据调用请求可以包括路测数据采集设备的类型、路测数据采集时间范围信息、测试车的id和/或测试行程的id等索引请求信息。由所述数据调用请求确定的所述索引请求信息可以至少包括测试车的id和/或测试行程的id、待调用数据所属的类型、时间范围、数据时间长度等。In some embodiments, the data may include drive test data. The user's data invocation request may include index request information such as the type of the drive test data collection device, the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip. The index request information determined by the data calling request may include at least the id of the test vehicle and/or the id of the test trip, the type of the data to be called, the time range, the data time length, and the like.
在一些实施例中,数据处理系统100可以提供边缘缓存机制。例如,在一些实施例中,处理设备122(例如,分段存储单元323)可以基于与用户的数据调用请求匹配的元服务信息从存储有目标数据集的存储设备(例如,分布式文件系统)中获取目标数据集中存储的数据元。进一步地,处理设备122可以基于数据元标识的时间信息(例如,时间戳),即目标数据集对应的时间信息以及用户数据调用请求中的时间信息,将目标数据集按预设的时间间隔划分成多个目标数据子集,每个目标数据子集对应一个时 间间隔,并分别存储各个时间间隔(例如,每10s)获取的数据元至每个目标数据子集(也称为物理数据文件)。例如,当用户数据调用请求中的时间范围长度小于目标数据集的时间范围长度,将目标数据集按预设的时间间隔划分成多个目标数据子集。例如,每个目标数据集对应的数据元的时间长度为100秒,用户数据调用请求中的时间长度为20秒,则处理设备122可以将目标数据集划分为10个目标数据子集,每个目标数据子集对应10秒的数据元。如本文所述,此处的时间范围可以指的是数据元被采集的时间范围。在一些实施例中,所述时间间隔可以依据用户针对待调用数据指定的时间段的最小值来设定,以尽量保证按照该时间段调用的若干目标数据子集与该时间段实际对应的数据相吻合。在一些实施例中,可以直接将用户针对待调用数据指定的时间段的最小值设为所述时间间隔。In some embodiments, data processing system 100 may provide an edge caching mechanism. For example, in some embodiments, processing device 122 (eg, segment storage unit 323 ) may retrieve data from a storage device (eg, a distributed file system) that stores the target dataset based on meta-service information that matches the user's data invocation request. to obtain the data elements stored in the target dataset. Further, the processing device 122 may divide the target data set according to preset time intervals based on the time information (eg, timestamp) identified by the data element, that is, the time information corresponding to the target data set and the time information in the user data call request. Create multiple target data subsets, each target data subset corresponds to a time interval, and store the data elements acquired at each time interval (for example, every 10s) to each target data subset (also called a physical data file) . For example, when the time range length in the user data call request is smaller than the time range length of the target data set, the target data set is divided into multiple target data subsets at preset time intervals. For example, if the time length of the data element corresponding to each target data set is 100 seconds, and the time length in the user data call request is 20 seconds, the processing device 122 may divide the target data set into 10 target data subsets, each The target data subset corresponds to 10 seconds of data elements. As described herein, the time range here may refer to the time range in which the data elements are collected. In some embodiments, the time interval may be set according to the minimum value of the time period specified by the user for the data to be called, so as to ensure that several target data subsets called according to the time period are actually corresponding to the time period. match. In some embodiments, the minimum value of the time period specified by the user for the data to be called may be directly set as the time interval.
在一些实施例中,处理设备122可以通过物理存储的方式将每个目标数据子集以及存储在其中的数据元存储在处理设备122的存储器中。当用户完成该目标数据集中的部分数据元的调用后,该目标数据子集以及其所存储的数据元可以被擦除。进一步的,处理设备122(例如,数据获取单元324)可以基于用户数据调用请求的时间信息,从存储设备中获取各个时间间隔中与该时间信息相匹配的目标数据子集中的数据元。例如,当用户所在地与存储有原始数据集的存储设备(称为第一存储设备,例如,分布式文件系统)处于同一城市或国家,处理设备122(称为本地服务器)可以基于上述方法分发数据给用户的终端(也称为用户端,例如,终端设备130)。In some embodiments, the processing device 122 may store each target data subset and the data elements stored therein in the memory of the processing device 122 by means of physical storage. After the user completes the invocation of some data elements in the target data set, the target data subset and the data elements stored therein can be erased. Further, the processing device 122 (eg, the data obtaining unit 324 ) may obtain, from the storage device, data elements in the target data subset matching the time information in each time interval based on the time information of the user data call request. For example, when the user's location is in the same city or country as the storage device (referred to as the first storage device, eg, a distributed file system) that stores the original data set, the processing device 122 (referred to as the local server) may distribute the data based on the above method To the user's terminal (also referred to as the user terminal, eg, terminal device 130).
在一些实施例中,处理设备122可以基于与用户的数据调用请求匹配的元服务信息,确定元服务信息指向的目标数据集。并基于目标数据集对应的时间信息(例如,时间范围)以及用户数据调用请求的时间信息建立目标数据集对应的多个逻辑文件。例如,当用户端所在地与存储有原始数据集的存储设备(称为第一存储设备,例如,分布式文件系统)不处于同一区域(例如,城市或国家),处理设备122与所述第一存储设备处于同一区域(例如,城市或国家),处理设备122可以建立目标数据集对应的多个逻辑文件。处理设备122可以进一步地将目标数据集以及其存储的数据元发送给第二存储设备,所述第二存储设备与用户端处于同一区域(例如,城市或国家)。第二存储设备距离用户端的距离小于第一存储设备距离用户端的距离。第二存储设备所处的服务器可以将接收的目标数据集(物理数据文件)按预设的时间间隔划分成多个目标数据子集,并将数据元分别存储至对应的目标数据子集中。所述多个逻辑文件通过指针方式指向处于第二存储设备的目标数据子集。处理设备122可以通过匹配用户数据调用请求中的时 间信息与每个逻辑文件中的时间信息,确定与用户调用请求匹配的逻辑文件,并基于用户匹配的逻辑文件所指向的第二存储设备中的目标数据子集,指示第二存储设备的服务器向用户端发送匹配的目标数据子集中的数据元。In some embodiments, the processing device 122 may determine the target dataset to which the meta-service information points based on the meta-service information that matches the user's data invocation request. And based on the time information (eg, time range) corresponding to the target data set and the time information of the user data invocation request, multiple logical files corresponding to the target data set are established. For example, when the location of the client is not in the same area (eg, city or country) as the storage device (referred to as the first storage device, such as a distributed file system) storing the original data set, the processing device 122 and the first storage device The storage devices are in the same region (eg, city or country), and the processing device 122 can create multiple logical files corresponding to the target dataset. The processing device 122 may further send the target data set and its stored data elements to a second storage device located in the same region (eg, city or country) as the client. The distance between the second storage device and the user terminal is smaller than the distance between the first storage device and the user terminal. The server where the second storage device is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively. The plurality of logical files point to the target data subset in the second storage device by way of pointers. The processing device 122 may determine the logical file matching the user invocation request by matching the time information in the user data invocation request with the time information in each logical file, and based on the data in the second storage device pointed to by the user-matched logical file. The target data subset, instructing the server of the second storage device to send the data elements in the matched target data subset to the client.
逻辑文件中不存储目标数据子集中的数据元,可以存储关于数据元的信息(例如,部分的元服务信息)。逻辑文件可以通过指针方式指向物理数据文件(即目标数据子集)。例如,可以将目标数据集按预设的时间间隔划分成多个目标数据子集,每个目标数据子集对应一个时间间隔,每个目标数据子集可以建立一个逻辑文件。每个逻辑文件中包括每个目标数据子集中存储的数据元的元服务信息。The data elements in the target data subset are not stored in the logical file, and information about the data elements (eg, part of the meta-service information) may be stored. Logical files can point to physical data files (ie, target data subsets) by way of pointers. For example, the target data set can be divided into multiple target data subsets according to preset time intervals, each target data subset corresponds to a time interval, and each target data subset can create a logic file. Each logical file includes meta-service information of data elements stored in each target data subset.
步骤730,将所述待调用数据同步至所述用户端的存储设备中。在一些实施例中,步骤730可以由处理设备122(例如,同步模块(图中未示出))执行。在一些实施例中,处理设备122可以进一步将多个目标数据子集中的数据元进行合并后发送给用户端的存储设备中,从而实现所述待调用数据在所述用户端的同步。Step 730: Synchronize the data to be called to the storage device of the client. In some embodiments, step 730 may be performed by processing device 122 (eg, a synchronization module (not shown)). In some embodiments, the processing device 122 may further combine the data elements in the multiple target data subsets and send them to the storage device of the client, so as to realize the synchronization of the data to be called on the client.
根据上述方法,当所述数据调用请求还包括待调用数据对应的时间范围,且待调用数据对应的时间范围小于目标数据集对应的时间范围时,处理设备122不需要将整个目标数据集中的数据发送给用户端,只需将于用户调用数据请求中的时间范围信息对应的数据元(即目标数据子集中的数据元)发送给用户端,从而实现快速调用部分数据的功能,提高数据调用效率。According to the above method, when the data call request further includes the time range corresponding to the data to be called, and the time range corresponding to the data to be called is smaller than the time range corresponding to the target data set, the processing device 122 does not need to transfer the data in the entire target data set To send to the client, it is only necessary to send the data element corresponding to the time range information in the user's call data request (that is, the data element in the target data subset) to the client, so as to realize the function of quickly calling part of the data and improve the efficiency of data calling .
需要注意的是,以上对于数据调用方法700的描述,仅为描述方便,并不能把本申请限制在所举实施例范围之内。可以理解,对于本领域的技术人员来说,在了解该方法的原理后,可能在不背离这一原理的情况下,对各个步骤进行任意组合,或者,可以增加或删减任意步骤。例如,步骤720还可以包括智能推荐过程。具体地,处理设备122可以记录用户的调用习惯,根据所述调用习惯向用户推荐调用结果。再例如,处理设备122还可以基于机器学习算法预测用户的搜索行为。It should be noted that the above description of the data calling method 700 is only for convenience of description, and cannot limit the present application to the scope of the illustrated embodiments. It can be understood that for those skilled in the art, after understanding the principle of the method, various steps may be combined arbitrarily without departing from the principle, or any steps may be added or deleted. For example, step 720 may also include an intelligent recommendation process. Specifically, the processing device 122 may record the calling habits of the user, and recommend calling results to the user according to the calling habits. For another example, the processing device 122 may also predict the user's search behavior based on a machine learning algorithm.
图8是根据本申请一些实施例所示的数据调用过程的示意图。如图8所示,用户输入的数据调用请求可以包括与行程ID(即测试行程的id)、时间范围和类型相关的信息。多个目标数据集可以包括A类型文件、B类型文件以及C类型文件。根据数据处理系统100提供的索引机制,处理设备122可以基于所述数据调用请求访问元服务信息。所述元服务信息可以包括如图8中所示的类型文件索引以及类型文件信息,其中,类型文件索引可以是与行程ID(即测试行程的id)、时间范围等相关的索引信息,类型文件信息可以是与目标数据集的标识信息相关的索引信息,所述目标数据集的标识 信息与类型对应。处理设备122可以根据所述类型文件信息与类型文件索引确定与类型对应的目标数据集,以及每个目标数据集中待调用的数据元的具体位置(例如,偏移起点和偏移终点),从而获取每个目标数据集中的待调用数据元。进一步地,处理设备122可以将待调用数据元合并生成数据包。所述数据包可以作为数据调用的结果传输给用户端。FIG. 8 is a schematic diagram of a data calling process according to some embodiments of the present application. As shown in FIG. 8 , the data invocation request input by the user may include information related to the trip ID (ie, the id of the test trip), the time range and the type. The multiple target datasets may include A-type files, B-type files, and C-type files. According to the indexing mechanism provided by the data processing system 100, the processing device 122 can access the meta-service information based on the data call request. The meta-service information may include the type file index and type file information as shown in FIG. 8 , wherein the type file index may be index information related to the trip ID (that is, the id of the test trip), time range, etc., and the type file The information may be index information related to identification information of the target data set, where the identification information of the target data set corresponds to the type. The processing device 122 may determine, according to the type file information and the type file index, the target data set corresponding to the type, and the specific position (for example, the offset start point and the offset end point) of the data element to be called in each target data set, thereby Get the object to be called in each target dataset. Further, the processing device 122 may combine the data elements to be called to generate a data packet. The data packet can be transmitted to the client as a result of the data call.
图9是根据本申请一些实施例所示的数据调用场景的示意图。如图9所示,数据调用场景可以包括用户端、本地服务器(及数据处理系统100)以及远程数据中心。FIG. 9 is a schematic diagram of a data calling scenario according to some embodiments of the present application. As shown in FIG. 9 , the data invocation scenario may include a client, a local server (and the data processing system 100 ), and a remote data center.
在用户端,用户可以通过用户端,即计算设备(例如,终端设备130的输入/输出接口)输入数据调用请求。On the client side, the user may input a data call request through the client side, that is, the computing device (eg, the input/output interface of the terminal device 130).
在一些实施例中,本地服务器(例如,数据处理系统100)可以包括上层文件系统(也可以成为逻辑文件系统)和底层文件系统。上层文件系统(或逻辑文件系统)可以用于定义本地服务器与用户端的接口(即访问)。例如,上层文件系统可以提供索引机制。例如,所述索引可以由处理设备122基于原始数据集建立。上层文件系统还可以定义文件及其属性、文件所允许的操作、文件的目录等信息。进一步地,处理设备122可以通过上层文件系统,根据数据调用请求确定与所述数据调用请求对应的索引请求信息,并基于数据索引信息以及文件目录确定与调用请求对应底层文件系统中的元服务信息,并基于元服务信息(例如,存储位置信息)确定满足所述数据调用请求的数据元的存储位置,从而通过底层文件系统得到所述待调用数据。In some embodiments, a local server (eg, data processing system 100) may include an upper file system (which may also be referred to as a logical file system) and an underlying file system. The upper file system (or logical file system) can be used to define the interface (ie access) between the local server and the client. For example, the upper file system may provide an indexing mechanism. For example, the index may be established by the processing device 122 based on the raw data set. The upper file system can also define information such as files and their attributes, operations allowed by files, and directories of files. Further, the processing device 122 can determine the index request information corresponding to the data invocation request according to the data invocation request through the upper-layer file system, and determine the meta-service information in the underlying file system corresponding to the invocation request based on the data index information and the file directory. , and determine the storage location of the data element that satisfies the data calling request based on the meta-service information (eg, storage location information), so as to obtain the data to be called through the underlying file system.
底层文件系统用于将上层文件系统映射到物理存储设备(例如,本地服务器中的硬盘)或内存设备。例如,底层文件系统可以包括元服务信息,其中包括目标数据集的索引信息(例如,目标数据集的标识信息、目标数据集中各数据元的一一对应的元标识信息和存储位置信息、原始数据集的集标识信息等)。底层文件系统可以基于上层文件系统中确定的索引请求信息,与元服务信息进行匹配,确定匹配的元服务信息指向的数据存储在物理存储设备的位置,由此获取数据元,实现上层文件系统到物理存储设备之间的映射。The underlying file system is used to map the upper file system to a physical storage device (eg, a hard disk in a local server) or a memory device. For example, the underlying file system may include meta-service information, including index information of the target data set (for example, the identification information of the target data set, the one-to-one meta-identification information and storage location information of each data element in the target data set, the original data set identification information for the set, etc.). The underlying file system can match the meta-service information based on the index request information determined in the upper-level file system, and determine the location where the data pointed to by the matching meta-service information is stored in the physical storage device, thereby obtaining data elements, and realizing the upper-level file system to Mapping between physical storage devices.
在一些实施例中,在底层文件系统中,本地服务器可以提供边缘缓存机制。例如,本地服务器可以将远程数据中心中的目标数据集按照流程400所描述的方法存储至本地服务器的存储设备中。仅作为示例,本地服务器可以基于数据元标识的时间信息(例如,时间戳),将目标数据集按预设的时间间隔被划分成多个目标数据子集,并被分别存储在存储设备中。目标数据子集包括元数据的时间信息,所述时间信息可以是数据元 存储在存储设备的地址。进一步的,本地服务器可以基于用户数据调用请求的时间信息,从存储设备中获取各个时间间隔中与该时间信息相匹配的目标数据子集中的数据元。例如,当用户所在地与存储有原始数据集的存储设备处于同一区域,本地服务器可以基于上述方法分发数据给用户端。作为另一示例,当用户端所在地与存储有原始数据集的存储设备(称为第一存储设备)不处于同一区域,本地服务器与所述第一存储设备处于同一区域时,本地服务器可以建立目标数据集对应的多个逻辑文件。本地服务器可以进一步地将目标数据集以及其存储的数据元发送给第二存储设备,所述第二存储设备与用户端处于同一区域。第二存储设备所处的服务器可以将接收的目标数据集(物理数据文件)按预设的时间间隔划分成多个目标数据子集,并将数据元分别存储至对应的目标数据子集中。所述多个逻辑文件通过指针方式指向处于第二存储设备的目标数据子集。本地服务器可以通过匹配用户数据调用请求中的时间信息与每个逻辑文件中的时间信息,确定与用户调用请求匹配的逻辑文件,并基于用户匹配的逻辑文件所指向的第二存储设备中的目标数据子集,指示第二存储设备的服务器向用户端发送匹配的目标数据子集中的数据元。进一步的,第二存储设备的服务器可以将多个目标数据子集中的数据元进行合并后发送给用户端。根据如上所述的方法,当所述数据调用请求还包括待调用数据对应的时间范围时,可以通过访问指定时间段内的目标数据子集对应的元服务信息来调用与该时间范围实际对应的数据(即物理存储的目标数据子集),从而可以仅同步该时间范围实际对应的数据,而非同步整个目标数据集,以此实现快速调用部分数据的功能,提高数据调用的效率。例如,如图9所示,用户可以在预设的测试站中获取存储在远程数据中心的数据,以进行进一步的分析处理(例如,研发程序调试、测试仿真模拟、问题数据分析等)。In some embodiments, in the underlying file system, the local server may provide an edge caching mechanism. For example, the local server may store the target data set in the remote data center to the storage device of the local server according to the method described in the process 400 . Just as an example, the local server may divide the target data set into multiple target data subsets at preset time intervals based on the time information (eg, timestamp) identified by the data element, and store them in the storage device respectively. The target data subset includes time information of the metadata, which may be the address at which the data element is stored on the storage device. Further, based on the time information of the user data call request, the local server may acquire, from the storage device, data elements in the target data subsets that match the time information in each time interval. For example, when the user's location is in the same area as the storage device that stores the original data set, the local server can distribute the data to the client based on the above method. As another example, when the location of the client is not in the same area as the storage device that stores the original data set (referred to as the first storage device), and the local server is in the same area as the first storage device, the local server can create a target Multiple logical files corresponding to the dataset. The local server may further send the target data set and the data elements stored therein to a second storage device, where the second storage device and the client are located in the same area. The server where the second storage device is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively. The plurality of logical files point to the target data subset in the second storage device by way of pointers. The local server can determine the logical file matching the user's invocation request by matching the time information in the user data invocation request with the time information in each logical file, and based on the target in the second storage device pointed to by the logical file matched by the user Data subset, instructing the server of the second storage device to send the data elements in the matched target data subset to the client. Further, the server of the second storage device may combine the data elements in the multiple target data subsets and send them to the client. According to the above method, when the data invocation request further includes the time range corresponding to the data to be invoked, the data actually corresponding to the time range can be invoked by accessing the meta-service information corresponding to the target data subset within the specified time range. Data (that is, a subset of the target data in physical storage), so that only the data actually corresponding to the time range can be synchronized, rather than the entire target data set, so as to realize the function of quickly calling part of the data and improve the efficiency of data calling. For example, as shown in FIG. 9 , a user can obtain data stored in a remote data center in a preset test station for further analysis and processing (eg, R&D program debugging, test simulation, problem data analysis, etc.).
图10是根据本申请一些实施例所示的数据调用示意图。如图10所示,当用户需要调用数据时,可以通过用户端输入数据调用请求(步骤1),数据处理系统100可以基于该数据调用请求从元服务模块获取与该调用请求匹配的元服务信息(或索引请求信息)(步骤2)。进一步地,数据处理系统100可以确定该元服务信息指向的目标数据集。仅作为示例,假设数据处理系统100与存储有原始数据集的本地存储设备(称为第一存分布式文件系统(HDFS))处于第一区域,而用户端所在地与第二HDFS处于第二区域(第一区域和第二区域为不同区域)或者用户端距离第二HDFS的服务器比用户端距离第一HDFS的服务器更近。例如,所述第一区域可以位于美国,所述第二区域可以位于中国。相应地,所述第一HDFS可以是设立在美国的数据中心,所述第二HDFS 可以是设立在中国的数据中心(例如,内蒙古(NMG)数据中心)。数据处理系统100在确定元服务信息指向的目标数据集(例如,相机数据文件)后,可以基于目标数据集对应的时间信息(例如,时间范围)以及用户数据调用请求的时间信息建立目标数据集对应的多个逻辑文件(例如,逻辑相机数据文件)。进一步地,数据处理系统100可以地将目标数据集以及其存储的数据元发送给第二HDFS。第二存HDFS所处的服务器可以将接收的目标数据集(物理数据文件)按预设的时间间隔划分成多个目标数据子集,并将数据元分别存储至对应的目标数据子集中。所述多个逻辑文件通过指针方式指向处于第二HDFS中的目标数据子集。数据处理系统100可以通过匹配用户数据调用请求中的时间信息与每个逻辑文件中的时间信息,确定与用户调用请求匹配的逻辑文件,并基于用户匹配的逻辑文件所指向的第二HDFS中的目标数据子集,指示第二HDFS的服务器向用户端发送匹配的目标数据子集中的数据元(步骤3)。进一步的,第二HDFS的服务器可以将多个目标数据子集中的数据元进行合并后发送给用户端(步骤4)。FIG. 10 is a schematic diagram of data calling according to some embodiments of the present application. As shown in FIG. 10 , when the user needs to call data, a data call request can be input through the user terminal (step 1), and the data processing system 100 can obtain the meta-service information matching the call request from the meta-service module based on the data call request. (or index request information) (step 2). Further, the data processing system 100 can determine the target dataset to which the meta-service information points. As an example only, it is assumed that the data processing system 100 and the local storage device storing the original data set (referred to as the first storage distributed file system (HDFS)) are in the first area, and the client location and the second HDFS are in the second area (The first area and the second area are different areas) or the client is closer to the server of the second HDFS than the client is to the server of the first HDFS. For example, the first region may be located in the United States and the second region may be located in China. Correspondingly, the first HDFS may be a data center established in the United States, and the second HDFS may be a data center established in China (eg, Inner Mongolia (NMG) data center). After the data processing system 100 determines the target data set (for example, the camera data file) pointed to by the meta-service information, it can establish the target data set based on the time information (for example, the time range) corresponding to the target data set and the time information of the user data call request Corresponding multiple logical files (eg, logical camera data files). Further, the data processing system 100 may send the target dataset and its stored data elements to the second HDFS. The server where the second storage HDFS is located may divide the received target data set (physical data file) into multiple target data subsets at preset time intervals, and store the data elements in the corresponding target data subsets respectively. The plurality of logical files point to the target data subset in the second HDFS by way of pointers. The data processing system 100 can determine the logical file matching the user invocation request by matching the time information in the user data invocation request with the time information in each logical file, and determine the logical file in the second HDFS to which the logical file matched by the user points. Target data subset, instructing the server of the second HDFS to send the data elements in the matched target data subset to the client (step 3). Further, the server of the second HDFS may combine the data elements in the multiple target data subsets and send them to the client (step 4).
图11是根据本申请一些实施例所示的数据存储和调用示意图。如图11所示,数据处理系统100包括元服务模块,所述元服务模块通过网络获取数据包(如本文其他地方所述的原始数据集)。元服务模块响应于接收到的数据包生成数据包处理任务(步骤1)。并将数据包处理任务以及数据包发送给处理模块。具体地,所述处理模块可以获取数据包(即原始数据集)处理任务并对所述数据包进行处理以及存储。例如,处理模块可以基于图4所述流程400对原始数据包(原始数据集)进行处理以及存储。处理模块可以根据所述原始数据包中数据元的类型信息,建立不同的目标数据集。例如,所述原始数据包中的数据元的类型的个数为N,可以建立N个不同的目标数据集,所述N个不同的目标数据集与不同类型的数据元相对应。处理模块还可以根据目标数据集对应的类型设置该目标数据集的标识信息,以识别不同类型的目标数据集。进一步地,处理模块可以确定原始数据集的元服务信息(也可以称为索引信息或包括索引信息),用于确定原始数据集或原始数据集中的数据元在存储设备中的存储位置。FIG. 11 is a schematic diagram of data storage and invocation according to some embodiments of the present application. As shown in FIG. 11, data processing system 100 includes a meta-service module that obtains data packets (as described elsewhere herein) for data packets (as described elsewhere herein). The meta-service module generates a packet processing task in response to the received packet (step 1). And send the data packet processing task and the data packet to the processing module. Specifically, the processing module may acquire a data packet (ie, original data set) processing task and process and store the data packet. For example, the processing module may process and store the original data packet (original data set) based on the process 400 described in FIG. 4 . The processing module may establish different target data sets according to the type information of the data elements in the original data packets. For example, if the number of types of data elements in the original data packet is N, N different target data sets may be established, and the N different target data sets correspond to different types of data elements. The processing module may also set the identification information of the target data set according to the type corresponding to the target data set, so as to identify different types of target data sets. Further, the processing module may determine the meta-service information (also referred to as index information or include index information) of the original data set, so as to determine the storage location of the original data set or data elements in the original data set in the storage device.
基于所述处理过程,所述处理模块可以生成目标数据集以及对应的元服务信息。进一步地,所述处理模块可以将目标数据集以及其存储的数据元上传(步骤2)并存储在本地的存储设备或系统(即第一分布式文件系统(HDFS)中)(步骤3),并将元服务信息存储在元服务模块相关联的存储设备中(步骤4)。如本文中所述,本地的存储设备或系统指的是与数据处理系统100处于同一区域(例如,城市或国家)的存储设备或系统。在一些实施例中,第一HDFS可以将处理后的数据包同步至第二HDFS。第二 HDFS与第一HDFS处于不同的区域(例如,不同的城市或国家),以便处于第二HDFS所在区域的用户端调用数据。关于基于第二HDFS进行数据调用的更多描述可以参考图10。Based on the processing procedure, the processing module may generate a target data set and corresponding meta-service information. Further, the processing module can upload the target data set and its stored data elements (step 2) and store them in a local storage device or system (ie, the first distributed file system (HDFS)) (step 3), and store the meta-service information in the storage device associated with the meta-service module (step 4). As used herein, a local storage device or system refers to a storage device or system in the same region (eg, city or country) as data processing system 100 . In some embodiments, the first HDFS may synchronize the processed data packets to the second HDFS. The second HDFS is located in a different region (eg, a different city or country) from the first HDFS, so that the user terminal located in the region where the second HDFS is located can call data. For more description on data calling based on the second HDFS, reference may be made to FIG. 10 .
元服务信息可以包括目标数据集中各数据元一一对应的元标识信息(例如,时间戳)和存储位置信息(例如,偏移量),目标数据集的标识信息,标数据集中各数据元对应的原始数据集的集标识信息等。The meta-service information may include meta-identification information (for example, a timestamp) and storage location information (for example, an offset) that correspond to each data element in the target data set, the identification information of the target data set, and the corresponding data elements in the target data set. The set identification information of the original dataset, etc.
当用户需要调用数据时,用户可以获取第一HDFS中数据的下载地址或者访问地址(步骤0)。用户可以基于用户端输入数据调用请求(步骤5)。数据调用请求可以包括待调用数据所属的类型、与元标识信息相关的元限定条件(例如,时间范围)等信息。例如,当用户想要调用路测数据时,数据调用请求可以包括路测数据采集设备的类型、路测数据采集时间范围信息、测试车的id和/或测试行程的id等索引请求信息。When the user needs to call the data, the user can obtain the download address or the access address of the data in the first HDFS (step 0). The user can invoke the request based on the user-side input data (step 5). The data invocation request may include information such as the type of the data to be invoked, meta-qualification conditions (eg, time range) related to the meta-identification information. For example, when the user wants to call the drive test data, the data call request may include index request information such as the type of the drive test data collection device, the time range information of the drive test data collection, the id of the test vehicle and/or the id of the test trip.
数据处理系统100可以基于数据调用请求从元服务模块获取与该调用请求匹配的元服务信息(或索引请求信息)(步骤6),并根据元服务信息调用存储在用户端附近的分布式文件系统(HDFS)中的目标数据子集中的数据(步骤7)。其中,从用户端附近的分布式文件系统调用数据可以指从第一HDFS或第二HDFS调用数据(步骤8)。例如,当用户端所在地与存储有原始数据集的第一HDFS处于同一区域时,数据处理系统100可以基于元服务信息从第一HDFS中获取目标数据集中存储的数据元。进一步的,数据处理系统100可以基于数据元标识的时间信息(例如,时间戳),即目标数据集对应的时间信息以及用户数据调用请求中的时间信息,将目标数据集按预设的时间间隔划分成多个目标数据子集,并通过物理存储的方式将每个目标数据子集以及存储在其中的数据元存储在数据处理系统100的存储器中。基于用户数据调用请求的时间信息,数据处理系统100可以进一步从存储设备中获取各个时间间隔中与该时间信息相匹配的目标数据子集中的数据元。The data processing system 100 may obtain meta-service information (or index request information) matching the call request from the meta-service module based on the data call request (step 6), and call the distributed file system stored near the client according to the meta-service information (step 7) in the target data subset in HDFS. Wherein, calling data from a distributed file system near the client may refer to calling data from the first HDFS or the second HDFS (step 8). For example, when the location of the client is in the same area as the first HDFS where the original data set is stored, the data processing system 100 may acquire the data elements stored in the target data set from the first HDFS based on the metadata service information. Further, the data processing system 100 may, based on the time information (for example, timestamp) identified by the data element, that is, the time information corresponding to the target data set and the time information in the user data call request, store the target data set at preset time intervals. The target data subsets are divided into a plurality of target data subsets, and each target data subset and the data elements stored therein are stored in the memory of the data processing system 100 by means of physical storage. Based on the time information of the user data call request, the data processing system 100 may further acquire, from the storage device, data elements in the target data subsets that match the time information in each time interval.
再例如,第一HDFS可以将处理后的数据包同步至第二HDFS。第二HDFS与第一HDFS处于不同的区域(例如,不同的城市或国家),以便处于第二HDFS所在区域的用户端调用数据。第二HDFS距离用户端的距离小于第一HDFS距离用户端的距离。关于基于第二HDFS进行数据调用的更多描述可以参考图10。在一些实施例中,数据处理系统100还可以将从用户端附近的分布式文件系统(HDFS)中获取的多个目标数据子集中的数据元进行合并后发送给用户端(步骤9)。For another example, the first HDFS may synchronize the processed data packets to the second HDFS. The second HDFS is located in a different region (eg, a different city or country) from the first HDFS, so that the user terminal located in the region where the second HDFS is located can call data. The distance between the second HDFS and the user terminal is smaller than the distance between the first HDFS and the user terminal. For more description on data calling based on the second HDFS, reference may be made to FIG. 10 . In some embodiments, the data processing system 100 may further combine data elements in multiple target data subsets obtained from a distributed file system (HDFS) near the client and send it to the client (step 9).
图12是根据本申请一些实施例所示的用户交互界面的示意图。如图12所示, 用户交互界面可以包括时间选择区1410、类型选择区1420、数据示意区1430、下载地址区1440以及处理进度区1450。FIG. 12 is a schematic diagram of a user interaction interface according to some embodiments of the present application. As shown in FIG. 12 , the user interaction interface may include a time selection area 1410 , a type selection area 1420 , a data representation area 1430 , a download address area 1440 and a processing progress area 1450 .
在时间选择区1410,用户可以输入待调用的数据对应的时间范围。例如,用户可以通过与所述用户交互界面关联的输入设备(例如,键盘、鼠标、触摸屏、麦克风、轨迹球)输入所述时间范围。In the time selection area 1410, the user can input the time range corresponding to the data to be recalled. For example, a user may input the time range through an input device (eg, keyboard, mouse, touch screen, microphone, trackball) associated with the user interface.
在类型选择区1420,用户可以输入待调用的数据对应的类型。例如,如图12所示,用户可以通过输入设备(例如,鼠标)勾选该类型对应的选择框来输入待调用的数据对应的类型。In the type selection area 1420, the user can input the type corresponding to the data to be called. For example, as shown in FIG. 12 , the user can input the type corresponding to the data to be called by checking the selection box corresponding to the type through an input device (eg, a mouse).
数据示意区1430可以用于显示与用户输入的时间范围和类型对应的待调用的数据。例如,如图12所示,与用户输入的时间范围和类型对应的待调用的数据可以以时间线和数据子集的组合形式显示在数据示意区1430中,以便用户检查或确认输入的信息是否正确。The data representation area 1430 may be used to display the data to be called corresponding to the time range and type input by the user. For example, as shown in FIG. 12 , the data to be called corresponding to the time range and type input by the user can be displayed in the data representation area 1430 in the form of a combination of timeline and data subset, so that the user can check or confirm whether the input information is not correct.
下载地址区1440可以用于提供与待调用的数据对应的下载链接。例如,用户可以点击所述下载链以触发数据调用过程。在数据调用过程中获取的待调用的数据可以合并生成文件包,并进一步被下载在用户端。The download address area 1440 may be used to provide a download link corresponding to the data to be called. For example, the user can click on the download chain to trigger the data invocation process. The data to be called obtained during the data calling process can be combined to generate a file package, which is further downloaded on the user end.
处理进度区1450可以用于显示数据处理的进度。例如,如图12所示,数据处理的进度可以包括处理完成、未处理、无法处理。用户可以通过所述处理进度区1450确定数据处理(例如,数据调用)的进度。The processing progress area 1450 may be used to display the progress of data processing. For example, as shown in FIG. 12 , the progress of data processing may include processing completed, unprocessed, and unable to process. The user can determine the progress of data processing (eg, data calling) through the processing progress area 1450 .
本申请实施例可能带来的有益效果包括但不限于:(1)将原始数据集中同一类型的数据元存储在一个目标数据集中,只需查找并访问属于用户指定类型的目标数据集,即可调用出用户指定类型的数据;(2)目标数据集中的各数据元可以按时间顺序连续存储,用户可以进一步获取指定类型在指定时间段内的目标数据;(3)目标数据集可以按预设的时间间隔划分成多个目标数据子集,用户可以基于数据调用请求仅获取所述多个目标数据子集中部分目标数据子集对应的数据元以实现快速调用部分数据的功能。相较于基于原始数据集调用数据,本申请实施例提供的数据存储方法使得数据调用过程更为简单且数据访问量更小,能够较好地提高数据调用的效率。需要说明的是,不同实施例可能产生的有益效果不同,在不同的实施例里,可能产生的有益效果可以是以上任意一种或几种的组合,也可以是其他任何可能获得的有益效果。The possible beneficial effects of the embodiments of the present application include, but are not limited to: (1) Store data elements of the same type in the original data set in a target data set, and only need to find and access the target data set belonging to the type specified by the user. Call out the data of the type specified by the user; (2) the data elements in the target data set can be stored continuously in chronological order, and the user can further obtain the target data of the specified type within the specified time period; (3) the target data set can be preset The time interval is divided into multiple target data subsets, and the user can obtain only the data elements corresponding to some target data subsets in the multiple target data subsets based on the data calling request to realize the function of quickly calling partial data. Compared with calling data based on the original data set, the data storage method provided by the embodiment of the present application makes the data calling process simpler and the amount of data access smaller, and can better improve the efficiency of data calling. It should be noted that different embodiments may have different beneficial effects, and in different embodiments, the possible beneficial effects may be any one or a combination of the above, or any other possible beneficial effects.
上文已对基本概念做了描述,显然,对于本领域技术人员来说,上述详细披露仅仅作为示例,而并不构成对本申请的限定。虽然此处并没有明确说明,本领域技术人 员可能会对本申请进行各种修改、改进和修正。该类修改、改进和修正在本申请中被建议,所以该类修改、改进、修正仍属于本申请示范实施例的精神和范围。The basic concept has been described above. Obviously, for those skilled in the art, the above detailed disclosure is only an example, and does not constitute a limitation to the present application. Although not explicitly described herein, various modifications, improvements and corrections to this application may occur to those skilled in the art. Such modifications, improvements, and corrections are suggested in this application, so such modifications, improvements, and corrections still fall within the spirit and scope of the exemplary embodiments of this application.
同时,本申请使用了特定词语来描述本申请的实施例。如“一个实施例”、“一实施例”、和/或“一些实施例”意指与本申请至少一个实施例相关的某一特征、结构或特点。因此,应强调并注意的是,本说明书中在不同位置两次或多次提及的“一实施例”或“一个实施例”或“一个替代性实施例”并不一定是指同一实施例。此外,本申请的一个或多个实施例中的某些特征、结构或特点可以进行适当的组合。Meanwhile, the present application uses specific words to describe the embodiments of the present application. Such as "one embodiment," "an embodiment," and/or "some embodiments" means a certain feature, structure, or characteristic associated with at least one embodiment of the present application. Therefore, it should be emphasized and noted that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places in this specification are not necessarily referring to the same embodiment . Furthermore, certain features, structures or characteristics of the one or more embodiments of the present application may be combined as appropriate.
此外,本领域技术人员可以理解,本申请的各方面可以通过若干具有可专利性的种类或情况进行说明和描述,包括任何新的和有用的工序、机器、产品或物质的组合,或对他们的任何新的和有用的改进。相应地,本申请的各个方面可以完全由硬件执行、可以完全由软件(包括固件、常驻软件、微码等)执行、也可以由硬件和软件组合执行。以上硬件或软件均可被称为“数据块”、“模块”、“引擎”、“单元”、“组件”或“系统”。此外,本申请的各方面可能表现为位于一个或多个计算机可读介质中的计算机产品,该产品包括计算机可读程序编码。Furthermore, those skilled in the art will appreciate that aspects of this application may be illustrated and described in several patentable categories or situations, including any new and useful process, machine, product, or combination of matter, or combinations of them. of any new and useful improvements. Accordingly, various aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, microcode, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as a "data block", "module", "engine", "unit", "component" or "system". Furthermore, aspects of the present application may be embodied as a computer product comprising computer readable program code embodied in one or more computer readable media.
计算机存储介质可能包含一个内含有计算机程序编码的传播数据信号,例如在基带上或作为载波的一部分。该传播信号可能有多种表现形式,包括电磁形式、光形式等,或合适的组合形式。计算机存储介质可以是除计算机可读存储介质之外的任何计算机可读介质,该介质可以通过连接至一个指令执行系统、装置或设备以实现通讯、传播或传输供使用的程序。位于计算机存储介质上的程序编码可以通过任何合适的介质进行传播,包括无线电、电缆、光纤电缆、RF、或类似介质,或任何上述介质的组合。A computer storage medium may contain a propagated data signal with the computer program code embodied therein, for example, on baseband or as part of a carrier wave. The propagating signal may take a variety of manifestations, including electromagnetic, optical, etc., or a suitable combination. Computer storage media can be any computer-readable media other than computer-readable storage media that can communicate, propagate, or transmit a program for use by coupling to an instruction execution system, apparatus, or device. Program code on a computer storage medium may be transmitted over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or a combination of any of the foregoing.
本申请各部分操作所需的计算机程序编码可以用任意一种或多种程序语言编写,包括面向对象编程语言如Java、Scala、Smalltalk、Eiffel、JADE、Emerald、C++、C#、VB.NET、Python等,常规程序化编程语言如C语言、Visual Basic、Fortran 2003、Perl、COBOL 2002、PHP、ABAP,动态编程语言如Python、Ruby和Groovy,或其他编程语言等。该程序编码可以完全在用户计算机上运行、或作为独立的软件包在用户计算机上运行、或部分在用户计算机上运行部分在远程计算机运行、或完全在远程计算机或服务器上运行。在后种情况下,远程计算机可以通过任何网络形式与用户计算机连接,比如局域网(LAN)或广域网(WAN),或连接至外部计算机(例如通过因特网),或在云计算环境中,或作为服务使用如软件即服务(SaaS)。The computer program coding required for the operation of the various parts of this application may be written in any one or more programming languages, including object-oriented programming languages such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB.NET, Python Etc., conventional procedural programming languages such as C language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may run entirely on the user's computer, or as a stand-alone software package on the user's computer, or partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the latter case, the remote computer can be connected to the user's computer through any network, such as a local area network (LAN) or wide area network (WAN), or to an external computer (eg, through the Internet), or in a cloud computing environment, or as a service Use eg software as a service (SaaS).
此外,除非权利要求中明确说明,本申请所述处理元素和序列的顺序、数字字 母的使用、或其他名称的使用,并非用于限定本申请流程和方法的顺序。尽管上述披露中通过各种示例讨论了一些目前认为有用的发明实施例,但应当理解的是,该类细节仅起到说明的目的,附加的权利要求并不仅限于披露的实施例,相反,权利要求旨在覆盖所有符合本申请实施例实质和范围的修正和等价组合。例如,虽然以上所描述的系统组件可以通过硬件设备实现,但是也可以只通过软件的解决方案得以实现,如在现有的服务器或移动设备上安装所描述的系统。In addition, unless explicitly stated in the claims, the order of processing elements and sequences described in the present application, the use of numbers and letters, or the use of other names are not intended to limit the order of the procedures and methods of the present application. While the foregoing disclosure discusses by way of various examples some embodiments of the invention that are presently believed to be useful, it is to be understood that such details are for purposes of illustration only and that the appended claims are not limited to the disclosed embodiments, but rather The requirements are intended to cover all modifications and equivalent combinations falling within the spirit and scope of the embodiments of the present application. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described systems on existing servers or mobile devices.
同理,应当注意的是,为了简化本申请披露的表述,从而帮助对一个或多个发明实施例的理解,前文对本申请实施例的描述中,有时会将多种特征归并至一个实施例、附图或对其的描述中。但是,这种披露方法并不意味着本申请对象所需要的特征比权利要求中提及的特征多。实际上,实施例的特征要少于上述披露的单个实施例的全部特征。Similarly, it should be noted that, in order to simplify the expressions disclosed in the present application and thus help the understanding of one or more embodiments of the invention, in the foregoing description of the embodiments of the present application, various features are sometimes combined into one embodiment, in the drawings or descriptions thereof. However, this method of disclosure does not imply that the subject matter of the application requires more features than those mentioned in the claims. Indeed, there are fewer features of an embodiment than all of the features of a single embodiment disclosed above.
一些实施例中使用了描述成分、属性数量的数字,应当理解的是,此类用于实施例描述的数字,在一些示例中使用了修饰词“大约”、“近似”或“大体上”来修饰。除非另外说明,“大约”、“近似”或“大体上”表明所述数字允许有±20%的变化。相应地,在一些实施例中,说明书和权利要求中使用的数值参数均为近似值,该近似值根据个别实施例所需特点可以发生改变。在一些实施例中,数值参数应考虑规定的有效数位并采用一般位数保留的方法。尽管本申请一些实施例中用于确认其范围广度的数值域和参数为近似值,在具体实施例中,此类数值的设定在可行范围内尽可能精确。Some examples use numbers to describe quantities of ingredients and attributes, it should be understood that such numbers used to describe the examples, in some examples, use the modifiers "about", "approximately" or "substantially" to retouch. Unless stated otherwise, "about", "approximately" or "substantially" means that a variation of ±20% is allowed for the stated number. Accordingly, in some embodiments, the numerical parameters set forth in the specification and claims are approximations that can vary depending upon the desired characteristics of individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and use a general digit reservation method. Notwithstanding that the numerical fields and parameters used in some embodiments of the present application to confirm the breadth of their ranges are approximations, in particular embodiments such numerical values are set as precisely as practicable.
针对本申请引用的每个专利、专利申请、专利申请公开物和其他材料,如文章、书籍、说明书、出版物、文档等,特此将其全部内容并入本申请作为参考。与本申请内容不一致或产生冲突的申请历史文件除外,对本申请权利要求最广范围有限制的文件(当前或之后附加于本申请中的)也除外。需要说明的是,如果本申请附属材料中的描述、定义、和/或术语的使用与本申请所述内容有不一致或冲突的地方,以本申请的描述、定义和/或术语的使用为准。Each patent, patent application, patent application publication, and other material, such as article, book, specification, publication, document, etc., cited in this application is hereby incorporated by reference in its entirety. Application history documents that are inconsistent with or conflict with the content of this application are excluded, as are documents (currently or hereafter appended to this application) that limit the broadest scope of the claims of this application. It should be noted that, if there is any inconsistency or conflict between the descriptions, definitions and/or terms used in the attached materials of this application and the content of this application, the descriptions, definitions and/or terms used in this application shall prevail .
最后,应当理解的是,本申请中所述实施例仅用以说明本申请实施例的原则。其他的变形也可能属于本申请的范围。因此,作为示例而非限制,本申请实施例的替代配置可视为与本申请的教导一致。相应地,本申请的实施例不仅限于本申请明确介绍和描述的实施例。Finally, it should be understood that the embodiments described in the present application are only used to illustrate the principles of the embodiments of the present application. Other variations are also possible within the scope of this application. Accordingly, by way of example and not limitation, alternative configurations of embodiments of the present application may be considered consistent with the teachings of the present application. Accordingly, the embodiments of the present application are not limited to the embodiments expressly introduced and described in the present application.

Claims (29)

  1. 一种由计算装置执行的数据存储方法,其特征在于,所述方法包括:A data storage method executed by a computing device, wherein the method comprises:
    获取原始数据集,所述原始数据集包括多个数据元,每个数据元具有标注该数据元类型的类型信息;Obtaining an original data set, the original data set includes a plurality of data elements, and each data element has type information marking the type of the data element;
    根据所述原始数据集中数据元的类型信息,得到不同类型的个数N,并对应建立N个不同的目标数据集,所述N个不同的目标数据集与不同类型的数据元相对应;其中,N为大于等于2的整数;以及According to the type information of data elements in the original data set, the number N of different types is obtained, and N different target data sets are correspondingly established, and the N different target data sets correspond to different types of data elements; wherein , where N is an integer greater than or equal to 2; and
    基于所述原始数据集中的数据元的类型信息以及各目标数据集,将与所述目标数据集对应的数据元存储在相应的目标数据集中,所述目标数据集存储于第一存储设备中。Based on the type information of the data elements in the original data set and each target data set, the data elements corresponding to the target data set are stored in the corresponding target data set, and the target data set is stored in the first storage device.
  2. 根据权利要求1所述的方法,其特征在于,所述数据集为文件,所述文件的数据元为消息。The method according to claim 1, wherein the data set is a file, and the data element of the file is a message.
  3. 根据权利要求1所述的方法,其特征在于,所述类型包括图像类、位置类、传感器类、数据包类和控制器局域网总线类中的一种或多种。The method of claim 1, wherein the types include one or more of an image type, a location type, a sensor type, a data packet type, and a controller area network bus type.
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    建立所述目标数据集的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息;其中,元标识信息是指相应数据元的标识信息。Establishing index information of the target data set, the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set; wherein, the meta-identification information refers to the identification information of the corresponding data element .
  5. 根据权利要求4所述的方法,其特征在于,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息。The method according to claim 4, wherein the data elements in the target data set are arranged in chronological order, and the element identification information includes time information of the corresponding data elements.
  6. 根据权利要求4所述的方法,其特征在于,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息;其中,集标识信息是指原始数据集的标识信息。The method according to claim 4, wherein the index information further comprises set identification information of the original data set corresponding to each data element in the target data set; wherein, the set identification information refers to the identification information of the original data set .
  7. 根据权利要求1所述的方法,其特征在于,所述原始数据集中的数据包括自动驾驶交通工具在运行过程中所产生或采集的数据。The method according to claim 1, wherein the data in the original data set includes data generated or collected during the operation of the autonomous vehicle.
  8. 根据权利要求1所述的方法,其特征在于,The method of claim 1, wherein:
    接收由用户端发送的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型;Receive a data call request sent by the client, where the data call request at least includes the type of the data to be called;
    基于所述数据调用请求从所述N个不同的目标数据集确定相应类型目标数据集;Determine a corresponding type of target data set from the N different target data sets based on the data call request;
    基于所述确定的目标数据集中的数据元,得到所述待调用数据;以及obtaining the data to be called based on the data elements in the determined target data set; and
    将所述待调用数据发送给所述用户端的第二存储设备中。Send the data to be called to the second storage device of the client.
  9. 根据权利要8所述的方法,其特征在于,所述基于所述确定的目标数据集中的数据元,得到所述待调用数据还包括:The method according to claim 8, wherein the obtaining the data to be called based on the data elements in the determined target data set further comprises:
    从所述第一存储设备中获取所述目标数据集以及其存储的数据元;Obtain the target data set and the data elements stored therein from the first storage device;
    将所述目标数据集按预设的时间间隔划分成多个目标数据子集;以及dividing the target data set into a plurality of target data subsets at preset time intervals; and
    基于所述数据调用请求,获取所述多个目标数据子集中部分目标数据子集对应的数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。Based on the data calling request, data elements corresponding to some target data subsets in the multiple target data subsets are acquired, and the data to be called includes the data elements corresponding to the partial target data subsets.
  10. 根据权利要8所述的方法,其特征在于,所述基于所述确定的目标数据集中的数据元,得到所述待调用数据还包括:The method according to claim 8, wherein the obtaining the data to be called based on the data elements in the determined target data set further comprises:
    将所述第一存储设备中获取的所述确定的目标数据集以及其存储的数据元发送至第三存储设备,所述第一存储设备距离所述用户端远于所述第三存储设备距离所述用户端;Sending the determined target data set obtained in the first storage device and the data elements stored therein to a third storage device, where the distance between the first storage device and the user terminal is farther than the distance from the third storage device the client;
    将所述目标数据集按预设的时间间隔划分成多个目标数据子集并存储于所述第三存储设备;dividing the target data set into a plurality of target data subsets at preset time intervals and storing them in the third storage device;
    建立多个逻辑文件,每个逻辑文件对应所述多个目标数据子集中的一个,所述每个逻辑文件包括所述目标数据子集中的数据元对应的索引信息;以及creating a plurality of logical files, each logical file corresponding to one of the plurality of target data subsets, each logical file including index information corresponding to data elements in the target data subset; and
    基于所述数据调用请求以及所述逻辑文件,从所述第三存储设备中获取所述多个目标数据子集中的部分目标数据子集存储的数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。Based on the data calling request and the logic file, acquire, from the third storage device, data elements stored in a partial target data subset of the multiple target data subsets, where the data to be called includes the partial target the data element corresponding to the data subset.
  11. 一种数据存储系统,其特征在于,所述系统包括:A data storage system, characterized in that the system comprises:
    原始数据集获取模块,用于获取原始数据集,所述原始数据集包括多个数据元,每个数据元具有标注该数据元类型的类型信息;an original data set acquisition module, used for acquiring an original data set, the original data set includes a plurality of data elements, each data element has type information marking the type of the data element;
    目标数据集建立模块,用于根据所述原始数据集中数据元的类型信息,得到不同类型的个数N,并对应建立N个不同的目标数据集,所述N个不同的目标数据集与不同类型的数据元相对应;其中,N为大于等于2的整数;以及The target data set establishment module is used to obtain the number N of different types according to the type information of the data elements in the original data set, and correspondingly establish N different target data sets, the N different target data sets are different from The data element of the type corresponds to; wherein, N is an integer greater than or equal to 2; and
    存储模块,用于基于所述原始数据集中的数据元的类型信息以及目标数据集,将与所述目标数据集对应的数据元存储在相应的目标数据集中。The storage module is configured to store the data elements corresponding to the target data set in the corresponding target data set based on the type information of the data elements in the original data set and the target data set.
  12. 根据权利要求11所述的系统,其特征在于,所述数据集为文件,所述文件的数据元为消息。The system according to claim 11, wherein the data set is a file, and the data element of the file is a message.
  13. 根据权利要求11所述的系统,其特征在于,所述类型包括图像类、位置类、传感器类、数据包类和控制器局域网总线类中的一种或多种。The system of claim 11, wherein the types include one or more of an image type, a location type, a sensor type, a data packet type, and a controller area network bus type.
  14. 根据权利要求11所述的系统,其特征在于,所述系统还包括:The system of claim 11, wherein the system further comprises:
    索引信息建立模块,用于建立所述目标数据集的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息;其中,元标识信息是指相应数据元的标识信息。An index information establishment module is used to establish index information of the target data set, and the index information at least includes meta identification information and storage location information corresponding to each data element in the target data set; wherein, the meta identification information is Refers to the identification information of the corresponding data element.
  15. 根据权利要求14所述的系统,其特征在于,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息。The system according to claim 14, wherein the data elements in the target data set are arranged in chronological order, and the element identification information includes time information of the corresponding data elements.
  16. 根据权利要求14所述的系统,其特征在于,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息;其中,集标识信息是指原始数据集的标识信息。The system according to claim 14, wherein the index information further comprises set identification information of the original data set corresponding to each data element in the target data set; wherein, the set identification information refers to the identification information of the original data set .
  17. 一种存储介质,其特征在于,所述存储介质用于存储计算机指令,当计算机读取所述存储介质中的计算机指令后,执行如权利要求1~10中任一项所述的数据存储方法。A storage medium, characterized in that the storage medium is used to store computer instructions, and after a computer reads the computer instructions in the storage medium, the data storage method according to any one of claims 1 to 10 is executed. .
  18. 一种由计算装置执行的数据调用方法,其特征在于,原始数据集中的数据元按权利要求1~10中任一项所述的数据存储方法被存储于相应的目标数据集,所述目标数据集存储于计算机装置相关联的第一存储设备中,所述数据调用方法包括:A data calling method executed by a computing device, wherein the data elements in the original data set are stored in the corresponding target data set according to the data storage method according to any one of claims 1 to 10, and the target data The data set is stored in the first storage device associated with the computer device, and the data calling method includes:
    获取由用户端发送的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型;Obtain a data call request sent by the client, where the data call request at least includes the type of the data to be called;
    基于所述数据调用请求获取所述目标数据集中的部分数据以得到所述待调用数据,所述部分数据包括与所述待调用数据所属的类型相对应的目标数据集中的数据元;以及Obtaining partial data in the target data set based on the data calling request to obtain the data to be called, the partial data including data elements in the target data set corresponding to the type to which the data to be called belongs; and
    将所述待调用数据发送至所述用户端的第二存储设备中。Send the data to be called to the second storage device of the client.
  19. 根据权利要求18所述的方法,其特征在于,所述目标数据集具有相应的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息,其中,元标识信息是指相应数据元的标识信息;所述数据调用请求还包括与元标识信息相关的元限定条件;The method according to claim 18, wherein the target data set has corresponding index information, and the index information at least includes meta-identification information and storage location information corresponding to each data element in the target data set , wherein the meta-identification information refers to the identification information of the corresponding data element; the data invocation request also includes meta-qualification conditions related to the meta-identification information;
    所述基于所述数据调用请求获取相应类型目标数据集中的数据元,包括:The acquiring data elements in the target data set of the corresponding type based on the data invocation request includes:
    基于所述数据调用请求获取与相应类型对应且满足所述元限定条件的索引信息;以及Obtain index information corresponding to the corresponding type and satisfying the meta-qualification condition based on the data call request; and
    基于所述获取的索引信息中的存储位置获取所述数据元。The data element is acquired based on the storage location in the acquired index information.
  20. 根据权利要求19所述的方法,其特征在于,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息;所述元限定条件包括所述待调用数据对应的时间范围。The method according to claim 19, wherein the data elements in the target data set are arranged in chronological order, the meta identification information includes time information of the corresponding data elements; the meta qualification includes the data to be called the corresponding time range.
  21. 根据权利要18所述的方法,其特征在于,所述基于所述数据调用请求获取所述目标数据集中的部分数据还包括:The method according to claim 18, wherein the acquiring part of the data in the target data set based on the data invocation request further comprises:
    将所述目标数据集按预设的时间间隔划分成多个目标数据子集;以及dividing the target data set into a plurality of target data subsets at preset time intervals; and
    基于所述数据调用请求,获取所述目标数据子集中部分目标数据子集对应的所述数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。Based on the data calling request, the data elements corresponding to some target data subsets in the target data subset are acquired, and the data to be called includes the data elements corresponding to the partial target data subsets.
  22. 根据权利要18所述的方法,其特征在于,所述基于所述数据调用请求获取所述目标数据集中的部分数据还包括:The method according to claim 18, wherein the acquiring part of the data in the target data set based on the data invocation request further comprises:
    将所述第一存储设备中获取的所述目标数据集以及其存储的数据元发送至第三存储设备,所述第一存储设备距离所述用户端远于所述第三存储设备距离所述用户端;Sending the target data set obtained in the first storage device and the data elements stored therein to a third storage device, where the distance from the first storage device to the user terminal is greater than the distance from the third storage device to the user terminal;
    将所述目标数据集按预设的时间间隔划分成多个目标数据子集并存储于所述第三 存储设备;The target data set is divided into a plurality of target data subsets at preset time intervals and stored in the third storage device;
    建立多个逻辑文件,每个逻辑文件对应所述多个目标数据子集中的一个,所述逻辑文件包括目标数据子集中的数据元对应的索引信息;以及creating a plurality of logical files, each logical file corresponding to one of the plurality of target data subsets, the logical file including index information corresponding to data elements in the target data subset; and
    基于所述数据调用请求以及所述逻辑文件,从所述第三存储设备中获取所述多个目标数据子集中的部分目标数据子集对应的数据元,所述待调用数据包括所述部分目标数据子集对应的所述数据元。Based on the data calling request and the logic file, data elements corresponding to some target data subsets in the plurality of target data subsets are acquired from the third storage device, and the data to be called includes the partial targets the data element corresponding to the data subset.
  23. 根据权利要求19所述的方法,其特征在于,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息,其中,集标识信息是指原始数据集的标识信息;所述数据调用请求还包括与集标识信息相关的集限定条件;The method according to claim 19, wherein the index information further comprises set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set ; The data call request also includes set qualifications related to the set identification information;
    所述基于所述数据调用请求获取与相应类型对应且满足所述元限定条件的索引信息,包括:The obtaining index information corresponding to the corresponding type and satisfying the meta-qualification condition based on the data calling request includes:
    基于所述数据调用请求获取与相应类型对应且满足所述集限定条件的索引信息。The index information corresponding to the corresponding type and satisfying the set qualification condition is obtained based on the data call request.
  24. 一种数据调用系统,其特征在于,原始数据集中的数据元按权利要求1~10中任一项所述的数据存储方法被存储于相应的目标数据集,所述目标数据集存储于计算机装置相关联的第一存储设备中,所述数据调用系统包括:A data calling system, characterized in that the data elements in the original data set are stored in a corresponding target data set according to the data storage method according to any one of claims 1 to 10, and the target data set is stored in a computer device In the associated first storage device, the data calling system includes:
    用户请求获取模块,用于获取由用户端发送的数据调用请求,所述数据调用请求至少包括待调用数据所属的类型;以及a user request obtaining module, configured to obtain a data calling request sent by the client, the data calling request at least including the type of the data to be called; and
    调用模块,用于基于所述数据调用请求获取所述目标数据集中的部分数据以得到所述待调用数据,所述部分数据包括与所述待调用数据所属的类型相对应的目标数据集中的数据元。A calling module, configured to obtain partial data in the target data set based on the data calling request to obtain the data to be called, where the partial data includes data in the target data set corresponding to the type to which the data to be called belongs Yuan.
  25. 根据权利要求24所述的系统,其特征在于,所述目标数据集具有相应的索引信息,所述索引信息至少包括所述目标数据集中各数据元的一一对应的元标识信息和存储位置信息,其中,元标识信息是指相应数据元的标识信息;所述数据调用请求还包括与元标识信息相关的元限定条件;The system according to claim 24, wherein the target data set has corresponding index information, and the index information at least includes meta identification information and storage location information corresponding to each data element in the target data set. , wherein the meta-identification information refers to the identification information of the corresponding data element; the data invocation request also includes meta-qualification conditions related to the meta-identification information;
    所述调用模块包括:The calling module includes:
    索引信息获取单元,用于基于所述数据调用请求获取与相应类型对应且满足所述元限定条件的索引信息;an index information obtaining unit, configured to obtain index information corresponding to the corresponding type and satisfying the meta-qualification condition based on the data call request;
    分段存储单元,用于将所述目标数据集按预设的时间间隔划分成多个目标数据子集,分别存储每个目标数据子集;以及a segmented storage unit for dividing the target data set into a plurality of target data subsets at preset time intervals, and storing each target data subset respectively; and
    数据元获取单元,用于基于获取的索引信息中的存储位置获取数据元。The data element obtaining unit is configured to obtain the data element based on the storage location in the obtained index information.
  26. 根据权利要求25所述的系统,其特征在于,所述目标数据集中的数据元按时间顺序排列,所述元标识信息包括相应数据元的时间信息;所述元限定条件包括所述待调用数据对应的时间范围。The system according to claim 25, wherein the data elements in the target data set are arranged in chronological order, the meta identification information includes time information of the corresponding data elements; the meta qualification includes the data to be called corresponding time range.
  27. 根据权利要求26所述的系统,其特征在于,所述数据调用系统还包括同步模块,所述同步模块用于将所述待调用数据发送至所述用户端的第二存储设备中。The system according to claim 26, wherein the data calling system further comprises a synchronization module, and the synchronization module is configured to send the data to be called to the second storage device of the client.
  28. 根据权利要求25所述的系统,其特征在于,所述索引信息还包括所述目标数据集中各数据元对应的原始数据集的集标识信息,其中,集标识信息是指原始数据集的标识信息;所述数据调用请求还包括与集标识信息相关的集限定条件;The system according to claim 25, wherein the index information further comprises set identification information of the original data set corresponding to each data element in the target data set, wherein the set identification information refers to the identification information of the original data set ; The data call request also includes set qualifications related to the set identification information;
    所述索引信息获取单元进一步用于:The index information acquisition unit is further used for:
    基于所述数据调用请求获取与相应类型对应且满足所述集限定条件的索引信息。The index information corresponding to the corresponding type and satisfying the set qualification condition is obtained based on the data call request.
  29. 一种存储介质,其特征在于,所述存储介质用于存储计算机指令,当计算机读取所述存储介质中的计算机指令后,执行如权利要求18~23中任一项所述的数据调用方法。A storage medium, characterized in that the storage medium is used to store computer instructions, and after a computer reads the computer instructions in the storage medium, it executes the data calling method according to any one of claims 18 to 23 .
PCT/CN2021/110847 2020-09-07 2021-08-05 Data storage method and system, and data calling method and system WO2022048387A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010931768.6 2020-09-07
CN202010931768.6A CN112069368B (en) 2020-09-07 Data storage and calling method and system

Publications (1)

Publication Number Publication Date
WO2022048387A1 true WO2022048387A1 (en) 2022-03-10

Family

ID=73664155

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/110847 WO2022048387A1 (en) 2020-09-07 2021-08-05 Data storage method and system, and data calling method and system

Country Status (1)

Country Link
WO (1) WO2022048387A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109886541A (en) * 2019-01-14 2019-06-14 北京百度网讯科技有限公司 Automatic driving vehicle Data Quality Assessment Methodology, device and storage medium
CN110619693A (en) * 2018-06-20 2019-12-27 北京图森未来科技有限公司 Automatic driving data management system and method and data processing system
CN110830555A (en) * 2019-10-15 2020-02-21 图灵人工智能研究院(南京)有限公司 Data processing method, control device and storage medium for unmanned equipment
CN111258974A (en) * 2020-01-20 2020-06-09 吉利汽车研究院(宁波)有限公司 Vehicle offline scene data processing method and system
CN112069368A (en) * 2020-09-07 2020-12-11 北京航迹科技有限公司 Data storage and calling method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619693A (en) * 2018-06-20 2019-12-27 北京图森未来科技有限公司 Automatic driving data management system and method and data processing system
CN109886541A (en) * 2019-01-14 2019-06-14 北京百度网讯科技有限公司 Automatic driving vehicle Data Quality Assessment Methodology, device and storage medium
CN110830555A (en) * 2019-10-15 2020-02-21 图灵人工智能研究院(南京)有限公司 Data processing method, control device and storage medium for unmanned equipment
CN111258974A (en) * 2020-01-20 2020-06-09 吉利汽车研究院(宁波)有限公司 Vehicle offline scene data processing method and system
CN112069368A (en) * 2020-09-07 2020-12-11 北京航迹科技有限公司 Data storage and calling method and system

Also Published As

Publication number Publication date
CN112069368A (en) 2020-12-11

Similar Documents

Publication Publication Date Title
TWI630373B (en) Systems and methods for determining a point of interest
US11546729B2 (en) System and method for destination predicting
US9646318B2 (en) Updating point of interest data using georeferenced transaction data
TWI675184B (en) Systems, methods and non-transitory computer readable medium for route planning
US11193786B2 (en) System and method for determining location
US9971775B2 (en) Method of and system for parameter-free discovery and recommendation of areas-of-interest
JP2020091273A (en) Position update method, position and navigation route display method, vehicle and system
CN110832478B (en) System and method for on-demand services
WO2018233680A1 (en) Systems and methods for querying a database
AU2016397268A1 (en) Systems and methods for determining a path of a moving device
WO2019033838A1 (en) Method and system for heading determination
CN110785749B (en) System and method for generating wide tables
CN110689719B (en) System and method for identifying closed road sections
US20230095218A1 (en) Method, apparatus, and system for visually identifying and pairing ride providers and passengers
WO2022048387A1 (en) Data storage method and system, and data calling method and system
TW201933280A (en) Systems and methods for new road determination
TWI720391B (en) Artificial intelligent systems and methods for identifying a drunk passenger by a car hailing order
CN112069368B (en) Data storage and calling method and system
US20230114415A1 (en) Method, apparatus, and system for providing digital street hailing
CN110651266B (en) System and method for providing information for on-demand services
CN114743395B (en) Signal lamp detection method, device, equipment and medium
TWI720390B (en) Systems, methods and non-transitory computer readable medium for optimizing spatial big data partition
CN115002196A (en) Data processing method and device and vehicle-end acquisition equipment
CN112097785B (en) Vehicle-mounted intelligent device data processing method and device and electronic device
JP6383063B1 (en) Calculation device, calculation method, and calculation program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21863455

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 23/06/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21863455

Country of ref document: EP

Kind code of ref document: A1